10,000 Matching Annotations

Nov 2025
www.medrxiv.org www.medrxiv.org

Genomic privacy risks in GWAS summary statistics

4
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  This important study provides a theoretical framework for quantifying privacy risk from publicly shared genome-wide association summary statistics. The findings reveal the conditions under which genotype reconstruction may become feasible, challenging long-held assumptions about personal data safety. While the evidence is solid, supported by clear mathematical derivations and simulations, validation on large empirical datasets would further strengthen the claims.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors aim to demonstrate that GWAS summary statistics, previously considered safe for open sharing, can, under certain conditions, be used to recover individual-level genotypes when combined with large numbers of high-dimensional phenotypes. By reformulating the GWAS linear model as a system of linear programming constraints, they identify a critical phenotype-to-sample size ratio (R/N) above which genotype reconstruction becomes theoretically feasible.
  
  Strengths:
  
  There is conceptual originality and mathematical clarity. The authors establish a fundamental quantitative relationship between data dimensionality and privacy leakage and validate their theory through well-designed simulations and application to the GTEx dataset. The derivation is rigorous, the implementation reproducible, and the work provides a formal framework for assessing privacy risks in genomic research.
  
  Weaknesses:
  
  The study simplifies assumptions that phenotypes are independent, which is not the truth, and are measured without noise. Real-world data are highly correlated across different levels, not only genotype but also multi-omics, which may overstate recovery potential. The empirical evidence, while illustrative, is limited to small-scale data and idealized conditions; thus, the full practical impact remains to be demonstrated. GTEx analysis used only whole blood eQTL data from 369 individuals, which cannot capture the complexity, sample heterogeneity, or cross-tissue dependencies typical of biobank-scale studies.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This study focuses on the genomic privacy risks associated with Genome-Wide Association Study (GWAS) summary statistics, employing a three-tiered demonstration framework of "theoretical derivation - simulation experiments - real-data validation". The research finds that when GWAS summary statistics are combined with high-dimensional phenotypic data, genotype recovery and individual re-identification can be achieved using linear programming methods. It further identifies key influencing factors such as the effective phenotype-to-sample size ratio (R/N) and minor allele frequency (MAF). These findings provide practical reference for improving data governance policies in genomic research, holding certain real-world significance.
  
  Strengths:
  
  This study integrates theoretical analysis, simulation validation, and the application of real-world datasets to construct a comprehensive research framework, which is conducive to understanding and mitigating the risk of private information leakage in genomic research.
  
  Weaknesses:
  
  (1) Limited scope of variant types covered:
  
  The analysis is conducted solely on Single Nucleotide Polymorphisms (SNPs), omitting other crucial genomic variant types such as Copy Number Variations (CNVs), Insertions/Deletions (InDels), and chromosomal translocations/inversions. From a genomic structure perspective, variants like CNVs and InDels are also core components of individual genetic characteristics, and in some disease-related studies, association signals for these variants can be even more significant than those for SNPs. From the perspective of privacy risk logic, the genotypes of these variants (e.g., copy number for CNVs, base insertion/deletion status for InDels) can also be quantified and could theoretically be inferred backwards using the combination of "summary statistics + high-dimensional phenotypes". Their privacy leakage risks might differ from those of SNPs (for instance, rare CNVs might be more easily re-identified due to higher genetic specificity).
  
  (2) Bias in data applicability scope:
  
  Both the simulation experiments and real-data validation in the study primarily rely on European population samples (e.g., 489 European samples from the 1000 Genomes Project; the genetic background of whole blood tissue samples from the GTEx project is not explicitly mentioned regarding non-European proportions). It only briefly notes a higher risk for African populations in the individual re-identification risk assessment, without conducting systematic analyses for other populations, such as East Asian, South Asian, or admixed American populations. Significant differences in genetic structure (e.g., MAF distribution, linkage disequilibrium patterns) exist across different populations. This may result in the R/N threshold and the relationship between MAF and recovery accuracy identified in the study not being fully applicable to other populations
  
  Hence, addressing the aforementioned issues through supplementary work would enhance the study's scientific rigor and application value, potentially providing more comprehensive theoretical and technical support for "privacy protection" in genomic data sharing.
  
  Review 2
4. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Author response:
  
  Reviewer #1 (Public Review):
  
  Summary:
  
  The authors aim to demonstrate that GWAS summary statistics, previously considered safe for open sharing, can, under certain conditions, be used to recover individual-level genotypes when combined with large numbers of high-dimensional phenotypes. By reformulating the GWAS linear model as a system of linear programming constraints, they identify a critical phenotypeto-sample size ratio (R/N) above which genotype reconstruction becomes theoretically feasible.
  
  Strengths:
  
  There is conceptual originality and mathematical clarity. The authors establish a fundamental quantitative relationship between data dimensionality and privacy leakage and validate their theory through well-designed simulations and application to the GTEx dataset. The derivation is rigorous, the implementation reproducible, and the work provides a formal framework for assessing privacy risks in genomic research
  
  We thank the reviewer for the positive assessment of our work’s conceptual originality, mathematical rigor, and reproducible implementation.
  
  Weaknesses:
  
  The study simplifies assumptions that phenotypes are independent, which is not the truth, and are measured without noise. Real-world data are highly correlated across different levels, not only genotype but also multi-omics, which may overstate recovery potential. The empirical evidence, while illustrative, is limited to small-scale data and idealized conditions; thus, the full practical impact remains to be demonstrated. GTEx analysis used only whole blood eQTL data from 369 individuals, which cannot capture the complexity, sample heterogeneity, or cross-tissue dependencies typical of biobank-scale studies
  
  We recognize the concern regarding the independence and noiselessness assumptions in our frame work. While assuming independent, noiseless phenotypes represents an idealized scenario, it allows us to clearly demonstrate the conceptual potential of our framework. The GTEx whole blood analysis is intended as a proof-of-concept, illustrating feasibility rather than capturing full biological complexity. In the revised manuscript, we will clarify these assumptions, emphasize that practical reconstruction accuracy maybe lower in correlated and noisy real-world data, and expand empirical validation to multiple GTEx tissue sand independent cohorts to demonstrate robustness under more realistic conditions.
  
  Reviewer #2 (PublicReview):
  
  Summary:
  
  This study focuses on the genomic privacy risks associated with Genome-Wide Association Study (GWAS) summary statistics, employing a three-tiered demonstration framework of” theoretical derivation- simulation experiments- real-data validation”. The research finds that when GWAS summary statistics are combined with high-dimensional phenotypic data, genotype recovery and individual re-identification can be achieved using linear programming methods. It further identifies key influencing factors such as the effective phenotype-to-sample sizeratio(R/N) and minor allele frequency(MAF). These findings provide practical reference for improving data governance policies in genomic research, holding certain real-world significance
  
  Strengths:
  
  This study integrates theoretical analysis, simulation validation, and the application of real world datasets to construct a comprehensive research framework, which is conducive to understanding and mitigating the risk of private information leakage in genomic research
  
  We are glad the reviewer values our integration of theory, simulation, and real data
  
  Weaknesses:
  
  (1) Limited scope of variant types covered:
  
  The analysis is conducted solely on Single Nucleotide Polymorphisms(SNPs), omitting other crucial genomic variant types such as Copy Number Variations(CNVs), Insertions/Deletions (InDels), and chromosomal translocations/inversions. From a genomic structure perspective, variants like CNVs and InDels are also core components of individual genetic characteristics, and in some disease-related studies, association signals for these variants can be even more significant than those for SNPs. From the perspective of privacy risk logic, the genotypes of these variants (e.g., copy number for CNVs, base insertion/deletion status for InDels) can also be quantified and could theoretically be inferred backwards using the combination of ”summary statistics +high-dimensional phenotypes”. Their privacy leakage risks might differ from those of SNPs(for instance, rare CNVs might be more easily re-identified due to higher genetic specificity)
  
  This point raises an important clarification regarding variant types beyond SNPs. We would like to clarify that our mathematical framework is not inherently restricted to SNPs. In fact, it is broadly applicable to any genetic variant that can be represented numerically, e.g., allelic dosage (0/1/2), copy number counts for CNVs, or presence/absence indicators for InDels. Conceptually, CNVs , InDels, and other structural variants can be incorporated in the same way as SNPs.
  
  The main limitation arises from the current availability of GWAS summary statistics for these non-SNP variant types (e.g., CNV dosages≥3), which are still relatively scarce. As a result, empirically evaluating our framework on these variant classes would be challenging. In the revision, we will explicitly emphasize the general applicability of our framework to diverse genetic variants while clearly noting this practical limitation. We also plan to include simulations to investigate the recovery accuracy associated with CNVs and InDels, which will further demonstrate the extensibility of our approach. It should be noted, however, that leaking genotypic data of ordinary SNPs already raises concerns, regardless of other types of genetic variants.
  
  (2) Bias in data applicability scope:
  
  Both the simulation experiments and real-data validation in the study primarily rely on European population samples (e.g.,489 Europe an samples from the 1000 Genomes Project; the genetic background of whole blood tissue samples from the GTEx project is not explicitly mentioned regarding non-European proportions). It only briefly notes a higher risk for African populations in the individual re-identification risk assessment, without conducting systematic analyses for other populations, such as East Asian, South Asian, or admixed American populations. Significant differences in genetic structure (e.g., MAF distribution, linkage disequilibrium patterns) exist across different populations. This may result in the R/N threshold and the relationship between MAF and recovery accuracy identified in the study not being fully applicable to other populations.
  
  Hence, addressing the aforementioned issues through supplementary work would enhance the study’s scientific rigor and application value, potentially providing more comprehensive theoretical and technical support for” privacy protection” in genomic data sharing.
  
  We acknowledge this valid concern regarding the generalizability of our findings. Our analysis already identifies MAF as a key factor influencing recovery accuracy, which begins to address population-specific genetic differences. Importantly, because our reconstruction method treats each variant independently, its success does not rely on population-specific LD patterns. The core determinant of feasibility is the ratio of phenotypic dimensions to sample size(R/N), a relationship we expect to hold a cross populations.
  
  Nevertheless, we agree that further validation across diverse ancestries can be helpful. In the revised manuscript, we will try to include additional cohorts as extended validation analyses
  
  AuthorResponse
Visit annotations in context

Tags

Summary

AuthorResponse

Review 2

Review 1

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2025.09.09.25335252v1
www.biorxiv.org www.biorxiv.org

Kinesin-1 conformational dynamics are controlled by a cargo-sensitive TPR switch

4
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  The manuscript by Shukla et al. provides important mechanistic insights into kinesin-1 autoinhibition and cargo-mediated activation. Using a convincing combination of protein engineering, computational modeling, biophysical assays, HDX-MS, and electron microscopy, the authors reveal how cargo binding induces an allosteric transition that propagates to the motor domains and enhances MAP7 binding. Despite limitations arising from conformational heterogeneity and structural resolution, the study presents a unified mechanism for kinesin-1 activation that will be of broad interest to the motor protein, structural biology, and cell biology communities.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors aim to interrogate the sets of intramolecular interactions that cause kinesin-1 hetero-tetramer autoinhibition and the mechanism by which cargo interactions via the light chain tetratricopeptide repeat domains can initiate motor activation. The molecular mechanisms of kinesin regulation remain an important question with respect to intracellular transport. It has implications for the accuracy and efficiency of motor transport by different motor families, for example, the direction of cargos towards one or other microtubules.
  
  Strengths:
  
  The authors focus on the response of inactivated kinesin-1 to peptides found in cargos and the cascade of conformational changes that occur. They also test the effects of the known activator of kinesin-1 - MAP7 - in the context of their model. The study benefits from multiple complementary methods - structural prediction using AlphaFold3, 2D and 3D analysis of (mainly negative stain) TEM images of several engineered kinesin constructs, biophysical characterisation of the complexes, peptide design, hydrogen/deuterium-exchange mass spectrometry, and simple cell-based imaging. Each set of experiments is thoughtfully designed, and the intrinsic limitations of each method are offset by other approaches such that the assembled data convincingly support the authors' conclusions. This study benefits from prior work by the authors on this system and the tools and constructs they previously accrued, as well as from other recent contributions to the field.
  
  Weaknesses:
  
  It is not always straightforward to follow the design logic of a particular set of experiments, with the result that the internal consistency of the data appears unconvincing in places. For example, i) the Figure 1 AlphaFold3 models do not include motor domains whereas the nearly all of the rest of the data involve constructs with the motor domains; ii) the kinesin constructs are chemically cross-linked prior to TEM sample preparation - this is clear in the Methods but should be included in the Results text, together with some discussion of how this might influence consistency with other methods where crosslinking was not used. Can those cross-links themselves be used to probe the intramolecular interactions in the molecular populations by mass spec? In general, the information content of some of the figure panels can also be improved with more annotations (e.g. angular relationship between views in Figure 1B, approximate interpretations of the various blobs in Fig 3F, and more thought given to what the reader should extract from the representative micrographs in several figures - inclusion of the raw data is welcome but extraction and magnification of exemplar particles (as is done more effectively in Fig S5) could convey more useful information elsewhere.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  In this paper, Shukla, Cross, Kish, and colleagues investigate how binding of a cargo-adaptor mimic (KinTag) to the TPR domains of the kinesin-1 light chain, or disruption of the TPR docking site (TDS) on the kinesin-1 heavy chain, triggers release of the TPR domains from the holoenzyme. This dislocation provides a plausible mechanism for transition out of the autoinhibited lambda-particle toward the open and active conformation of kinesin-1. Using a combination of negative-stain electron microscopy, AlphaFold modeling, biochemical assays, hydrogen-deuterium exchange mass spectrometry (HDX-MS), and other methods, the authors show how TPR undocking propagates conformational changes through the coiled-coil stalk to the motor domains, increasing their mobility and enhancing interactions with the microtubule-bound cofactor MAP7. Together, they propose a model in which the TDS on CC1 of the heavy chain forms a "shoulder" in the compact, autoinhibited state. Cargo-adaptor binding, mimicked here by KinTag, dislodges this shoulder, liberating the motor domains and promoting MAP7 association, driving kinesin-1 activation.
  
  Strengths:
  
  Throughout the study, the authors use a clever construct design - e.g., delta-Elbow, ElbowLock, CC-Di, and the high-affinity KinTag - to test specific mechanisms by directly perturbing structural contacts or affecting interactions. The proposed mechanism of releasing autoinhibition via adaptor-induced TPR undocking is also interrogated with a number of complementary techniques that converge on a convincing model for activation that can be further tested in future studies. The paper is well-written and easy to follow, though some more attention to figure labels and legends would improve the manuscript (detailed in recommendations for the authors).
  
  Weaknesses:
  
  These reflect limits of what the current data can establish rather than flaws in execution. It remains to be tested if the open state of kinesin-1 initiated by TPR undocking is indeed an active state of kinesin-1 capable of processive movement and/or cargo transport. It also remains to be determined what the mechanism of motor domain undocking from the autoinhibited conformation is, and perhaps this could have been explored more here. The authors have shown by HDX-MS that the motor domains become more mobile on KinTag binding, but perhaps molecular dynamics would also be useful for modelling how that might occur.
  
  Review 2
4. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  The manuscript by Shukla and colleagues presents a comprehensive study that addresses a central question in kinesin-1 regulation - how cargo binding to the kinesin light chain (KLC) tetratricopeptide repeat (TPR) domains triggers activation of full-length kinesin-1 (KHC). The authors combine AlphaFold3 modeling, biophysical analysis (fluorescence polarization, hydrogen-deuterium exchange), and electron microscopy to derive a mechanistic model in which the KLC-TPR domains dock onto coiled-coil 1 (CC1) of the KHC to form the "TPR shoulder," stabilizing the autoinhibited (λ-particle) conformation. Binding of a W/Y-acidic cargo motif (KinTag) or deletion of the CC1 docking site (TDS) dislocates this shoulder, liberating the motor domains and enhancing accessibility to cofactors such as MAP7. The results link cargo recognition to allosteric structural transitions and present a unified model of kinesin-1 activation.
  
  Strengths:
  
  (1) The study addresses a fundamental and long-standing question in kinesin-1 regulation using a multidisciplinary approach that combines structural modeling, quantitative biophysics, and electron microscopy.
  
  (2) The mechanistic model linking cargo-induced dislocation of the TPR shoulder to activation of the motor complex is well supported by both structural and biochemical evidence.
  
  (3) The authors employ elegant protein-engineering strategies (e.g., ElbowLock and ΔTDS constructs) that enable direct testing of model predictions, providing clear mechanistic insight rather than purely correlative data.
  
  (4) The data are internally consistent and align well with previous studies on kinesin-1 regulation and MAP7-mediated activation, strengthening the overall conclusion.
  
  Weaknesses:
  
  (1) While the EM and HDX-MS analyses are informative, the conformational heterogeneity of the complex limits structural resolution, making some aspects of the model (e.g., stoichiometry or symmetry of TPR docking) indirect rather than directly visualized.
  
  (2) The dynamics of KLC-TPR docking and undocking remain incompletely defined; it is unclear whether both TPR domains engage CC1 simultaneously or in an alternating fashion.
  
  (3) The interplay between cargo adaptors and MAP7 is discussed but not experimentally explored, leaving open questions about the sequence and exclusivity of their interactions with CC1.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 3

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.04.08.647705v2
www.biorxiv.org www.biorxiv.org

Interplay between cohesin and TORC1 links chromosome segregation and gene expression to environmental changes

3
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  This important study describes a new link between nutrient signaling and chromosome regulation, providing compelling evidence that reduced activity in the central nutrient-sensing pathway governed by TORC1 improves chromosome stability and alters gene expression in S. pombe through effects on cohesin. While the biological importance of this newly described circuit is not yet fully known, and some data would benefit from further clarification, the overall body of evidence supports the main conclusions.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  In this study, Besson et al. investigate how environmental nutrient signals regulate chromosome biology through the TORC1 signaling pathway in Schizosaccharomyces pombe. Specifically, the authors explore the impact of TORC1 on cohesin function - a protein complex essential for chromosome segregation and transcriptional regulation. Through a combination of genetic screens, biochemical analysis, phospho-proteomics, and transcriptional profiling, they uncover a functional and physical interaction between TORC1 and cohesin. The data suggest that reduced TORC1 activity enhances cohesin binding to chromosomes and improves chromosome segregation, with implications for stress-responsive gene expression, especially in subtelomeric regions.
  
  Strengths:
  
  This work presents a compelling link between nutrient sensing and chromosome regulation. The major strength of the study lies in its comprehensive and multi-disciplinary approach. The authors integrate genetic suppression screens, live-cell imaging, chromatin immunoprecipitation, co-immunoprecipitation, and mass spectrometry to uncover the functional connection between TORC1 signaling and cohesin. The use of phospho-mutant alleles of cohesin subunits and their loader provides mechanistic insight into the regulatory role of phosphorylation. The addition of transcriptomic analysis further strengthens the biological relevance of the findings and places them in a broader physiological context. Altogether, the dataset convincingly supports the authors' main conclusions and opens up new avenues of investigation.
  
  Weaknesses:
  
  While the study is strong overall, a few limitations are worth noting. The consistency of cohesin phosphorylation changes under different TORC1-inhibiting conditions (e.g., genetic mutants vs. rapamycin treatment) is unclear and could benefit from further clarification. The phosphorylation sites identified on cohesin subunits do not match known AGC kinase consensus motifs, raising the possibility that the modifications are indirect. The study relies heavily on one TORC1 mutant allele (mip1-R401G), and additional alleles could strengthen the generality of the findings. Furthermore, while the results suggest that nutrient availability influences cohesin function, this is not directly tested by comparing growth or cohesin dynamics under defined nutrient conditions.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  In this study, the authors follow up on a previous suppressor screen of a temperature-sensitive allele of mis4 (mis4-G1487D), the cohesin loading factor in S. pombe, and identify additional suppressor alleles tied to the S. pombe TORC1 complex. Their analysis suggests that these suppressor mutations attenuate TORC1 activity, while enhanced TORC1 activity is deleterious in this context. Suppression of TORC1 activity also ameliorates chromosome segregation and spindle defects observed in the mis4-G1487D strain, although some more subtle effects are not reconstituted. The authors provide evidence that this genetic suppression is also tied to the reconstitution of cohesin loading. Moreover, disrupting TORC1 also enhances Mis4/cohesin association with chromatin (likely reflecting enhanced loading) in WT cells, while rapamycin treatment can enhance the robustness of chromosome transmission. These effects likely arise directly through TORC1 or its downstream effector kinases, as TORC1 co-purifies with Mis4 and Rad21; these factors are also phosphorylated in a TORC1-dependent fashion. Disrupting Sck2, a kinase downstream of TORC1, also suppresses the mis4-G1487D allele while simultaneous disruption of Sck1 and Sck2 enhances cohesin association with chromatin, albeit with differing effects on phosphorylation of Mis4 and Psm1/Scm1. Phosphomutants of Mis4 and Psm1 that mimic observed phosphorylation states identified by mass spectrometry that are TORC1-dependent also suppressed phenotypes observed in the mis4-G1487D background. Last, the authors provide evidence that the mis4-G1487D background and TORC1 mutant backgrounds display an overlap in the dysregulation of genes that respond to environmental conditions, particularly in genes tied to meiosis or other "stress".
  
  Overall, the authors provide compelling evidence from genetics, biochemistry, and cell biology to support a previously unknown mechanism by which nutrient sensing regulates cohesin loading with implications for the stress response. The technical approaches are generally sound, well-controlled, and comprehensive.
  
  Specific Points:
  
  (1) While the authors favor the model that the enhanced cohesin loading upon diminished TORC1 activity helps cells to survive harsh environmental conditions, as starvation of S. pombe also drives commitment to meiosis, it seems as plausible that enhanced cohesin loading is related to preparing the chromosomes to mate.
  
  (2) Related to Point 1, the lab of Sophie Martin previously published that phosphorylation of Mis4 characterizes a cluster of phosphotargets during starvation/meiotic induction (PMID: 39705284). This work should be cited, and the authors should interrogate how their observations do or do not relate to these prior observations (are these the same phosphosites?).
  
  (3) It would be useful for the authors to combine their experimental data sets to interrogate whether there is a relationship between the regions where gene expression is altered in the mis4-G1487D strain and changes in the loading of cohesin in their ChIP experiments.
  
  (4) Given that the genes that are affected are predominantly sub-telomeric while most genes are not affected in the mis4-G1487D strain, one possibility that the authors may wish to consider is that the regions that become dysregulated are tied to heterochromatic regions where Swi6/HP1 has been implicated in cohesin loading.
  
  (5) It would be helpful to show individual data points from replicates in the bar graphs - it is not always clear what comprises the data sets, and superplots would be of great help.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.07.24.603895v2
www.biorxiv.org www.biorxiv.org

UV irradiation alters TFAM binding to mitochondrial DNA

4
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  Mitochondrial DNA (mtDNA) exhibits a degree of resistance to mutagenesis under genotoxic stress, and this study on the mitochondrial Transcription Factor A (TFAM) presents valuable data concerning the possible mechanisms involved. The presented data are solid, technically rigorous, and consistent with established literature findings. The experiments are well-executed, providing reliable evidence on the change of TFAM-DNA interactions following UVC irradiation. However, the evidence is inadequate to support the primary claims.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors investigate how UVC-induced DNA damage alters the interaction between the mitochondrial transcription factor TFAM and mtDNA. Using live-cell imaging, qPCR, atomic force microscopy (AFM), fluorescence anisotropy, and high-throughput DNA-chip assays, they show that UVC irradiation reduces TFAM sequence specificity and increases mtDNA compaction without protecting mtDNA from lesion formation. From these findings, the authors suggest that TFAM acts as a "sensor" of damage rather than a protective or repair-promoting factor.
  
  Strengths:
  
  (1) The focus on UVC damage offers a clean system to study mtDNA damage sensing independently of more commonly studied repair pathways, such as oxidative DNA damage. The impact of UVC damage is not well understood in the mitochondria, and this study fills that gap in knowledge.
  
  (2) In particular, the custom mitochondrial genome DNA chip provides high-resolution mapping of TFAM binding and reveals a global loss of sequence specificity following UVC exposure.
  
  (3) The combination of in vitro TFAM DNA biophysical approaches, combined with cellular responses (gene expression, mtDNA turnover), provides a coherent multi-scale view.
  
  (4) The authors demonstrate that TFAM-induced compaction does not protect mtDNA from UVC lesions, an important contribution given assumptions about TFAM providing protection.
  
  Weaknesses:
  
  (1) The authors show a decrease in mtDNA levels and increased lysosomal colocalization but do not define the pathway responsible for degradation. Distinguishing between replication dilution, mitophagy, or targeted degradation would strengthen the interpretation
  
  (2) The sudden induction of mtDNA replication genes and transcription at 24 h suggests that intermediate timepoints (e.g., 12 hours) could clarify the kinetics of the response and avoid the impression that the sampling coincidentally captured the peak.
  
  (3) The authors report no loss of mitochondrial membrane potential, but this single measure is limited. Complementary assays such as Seahorse analysis, ATP quantification, or reactive oxygen species measurement could more fully assess functional integrity.
  
  (4) The manuscript briefly notes enrichment of TFAM at certain regions of the mitochondrial genome but provides little interpretation of why these regions are favored. Discussion of whether high-occupancy sites correspond to regulatory or structural elements would add valuable context.
  
  (5) It remains unclear whether the altered DNA topology promotes TFAM compaction or vice versa. Addressing this directionality, perhaps by including UVC-only controls for plasmid conformation, would help disentangle these effects if UVC is causing compaction alone.
  
  (6) The authors provide a discrepancy between the anisotropy and binding array results. The reason for this is not clear, and one wonders if an orthogonal approach for the binding experiments would elucidate this difference (minor point).
  
  Assessment of conclusions:
  
  The manuscript successfully meets its primary goal of testing whether TFAM protects mtDNA from UVC damage and the impact this has on the mtDNA. While their data points to an intriguing model that TFAM acts as a sensor of damaged mtDNA, the validation of this model requires further investigation to make the model more convincing. This is likely warranted for a follow-up study. Also, the biological impact of this compaction, such as altering transcription levels, is not clear in this study.
  
  Impact and utility of the methods:
  
  This work advances our understanding of how mitochondria manage UVC genome damage and proposes a structural mechanism for damage "sensing" independent of canonical repair. The methodology, including the custom TFAM DNA chip, will be broadly useful to the scientific community.
  
  Context:
  
  The study supports a model in which mitochondrial genome integrity is maintained not only by repair factors, but also by selective sequestration or removal of damaged genomes. The demonstration that TFAM compaction correlates with damage rather than protection reframes an interesting role in mtDNA quality control.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  King et al. present several sets of experiments aimed to address the potential impact of UV irradiation on human mitochondrial DNA as well as the possible role of mitochondrial TFAM protein in handling UV-irradiated mitochondrial genomes. The carefully worded conclusion derived from the results of experiments performed with human HeLa cells, in vitro small plasmid DNA, with PCR-generated human mitochondrial DNA, and with UV-irradiated small oligonucleotides is presented in the title of the manuscript: "UV irradiation alters TFAM binding to mitochondrial DNA". The authors also interpret results of somewhat unconnected experimental approaches to speculate that "TFAM is a potential DNA damage sensing protein in that it promotes UVC-dependent conformational changes in the [mitochondrial] nucleoids, making them more compact." They further propose that such a proposed compaction triggers the removal of UV-damaged mitochondrial genomes as well as facilitates replication of undamaged mitochondrial genomes.
  
  Strengths:
  
  (1) The authors presented convincing evidence that a very high dose (1500 J/m2) of UVC applied to oligonucleotides covering the entire mitochondrial DNA genome alleviates sequence specificity of TFAM binding (Figure 3). This high dose was sufficient to cause UV lesions in a large fraction of individual oligonucleotides. The method was developed in the lab of one of the corresponding authors (reference 74) and is technically well-refined. This result can be published as is or in combination with other data.
  
  (2) The manuscript also presents AFM evidence (Figure 4) that TFAM, which was long known to facilitate compaction of the mitochondrial genome (Alam et al., 2003; PMID 12626705 and follow-up citations), causes in vitro compaction of a small pUC19 plasmid and that approximately 3 UVC lesions per plasmid molecule result in a slight, albeit detectable, increase in TFAM compaction of the plasmid. Both results can be discussed in line with a possible extrapolation to in vivo phenomena, but such a discussion should include a clear statement that no in vivo support was provided within the set of experiments presented in the manuscript.
  
  Weaknesses:
  
  Besides the experiments presented in Figures 3 and 4, other results do not either support or contradict the speculation that TFAM can play a protective role, eliminating mitochondrial genomes with bulky lesions by way of excessive compaction and removing damaged genomes from the in vivo pool.
  
  To specify these weaknesses:
  
  (1) Figure 1 - presents evidence that UVC causes a reduction in the number of mitochondrial spots in cells. The role of TFAM is not assessed.
  
  (2) Figure 2 - presents evidence that UVC causes lesions in mitochondrial genomes in vivo, detectable by qPCR. No direct assessment of TFAM roles in damage repair or mitochondrial DNA turnover is assessed despite the statements in the title of Figure 2 or in associated text. Approximately 2-fold change in gene expression of TFAM and of the three other genes does not provide any reasonable support to suggestion about increased mitochondrial DNA turnover over multiple explanations on related to mitochondrial DNA maintenance.
  
  (3) Figure 5. Shows that TFAM does not protect either mitochondrial nucleoids formed in vitro or mitochondrial DNA in vivo from UVC lesions as well as has no effect on in vivo repair of UV lesions.
  
  (4) Figure 6: Based on the above analysis, the model of the role of TFAM in sensing mtDNA damage and elimination of damaged genomes in vivo appears unsupported.
  
  (5) Additional concern about Figure 3 and relevant discussion: It is not clear if more uniform TFAM binding to UV irradiated oligonucleotides with varying sequence as compared to non-irradiated oligonucleotides can be explained by just overall reduced binding eliminating sequence specific peaks.
  
  Review 2
4. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  The study is grounded in the observations that mitochondrial DNA (mtDNA) exhibits a degree of resistance to mutagenesis under genotoxic stress. The manuscript focuses on the effects of UVC-induced DNA damage on TFAM-DNA binding in vitro and in cells. The authors demonstrate increased TFAM-DNA compaction following UVC irradiation in vitro based on high-throughput protein-DNA binding and atomic force microscopy (AFM) experiments. They did not observe a similar trend in fluorescence polarization assays. In cells, the authors found that UVC exposure upregulated TFAM, POLG, and POLRMT mRNA levels without affecting the mitochondrial membrane potential. Overexpressing TFAM in cells or varying TFAM concentration in reconstituted nucleoids did not alter the accumulation or disappearance of mtDNA damage. Based on their data, the authors proposed a plausible model that, following UVC-induced DNA damage, TFAM facilitates nucleoid compaction, which may serve to signal damage in the mitochondrial genome.
  
  Strengths:
  
  The presented data are solid, technically rigorous, and consistent with established literature findings. The experiments are well-executed, providing reliable evidence on the change of TFAM-DNA interactions following UVC irradiation. The proposed model may inspire future follow-up studies to further study the role of TFAM in sensing UVC-induced damage.
  
  Weaknesses:
  
  The manuscript could be further improved by refining specific interpretations and ensuring terminology aligns precisely with the data presented.
  
  (1) In line 322, the claim of increased "nucleoid compaction" in cells should be removed, as there is a lack of direct cellular evidence. Given that non-DNA-bound TFAM is subject to protease digestion, it is uncertain to what extent the overexpressed TFAM actually integrates into and compacts mitochondrial nucleoids in the absence of supporting immunofluorescence data.
  
  (2) In lines 405 and 406, the authors should avoid equating TFAM overexpression with compaction in the cellular context unless the compaction is directly visualized or measured.
  
  (3) In lines 304 and 305 (and several other places throughout the manuscript), the authors use the term "removal rates". A "removal rate" requires a direct comparison of accumulated lesion levels over a time course under different conditions. Given the complexity of UV-induced DNA damage-which involves both damage formation and potential removal via multiple pathways-a more accurate term that reflects the net result of these opposing processes is "accumulated DNA damage levels." This terminology better reflects the final state measured and avoids implying a single, active 'removal' pathway without sufficient kinetic data.
  
  (4) In line 357, the authors refer to the decrease in the total DNA damage level as "The removal of damaged mtDNA". The decrease may be simply due to the turnover and resynthesis of non-damaged mtDNA molecules. The term "removal" may mislead the casual reader into interpreting the effect as an active repair/removal process.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 3

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.10.24.620005v4
www.biorxiv.org www.biorxiv.org

Multi-barrier unfolding of the double-knotted protein, TrmD–Tm1570, revealed by single-molecule force spectroscopy and molecular dynamics

3
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  This study investigates the folding and unfolding behavior of the doubly knotted protein TrmD-Tm1570, providing insight into the molecular mechanisms underlying protein knotting. The findings reveal multiple unfolding pathways and suggest that the formation of double knots may require chaperone assistance, offering valuable insights into topologically complex proteins. The evidence is solid, supported by consistent agreement between simulation and experiment, though some aspects of the presentation and experimental scope could be clarified or expanded.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  This paper investigates the thermal and mechanical unfolding pathways of the doubly knotted protein TrmD-Tm1570 using molecular simulations, optical tweezers experiments, and other methods. In particular, the detailed analysis of the four major unfolding pathways using a well-established simulation method is an interesting and valuable result.
  
  Strengths:
  
  A key finding that lends credibility to the simulation results is that the molecular simulations at least qualitatively reproduce the characteristic force-extension distance profiles obtained from optical tweezers experiments during mechanical unfolding. Furthermore, a major strength is that the authors have consistently studied the folding and unfolding processes of knotted proteins, and this paper represents a careful advancement building upon that foundation.
  
  Weaknesses:
  
  While optical tweezers experiments offer valuable insights, the knowledge gained from them is limited, as the experiments are restricted to this single technique.
  
  The paper mentions that the high aggregation propensity of the TrmD-Tm1570 protein appears to hinder other types of experiments. This is likely the reason why a key aspect, such as whether a ribosome or molecular chaperones are essential for the folding of TrmD-Tm1570, has not been experimentally clarified, even though it should be possible in principle.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  In this manuscript, the authors combined coarse-grained structure-based model simulation, optical tweezer experiments, and AI-based analysis to assess the knotting behavior of the TrmD-Tm1570 protein. Interestingly, they found that while the structure-based model can fold the single knot from TrmD and Tm1570, the double-knot protein TrmD-Tm1570 cannot form a knot itself, suggesting the need for chaperone proteins to facilitate this knotting process. This study has strong potential to understand the molecular mechanism of knotted proteins, supported by many experimental and simulation evidence. However, there are a few places that appear to lack sufficient details, and more clarification in the presentation is needed.
  
  Strengths:
  
  A combination of both experimental and computational studies.
  
  Weaknesses:
  
  There is a lack of detail to support some statements.
  
  (1) The use of the AI-based method, SOM, can be emphasized further, especially in its analysis of the simulated unfolding trajectories and discovery of the four unfolding/folding pathways. This will strengthen the statistical robustness of the discovery.
  
  (2) The manuscript would benefit from a clearer description of the correlation between the simulation and experimental results. The current correlation, presented in the paragraph starting from Line 250, focuses on measured distances. The authors could consider providing additional evidence on the order of events observed experimentally and computationally. More statistical analyses on the experimental curves presented in Figure 4 supplement would be helpful.
  
  (3) How did the authors calibrate the timescale between simulation and experiment? Specifically, what is the value \tau used in Line 270, and how was it calculated? Relevant information would strengthen the connection between simulation and experiment.
  
  (4) In Line 342, the authors comment that whether using native contacts or not, they cannot fold double-knotted TrmD-Tm1570. Could the authors provide more details on how non-native interactions were analyzed?
  
  (5) It appears that the manuscript lacks simulation or experimental evidence to support the statement at Line 343: While each domain can self-tie into its native knot, this process inhibits the knotting of the other domain. Specifically, more clarification on this inhibition is needed.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.27.672562v1
www.biorxiv.org www.biorxiv.org

Ptbp1 is not required for retinal neurogenesis and cell fate specification

4
1. Public_Reviews 18 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  This study used a conditional knockout mouse line to remove Ptbp1 in retinal progenitors and demonstrated that its deletion has no effect on retinal neurogenesis or cell fate specification, thereby challenging the prevailing view of Ptbp1 as a master regulator of neuronal fate. The data are convincing, supported by transcriptomic analysis, histology, and proliferation assays. This study is important, and the broader implications for other CNS regions warrant further investigation.
  
  Summary
2. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The researchers sought to determine whether Ptbp1, an RNA-binding protein formerly thought to be a master regulator of neuronal differentiation, is required for retinal neurogenesis and cell fate specification. They used a conditional knockout mouse line to remove Ptbp1 in retinal progenitors and analyzed the results using bulk RNA-seq, single-cell RNA-seq, immunohistochemistry, and EdU labeling. Their findings show that Ptbp1 deletion has no effect on retinal development, since no defects were found in retinal lamination, progenitor proliferation, or cell type composition. Although bulk RNA-seq indicated changes in RNA splicing and increased expression of late-stage progenitor and photoreceptor genes in the mutants, and single-cell RNA-seq detected relatively minor transcriptional shifts in Müller glia, the overall phenotypic impact was low. As a result, the authors conclude that Ptbp1 is not required for retinal neurogenesis and development, thus contradicting prior statements about its important role as a master regulator of neurogenesis. They argue for a reassessment of this stated role. While the findings are strong in the setting of the retina, the larger implications for other areas of the CNS require more investigation. Furthermore, questions about potential reimbursement from Ptbp2 warrant further research.
  
  Strengths:
  
  This study calls into doubt the commonly held belief that Ptbp1 is a critical regulator of neurogenesis in the CNS, particularly in retinal development. The adoption of a conditional knockout mouse model provides a reliable way for eliminating Ptbp1 in retinal progenitors while avoiding the off-target effects often reported in RNAi experiments. The combination of bulk RNA-seq, scRNA-seq, and immunohistochemistry enables a thorough examination of molecular and cellular alterations at both embryonic and postnatal stages, which strengthens the study's findings. Furthermore, using publicly available RNA-Seq datasets for comparison improves the investigation of splicing and expression across tissues and cell types. The work is well-organized, with informative figure legends and supplemental data that clearly show no substantial phenotypic changes in retinal lamination, proliferation, or cell destiny, despite identified transcriptional and splicing modifications.
  
  Weaknesses:
  
  The retina-specific method raises questions regarding whether Ptbp1 is required in other CNS locations where its neurogenic roles were first proposed. Although the study performs well in transcriptome and histological analyses, it lacks functional assessments (such as electrophysiological or behavioral testing) to determine if small changes in splicing or gene expression affect retinal function.
  
  Review 1
3. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Ptbp1 has been proposed as a key regulator of neuronal fate through its role in repressing neurogenesis. In this study, the authors conditionally inactivated Ptbp1 in mouse retinal progenitor cells using the Chx10-Cre line. While RNA-seq analysis at E16 revealed some changes in gene expression, there were no significant alterations in retinal cell type composition, and only modest transcriptional changes in the mature retina, as assessed by immunofluorescence and scRNAseq. Based on these findings, the authors conclude that Ptbp1 is not essential for cell fate determination during retinal development.
  
  Strengths:
  
  Despite some effects of Ptbp1 inactivation (initiated around E11.5 with the onset of Chx10-Cre activity) on gene expression and splicing, the data convincingly demonstrate that retinal cell type composition remains largely unaffected. This study is highly significant since it challenges the prevailing view of Ptbp1 as a central repressor of neurogenesis and highlights the need to further investigate, or re-evaluate, its role in other model systems and regions of the CNS.
  
  Weaknesses:
  
  A limitation of the study is the use of the Chx10-Cre driver, which initiates recombination around E11. This timing does not permit assessment of Ptbp1 function during the earliest phases of retinal development, if expressed at that time.
  
  Comments on revisions:
  
  The authors have thoroughly and satisfactorily addressed all my previous comments.
  
  Review 2
4. Public_Reviews 18 Nov 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the original reviews.
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The researchers sought to determine whether Ptbp1, an RNA-binding protein formerly thought to be a master regulator of neuronal differentiation, is required for retinal neurogenesis and cell fate specification. They used a conditional knockout mouse line to remove Ptbp1 in retinal progenitors and analyzed the results using bulk RNA-seq, single-cell RNA-seq, immunohistochemistry, and EdU labeling. Their findings show that Ptbp1 deletion has no effect on retinal development, since no defects were found in retinal lamination, progenitor proliferation, or cell type composition. Although bulk RNA-seq indicated changes in RNA splicing and increased expression of late-stage progenitor and photoreceptor genes in the mutants, and single-cell RNA-seq detected relatively minor transcriptional shifts in Müller glia, the overall phenotypic impact was low. As a result, the authors conclude that Ptbp1 is not required for retinal neurogenesis and development, thus contradicting prior statements about its important role as a master regulator of neurogenesis. They argue for a reassessment of this stated role. While the findings are strong in the setting of the retina, the larger implications for other areas of the CNS require more investigation. Furthermore, questions about potential reimbursement from Ptbp2 warrant further research.
  
  Strengths:
  
  This study calls into doubt the commonly held belief that Ptbp1 is a critical regulator of neurogenesis in the CNS, particularly in retinal development. The adoption of a conditional knockout mouse model provides a reliable way for eliminating Ptbp1 in retinal progenitors while avoiding the off-target effects often reported in RNAi experiments. The combination of bulk RNA-seq, scRNA-seq, and immunohistochemistry enables a thorough examination of molecular and cellular alterations at both embryonic and postnatal stages, which strengthens the study's findings. Furthermore, using publicly available RNA-Seq datasets for comparison improves the investigation of splicing and expression across tissues and cell types. The work is wellorganized, with informative figure legends and supplemental data that clearly show no substantial phenotypic changes in retinal lamination, proliferation, or cell destiny, despite identified transcriptional and splicing modifications.
  
  We thank the Reviewer for their evaluation of the strengths of the study.
  
  Weaknesses:
  
  The retina-specific method raises questions regarding whether Ptbp1 is required in other CNS locations where its neurogenic roles were first proposed. The claim that Ptbp1 is "fully dispensable" for retinal development may be toned down, given the transcriptional and splicing modifications identified. The possibility of subtle or transitory impacts, such as ectopic neuron development followed by cell death, is postulated, but not completely investigated. Furthermore, as the authors point out, the compensating potential of increased Ptbp2 warrants additional exploration. Although the study performs well in transcriptome and histological analyses, it lacks functional assessments (such as electrophysiological or behavioral testing) to determine if small changes in splicing or gene expression affect retinal function. While 864 splicing events have been found, the functional significance of these alterations, notably the 7% that are neuronalenriched and the 35% that are rod-specific, has not been thoroughly investigated. The manuscript might be improved by describing how these splicing changes affect retinal development or function.
  
  We have revised the text to address these points as requested.
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Ptbp1 has been proposed as a key regulator of neuronal fate through its role in repressing neurogenesis. In this study, the authors conditionally inactivated Ptbp1 in mouse retinal progenitor cells using the Chx10-Cre line. While RNA-seq analysis at E16 revealed some changes in gene expression, there were no significant alterations in retinal cell type composition, and only modest transcriptional changes in the mature retina, as assessed by immunofluorescence and scRNAseq. Based on these findings, the authors conclude that Ptbp1 is not essential for cell fate determination during retinal development.
  
  Strengths:
  
  Despite some effects of Ptbp1 inactivation (initiated around E11.5 with the onset of Chx10-Cre activity) on gene expression and splicing, the data convincingly demonstrate that retinal cell type composition remains largely unaffected. This study is highly significant since it challenges the prevailing view of Ptbp1 as a central repressor of neurogenesis and highlights the need to further investigate, or re-evaluate, its role in other model systems and regions of the CNS.
  
  We thank the Reviewer for their evaluation of the strengths of the study.
  
  Weaknesses:
  
  A limitation of the study is the use of the Chx10-Cre driver, which initiates recombination around E11. This timing does not permit assessment of Ptbp1 function during the earliest phases of retinal development, if expressed at that time.
  
  We have revised the text to address the potential limitations of the use of the Chx10-Cre driver in this study.
  
  Reviewer #1 (Recommendations for the authors):
  
  (1) The author only selected scRNA-Seq datasets to examine the expression patterns of Ptbp1 in the retina; incorporating immunostaining analysis in the mouse retina is necessary.
  
  Ptbp1 expression patterns in the mouse retina were performed in Fig. 1b-1d, where Ptbp1 expression was analyzed via immunostaining for Ptbp1 protein in Chx10-Cre control and Ptbp1KO retinas at E14, P1, and P30, and are quantified in Fig. 1e.
  
  (2) In Figure 1, Ptbp1 signals were still detected in the KO mice, with the author suggesting that this may indicate cross-reactivity with an unknown epitope. Why is this unknown epitope only detected in the ganglion cell layer? Additional antibodies are needed to confirm the staining results. Furthermore, it is essential to verify the KO at the mRNA level using PCR.
  
  We are unsure of the identity of this cross-reacting epitope, although it might be Ptbp2, which is enriched expressed in immature retinal ganglion cells (Fig. S1). In any case, we do not believe that the identity of this epitope is not relevant to assessing the efficiency of Ptbp1 deletion, as it is not detectably expressed in retinal ganglion cells in any case (Fig. S1).
  
  Although the heatmap in Figure 2B indicates a decrease in Ptbp1 levels in the KO mice, the absence of statistical data makes it difficult to evaluate the KO efficiency.
  
  Respectfully, we believe that Ptbp1 knockout efficiency is adequately addressed using immunohistochemistry, and that further statistical analysis is not essential here.
  
  Cre staining of the Chx10-Cre;Ptbp1lox/lox mice or using reporter lines is also suggested to indicate the theoretically knockout cells. Providing high-power images of the Ptbp1 staining would help readers clearly recognize the staining signals.
  
  To clarify the identity of the knockout cells, we have updated Figure 1 to include the Chx10-CreEGFP staining which more clearly delineates the cells in which Ptbp1 is deleted. Regarding verification of the knockout, we believe additional PCR assays are not necessary, as we have already demonstrated efficient loss of Ptbp1 in Chx10-Cre-expressing cells at the RNA level by both single-cell RNA-sequencing and bulk RNA-sequencing, and also at the protein level by immunohistochemistry. Sun1-GFP Cre reporter lines are also used in Figures 1 and S2 to visualize patterns of Cre activity, a point which is now highlighted in the text. Together, these approaches provide sufficient evidence for effective Ptbp1 knockout.
  
  (3) The possibility of ectopic neuron formation followed by cell death is intriguing but underexplored. Consider adding apoptosis assays (e.g., TUNEL staining) at early developmental stages to test this hypothesis.
  
  While apoptosis assays such as TUNEL staining would be helpful to address this hypothesis, we feel incorporating these additional experiments is currently beyond the scope of this study. We agree the possibility of cell death is intriguing and plan to explore this in future work.
  
  (4) On page 4, the statement "We did not observe any significant differences ... Chx10Cre;Ptbp1lox/lox mice (Fig. 2b,c)" should refer to Fig. 3b,c instead.
  
  We have changed the text to refer to Fig. 3b,c.
  
  (5) The labeling in Figure 3 as "Cre-Ptbp1" is inconsistent with the figure legend "Ptbp1-Ctrl.".
  
  This language was used because the samples for EdU staining in Figure 3 were Chx10-Cre negative Ptbp1<sup>lox/lox</sup> mice. We have updated the language in the manuscript and figure to reflect the genotypes more clearly.
  
  (6) P30 mice are still sexually immature; the term "adolescent" or "juvenile" should be used instead of "adult."
  
  We have updated the language in the text from “adult” to “adolescent” to describe P30 mice, although the retina itself is mature by this age.
  
  Reviewer #2 (Recommendations for the authors):
  
  (1) As mentioned in the public review, a limitation of the study is that Ptbp1 KO is not induced prior to E11. The authors should acknowledge this limitation and include in the Discussion that the use of the Chx10-Cre line does not permit evaluation of a potential role for Ptbp1 during very early stages of retinal development, should it be expressed at that time (an aspect that would be important to determine).
  
  We and have added this limitation to the Discussion in the sentence highlighted below.
  
  Furthermore, the use of the Chx10-Cre transgene in this study does not exclude a potential role for Ptbp1 during very early stages of retinal development prior to E11 (pg. 6).
  
  (2) While the data convincingly show no significant changes in retinal cell type distribution in Ptbp1 mutants, the claims in the abstract and introduction that Ptbp1 is "dispensable for retinal development" or "dispensable for the process of neurogenesis" may be overstated. Indeed, the results indicate that loss of Ptbp1 function influences retinal development by promoting neurogenesis through induction of a neuronal-like splicing program in neural progenitors. Concluding solely that Ptbp1 is dispensable for retinal cell fate specification, rather than for retinal development as a whole, would thus seem more accurate.
  
  We have updated the language in the text to reflect Ptbp1’s role in regulating retinal cell fate specification more clearly.
  
  (3) The authors conclude from Figure 5 that "No changes in the identity or composition of any retinal cell type were observed." Which statistical test was applied to support this conclusion? The figure indicates that Müller cells comprise 10.5% of the total cell population in controls versus 8.2% in Ptbp1-KO retinas. It may be important to consider the overall distribution of glia versus all neurons (rather than each neuron subtype individually). While the observed difference (~2% more glia at the expense of neurons) appears modest, it would be important to determine whether this trend is consistent and statistically significant.
  
  To evaluate cell type composition, we performed differential expression analysis across all major retinal cell types and compared proportional cell type representation between control and Ptbp1 KO retinas. While these analyses did not reveal marked differences in any specific cell type, we acknowledge that the scRNA-Seq dataset includes a single experimental replicate, containing two retinas in each replicate. Therefore, we cannot draw firm statistical conclusions regarding the relative distribution of glia versus neurons, and the modest difference observed in glia cell proportion should be interpreted with caution. We agree that assessing glia-to-neuron ratios across additional replicates will be important in future studies.
  
  (4) Referringx to Figure S1 (scRNA-seq data), the authors state that Ptbp1 mRNA is robustly expressed in retinal progenitors and Müller glia in both mouse and human retina. While the immunostaining in Figure 4 indeed clearly shows strong expression in Müller cells, the scRNAseq data presented in Figure S1 do not support the claim of "robust" expression in Müller glia in the mouse retina. This is even more striking in the human data, where panels F and H show that Ptbp1 is expressed at extremely low, certainly not "robust", levels in Müller cells. The corresponding sentence in the Results section should therefore be revised to more accurately reflect the data presented in Figure S1, or be supported by complementary immunofluorescence evidence.
  
  We thank the reviewer for this comment. We have revised this section of the Results to better reflect Fig S1, as follows:
  
  We observe high expression levels of Ptbp1 mRNA in primary retinal progenitors in both species and Müller glia in mouse retina, with weaker expression in neurogenic progenitors, and little expression detectable in neurons at any developmental age.
  
  (5) When mentioning potential compensation by Ptbp2, the authors may also consider discussing the possibility that compensatory mechanisms can differ between knockdown and knockout approaches. In this context, it is noteworthy that a recent study by Konar et al., Exp Eye Res, 2025 (published after the submission of the present manuscript) reports that Ptbp1 knockdown promotes Müller glia proliferation in zebrafish.
  
  We thank the reviewer for this suggestion. To address this, we have included a section considering this possibility in the discussion section highlighted below.
  
  It is also possible that compensatory mechanisms differ between knockdown and knockout approaches. Notably, a recent study (Konar et al. 2025) reported that Ptbp1 knockdown promotes Müller glia proliferation in zebrafish, suggesting that effects of acute reduction of Ptbp1 may not fully mirror those of complete loss-of-function.
  
  (6) The statistical analyses were performed using a t-test. However, this parametric test is not appropriate for experiments with low sample sizes. A non-parametric test, such as the MannWhitney test, would be more suitable in this context. Furthermore, performing statistical analysis on n = 2 (Figure 3C) is not statistically valid.
  
  We thank the reviewer for this comment. We agree that with a small n, non-parametric tests are more appropriate. We have added additional retinas (now n=5) for the Ptbp1-KO condition in Figure 3C and reanalyzed with the appropriate non-parametric Mann-Whitney test. For all other datasets with sufficient replicates (n≥ 4/genotype), parametric tests such as unpaired t-tests remain valid, and the results are consistent with non-parametric testing.
  
  (7) Figure S3 is accompanied by only a brief explanation in the Results section (a single sentence despite the figure containing six panels), which makes it difficult for readers unfamiliar with this type of data to interpret.
  
  We thank the reviewer for the suggestion. To address this, we have included a more detailed explanation of Supplementary Figure S3 to better clarify our analysis of mature neuronal and glial cell types in both Ptbp1-deficient and wild-type animals. The relevant text now reads:
  
  Notably, splicing patterns in Ptbp1-deficient retinas showed stronger correlation with Thy1positive neurons— which exhibit low Ptbp1 expression—and minimal overlap with microglia and auditory hair cells, the adult cell types with the highest Ptbp1 levels (Fig. S3).
  
  Gene expression and splicing changes were compared across several reference tissues: heart tissue and Thy1-positive neurons, mature hair cells, microglia, and astrocytes (Fig. S3a,b). A heatmap of differentially expressed genes showed that while Ptbp1-deficient retinas diverged from WT retinas, their expression profiles did not resemble those of fully differentiated cell types like rods, astrocytes, or adult WT retina (Fig. S3c). Consistently, Pearson correlation analysis revealed that Ptbp1-deficient and WT retinas were more similar to each other than to fully differentiated neuronal or glial populations (Fig. S3d). Splicing profile analysis further revealed that while there was high correlation of PSI between Ptbp1-deficient and WT retinas, Ptbp1deficient retinas more closely resembled Thy1-positive neurons, whereas WT retinas aligned more strongly with mature cells such as astrocytes, microglia, and auditory hair cells (Fig. S3ef). Together, these results suggest that although Ptbp1 loss induces hundreds of alternative splicing events, the magnitude of PSI changes in the KO retinas remains considerably lower than that seen in fully differentiated cell types (Extended Data 3). Thus, while a subset of splicing events overlaps with those characteristic of mature neurons or rods, the overall splicing and expression profiles of KO retinas are more similar to those of developing retinal tissue rather than terminally differentiated neuronal or glial populations.
  
  (8) To assess progenitor proliferation, the authors performed EdU labeling experiments in P0 retinas. Is there a rationale for not examining earlier developmental time points to evaluate potential effects on early RPCs?
  
  We thank the reviewer for this comment. We chose to perform EdU labeling experiments at P0 for several reasons. P0 represents a developmental stage where RPCs are actively proliferating and represent ~35% of all retina cells, and the retina is transitioning to intermediate-late-stage development, providing sufficient time to ensure efficient and widespread disruption of Ptbp1. Earlier embryonic timepoints were not examined here, as addressing all stages of development was beyond the scope of this current study. However, we agree that investigating whether Ptbp1 plays stage-specific roles during development on early RPCs is an important question and potential future direction.
  
  (9) In Figure S2, panel D shows staining in GCL under the Ptbp1 condition that does not make sense and is inconsistent with panel C. If possible, the authors should provide an alternative image to prevent any confusion.
  
  Thank you for bringing this to our attention. The image shown for Ptbp1-KO in Figure 2d shows Sun1-eGFP labeling, which labels every cell affected by the Cre condition. The genotype for this mouse was Chx10-Cre;Ptbp1lox/lox;Sun1-GFP. We apologize for any confusion and have updated the genotype in the figure legend.
  
  (10) The authors should revise the following sentence at the end of the Discussion section, as its meaning is unclear: "...and conditions for in vitro analysis may have accurately replicated conditions in the native CNS."
  
  We thank the reviewer for this comment and have revised this sentence in the discussion for the sentence below.
  
  Previous studies using knockdown may have been complicated by off-target effects (Jackson et al. 2003), and conditions for in vitro analysis may not have accurately replicated conditions in the native CNS.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

AuthorResponse

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.07.02.662808v2
www.biorxiv.org www.biorxiv.org

The Anti-Inflammatory Role of GPNMB in Post-Traumatic Osteoarthritis

4
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  eLife Assessment
  
  This study demonstrates the cartilage-protective effects of osteoactivin in inflammatory experimental models. The work offers valuable insights advancing current knowledge regarding regulation of joint inflammation and tissue degeneration. The evidence provided is compelling and suggests that osteoactivin may serve as a promising therapeutic target for inflammatory joint diseases.
  
  Summary
2. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  While previous studies by this group and others have demonstrated the anti-inflammatory properties of osteoactivin, its specific role in cartilage homeostasis and disease pathogenesis remains unknown.
  
  Strengths:
  
  Strengths of the study include its clinical relevance, given the lack of curative treatments for osteoarthritis, as well as the clarity of the narrative and the quality of most results."
  
  Weaknesses:
  
  A limitation of the study is the reliance on standard techniques; however, this is a minor concern that does not diminish the overall impact or significance of the work.
  
  Comments on revisions:
  
  The authors have satisfactorily addressed my concerns.
  
  Review 1
3. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This manuscript presents compelling evidence for a novel anti-inflammatory function of glycoprotein non-metastatic melanoma protein B (GPNMB) in chondrocyte biology and osteoarthritis (OA) pathology. Through a combination of in vitro, ex vivo, and in vivo models, including the destabilization of the medial meniscus (DMM) surgery in mice, the authors demonstrate that GPNMB expression is upregulated in OA-affected cartilage and that recombinant GPNMB treatment reduces the expression of key catabolic markers (MMPs, Adamts-4, and IL-6) without impairing anabolic gene expression. Notably, DBA/2J mice lacking functional GPNMB exhibit exacerbated cartilage degradation post-injury. Mechanistically, GPNMB appears to mitigate inflammation via the MAPK/ERK pathway. Overall, the work is thorough, methodologically sound, and significantly advances our understanding of GPNMB as a protective modulator in osteoarthritic joint disease. The findings could open pathways for therapeutic development.
  
  Strengths:
  
  (1) Clear hypothesis addressing a well-defined knowledge gap.
  
  (2) Robust and multi-modal experimental design: includes human, mouse, cell-line, explant, and surgical OA models.
  
  (3) Elegant use of DBA/2J GPNMB-deficient mice to mimic endogenous loss-of-function.
  
  (4) Mechanistic insight provided through MAPK signaling analysis.
  
  (5) Statistical analysis appears rigorous and the figures are informative.
  
  Weaknesses:
  
  (1) Clarify the strain background of the DBA/2J GPNMB+ mice: While DBA/2J GPNMB+ is described as a control, it would help to explicitly state whether these are transgenically rescued mice or another background strain. Are they littermates, congenic, or a separate colony?
  
  (2) Provide exact sample sizes and variance in all figure legends: Some figures (e.g., Figure 2 panels) do not consistently mention how many replicates were used (biological vs. technical) for each experimental group. Standardizing this across all panels would improve reproducibility.
  
  (3) Expand on potential sex differences: The DMM model is applied only in male mice, which is noted in the methods. It would be helpful if the authors added 1-2 lines in the discussion acknowledging potential sex-based differences in OA progression and GPNMB function.
  
  (4) Visual clarity in schematic (Figure 7): The proposed mechanism is helpful but the text within the schematic is somewhat dense and could be made more readable with spacing or enlarged font. Also, label the MAPK/ERK pathway explicitly in panel B.
  
  Comments on revisions:
  
  The authors have addressed all the concerns raised in the initial review.
  
  Review 2
4. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the original reviews.
  
  Reviewer #1 (Public Reviews):
  
  Weaknesses:
  
  A limitation of the study is the reliance on standard techniques; however, this is a minor concern that does not diminish the overall impact or significance of the work.
  
  We agree that standard techniques were utilized. We believe this approach enhances the reliability and reproducibility of our findings. These methods are well-validated in the field and allow for robust interpretation of the results presented.
  
  Reviewer #2 (Public Reviews):
  
  Weaknesses:
  
  (1) Clarify the strain background of the DBA/2J GPNMB+ mice: While DBA/2J GPNMB+ is described as a control, it would help to explicitly state whether these are transgenically rescued mice or another background strain. Are they littermates, congenic, or a separate colony?
  
  The following language was added to the manuscript, “The DBA/2J GPNMB+ mice are a coisogenic strain purchased from Jackson Laboratories. Jackon Laboratories generated these mice by knocking in the wild-type allele of Gpnmb into the DBA/2J background. By doing so, they rescued the phenotype of the DBA/2J mice. This description has been highlighted in our previous publications (Abdelmagid et al., 2014; Abdelmagid et al., 2015).”
  
  (2) Provide exact sample sizes and variance in all figure legends: Some figures (e.g., Figure 2 panels) do not consistently mention how many replicates were used (biological vs. technical) for each experimental group. Standardizing this across all panels would improve reproducibility.
  
  The manuscript has been updated to include replicates in each figure legend.
  
  (3) Expand on potential sex differences: The DMM model is applied only in male mice, which is noted in the methods. It would be helpful if the authors added 1-2 lines in the discussion acknowledging potential sex-based differences in OA progression and GPNMB function.
  
  To our knowledge there are no sexbased differences in OA progression and GPNMB function in the literature. It was initially reported that only male C57BL/6J mice (Jackson Laboratories) develop OA following DMM however, recent literature has shown that both male and female mice develop the disease (Hwang et al., 2021; Ma et al., 2007). For the purpose of this manuscript, only male mice were used to provide preliminary results, however, we plan to repeat the included studies in female mice in the near future.
  
  (4) Visual clarity in schematic (Figure 7): The proposed mechanism is helpful, but the text within the schematic is somewhat dense and could be made more readable with spacing or enlarged font. Also, label the MAPK/ERK pathway explicitly in panel B.
  
  We updated the schematic diagram in figure 7 and the figure legend.
  
  Reviewer #1 (Recommendations for the Authors):
  
  Several concerns must be addressed to improve the clarity and scientific rigor of the manuscript:
  
  (1) Abstract: Specify which MMPs and MAPKs are modulated by osteoactivin.
  
  We specified the MMPs and clarified that GPNMB plays a role in pERK inhibition following inflammation induced by IL-1β stimulation.
  
  (2) Human explant validation: The regulation of MMP-9, MMP-13, and IL-6 should be validated in the human cartilage explant model to support the claim that "GPNMB has an anti-inflammatory role in human primary chondrocytes" (line 123). Additionally, the anatomical origin of the explants must be stated.
  
  Thank you very much for the recommendation. We agree that validating the explant culture for MMP-9, MMP-13, and IL-6 would strengthen our data. Unfortunately, this experiment has been terminated and we no longer have access to the tissue. Human explants were obtained from discarded knee articular cartilage following arthroplasty. The manuscript has been updated to include this information.
  
  (3) DBA/2J GPNMB expression: GPNMB is known to be produced as a truncated protein in DBA/2J cells. The manuscript should address why its expression is reduced. Does this involve mRNA instability? Also, the nomenclature "DBA/2J GPNMB+" versus "DBA/2J" is confusing, especially since both mRNA and protein are still detectable, albeit at reduced levels. Figure 2C is not convincing; therefore, Figures 2C and 2D can be omitted.
  
  The following language was added to the manuscript, “Our results are consistent with the literature which shows that that the GPNMB gene in DBA/2J mice carries a nonsense mutation that leads to reduced RNA stability (Anderson et al., 2008).” We can appreciate that the nomenclature "DBA/2J GPNMB+" versus "DBA/2J" could be confusing. However, this is the standard language used in multiple publications, and we want to remain consistent with the literature. Based on your recommendation we have removed Figure 2 C and D and updated the methods and results sections accordingly.
  
  (4) Figures 2J-L: The claim that gene expression changes are "significantly higher in DBA/2J animals compared to fold changes seen in chondrocytes from DBA/2J GPNMB+ controls" is not supported by the current presentation. The data should be plotted on the same graphs, and appropriate statistical analysis (e.g., two-way ANOVA) must be performed.
  
  Graphs for figure 2 have been updated and the appropriate analyses have been performed.
  
  (5) Figure 6: The GPNMB expression data in the presence and absence of IL-1β at 0 and 10 minutes are missing.
  
  We apologize for the confusion. We corrected the mistake and removed the mention of the timepoints 0 and 10 minutes.
  
  Reviewer #2 (Recommendations for the Authors):
  
  Consider unifying terminology around "GPNMB" and "osteoactivin": The term "osteoactivin" is used in some contexts and "GPNMB" in others. Since the focus is GPNMB's role in cartilage, suggest using a single term throughout to prevent confusion.
  
  Thank you for your comment. We include osteoactivin for clarification purposes once in the abstract, introduction and discussion.
  
  In summary, we believe we have addressed all comments/concerns raised by the reviewers. We appreciate the opportunity to improve the quality of our manuscript.
  
  References
  
  Abdelmagid, S. M., Belcher, J. Y., Moussa, F. M., Lababidi, S. L., Sondag, G. R., Novak, K. M., Sanyurah, A. S., Frara, N. A., Razmpour, R., & Del Carpio-Cano, F. E. (2014). Mutation in osteoactivin decreases bone formation in vivo and osteoblast differentiation in vitro. The American journal of pathology, 184(3), 697-713.
  
  Abdelmagid, S. M., Sondag, G. R., Moussa, F. M., Belcher, J. Y., Yu, B., Stinnett, H., Novak, K., Mbimba, T., Khol, M., Hankenson, K. D., Malcuit, C., & Safadi, F. F. (2015). Mutation in Osteoactivin Promotes Receptor Activator of NFκB Ligand (RANKL)-mediated Osteoclast Differentiation and Survival but Inhibits Osteoclast Function. J Biol Chem, 290(33), 2012820146. https://doi.org/10.1074/jbc.M114.624270
  
  Anderson, M. G., Nair, K. S., Amonoo, L. A., Mehalow, A., Trantow, C. M., Masli, S., & John, S. W. (2008). GpnmbR 150Xallele must be present in bone marrow derived cells to mediate DBA/2J glaucoma. BMC genetics, 9(1), 1-14.
  
  Hwang, H., Park, I., Hong, J., Kim, J., & Kim, H. (2021). Comparison of joint degeneration and pain in male and female mice in DMM model of osteoarthritis. Osteoarthritis and Cartilage, 29(5), 728738.
  
  Ma, H.-L., Blanchet, T., Peluso, D., Hopkins, B., Morris, E., & Glasson, S. (2007). Osteoarthritis severity is sex dependent in a surgical mouse model. Osteoarthritis and Cartilage, 15(6), 695-700.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

AuthorResponse

Review 2

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.06.06.658389v2
www.biorxiv.org www.biorxiv.org

A dentate gyrus-CA3 inhibitory circuit promotes evolution of hippocampal-cortical ensembles during memory consolidation

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This ms targets an interesting question, whether changes of feedforward inhibition at the DG-CA3 synapses regulate the representational capabilities of contextual fear memory at CA1 and the anterior cingulate cortex (ACC). The paper exploits a recent tool developed by the group (viral-mediated shRNA interference of Ablim3 in DG), to enhance PV+ mediated inhibition of CA3 pyramidal cells by increasing both their recruitment by DG cells and their number of contacts over postsynaptic cells. Using micro-endoscopic imaging of mice experiencing contextual fear conditioning, the authors nicely evaluate the effect of feedforward inhibitory control of CA3 outputs in the formation, stabilization and specificity of contextual fear memory representations in the CA1 and ACC. Data is relevant to understand how specific microcircuit motifs can influence representational dynamics in downstream regions. I have some methodological comments and recommendations for authors to improve their presentation and to exclude potential confounding factors.
  
  1- Since imaging is performed in CA1 and ACC separately, the study design entails 4 groups: shNT vs shRNA which is the main experimental manipulation, plus CA1 vs ACC. While data is in general carefully presented, some analysis may require additional validation to discard whether some regional effects caused by manipulation may actually reflect group differences. This is important because there may be some differences between ACC and CA1 groups in some behavioral readout (e.g. Fig.2c; Fig.S2b) which may actually explains different effect of manipulation. Formal comparisons of behavior in ACC and CA1 shNT groups may be required to discard this effect.
  
  We compared behavior data in the control groups across brain region to test if our calcium imaging findings are driven by differences in groups rather than virus manipulation. We did not find a significant difference for any of the data sets (see figure legend Rebuttal Figure 1 a-d for details). In general, we tried to avoid presenting the same (or part of the same) dataset in multiple figures. An alternative would be to plot all 4 groups in 1 graph and test as such but that would decrease readability in our opinion. Therefore, we are happy to provide the additional graphs and analysis but prefer not to include them in the main manuscript. (Rebuttal Figure 1a-d).
  
  2- Differences of activity level (calcium rate) are examined using bins of 5 seconds for a total of 360 sec of exploratory activity. To discard motility effects an analysis is implemented using 1 sec bins. Thus, the two data samples are not commensurate. Also, an ANOVA on calcium rate is applied over uneven multiple comparisons to account for statistical effects of region x time or context x time. This is relevant for fig.1g vs 1i and Fig.S2j,l and may require correction.
  
  We assume you mean “1 minute” and not “1 sec” here. We presented the two datasets (calcium event rate) and moving index indeed using different time bins (5 sec and 1 minute respectively). It is true that a difference in binning and therefore different sample size in one factor (time) could affect the result of the ANOVA. Rebuttal Figure 1 e-f shows the behavior comparison made in Suppl.Figure 2b in the original manuscript with a 5 second bin. A 2-Way ANOVA with repeated measurements reveals no main virus effect [Two-way repeated measures ANOVA, ACC (e): virus x time effect 0.0113; virus main effect N.S., time main effect N.S., n=5 per group; CA1 (f): virus x time effect N.S.; virus main effect N.S., time main effect N.S., n=5 shNT, n=6 shRNA]. In ACC, we find a significant interaction effect but a posthoc Sidak test did not reveal a difference between virus groups at any time point. This confirms our previous findings that differences in movement do not seem to drive the differences between virus groups.
  
  3- Fig.3 nicely show accurate context classification based on calcium activity from A&C contexts neurons using support-vector machine. The authors report very interesting representational effects for shNT vs shRNA manipulations. Is prediction accuracy of the SVM classifier correlated with behavioral discrimination? That would reinforce conclusions.
  
  Thank you for raising this very interesting point and indeed, we found a positive correlation between the discrimination ratio and the accuracy of the SVM classifier (Pearson’s r, shNT: R2 = 0.5794, p= 0.0282, n=4; shRNA: R2= 0.5771, p= 0.0288 , n=4. We added these data in Figure 4 (Figure 4c) and in Rebuttal Figure 1g.
  
  Regarding conclusions and physiological relevance, the authors may need to discuss why enhanced feedforward inhibition at DG-CA3 synapses is not naturally established given the beneficial effect in context discrimination.
  
  We apologize that we did not make that aspect of our manipulation clearer in our discussion. We edited the introduction and discussion (LL 65, LL 365) to clearly convey that FFI in DG-CA3 is naturally temporarily increased following learning (Ruediger 2011, Ruediger 2012, Guo et al 2018).
  
  Reviewer #3 (Public Review):
  
  In this study, Twarkowski et al. aim to understand the role of a specific circuit motif, dentate gyrus (DG) to CA3 feed-forward inhibition (FFI), for memory encoding and consolidation. FFI is a ubiquitous circuit motif in the brain. As a result, providing insights on its function is an interesting and a potentially very impactful contribution to neuroscience.
  
  To tackle this issue, the authors describe how increasing DG-CA3 FFI impacts the ensemble activity in hippocampal area CA1 and the anterior cingulate cortex (ACC) in mice undergoing a contextual fear conditioning paradigm. To selectively increase FFI onto CA3 neurons, the study uses a molecular tool (downregulation of Ablim3 using virally mediated expression of shRNA), which has been developed by the same group (Guo et al, 2018, Nature Medicine). The impact of this manipulation is assessed via chronic in vivo one-photon Ca2+ imaging of dorsal CA1 and ACC neurons on the day of fear conditioning, one day after (recent recall), and 16 days after (remote recall) the fear conditioning. During and after fear conditioning, the results show in both experimental groups (shRNA and control) various population activity changes in both CA1 and ACC. Furthermore, the study finds improved context discrimination in the shRNA group only at the remote recall timepoint. The authors' conclusion is that increasing FFI enhances the formation of learning-specific ensembles, first in CA1 and later in ACC, which is associated with an improved memory recall. The experiments presented here were very technically challenging and produced a comprehensive and valuable dataset describing the parallel ensemble activity changes in CA1 and ACC after fear conditioning, with or without increasing DG-CA3 FFI. However, a causal relationship between the manipulation of DG-CA3 FFI, the network activity changes in CA1 and ACC, and the behavioral improvement is, in my opinion, not fully demonstrated. This is for a couple of reasons:
  
  1) The magnitude of the effect of the shRNA manipulation on the immediate downstream area CA3 remains unclear. Therefore, the findings in the downstream areas CA1 or even ACC (which is at least three synapses removed from CA3) are, in my opinion, difficult to interpret. This uncertainty includes (1) the extent of the virus injection in the dentate gyrus and the extent of subsequent changes in CA3, and (2) the effect of the manipulation on CA3 pyramidal cell activity in vivo. The original paper (Guo et al, 2018) uses in vitro voltage-clamp recordings to record EPSCs/IPSCs in CA3, but does not exclude possible compensatory changes in vivo, e.g., in the excitability of CA3 neurons, which could result from increasing FFI chronically over a few weeks. The data in Figures 1f and g seems to suggest that there are baseline activity changes in CA1, which might be caused by changes in the upstream CA3 network activity. Along the same lines, I am unsure how to interpret the comparisons between CA1 and ACC in Figure 1; within brain region comparisons are more relevant and should be shown instead.
  
  This is a great point and was raised by all reviewers. We acknowledge the weakness of this comparison, apologize for this misstep in our analysis and have accordingly, removed this dataset from our manuscript. Instead, we performed new experiments using in vivo electrophysiology to allow for cross-region comparison of LFPs in CA1 and ACC within the same animal. We removed data from Figure 1 e-i and added new, simultaneous electrophysiological LFP recordings (Figure 5 and supplementary Figure 4 in revised manuscript).
  
  We found an increased number of CA1 ripples that are coupled with ACC spindles (“coupled ripples”) in shRNA mice compared to control mice prior to a learning event (Figure 5c, two-tailed unpaired student’s t-test with Welch’s correction, p=0.0499, n=5) with no difference in time spend in slow-wave sleep (SWS) (supplementary Figure 4a) or total numbers of spindles or ripples (supplementary Figure 4b-c). Control mice show a learning-dependent increase in coupled ripples (Figure 5f, two-tailed paired student’s t-test, p=0.019, n=5) to a similar level as seen in shRNA mice prior to learning. No further increase is seen in shRNA mice indicating a saturation of circuit changes that cannot be further amplified following learning.
  
  2) Several parameters are used in this study to describe the network activity in CA1 and ACC. These include the number of correlated neuron pairs, the number of neurons active in both the training context and a neutral context (so-called A-C neurons), or the event rate observed in these A-C neurons. Most of the activity changes observed do not appear specific to the shRNA group and occur also under control condition, suggesting that they are not caused by an increase in DG-CA3 FFI. It would be helpful to clarify the sequence, how increasing FFI onto CA3 is hypothesized to cause the changes in CA1 or even ACC.
  
  We apologize for failing to make this clearer. Prior work has shown that learning increases FFI in DG-CA3 and downregulates Ablim3 in DG (Ruediger 2011, 2012, Guo et al 2018). Therefore, it is not surprising that we observe similar changes in the control (shNT) group as shRNA group.
  
  From previous work we know that shNT mice show increased DG-CA3 FFI following learning (training day) for approximately 24 hours (Guo et al, 2018). Thus, our manipulation allows us to mimic and boost a naturally occurring learning-induced synaptic modification in an inhibitory microcircuit in DGCA3 and examine the impact on network mechanisms underlying systems consolidation. Importantly, enhanced feedforward inhibition at the DG-CA3 synapses is naturally established for several hours following a spatial learning event (see Ruediger et al, 2011, Guo et al, 2018). Leveraging a molecular tool to enhance FFI prior to learning, we were able to reveal that DG-CA3 FFI plays a role in tuning the circuit towards cross-regional long-term storage of precise neuronal representations. (see also edits in text, LL 365).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.21.445117v1
www.biorxiv.org www.biorxiv.org

TLR4 regulation in human fetal membranes as an explicative mechanism of a pathological preterm case

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  [...]
  
  A notable shortcoming of the authors' interpretation is the generalization of their findings to preterm premature rupture of membranes (PPROM). As noted by the authors, term labor is considered a "sterile" process, which is particularly important in terms of the authors' findings since TLR4 in the fetal membranes may be responding to endogenous signals such as danger signals. However, a large proportion of PPROM cases are associated with microbial invasion of the amniotic cavity, and thus in this context TLR4 would be responding to bacterial products.
  
  To bring in some new elements and address this reviewer’s concern, along with the potential extrapolation between physiological rupture and pathological rupture in the case of PPROM, we decided first to remove Figure 3C (expression of TLR4 in the presence of LPS from bacterial origin) from the revised version of the manuscript. To address this comment, it is well known that the percentage of PPROM associated with microbial invasion are variable based on the weeks of gestation. In fact, early gestational ages are clearly linked to high-microbial-associated intra-amniotic inflammation prevalence (64.3% when <25 WGA) whereas this percentage subsequently decreases throughout gestation (Romero et al., 2015), reaching one-third at term, which better links with the gestational stage of the current study. Such observations support the fact that the TLR4 model in physiological rupture could be transposed—at least in part—to sterile PPROM and initiated by the presence of alarmins (i.e., HMGB1) and their binding to such type of receptors. Indeed, TLR4 is now well described as being stimulated by ligands other than LPS, such as HMGB1, a member of the DAMPs (Robertson et al., 2020). Furthermore, the quantification of TLR4 mRNA expression and protein in the case of PPROM without chorioamnionitis compared with term no labor without chorioamnionitis was already carried out (Kim et al., 2004), indicating an absence of clear link between the chorioamnionitis and TLR4 expression. Finally, in an animal model of PPROM, an article underlined the importance of TLR4 in preterm labor by using TLR4 mice mutants in a sterile context (Wahid et al., 2015).
  
  It is a well-known concept that TLR4 is expressed by the fetal membranes and is responsive to LPS stimulation, and thus the confirmatory set of experiments performed by the authors do not seem to be as novel. Indeed, given that this study was focused on the "sterile" process of term labor, perhaps the utilization of danger signals that can interact with TLR4 would be more appropriate.
  
  The choice to use LPS (Figure 3C) was only to confirm that TLR4 leads to a proinflammation activation in the amnion and choriodecidua, demonstrating the functional pathway after TLR4 activation in the fetal membranes environment. We completely agree these are not novel data; this is why we decided to remove this part of results in the revised version of the manuscript. Furthermore, we decided to not repeat the use of DAMPs (such as HMGB1) to stimulate the TLR4 pathway in this work because it was already published in the fetal membranes context (Bredeson et al., 2014). To be in accordance with your comments, we have modified the end of the results paragraph entitled ‘Combination of transcriptomic and methylomic results in the ZAM zone demonstrate that genes more expressed in the choriodecidua are linked to pregnancy pathologies’ to better justify the choice to focus on TLR4 global transcriptional regulation.
  
  The distinction between the ZAM and ZIM seems to have been lost among the TLR4-focused experiments, and thus it is unclear how these fetal membrane zones fit into the conceptual model proposed by the authors in the final figure.
  
  The reviewer is correct here, so to avoid confusion between the ZIM and ZAM used, we decided to do the following: - Read carefully all the successive paragraphs of the results to check for the presence of ‘ZAM specification’ - Add ‘ZAM’ in the legend of Figure 4. This information was present in the related text of the article. - Update Figure 7 and its legend (model of regulation). We had ‘ZAM zone’ in the discussion part regarding Figure 7.
  
  The study is largely descriptive and would benefit from the addition of fetal membrane tissues from pregnancy complications such as PPROM and/or animal models in which premature rupture of the membranes has been induced.
  
  We agree that animal models are available. Nevertheless, we considered that such models are far from the human reality. In fact, animal models are often used for fetal membrane studies, but they are different regarding pregnancy physiology, structure and uterine environment, which hamper their use. We used ‘term’ fetal membrane to decipher the physiological rupture of membrane and demonstrate the importance of the TLR4 actor. To bring some elements regarding this comment and the possible extrapolation between physiological rupture and pathological rupture in the case of PPROM, we decided to remove Figure 3C (expression of TLR4 in the presence of LPS from bacterial origin) to focus more on the physiological rupture of fetal membranes without the involvement of bacterial presence. Previous bibliographic data answer the reviewer’s question: Kim et al. (2004) well demonstrated that TLR4 mRNA levels are higher in PPROM (31.2 weeks of gestation) fetal membranes without chorioamnionitis than in term (39.1 week of gestation) ones without chorioamnionitis.
  
  The study focuses on the mechanisms of rupture of membranes, but does not provide an explanation as to how the regulation of TLR4 mediates the process of membrane rupture.
  
  We agree with your comment; however, ‘how the regulation of TLR4 mediates the process of membrane rupture’ is not the topic of the manuscript. In addition, this has already been well established in previous publications. Nevertheless, we added a sentence in the introduction part between the lines 97-100 : ‘The mechanisms implying TLR4 in the physiological or pathological rupture of membrane in case of PPROM are well known. Triggering TLR4 will lead to NFκB activation, leading to an increase of the release of proinflammatory cytokine, concentration of matrix metalloprotease and prostaglandin, which are well established actors of fetal membrane rupture (Robertson et al., 2020).
  
  Reviewer #2 (Public Review):
  
  This is a well-conceived and executed paper that adds novel data to improve our understanding of rupture of the human fetal membranes. The new information presented not only addresses gaps in our understanding of normal parturition mechanisms but also the significant issue of preterm birth. The authors highlight the need to understand the understudied human fetal membranes to be able to understand its role in normal parturition but also to lower the rates of preterm birth. They not only establish the need to study this tissue but also to improve our appreciation for regional differences within it, using a comprehensive genetic approach. The authors provide data from a genome wide methylation study and cross reference this with transcriptome data. Using this new knowledge, they then zero in on a specific gene of interest TLR4. This receptor is already established as an extremely important receptor for preterm birth but little is known about its role in normal parturition. Strengths of this paper stem from the comprehensive data set provided, answering both the questions pertaining to the specific aims of this paper but also potentially future questions and providing potential focused targets of study. One example of this may be the common methylated genes that are found in both the ZIM and ZAM, illustrating not regional changes but gestational programming of this tissue.
  
  We thank the reviewer for the positive and constructive comments regarding the article. Following all the reviewers’ comments, we now have an improved version.
  
  Reviewer #3 (Public Review):
  
  Manuscript by Belville et al describes the significance of epigenetic and transcription associated changes to TLR4 as a mechanistic event for sterile inflammation associated with fetal membrane weakening, specifically in the zone of altered morphology. This manuscript is timely in an understudied area of research.
  
  The authors have taken an extensive set of experiments to derive their conclusions.
  
  However, it is unclear why the focus is on TLR4. Although LPS is a ligand for TLR4, gram negative infections are rare in PPROM but mostly genital Mycoplasmas. The methylome and transcriptome analysis does not necessarily warrant examination of a single marker. A clear rationale would need to be included.
  
  We would like to thank the reviewer for their comments regarding the article. For the last part of the public review, we would like to underline the following:
  
  -The choice of focusing on TLR4 is explained in the article text between lines 161 and 165 by the following sentences: ‘Of all the genes classified in these processes, TLR4 was the only one represented in all these biological processes and, therefore, seems to play a central role in parturition at term. To validate this in-silico observation and pave the way for describing TLR4’s importance, immunofluorescence experiments were first conducted to confirm the protein’s presence in the amnion and choriodecidua of the ZAM (Figure 3B)’. Furthermore, this choice arises from analysis described in Figure 3A, which underlines that the four GO terms most represented have only one common gene: ‘TLR4’. The combination of two high-scale studies does not permit us to individually characterize how each gene is regulated. Nevertheless, the focus on TLR4 provides an original and interesting hypothesis on how a specific layer regulation between the amnion and choriodecidua could be cellular realised in the ZAM’s weaker zone. Finally, because the high-scale study results are public, this type of analysis could be conducted on other candidate genes.
  
  -Throughout the text, we changed all the ‘E. Coli’ to ‘Gram-negative bacteria’. Furthermore, as found in the literature, genital mycoplasma are considered ‘Gram-negative bacteria’. We focused on the ‘sterile inflammation phenomenon’, and to support the hypothesis concerning the importance of TLR4, we realised a supplementary transcriptome ‘ZAM heatmap’, which confirmed a sur-expression of DAMP in choriodecidua, S100A7, A8 and A9, for example, which are well-known ligands of TLR4 (given below as an image).
  
  Heatmap of genes differentially expressed in the ZAM zone in relation to the sterile inflammation phenomenon.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.28.450131v1
www.medrxiv.org www.medrxiv.org

Prediction of type 2 diabetes mellitus onset using logistic regression-based scoreboards

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  The authors analyzed several models for predicting the early onset of T2D, where they trained and tested on a UKB based cohort, aged 40 - 69 and suggest two simple logistic regression models: the anthropometric and the five blood tests models in reference to FINDRISC and GDRS models. Their models achieved better auROC, APS, and decile prevalence OR, and better-calibrated predictions.
  
  Strengths:
  
  1.The authors have neatly explained their objectives and performed well-justified analyses.
  
  2.The authors highlight how using both features - HbA1C% measure and reticulocyte count may provide a better indication of the average blood sugar level during the last two-three months than using just the standard HbA1C% measure.
  
  3.Further verification of the proposed anthropometric-based and 5 blood-test results-based modelscan discriminate discriminating within a group of normoglycemic participants and within a group of pre-diabetic participants resulted in outperforming the FINDRISC and the GDRS based models.
  
  Weaknesses:
  
  As the authors point out in the manuscript that these models are suited for the UKB cohort or populations with similar characteristics. It limits the extrapolation of these findings onto another cohort from a different background until analyzed on another country/continent-based cohort.
  
  We agree with this comment as we indeed pointed in the paper. We recommend to adjust these models when applying it to populations with distinct characteristics.
  
  In the methods section, an additional explanation of how the T2D prevalence bins were formed would be useful to a reader.
  
  We thank the reviewer for this note, we added the following explanation in section 4.11: “We considered several potential risk score limits that separate T2D onset probability in each of the scores groups, and we chose boundaries that showed a separation between the risk groups on the validation datasets. Once we decided on the boundaries of the score, we report the prevalence in each risk group on the test set and we report these results.”
  
  The authors have mentioned that the prevalence of diabetes has been rising more rapidly in low and middle-income countries (LMICs) than in high-income countries and the objective of the present research was to develop clinically usable models which are easy to use and highly predictive of T2D onset. As lifestyle is also one of the contributory factors for T2D, additional analysis that includes a comparison of groups between low-income and high-income subjects within UKB-based cohort provided such metadata available would help understand if the prevalence for T2D differs or not between such groups.
  
  We thank the reviewer for this comment, we added below an analysis that we run on our data, showing the deprivation indexes differences between sick and healthy populations. The sick population has a higher deprivation index as expected. When running a Mann-Whitney U Test on the data we get a p value of zero, creating this with a sample of just 1000 participants from each group, we get a p-value of 2.37e-137. This indicates that there is a significant correlation between deprivation index and tendency to develop T2D. We also add this finding to the supplementary material and a reference to it.
  
  You can also find below a SHAP diagram showing tht higher Townsend deprivation index is pushing the prediction for T2D upwards.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2020.08.02.20165092v2
www.biorxiv.org www.biorxiv.org

New submission 06/06/2022, 15:02:00

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Summary: This substantial collaborative effort utilized virus-based retrograde tracing from cervical, thoracic and lumbar spinal cord injection sites, tissue clearing and cutting-edge imaging to develop a supraspinal connectome or map of neurons in the brain that project to the spinal cord. The need for such a connectome-atlas resource is nicely described, and the combination of the actual data with the means to probe that data is truly outstanding.
  
  They then compared the connectome from intact mice to those of mice with mild, moderate and severe spinal cord injuries to reveal the neuronal populations that retain axons and synapses below the level of injury. Finally, they look for correlations between the remaining neuronal populations and functional recovery to reveal which are likely contributing to recovery and its variability after injury. Overall, they successfully achieve their primary goals with the following caveats: The injury model chosen is not the most widely employed in the field, and the anatomical assessment of the injuries is incomplete/not ideal.
  
  Concerns/issues:
  
  1) I would like to see additional discussion/rationale for the chosen injury model and how it compares to other more commonly employed animal models and clinical injuries. Please relate how what is being observed with the supraspinal connectome might be different for these other models and for clinical injuries.
  
  We have added text to the Results and Discussion to explain our rationale for selecting the crush injury model, and to acknowledge differences between this model and more clinically relevant contusion models. (Results: line 360-364, Discussion 608-615). We agree wholeheartedly that a critical future direction will be to deploy brain-wide quantification in contusion models, and we are currently seeking funding to obtain the needed equipment.
  
  2) The assessment of the thoracic injuries employed is not ideal because it provides no anatomical description of spared white matter (or numbers of spared axons) at the injury epicenter.
  
  We address this more fully in the related point below. Briefly, we agree with a need to improve the assessment of the lesion but are hampered by tissue availability. We are unable to assess white matter sparing but can offer quantification of the width of residual astrocyte tissue bridges in four spinal sections from each animal (new Figure 5 – figure supplement 3). As discussed below, however, we recognize the limitations of the lesion assessment and agree with the larger point that the current quantification methods do not position us to make claims about the relative efficacy of spinal injury analyses versus whole-brain sparing analyses to stratify severity or predict outcomes. Our approach should be seen as a complement, not a substitute, for existing lesion-based analyses. We have edited language throughout the manuscript to make this position clearer.
  
  3) Related to this, but an issue that requires separate attention is the highly variable appearance of the injury and tracer/virus injection sites, the variability in the spatial relationship with labeled neurons (lumbar) and how these differences could influence labeling, sprouting of axons of passage and interpretation of the data. In particular this is referring to the data shown in Figure 6 (and related data).
  
  It is true that there is some variability in the relative position of the injury and injection, a surgical reality. The degree of variability was perhaps exaggerated in the original Figure 6 (Now Figure 5), in which one image came from one of two animals in the cohort with a notably larger gap between the injury and injection. Nevertheless, this comment raises the important question of how variability in injection-to-injury distance might affect supraspinal label. First, we would emphasize the data in Figure 1 – Figure Supplement 6, in which we showed that the number of retrogradely labeled supraspinal neurons is relatively stable as injection sites are deliberately varied across the lower thoracic and lumbar cord. Indeed, the question raised here is precisely the reason we performed this early test to determine how sensitive the results might be to shifts in segmental targeting. The results indicate that retrograde labeling is fairly insensitive to L1 versus L4 targeting. As an additional check for this specific experiment we also measured the distance between the rostral spread of viral label and the caudal edge of the lesion and plotted it against the total number of retrogradely labeled neurons in the brain. If a smaller injury/injection gap favored more labeling we might expect negative correlation, but none is apparent. We conclude that although the injury/injection distance did vary in the experiment, it likely did not exert a strong influence on retrograde labeling.
  
  Reviewer #3 (Public Review):
  
  In this manuscript, Wang et al describe a series of experiments aimed at optimizing the experimental and computational approach to the detection of projection-specific neurons across the entire mouse brain. This work builds on a large body of work that has developed nuclear-fused viral labelling, next-generation fluorophores, tissue clearing, image registration, and automated cell segmentation. They apply their techniques to understand projection-specific patterns of supraspinal neurons to the cervical and lumbar spinal cord, and to reveal brain and brainstem connections that are preferentially spared or lost after spinal cord injury.
  
  Strengths:
  
  Although this work does not put forward any fundamentally new methodologies, their careful optimization of the experimental and quantification process will be appreciated by other laboratories attempting to use these types of methods. Moreover, the observations of topological arrangement of various supraspinal centres are important and I believe will be interesting to others in the field.
  
  The web app provided by the authors provides a nice interface for users to explore these data. I think this will be appreciated by people in the field interested in what happens to their brain or brainstem region of interest.
  
  Weaknesses:
  
  Overall the work is well done; however, some of the novelty claims should be better aligned with the experimental findings. Moreover, the statistical approaches put forward to understand the relationship between spinal cord injury severity and cell counts across the mouse brain needs to be more carefully considered.
  
  The authors state that they provide an experimental platform for these types of analysis to be done. My apologies if I missed it but I could not find anywhere the information on viral construct availability or code availability to reproduce the results. Certainly both of these aspects would be required for people to replicate the pipeline. Moreover, the described methodology for imaging and processing is quite sparse. While I appreciate that this information is widely provided in papers that have developed these methods, I do not think it is appropriate to claim to have provided a platform for people to enable these types of analyses without a more in-depth description of the methods. Alternatively, the authors could instead focus on how they optimized current methodologies and avoid the overstatement that this work provides a tool for users. The exception to this is of course the viral constructs, the plasmids of which should be deposited.
  
  We agree that we have not provided a tool per se, more of an example that could be followed. We have revised language in the abstract, introduction, and discussion to make it clear that we optimized existing methods and provide an example of how this can be done, but are not offering a “plug and play” solution to the problem of registration that would, for example, allow upload of external data. For example, in the abstract we replaced “We now provide an experimental platform” with “Here we assemble an experimental workflow.” (Line 28). The term “platform” no longer appears in the manuscript and has been replaced throughout by “example.” We how this matches the intention of the comment and are happy to revise further as needed. Note that the plasmids have been deposited to Addgene.
  
  It was not completely to me clear why or when the authors switch back and forth between different resolutions throughout the manuscript. In the abstract it states that 60 regions were examined, but elsewhere the number is as many as 500. My understanding is that current versions of the Allen Brain Annotation include more than 2000 regions. I think it would make things clear for the readers if a single resolution was used throughout, or at least justified narratively throughout the text to avoid confusion.
  
  Thank you for pointing this out. The Cellfinder application recognizes 645 discrete regions in the brain, and across all experiments we detected supraspinal nuclei in 69 of these. This number, however, includes some very fine distinctions, for example three separate subregions of vestibular nuclei, three subregions of the superior olivary complex, etc. True experts may desire this level of information, but with the goal of accessibility we find it useful to collapse closely related / adjacent regions to an umbrella term. Doing so generates a list of 25 grouped or summary regions. In the revised version we move the 69-region data completely to the supplemental data (there for the experts who wish to parse), and use the consistent 25-region system (plus cervical spinal cord in later sections) to present data in the main figures. We have added text to the Results section (lines 157-162) to clarify this grouping system.
  
  The others provide an interesting analysis of the difference between cervical and lumbar projections. I think this might be one of the more interesting aspects of the paper - yet I found myself a bit confused by the analysis, and whether any of the differences observed were robust. Just prior to this experiment the authors provide a comparison of the mScarlet vs. the mGL, and demonstrate that mGL may label more cells. Yet, in the cervical vs. lumbar analysis it appears they are being treated 1 to 1. Moreover, I could not find any actual statistical analysis of this data? My impression would be that given the potential difference in labelling efficiency between the mScarlet and mGL this should be done using some kind of count analysis that takes into account the overall number of neurons labelled, such as a Chi-sq test or perhaps something more sophisticated. Then, with this kind of statistical analysis in place, do any of the discussed differences hold up? If not, I do not think this would detract from the interesting topological observations - but would call on the authors to be a bit more conservative about their statements and discussion regarding differences in the proportions of neurons projecting to certain supraspinal centers.
  
  This is an important point. In response to this input and related comments from other reviewers we performed new experiments to assess co-localization. The new data address the point above by including quantification of the degree of colocalization that results from titer-matched co-injection of the two fluorophores, providing baseline data. The results of this can be found in Figure 3 – figure supplement 3 and form the basis for statistical comparisons to experimental animals shown in Figure 3.
  
  Finally, I do have some concerns about the author's use of linear regression in their analysis of brain regions after varying severities of SCI. First of all, the BMS score is notoriously non-linear. Despite wide use of linear regressions in the field to attempt to associate various outcomes to these kinds of ordinal measures, this is not appropriate. Some have suggested a rank conversion of the BMS prior to linear analyses, but even this comes with its own problems. Ultimately, the authors have here 2-3 clear cohorts of behavioral scores and drawing a linear regression between these is unlikely to be robustly informative. Moreover, it is unclear whether the authors properly adjusted their p-values from running these regressions on 60 (600?) regions. Finally, the statement in the abstract and discussion that the authors "explain more variability" compared to typical lesion severity analysis is also unsupported. My suggestion would be the following:
  
  Remove the linear regression analyses associated with BMS. I do not think these add value to the paper, and if anything provide a large window of false interpretation due to a violation of the assumptions of this test.
  
  Consider adding a more appropriate statistical analysis of the brain regions, such as a non-parametric group analysis. Knowing which brain regions are severity dependent, and which ones are not, would already be an interesting finding. This finding would not be confounded by any attempt to link it to crude measures of behavior.
  
  We agree that the linear regression approach was flawed and appreciate the opportunity to correct it. After consultation with two groups of statisticians we were forced to conclude that the data are simply underpowered for mixed model and ranking approaches. We therefore adopted a much simpler strategy. As you point out (and as noted by the statisticians), the behavioral data are bimodal; one group of animals regained plantar stepping ability, albeit with varying degrees of coordination (BMS 6-8), while the others showed at most rare plantar steps (BMS 0-3.5). We therefore asked whether the number of spared neurons in each brain region differed between the two groups and also examined the degree of “overlap” in the sparing values between the two groups. The data are now presented in Figure 6.
  
  If the authors would like to state anything about 'explaining more variability' then the proper statistical analysis should be used, which in this case would be to compare the models using a LRT or equivalent. However, as I mentioned it does not seem to be appropriate to be doing this with linear models so the authors should consider a non-linear equivalent if they choose to proceed with this.
  
  We thank the reviewer for the excellent suggestion. However as we explained above after consultation with two groups of statisticians we were forced to conclude that the data are underpowered and could not apply some of the methods suggested. Especially in light of our simplified analysis, we think it is better to remove any claims of the relative success of the sparing in different regions to explain more or less variability. Instead we can simply report that sparing in some regions, but not others, is significantly different between “low-performing” and “high-performing” groups.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.10.447885v2
www.biorxiv.org www.biorxiv.org

Evolution of cytokine production capacity in ancient and modern European populations

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This paper focuses on the role of historical evolutionary patterns that lead to genetic adaptation in cytokine production and immune mediated diseases including infectious, inflammatory, and autoimmune diseases. The overall goal of this research was to track the evolutionary trajectories of cytokine production capacity over time in a number of patients with different exposure to infectious organisms, infectious disease, autoimmune and inflammatory diseases using the 500 Functional Genomics cohort of the Human Functional Genomics Project. The identified cohort is made up of 534 individuals of Western European ancestry. Much of this focus is on the impact and limitations of certain datasets that they have chosen to use such as the "average genotyped dosage" to be substituted for missing variants and data interpretation.
  
  We fully agree with the reviewer, we replace missing variants in a sample with its average dosage in the entire dataset. This makes it so missing variants in a sample do not bias the trends over time we observe. If we were to correct it using only samples from within their own era we would be inflating differences between the different era's. Whereas only using shared variants would increase the noise for older samples due to higher error rates associated with DNA degradation.
  
  Moreover, some data pairings in the data set are not complete or had varying time points .
  
  The stimulation periods were chosen based on extensive studies that showed that the timepoints used were best suited for assessing monocyte-derived and lymphocyte-derived cytokines per stimulus. Not all the stimuli induce the production of all cytokines, so the selection of the cytokine-stimulus pairs was performed for those pairs in which a cytokine production could be measured (PMID: 1385767; PMID: 19380112; PMID: 27814509; PMID: 27814508; PMID: 27814507). The differences in the cytokine availability and time points are adjusted to the optimal time of production per stimuli. Monocyte-derived cytokines (IL-1b, IL-6 and TNFa) are early response cytokines, produced by innate immune cells shortly after stimulation. IFNg, IL-17 and IL-22 are lymphocyte-derived cytokines, produced by adaptive immune cells, in this case T helper cells. These cells need to differentiate for several days before they start to produce these cytokines, this is the reason why the time point of the measurements of these cytokines is 7 days. In the case of IFNg, it can also be produced by NK cells, so it was measured after 48h after stimulation in whole blood samples. We have included these considerations in the new version of the text (lines 82 to 87).
  
  Similarly, a split was done to look at before and after the Neolithic era and the linear regression correspond to those two eras. However, the authors do not comment or show the data to demonstrate why they choose that specific breakpoint as opposed to looking at every historical era transition, i.e., from early upper paleolithic to late upper paleolithic to Mesolithic to Neolithic to post-Neolithic to modern.
  
  We thank the reviewer for this remark and acknowledge that we do not address the rationale behind our choice to look at this split specifically sufficiently. We hypothesized that the start of the Neolithic with its increase in population density and contact with animals would also be a turning point for many immune responses and immune related traits. We added various analyses to better highlight this and also show differences between different adjacent time periods.
  
  -The original figures showed only models using two separate linear regression lines and the different thresholds for missing genotype rates showed consistent results. In the new figures we depict LOESS regression models to better show the difference in mean PRS at every point in time and we additionally show boxplots with the different major age periods pooling the paleolithic and mesolithic samples together as pre-neolithic samples in order to account for the lower sample number in the earlier historical periods. To highlight this we have added a new section in lines 123 to 129 and new versions of the figures 1, 2, 3 and 4.
  
  -In the new figure 2 we add LOESS regression models for which we do not bias our analysis into defining a break at a certain time period. We furthermore show boxplots with pairwise comparisons (student’s T-test) for broader time periods highlighting the changes in PRS that would correspond with major changes in human lifestyle such as the shift from a hunter-gatherer to a neolithic lifestyle or the rapid urbanization of human society.
  
  -In the new Figure 3 we confirm that the various traits showing a clear change in PRS start at the advent of the Neolithic or post-Neolithic era using both the LOESS regression and pairwise comparisons (student T-test).
  
  -Similarly the heatmap in our original figure 4 has also been revised to only show the large sample set.
  
  Lastly, the authors should highlight additional limitations of this current study in terms of the generalizability to other populations or to clearly state that this is limited to the European population at the specified latitude and longitudes used.
  
  We thank the reviewer for his feedback and agree we should put more emphasis on this. In our study we focus on summary statistics obtained from European populations and only employ European aDNA samples, so our results should not be extrapolated to other populations from other geographical areas. We have included this in the Discussion of the new version of the manuscript (lines 289 to 292). However, our findings are mostly in agreement with previous studies in other populations, which adds robustness to the results of our study.
  
  Reviewer #2 (Public Review):
  
  In "Evolution of cytokine production capacity in ancient and modern European populations", Dominguez-Andrés et al. collect a large amount of trait association data from various studies on immune-mediated disorders and cytokine production, and use this data to create polygenic scores in ancient genomes. They then use the scores to attempt to test whether the Neolithic transition was characterized by strong changes in the adaptive response to pathogens. The impact of pathogens in human prehistory and the evolutionary response to them is an intriguing line of inquiry that is now beginning to be approachable with the rapidly increasing availability of ancient genomes.
  
  While the study shows a commendable collection of association data, great expertise in immune biology and an interesting study question, the manuscript suffers from severe statistical issues, which makes me doubt the validity and robustness of their conclusions. I list my concerns below, in rough order of how important I believe they are to the claims of the paper:
  
  —In addition to the magnitude of an effect away from the null, P-values are a function of the amount of data one has to fit a model or test a hypothesis. In this case, the authors have vastly more data after the Neolithic Revolution than before, and so have much higher power to reject the null hypothesis of "no relationship to time" after the revolution than before. One can see this in the plots the authors provided, which show vastly more data after the Neolithic, and consequently a greater ability to fit a significant linear model (in any direction) afterwards as well.
  
  We thank the reviewer for raising this very important point. In order to account for this difference in sample size for the different historical periods we pooled all samples prior to the neolithic era together to test for differences in mean PRS between neighbouring historical periods. This way we lose some strength in terms of the carbon-dated age of each sample but we gain the ability to compare more different pairings than just pre- and post-neolithic samples. We added various analyses to better highlight this and also show differences between different adjacent time periods:
  
  -The original figures showed only models using two separate linear regression lines and the different thresholds for missing genotype rates showed consistent results. In the new figures we depict LOESS regression models to better show the difference in mean PRS at every point in time and we additionally show boxplots with the different major age periods pooling the paleolithic and mesolithic samples together as pre-neolithic samples in order to account for the lower sample number in the earlier historical periods. To highlight this we have added a new section in lines 123 to 129 and new versions of the Figures 1, 2, 3 and 4.
  
  -In the new figure 2 we add LOESS regression models for which we do not bias our analysis into defining a break at a certain time period. We furthermore show boxplots with pairwise comparisons (student’s T-test) for broader time periods highlighting the changes in PRS that would correspond with major changes in human lifestyle such as the shift from a hunter-gatherer to a neolithic lifestyle or the rapid urbanization of human society.
  
  -In the new figure 3 we confirm that the various traits showing a clear change in PRS start at the advent of the Neolithic or post-Neolithic era using both the LOESS regression and pairwise comparisons (student T-test).
  
  -Similarly the heatmap in our original figure 4 has also been revised to only show the large sample set.
  
  —The authors argue that Figure S2 makes their results robust to sample size differences, but showing a consistency in direction before and after downsampling in the post-neolithic samples is not enough, because:
  
  1) you still lack power to detect changes in direction before the Neolithic.
  
  2) even for the post-Neolithic, the relationship may be in the same direction but no longer significant after downsampling. How much the significance of the linear model fit is affected by the downsampling is not shown.
  
  We thank the reviewer for pointing this out. The low sample count dating back to before the Neolithic era makes it indeed hard to accurately detect changes in PRS significantly correlated with time. Instead, we now aim to pool these samples together and compare the distribution of their PRS with those of Neolithic samples to better be able to detect significant differences in PRS between these historical time periods.
  
  In order to show the significance of each linear model as well we now show the -Log10 of the P value multiplied by the sign of the correlation coefficient. This way we can better highlight the consistency in direction as well as significance and show that downsampling affects the order of significance. Please see the new Figure 4-figure supplement 1. We have also discussed this more in depth on lines 267-272 of the new version of the text.
  
  —The authors chose to test "relationship between PRS with time" before and after the Neolithic as a way to demonstrate that "the advent of the Neolithic was a turning point for immune-mediated traits in Europeans". A more appropriate way to test this would be creating a model that incorporates both sets of scores together, accounts for both sample size and genetic drift in the change of polygenic scores, and shows a significant shift occurs particularly in the Neolithic, rather in any other time period, instead of choosing the Neolithic as an "a priori" partition of the data. My guess is that one could have partitioned the data into pre- and post-Mesolithic and gotten similar results, largely due to imbalances in data availability.
  
  We agree with the reviewer that the exact pairing of the groups might influence the conclusions, showing the importance of remaining unbiased in our a priori partitioning of the data like the reviewer accurately pointed out. We aim to account for sample imbalances by pooling the paleolithic and mesolithic samples together and instead of just testing pre- versus post- Neolithic samples we perform a pairwise comparison between neighbouring historical periods using a T test thereby taking into account the sample size of each group.
  
  —The authors only talk about partitions before and after the Neolithic, but plots are colored by multiple other periods. Why is the pre- and post-Neolithic the only transition that is mentioned?
  
  Our initial hypothesis was that the pre-versus post-Neolithic shift was a turning point for immune responses. However, based on the suggestions of the reviewers, we have decided to perform the analysis in a more unbiased way, so we show the comparison of different individual era's. The new analyses and the new Figures provided address these issues.
  
  —Extrapolating polygenic scores to the distant past is especially problematic given recent findings about the poor portability of scores across populations (Martin et al. 2017, 2019) and the sensitivity of tests of polygenic adaptation to the choice of GWAS reference used to derive effect size estimates (Berg et al. 2019, Sohail et al. 2019). In addition to being more heavily under-represented, paleolithic hunter-gatherers are the most differentiated populations in the time series relative to the GWAS reference data, and so presumably they are also the genomes for which PGS estimates built using such a reference would have higher error (see, e.g. Rosenberg et al. 2019). Some analyses showing how believable these scores are is warranted (perhaps by comparing to phenotypes in distant present-day populations with equivalent amounts of differentiation to the GWAS panel).
  
  A similar study regarding standing height in ancient populations (PMID: 31594846) validated this approach when comparing polygenic scores based on modern populations with skeletal remains from ancient individuals. We do acknowledge the absolute results of the polygenic scores are less accurate for aDNA samples compared to a modern European cohort. The effect size estimates gained using a modern cohort are less accurate for aDNA samples than unrelated modern samples, and this is certainly an unavoidable limitation of the study.This is the reason why we focus on the direction of change of the trends and not on the absolute polygenic scores since such subtle differences do not affect the conclusions of our study.
  
  —In multiple parts of the paper, the authors mention "adaptation" as equivalent to the patterns they claim to have found, but alternative hypotheses like genetic drift are not tested (see e.g. Guo et al. 2018 for a review of methods that could be used for this).
  
  We thank the reviewer for this feedback. Based on this, we have added an Fst based test for selection to determine whether the changes we see in PRS over time are due to selection or due to genetic drift. This test shows that changes between the pre-Neolithic to Neolithic are not significantly different from drift whereas after the onset of the Neolithic we do see significant amount of selection. We have explained this further in the manuscript on lines 130-135 and included the new Table S2.
  
  New Table S2 : Tests for selection as opposed to genetic drift were performed between populations from adjacent time periods. A two tailed test was used to determine whether mean trait Fst between pre-Neolithic - Neolithic, Neolithic - post-Neolithic, and post-Neolithic - Modern samples was significantly different compared to 10000 random LD and MAF matched mean Fst’s calculated using a same amount of SNP’s.
  
  —250 kb window is too short a physical distance for ensuring associated loci that are included in the score are not in LD, and much shorter than standard approaches for building polygenic scores in a population genomic context (e.g. see Berg et al. 2019, Berisa et al. 2016). Is this a robust correction for LD?
  
  We thank the reviewer for this remark, we tested multiple thresholds for window sizes, increasing the window size from 250 kb to 500 kb and 1000 kb (please see below new Figure 1-figure supplement 2) Although the level of significance changes for a few traits, the direction of the change remains stable across the three thresholds, demonstrating the robustness of our results. We have chosen this approach because the aDNA samples present a too high error rate and contain a relatively high amount of missing data to accurately determine LD, and determining LD using a modern reference cohort would bias our analysis by assuming the aDNA samples have a similar LD structure as modern samples.
  
  New Figure 1-figure supplement 2: PRS correlation pre- and post-Neolithic revolution using polygenic scores calculated at varying window sizes.
  
  We have edited the manuscript accordingly to show the consistency between these varying window sizes on lines 111-113.
  
  —If one substitutes dosage with the average genotyped dosage for a variant from the entire dataset, then one is biasing towards the partitions of the dataset that are over-represented, in this case, post-Neolithic samples.
  
  We fully agree with the reviewer, however the substitution of missing dosages with average dosages prevents the introduction of the bias in our models caused by varying amounts of missing SNPs in the older samples. Although our average scores on an absolute level are largely influenced by the more abundant post-Neolithic samples, this reduces the odds of wrongfully observing significant trends caused by the sparsity of the data. While the absolute scores might be biased towards a certain value, the differences and thus the direction of the change in PRS is affected by the non-missing variants in each sample.
  
  —It seems from Figure 2, that some scores are indeed very sensitive to the choice of P-value cutoff (e.g., Malaria, Tuberculosis) and to the amount of missing data (e.g. HIV). This should be highlighted in the main text.
  
  The reviewer is right, and this is largely due to the fewer number of SNPs that are included in the model at stricter p-value cutoffs, which is in part a limitation of the available GWAS summary statistics. Using fewer SNPs in our PRS calculations reduces the variability between different samples which weakens our ability to accurately model changes in these specific complex traits and detect statistical significance. We have highlighted this in the main text on lines 193-196.
  
  —Some of the score distributions look a bit strange, like the Tuberculosis ones in Figure 2, which appear concentrated into particular values. Could this be because some of the scores are made with very few component SNPs?
  
  We thank the reviewer for pointing this out and this is indeed correct. At stricter thresholds fewer significant QTLs will be included in the polygenic score model. We chose to still show these plots to point out those results might more easily differ if more variants could be included. At more lenient thresholds more variants can be included increasing the power of the model but the score might be less informative for the trait that way.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.14.426690v1
www.biorxiv.org www.biorxiv.org

New submission 08/10/2023, 16:48:31

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  Myelodysplastic syndrome (MDS) is a heterogenous, clonal hematopoietic stem cell disorder characterized by morphological dysplasia in one or more hematopoietic lineages, cytopenias (most frequently anemia), and ineffective hematopoiesis. In patients with MDS, transfusion therapy treatment causes clinical iron overload; however it has been unclear if treatment with iron chelation yields clinical benefits. In the present study, the authors use a transgenic mouse model of MDS, NUP98-HOXD13 (referred to here as "MDS mice") to investigate this area. Starting at 5 months of age (before MDS mice progress to acute leukemia), the authors administered DFP in the drinking water for 4 weeks, and compared parameters to untreated MDS mice and WT controls.
  
  The authors first show that MDS mice exhibit systemic iron overload and macrocytic anemia that is improved by treatment with the iron chelator deferiprone (DFP). They then perform a detailed characterization the effects of DFP treatment on erythroid differentiation and various parameters related to iron transport and trafficking in MDS erythroblasts. Strengths of the work are the use of a well-characterized mouse model of MDS with appropriate animal group sizes and detailed analyses of systemic iron parameters and erythroid subpopulations. A remediable weakness is that in certain areas of the Results and Discussion, the authors overinterpret their findings by inferring causation when they have only shown a correlation. Additionally, when drawing conclusions based on changes in erythroblast mRNA expression levels between groups, the authors should consider that translation efficiency may be altered in MDS and that the NUP98 fusion protein itself, by acting as a chimeric transcription factor, may also impact gene expression profiles. Given that the application of chelators for treatment of MDS remains controversial, this work will be of interest to scientists focused on erythroid maturation and iron dysregulation in MDS, as well as clinicians caring for patients with this disorder.
  
  Major Comments
  
  1) The authors define the stages of erythroblast differentiation using the CD44-FSC method, which assumes that CD44 expression levels during the stages of erythroid differentiation are not altered by MDS itself. Are morphologically abnormal erythroblasts, such as bi-nucleate forms, captured in this analysis, and if so, are they classified in the appropriate subset? The percentage of erythroblasts in the bone marrow of MDS mice in this current study is lower than that reported by Suragani et al (Nat Med 2014), who employed a different strategy to define erythroid precursors. While representative erythroblast gating is presented as Supplemental Figure 17, it would be important to present representative gating from all 3 animal groups: WT, MDS, and MDS+DFP mice.
  
  We appreciate this comment and have added representative gating for all 3 groups to Supplemental Figure 17 (new Figure 3 – figure supplement 6 in the revised manuscript).
  
  2) Methods, "Statistical analysis." The authors state that all comparisons were done with 2-tailed student paired t test, which would not be appropriate for comparisons being made between independent animals groups (i.e. when groups are not "paired").
  
  We appreciate this comment and have reanalyzed all revised mouse data using one-way ANOVA with multiple comparisons and Tukey post-test analyses when more than 2 groups were compared. This has been edited in the Methods section in the revised manuscript.
  
  3) The Results (p.7) indicates that both sexes showed similar responses to DFP; however, the figure legends do not indicate sex. Given that systemic iron metabolism in mice shows sex-related differences, sex should be specified.
  
  We appreciate this comment and present here the gender-specific data for the reviewers’ evaluation (Author respone image 1). Similarly elevated transferrin saturation (a) (n = 3-4 male mice/group and n = 4-6 female mice/group) and hemoglobin (b) (n = 4-6 male mice/group and n = 4-9 female mice/group) are observed in male and female DFP-treated MDS mice. (c) Bone marrow erythroblasts are decreased to a greater degree in male relative to female DFP-treated MDS mice (n = 4-7 male mice/group and n = 8-9 female mice/group). We have added the data on gender-specific measures to new Figure 1 - figure supplement 3, Figure 2 – figure supplement 1, and Figure 3 – figure supplement 1 in the revised manuscript.
  
  Author respone image 1.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.05.510967v1
www.biorxiv.org www.biorxiv.org

New submission 02/01/2023, 14:19:54

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The manuscript by Xu et. al. does a very thorough characterization and molecular dissection of the role of SSH2 in spermatogenesis. Loss of SSh2 in germ cells results in germ cell arrest In step2-3 spermatids and eventually leads to germ cell loss by apoptosis. Molecular characterization of the mutant mice shows that the loss of SSH2 prevents the fusion of proacrosomal vesicles leading to the formation of a fragmented acrosome. The fragmentation of the acrosome is due to the impaired actin bundling and dephosphorylation of COFILIN. In short, this is a comprehensive body of work.
  
  We thank the referee for these insightful comments.
  
  Reviewer #2 (Public Review):
  
  The acrosome is a unique sperm-specific subcellular organelle required for the fertilization process, and it is also an organelle undergoing extensive morphological and structural transformation during sperm development. The mechanism underlying the extensive acrosome morphogenesis and biogenesis remains incompletely understood. Xu et al in their manuscript entitled "The Slingshot phosphatase 2 is required for acrosome biogenesis during spermatogenesis in mice" reported that the Slingshot Phosphatase 2 is essential for acrosome biogenesis and male fertility through their characterization of spermatogenic and acrosomal defects in Ssh2 knockout mice they generated. Specifically, the authors provided molecular, genetic, and subcellular evidence supporting that Ssh2 mutation impaired the phosphorylation of an acting-binding protein, COFILIN during spermiogenesis and accordingly actin cytoskeleton remodeling, crucial for proacrosomal vesicle trafficking and acrosome biogenesis. The manuscript by Xu et. al. does a very thorough characterization and molecular dissection of the role of SSH2 in spermatogenesis. Loss of SSh2 in germ cells results in germ cell arrest In step2-3 spermatids and eventually leads to germ cell loss by apoptosis. Molecular characterization of the mutant mice shows that the loss of SSH2 prevents the fusion of proacrosomal vesicles leading to the formation of a fragmented acrosome. The fragmentation of the acrosome is due to the impaired actin bundling and dephosphorylation of COFILIN. In short, this is a comprehensive body of work.
  
  We appreciate and thank Referee #2 for the positive feedback and insightful comments.
  
  Strengths:
  
  Nicely written manuscript, addresses an important mechanistic question of the roles of cytoskeleton remodeling in acrosome biogenesis and provided genetic, subcellular, and molecular evidence to build up their support for their hypothesis that Ssh2 regulates actin cytoskeleton remodeling, a process essential for proacrosomal vesicle trafficking and acrosome biogenesis, through dephosphorylation actin-binding protein during spermiogenesis.
  
  We again thank to the Referee #2 for appreciating and encouraging us regarding our current research work.
  
  Weaknesses:
  
  For body weight, and testis weight of the mutants, the authors concluded that there is no significant difference between the mutant and wildtype (Fig 1E -1G), but they appear to use mice between 6-8 wk old, both the testis and body weight of males at 6-8 wks is still growing, with the number of mice analyzed being six, you could easily miss the significant difference of the testis size and or body weight with such a varied age and a small sample size.
  
  We thank the referee for their prompting of this important discussion point, which we now cover in our revised manuscript. In our originally submitted manuscript, we only presented the data for body weight, testis weight, and T/B ratio for mice between the age of 6–8 weeks, however, we have added the additional data of mice with age more than 8 weeks in the revised manuscript in a new Figure 1E-1G with the sample size of 12 for each genotype. We have also updated the relevant content in the figure caption. The revised figure caption for Figure 1 panels E–G reads as follows: “(E-G) Body weights (26.3609 ± 0.4914 for WT; 25.1741 ± 0.5189 for Ssh2 KO), weights of the testes (0.0862 ± 0.0036 for WT; 0.0788 ± 0.0023 for Ssh2 KO), and the testis-to-body weight ratio (0.3281 ± 0.0153 for WT; 0.3154 ± 0.0135 for Ssh2 KO) of adult WT and Ssh2 KO males (n = 12). Data are presented as the mean ± SEM; p > 0.05 calculated by Student’s t-test. Bars indicate the range of the data.”
  
  Other points:
  
  Comments: 1) Could the uniform cytoplasmic distribution of diminutive actin filaments in the wild type and disrupted actin filament remodeling be examined at the EM level on the round spermatids?
  
  We apologize for the confusion. Previously, we conducted a transmission electron microscopy (TEM) analysis on the testes samples to discover the distribution and ultrastructural organization of F-actin in WT and Ssh2 KO round spermatids. Unfortunately, even at high magnification (30,000x, right panel of Figure R1-Response Figure 1) by TEM of testicular section no diminutive actin filament was observed in the cytoplasm of round spermatids except for the acroplaxome-an actin-rich specialized structure anchors the acrosome-in WT spermatids as well as some thick bundle-like structures located at the acrosomal region of Ssh2 KO spermatids (Fig. R1). According to their unique characteristic of appearance, we interpreted these electron-dense bundles as the aberrantly aggregated actin filaments whose lengths are in accordance with the lengths of COFILIN-saturated F-actin fragments (Bamburg et al., 2021), suggesting the disrupted actin filament remodeling during acrosome biogenesis resulted from Ssh2 KO. However, due to the technological limitations of TEM and the complexity of intracellular environment of round spermatids, we only recognized few aggregated actin bundles with the loss of filamentous appearance in Ssh2 KO spermatids and no typical diminutive actin filament was detected which had been imaged under high-resolution cryo-TEM (Haviv et al., 2008) or live-cell total internal reflection fluorescence microscopy (Johnson et al., 2015) on the purified actin bundles and cultured cells. Given the lack of effective approaches to culture murine round spermatids in vitro, confocal microscopy of flourescence-labelled F-actin (e.g., IF staining by FITC-phalloidin) is a more accessible method for visualizing the disruption of actin remodeling than EM in murine spermatids as the actin-related findings that several other studies demonstrated (Djuzenova et al., 2015; Meenderink et al., 2019).
  
  Comments: 2) Any other defects are seen besides acrosome in the mutant testis given the important roles of actin cytoskeleton network and high expression of Ssh2 in spermatocytes, were chromatoid bodies or mitochondria affected in any way? Any other defects in the mice overall including female fertility and other organs, given the previously reported roles in the nervous system. It could be helpful information for others interested in Ssh 2 protein and actin cytoskeleton's roles in general.
  
  The referee has here raised an interesting point. Firstly, besides the acrosome-related defects in Ssh2 KO spermatids, we identified increased germ cell apoptosis and aberrant activation of apoptotic Bcl-2/Caspase-3 pathway in the testes of Ssh2 KO mice which were speculated to be triggered by the disordered COFILIN-mediated F-actin remodeling and have attracted our attention to further elucidate the underlying mechanisms in the future. Secondly, given the high expression of SSH2 in spermatocytes demonstrated by IF staining shown in figure 4B and 4C,we thus performed the surface chromosome spreading on spermatocytes to observe whether the morphology of chromatid bodies and the meiotic progression was affected by Ssh2 KO and no obvious defects were observed as shown in supplementary Figure S3 in originally submitted manuscript. Thirdly, no obvious morphological abnormality in chromatin or mitochondrial structure was detected in Ssh2 KO germ cells such as spermatocytes and round spermatids under TEM which prevents us to pursue it further. Fourthly, we have observed the potential effect(s) of Ssh2 KO on female fertility using Ssh2 KO female mice and did not find any obvious infertility defect in Ssh2 KO females compared to their WT littermates as demonstrated by the data of the body weight, ovary weight, ovary-to-body weight ratio, size of ovaries and fertility test as well as the images of ovarian HE staining (Fig. R1). Moreover, given that during our investigation period, Ssh2 KO males and females did not manifest any defective physical development, aberrant physiological status or mental disorder notwithstanding the roles of SSH2 in neurite extension had been reported (Endo, Ohashi, & Mizuno, 2007), we did not conduct the experiments to observe the effect(s) of SSH2 in other organs except for the female fertility.
  
  Fig. R1 No reproductive defects were found in Ssh2 KO females. (A-C) Body weights, weights of the ovaries, and the ovary-to-body weight ratio of adult WT and Ssh2 KO females aged 8-10 weeks (n = 5); p > 0.05 calculated by Student’s t-test. Bars indicate the range of data. (D) The size of ovaries from Ssh2 KO were indistinguishable from ovaries of WT mice age 8 weeks, n = 4. (E) Histology of the ovaries from WT and Ssh2 KO mice. Sections were stained with hematoxylin and eosin. Scale bars: 200 μm. Images are representative of ovaries extracted from 8-week-old adult female mice per genotype. (F) Number of pups per litter from WT and Ssh2 KO male mice (8 weeks old) after crossing with WT adult male mice (n =3); p > 0.05 calculated by Student’s t-test. Bars indicate the range of the data.
  
  Comments: 3) Providing detailed information on the number of animals used and cells analyzed in the legend is nice, but it might be even better for the readers to include sample size and the number of cells examined in the figure/graph if possible.
  
  We appreciate the suggestions from the reviewer. We have integrated some information of sample size in the figures where appropriate. Firstly, we integrated sample size in the figure 1C, 1E, 1F, 1G and 1I. Secondly, we included sample size and the number of seminiferous tubule/epididymal duct we evaluated for TUNEL (+) cell counting in figure 2C and figure 2D. Thirdly, we included sample size and the number of spermatids for co-localization in figure 6B and figure 6D.
  
  Comments: 4) Nice discussion and comparison with GOPC and GM130, how about comparison and discussion with other acrosome defective mutants like PICK1, and ATG to provide some insights into acrosome biogenesis and proacrosomal vesicle trafficking?
  
  We greatly appreciate the referee for positive appraisal of our work with constructive suggestions, unfortunately, we are unable to address these defective mutants with certainty due to the lack of proper sample accessibility (only 3 of 16-month-old Ssh2 KO mice are accessible now). We compared the cytological staining of GM130 and GOPC in WT and Ssh2 KO spermatids using tubule squash sections as the description in the originally submitted manuscript which are prepared from fresh testes originated from 8-week-old mice and we now have several aged Ssh2 KO mice which prevent us to achieve the staining of PICK1 and ATG. PICK1 was previously reported to facilitate vesicle trafficking from the Golgi apparatus to the acrosome which co-localizes with GOPC in the proacrosomal granules (Xiao et al., 2009) and the phenotypes of Pick1 KO mice share a lot of similar characteristics with that of Ssh2 KO mice such as the fragmentation of the acrosome and increased germ cell apoptosis. Both autophagy-related ATG5 (Huang et al., 2021) and ATG7 (Wang et al., 2014) were reported to participate in the process of acrosome biogenesis and ATG7 is required for proacrosomal vesicle transportation/fusion by conjugating LC3 to the membrane of proacrosomal vesicles. Although the spermatids evaluated in these KO mice models could still be developed into spermatozoa with defective acrosome that is different from the situation in Ssh2 KO mice, it would be meaningful to discover the affects by Ssh2 KO on the localization of these regulators of acrosome biogenesis in spermatids and their potential interactions with SSH2. Indeed, in future work, we plan to pursue these issues and the content related to PICK1 has been added to the discussion in the revised manuscript as follows: “Moreover, it is intriguing to note that the phenotypes of Ssh2 KO mice share a lot of similarities with that of Pick1 KO model (Xiao et al., 2009) such as acrosome fragmentation and enhanced germ cell apoptosis, suggesting the possibility that SSH2 and PICK1 work together in a same trafficking machinery functioning in acrosome biogenesis which needs to be clarified further.”
  
  Comments: 5) Given the literature on Cofilin's requirement for male fertility and the increased p-Cofilin in Ssh2 mutant testis by Western and IF, the authors have a strong case for their hypothesis. But given the general role of phosphatase, it might be prudent to discuss alternative possibilities.
  
  We thank the reviewer for these valuable suggestions. Given that p-COFILIN is the only known substrate of SSH2 based on previous reports, we focused principally on this cascade to conduct our investigation. As a phosphatase, SSH2 is very likely to interact with many other proteins functioning in various cellular processes other than the actin-binding proteins which remain elusive. As directed, we now have added some content related to the regarding above concern in the discussion section of the revised manuscript as follows: “Given the diverse physiological roles reported for Slingshot family proteins, the possibility of the alternative mechanism underlying involvement of SSH2 in cellular events beyond the COFILIN-mediated actin remodeling should be noted. According to some publicly accessible databases as the indicators of potential protein–protein interactions such as BioGRID (Oughtred et al., 2019) and IntAct (Del Toro et al., 2022), SSH2 might interact with a set of actin-based molecular motors covering MYH9, MYO19 and MYO18A, which have been implicated in the maintenance of Golgi morphology and Golgi anterograde vesicular trafficking via the PI4P/GOLPH3/MYO18A/F-actin pathway (Rahajeng et al., 2019).”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.15.508144v1
www.biorxiv.org www.biorxiv.org

New submission 24/10/2022, 12:01:10

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Zylbertal and Bianco propose a new model of trial-to-trial neuronal variability that incorporates the spatial distance between neurons. The 7-parameter model is attractive because of its simplicity: A neuron's activity is a function of stimulus drive, neighboring neurons, and global inhibition. A neuroscientist studying almost any brain area in any model organism could make use of this model, provided that they have access to 1) simultaneously-recorded neurons and 2) the spatial locations of those neurons. I could foresee this model being the de-facto model to compare to all future models, as it is easy to code up and interpret. The paper explores the effectiveness of this distance model by modeling neural activity in the zebrafish optic tectum. They find that this distance-based model can capture 1) bursting found in spontaneous activity, 2) ongoing co-fluctuations during stimulus-evoked activity, and 3) adaptation effects during prey-catching behavior.
  
  Strengths:
  
  The main strength of the paper is the interpretability of the distance-based model. This model is agnostic to the brain area from which the population of neurons is recorded, making the model broadly applicable to many neuroscientists. I would certainly use this model for any baseline comparisons of trial-to-trial variability.
  
  The model is assessed in three different contexts, including spontaneous activity and behavior. That the model provides some prediction in all three contexts is a strong indicator that this model will be useful in other contexts, including other model organisms. The model could reasonably be extended to other cognitive states (e.g., spatial attention) or accounting for other neuron properties (such as feature tuning, as mentioned in the manuscript).
  
  The analyses and intuition to show how the distance-based model explains adaptation were insightful and concise.
  
  We thank the reviewer for these supportive comments.
  
  Weaknesses:
  
  Model evaluation and comparison: The paper does not fully evaluate the model or its assumptions; here, I note details in which evaluation is needed. A key assumption of the model - that correlations fall off in a gaussian manner (Fig. 1C-E - is not supported by Fig. 1C, which appears to have an exponential fall-off. Functions other than gaussian may provide better fits.
  
  A key feature of our model is that connection strengths smoothly decrease with distance. However, we did not intend to make strong claims about the exact function parametrizing this distance relationship. In light of the reviewer’s comment, we have additionally tested an exponential function and find that it too can describe activity correlations in OT with a negligible decrease in r2 (Figure 1 – figure supplement 1A-C). The main purpose of the analysis was to show that the correlation is maximal around the seed and decays uniformly with distance from it (i.e. no sub-networks or cliques are detected). We have emphasized this in a revised conclusion paragraph and note that while multiple functions can be used to parameterize the relationship, they are nonetheless certainly simplifications. Secondly, we also ran a version of the network simulation where the connections decay in space according to an exponential rather than Gaussian function and show that, as expected, tectal bursting is robust to this change.
  
  Furthermore, it is not clear whether the r^2s in Fig. 1E are computed in a held-out manner (more details about what goes into computing r^2 are needed).
  
  These values are computed by fitting the 2-d Gaussian (or exponential function) to all neurons excluding the seed itself (added a short clarification in the Methods).
  
  Assessing the model based on peak location alone (Fig. 1E) is not sufficient, as other smooth monotonically-decreasing functions may perform similarly.
  
  As discussed above, an exponential function indeed performs similarly to a Gaussian. However, goodness of fit is secondary to the main aim of Fig 1E, which is to show that the correlation peak tends to fall near the seed cell.
  
  Simulating from the model greatly improves the reader's understanding (Fig. 2D), but no explanation is given for why the simulations (Fig. 2D) have almost no background spikes and much fewer, non-co-occurring bursts than those of real data (Fig. 2E).
  
  In part this is because the simulation results depicted in Fig 2D were derived from the ‘baseline model’, prior to optimizing to match biological bursting statistics. It is thus expected that activity will differ from experimental observation and was our main motive to tune the model parameters (now emphasized in the text). However, the model will certainly not account for all aspects of tectal activity; rather, it was designed to reproduce bursting as a prominent feature of ongoing activity and in the second part of the paper we explore the extent to which it can account for other phenomena. As noted above, in the revised abstract, introduction and discussion we have tried to clarify the motivation for developing the model and how it was used to gain insight into activity-dependent changes in network excitability.
  
  A key assumption of the distance model (Fig. 2A) is that each neuron has the same gaussian fall-off (i.e., sigma_excitation and sigma_inhibition), but it is unclear if the data support this assumption.
  
  We intentionally opted for a simple model (i.e. described by few parameters), in part due to the lack of connectivity data and additionally to set a lower bound on the extent to which multiple features of tectal activity could be accounted for. More complex models with additional degrees of freedom (such as cell-specific connectivity) may well describe the data better, but likely at the cost of interpretability. We consider such extensions are beyond the scope of the present study but might be fruitful avenues for future research.
  
  Although an excitatory and inhibitory gain is assumed (Fig. 2A), it is not clear from the data (Fig. 1C) that an inhibitory gain is needed (no negative correlations are observed in Fig. 1C-D).
  
  This is now explored in the revised Figure 3A which includes the condition of zero inhibition gain. See also response to reviewer 1.
  
  After optimization (Fig. 3), the model is evaluated on predicting burst properties but not evaluated on predicting held-out responses (R^2s or likelihoods), and no other model (e.g., fitting a GLM or a model with only an excitatory gain) is considered. In particular, one may consider a model in which "assemblies" do exist - does such an assembly model lead to better held-out prediction performance?
  
  The model we developed is a mechanistic, generative model. In contrast to Pillow et al 2008, we did not fit the model to data but rather we used it to simulate network activity and tuned the seven parameters (using EMOO) to best match biological observations. Thus, rather than assessing goodness-of-fit using cross-validation, our approach involved comparison of summary statistics related to the target emergent phenomenon (tectal bursting). This was necessary as bursting appears highly stochastic. Further to the comments above, we have expanded the parameter space to include instances with only an excitatory gain (where bursting failed) and no distance-dependence (again, busting failed). Introducing assemblies into the model will inevitably support bursting (and introduce many more free parameters), but one of our key observations is that such assemblies are not required for this aspect of spontaneous activity. Again, our aim was not to produce a detailed picture of tectal connectivity, but rather to develop a minimal model and estimate the extent to which it can account for observed features of activity. Note that the second half of the paper (Figure 4 onwards) shows the model can explain phenomena that were not considered during parameter tuning.
  
  It is unclear why a genetic algorithm (Fig. 1A-C) is necessary versus a grid search; it appears that solutions in Generation 2 (Fig. 3C, leftmost plot, points close to the origin) are as good as solutions in Generation 30 and that the spreads of points across generations do not shrink (as one would expect from better mutations). Given the small number of parameters (7), a grid search is reasonable, computationally tractable, and easier to understand for all readers (Fig. 3A).
  
  Perhaps in hindsight a grid search would have worked, but at increased computational cost (each instantiation of the model is computationally expansive). At the time we chose EMOO, and since it produced satisfactory results, we kept it. As often happens with multi-objective optimization, an improvement in one objective usually happens at the expense of other objectives, so the spread of the points does not shrink much but they move closer to the axes (i.e. reduced error). The final parameter combination is closer to the origin than any point in generation 2, though admittedly not by much. Importantly, however, optimizing the model using the training features generalized to other burst-related statistics.
  
  It is unclear why the excitatory and inhibitory gains of the temporal profiles (Fig. 3I) appear to be gaussian but are formulated as exponential (formula for I_ij^X in Methods).
  
  The interactions indeed have exponential decay in time. These might appear Gaussian because the axis scale is logarithmic.
  
  Overall, comparing this model to other possible (similar) models and reporting held-out prediction performance will support the claim that the distance model is a good explanation for trial-to-trial variability.
  
  See comments above. A key point we want to stress is that we intentionally explored a minimal network model and found that, despite obvious simplifications of the biology, it was nonetheless able to explain multiple aspects of tectal physiology and behaviour. We hope that it inspires future studies and can be extended, in parallel to experimental findings, to more accurately represent the cell-type diversity and cell-specific connectivity of the tectal network.
  
  Data results: Data results were clear and straightforward. However, the explanation was not given for certain results. For example, the relationship between pre-stimulus linear drive and delta R was weak; the examples in Fig. 4C do not appear to be representative of the other sessions. The example sessions in Fig. 4C have R^2=0.17 and 0.19, the two outliers in the R^2 histogram (Fig. 4D).
  
  The revised figure 4 is based on new data and new analysis (see below), and the presented examples no longer represent the extreme tail of the distribution (they still, however, represent strong examples, as is now explicitly indicated in the figure legend).
  
  The black trace in Fig. 4D has large variations (e.g., a linear drive of 25 and 30 have a change in delta R of ~0.1 - greater than the overall change of the dashed line at both ends, ~0.08) but the SEMs are very tight. This suggests that either this last fluctuation is real and a major effect of the data (although not present in Fig. 4C) or the SEM is not conservative enough. No null distribution or statistics were computed on the R^2 distribution (Fig. 4C, blue distribution) to confirm the R^2s are statistically significant and not due to random fluctuations.
  
  We agree that this was not sufficiently robust and in response to this comment we undertook a significant revision to figure 4 and the associated text:
  
  i) The revised figure is based on an entirely new dataset, allowing us to verify the results on independent data. We used 5 min ISI for all stimulus presentations, regardless of stimulus type (high or low elevation), thus ensuring that we are only examining differences in state brought about by previous ongoing activity, without risk of ‘contamination’ by evoked activity.
  
  ii) As per the reviewer’s suggestion, we compared model-estimated pre-stimulus state to a null estimate using randomly sampled time-points. We additionally compared the optimised model with the baseline model. Whereas the null (random times) estimates had no predictive power, both models using pre-stimulus activity were able to explain a fraction of the response residuals with the optimised model performing better.
  
  iii) We refined the binning process by first computing, for each response, the mean of response residuals across neurons for each bin of estimated linear drive, and then averaging across responses. This prevents the relationship being skewed by rare instances involving unusually large numbers of neurons for a particular linear drive bin, and thereby eliminates the fluctuations the reviewer was referring to.
  
  The absence of any background activity in Fig. 6B (e.g., during the rest blocks) is confusing, given that in spontaneous activity many bursts and background activity are present (Fig. 2E).
  
  The raster only presents evoked responses and no background activity is shown. This has been clarified in the revised figure and legend.
  
  Finally, it appears that the anterior optic tectum contributes to convergent saccades (CS) (Fig. 7E) but no post-saccadic activity is shown to assess how activity changes after the saccade (e.g., plotting activity from 0 to 60).
  
  Activity before and after the saccade is shown in Fig 7A. Fig 7E shows the ‘linear drive’ (or ‘excitability’), and how it changes leading up to the saccade. Since we were interested in the association between pre-saccade state and saccade-associated activity, we did not plot post-saccadic linear drive. However, as can be seen in the below figure for the reviewer, linear drive is strongly suppressed by the saccade, as expected due to CS-associated activity.
  
  No explanation is given why activity drops ~30 seconds before a convergent saccade (Fig. 7E).
  
  This is no longer shown after we trimmed the history data in Fig 7E in accordance with a comment from reviewer 1. We speculate, however, that the mean linear drive of a compact population of neurons would be somewhat periodical, since a high linear drive leads to a burst which results in a prolonged inhibition (low linear drive) with a slow recovery and so on.
  
  No statistical test is performed on the R^2 distribution (Fig. 7H) to confirm the R^2s (with a mean close to R^2=0.01) are meaningful and not due to random fluctuations.
  
  We revised the analysis in Fig 7 along the same lines as the revision of Fig 4. Model-estimated linear drive predicts CS-associated activity whereas a null estimate (random times) shows no such relationship.
  
  Presentation: A disjointed part of the paper is that for the first part (Figs. 1-3), the focus is on capturing burst activity, but for the second part (Figs. 4-7), the focus is on trial-to-trial variability with no mention of bursts. It is unclear how the reader should relate the two and if bursts serve a purpose for stimulus-evoked activity.
  
  In the first part of the paper (Figs. 1-3), we use ongoing activity to develop an understanding (formulated as a network model) of how activity modulates the network state. In the second part, we test this understanding in the context of evoked responses and show that model-estimated network state explains a fraction of visual response variability and experience-dependent changes in activity and behaviour. In the revised MS we further emphasize this idea and have edited the results text to strengthen the connections between these parts of the study. See also comments above.
  
  Citations: The manuscript may cite other relevant studies in electrophysiology that have investigated noise correlations, such as:
  
  Luczak et al., Neuron 2009 (comparing spontaneous and evoked activity).
  
  Cohen and Kohn, Nat Neuro 2011 (review on noise correlations).
  
  Smith and Kohn, JNeurosci 2008 (looking at correlations over distance).
  
  Lin et al., Neuron 2015 (modeling shared variability).
  
  Goris et al., Nat Neuro 2014 (check out Fig. 4).
  
  Umakantha et al., Neuron 2021 (links noise correlation and dim reduction; includes other recent references to noise correlations).
  
  We agree that the manuscript could benefit from citing some of these suggested studies and have added citations accordingly.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.30.486335v1
www.biorxiv.org www.biorxiv.org

Identification of blood autosomal cis-expression quantitative trait methylation (cis-eQTMs) in children

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this manuscript, the authors find CpGs within 500Kb of a gene that associate with transcript abundance (cis-eQTMs) in children from the HELIX study. There is much to admire about this work. With two notable exceptions, their work is solid and builds/improves on the work that came before it. Their catalogue of eQTMs could be useful to many other researchers that utilize methylation data from whole blood samples in children. Their annotation of eQTMs is well thought out and exhaustive. As this portion of the work is descriptive, most of their methods are appropriate.
  
  Unfortunately, their use of results from a model that does not account for cell-type proportions across samples diminishes the utility and impact of their findings. I believe that their catalog of eQTMs contains a great deal of spurious results that primarily represent the differences in cell-type proportions across samples.
  
  Lastly, the authors postulate that the eQTM gene associations found uniquely in their unadjusted model (in comparison to results from a model that does account for cell type proportion) represent cell-specific associations that are lost when a fully-adjusted model is assumed. To test this hypothesis, the authors appear to repurpose methods that were not intended for the purposes used in this manuscript. The manuscript lacks adequate statistical validation to support their repurposing of the method, as well as the methodological detail needed to peer review it. This section is a distraction from an otherwise worthy manuscript. But provide evidences that enriched for cell sp CpGs.
  
  Major points
  
  Line 414-475: In this section, the authors are suggesting that CpGs that are significant without adjusting for cell type are due to methylation-expression associations that are found only in one cell type, while association found in the fully adjusted model are associations that are shared across the cell types. I do not agree with this hypothesis, as I do not agree that the confounding that occurs when cell-type proportions are not accounted for would behave in this way. Although restricting their search for eQTMs to only those CpGs proximal to a gene will reduce the number of spurious associations, a great deal of the findings in the authors' unadjusted model likely reflect differences in cell-type proportions across samples alone. The Reinius manuscript, cited in this paper, indicates that geneproximal CpGs can have methylation patterns that vary across cell types.
  
  Following reviewers’ recommendations, we have reconsidered our initial hypothesis about the role of cellular composition in the association between methylation and gene expression. Although we still think that some of the eQTMs only found in the model unadjusted for cellular composition could represent cell specific effects, we acknowledge that the majority might be confounded by the extensive gene expression and DNA methylation differences between cell types. Also, we recognize that more sophisticated statistical tests should be applied to prove our hypothesis. Because of this, we have decided to report the eQTMs of the model adjusted for cellular composition in the main manuscript and keep the results of the model unadjusted for cellular composition only in the online catalogue.
  
  Line 476-488: Their evidence due to F-statistics is tenuous. The authors do not give enough methodological detail to explain how they're assessing their hypothesis in the results or methods (lines 932-946) sections. The methods they give are difficult to follow. The results in figure S19A are not compelling. The citation in the methods (by Reinius) do not make sense, because Reinius et al did not use F-statistics as a proxy for cell type specificity. The citation that the authors give for this method in the results does not appear to be appropriate for this analysis, either. Jaffe and Irizarry state that a CpG with a high Fstatistic indicates that the methylation at that CpG varies across cell type. They suggest removing these CpGs from significant results, or estimating and correcting for cell type proportions, as their presence would be evidence of statistical confounding. The authors of this manuscript indicate that they find higher F-statistics among the eQTMs uniquely found in the unadjusted model, which seems to only strengthen the idea that the unadjusted model is suffering from statistical confounding.
  
  We recognize the miss-interpretation of the F-statistic in relation to cellular composition. We have deleted all this part from the updated version of the manuscript.
  
  The methods used to generate adjusted p-values in this manuscript are not appropriate as they are written. Further, they are nothing like the methods used in the paper cited by the authors. The Bonder paper used permutations to estimate an empirical FDR and cites a publication by Westra et al for their method (below). The Westra paper is a better one to cite, because the methods are more clear. Neither the Bonder nor the Westra paper uses the BH procedure for FDR.
  
  Westra, H.-J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238-1243 (2013).
  
  We apologize for this misleading citation. Although Bonder et al applied a permutation approach to adjust for multiple testing, our approach was inspired by the method applied in the GTEx project (GTEx consortium, 2020), using CpGs instead of SNPs. The citation has been corrected in the manuscript. Moreover, we have explained in more detail the whole multiple-testing processes in the Material and Methods section (page 14, line 316):
  
  “To ensure that CpGs paired to a higher number of Genes do not have higher chances of being part of an eQTM, multiple-testing was controlled at the CpG level, following a procedure previously applied in the Genotype-Tissue Expression (GTEx) project (Gamazon et al., 2018). Briefly, our statistic used to test the hypothesis that a pair CpGGene is significantly associated is based on considering the lowest p-value observed for a given CpG and all its pairs Gene (e.g. those in the 1 Mb window centered at the TSS). As we do not know the distribution of this statistic under the null, we used a permutation test. We generated 100 permuted gene expression datasets and ran our previous linear regression models obtaining 100 permuted p-values for each CpG-Gene pair. Then, for each CpG, we selected among all CpG-Gene pairs the minimum p-value in each permutation and fitted a beta distribution that is the distribution we obtain when dealing with extreme values (e.g. minimum) (Dudbridge and Gusnanto, 2008). Next, for each CpG, we took the minimum p-value observed in the real data and used the beta distribution to compute the probability of observing a lower p-value. We defined this probability as the empirical p-value of the CpG. Then, we considered as significant those CpGs with empirical p-values to be significant at 5% false discovery rate using BenjaminiHochberg method. Finally, we applied a last step to identify all significant CpG-Gene pairs for all eCpGs. To do so, we defined a genome-wide empirical p-value threshold as the empirical p-value of the eCpG closest to the 5% false discovery rate threshold. We used this empirical p-value to calculate a nominal p-value threshold for each eCpG, based on the beta distribution obtained from the minimum permuted p-values. This nominal p-value threshold was defined as the value for which the inverse cumulative distribution of the beta distribution was equal to the empirical p-value. Then, for each eCpG, we considered as significant all eCpG-Gene variants with a p-value smaller than nominal p-value.”
  
  References:<br /> GTEx consortium, The GTEx Consortium atlas of genetic regulatory effects across human tissues, Science (2020) Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.
  
  Reviewer #2 (Public Review):
  
  Strength:
  
  Comprehensive analysis Considering genetic factors such as meQTL and comparing results with adult data are interesting.
  
  We thank the reviewer for his/her positive feedback on the manuscript. We agree that the analysis of genetic data and the comparison with eQTMs described in adults are two important points of the study.
  
  Weakness:
  
  Manuscript is not summarized well. Please send less important findings to supplementary materials. The manuscript is not well written, which includes every little detail in the text, resulting in 86 pages of the manuscript.
  
  Following reviewers’ comments, we have simplified the manuscript. Now only the eQTMs identified in the model adjusted for cellular composition are reported. In addition, functional enrichment analyses have been simplified without reporting all odds ratios (OR) and p-values, which can be seen in the Figures.
  
  Any possible reason that the eQTM methylation probes are enriched in weak transcription regions? This is surprising.
  
  Bonder et al also found that blood eQTMs were slightly enriched for weak transcription regions (TxWk). Weak transcription regions are highly constitutive and found across many different cell types (Roadmap Epigenetics Consortium, 2015). However, hematopoietic stem cells and immune cells have lower representation of TxWk and other active states, which may be related to their capacity to generate sub-lineages and enter quiescence.
  
  Given that we analyzed whole blood and that ROADMAP chromatin states are only available for blood specific cell types, each CpG in the array was annotated to one or several chromatin states by taking a state as present in that locus if it was described in at least 1 of the 27 bloodrelated cell types. By applying this strategy we may be “over-representing” TxWk chromatin states, in the case TxWk are cell-type specific. As a result, even if each blood cell type might have few TxWk, many positions can be TxWk in at least one cell type, inflating the CpGs considered as TxWk. This might have affected some of the enrichments.
  
  On the other hand, CpG probe reliability depends on methylation levels and variance. TxWk regions show high methylation levels, which tend to be measured with more error. This also might have impacted the results, however the analysis considering only reliable probes (ICC >0.4) showed similar enrichment for TxWk.
  
  Besides these, we do not have a clear answer for the question raised by the reviewer.
  
  References:
  
  Bonder MJ, Luijk R, Zhernakova D V, Moed M, Deelen P, Vermaat M, et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat Genet [Internet]. 2017 [cited 2017 Nov 2];49:131–8. Available from: http://www.ncbi.nlm.nih.gov/pubmed/27918535
  
  Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A, Kheradpour P, Zhang Z, Wang J, Ziller MJ, Amin V, Whitaker JW, Schultz MD, Ward LD, Sarkar A, Quon G, Sandstrom RS, Eaton ML, Wu YC, Pfenning AR, Wang X, Claussnitzer M, Liu Y, Coarfa C, Harris RA, Shoresh N, Epstein CB, Gjoneska E, Leung D, Xie W, Hawkins RD, Lister R, Hong C, Gascard P, Mungall AJ, Moore R, Chuah E, Tam A, Canfield TK, Hansen RS, Kaul R, Sabo PJ, Bansal MS, Carles A, Dixon JR, Farh KH, Feizi S, Karlic R, Kim AR, Kulkarni A, Li D, Lowdon R, Elliott G, Mercer TR, Neph SJ, Onuchic V, Polak P, Rajagopal N, Ray P, Sallari RC, Siebenthall KT, Sinnott-Armstrong NA, Stevens M, Thurman RE, Wu J, Zhang B, Zhou X, Beaudet AE, Boyer LA, De Jager PL, Farnham PJ, Fisher SJ, Haussler D, Jones SJ, Li W, Marra MA, McManus MT, Sunyaev S, Thomson JA, Tlsty TD, Tsai LH, Wang W, Waterland RA, Zhang MQ, Chadwick LH, Bernstein BE, Costello JF, Ecker JR, Hirst M, Meissner A, Milosavljevic A, Ren B, Stamatoyannopoulos JA, Wang T, Kellis M. Integrative analysis of 111 reference human epigenomes. Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248. PMID: 25693563; PMCID: PMC4530010.
  
  The result that the magnitude of the effect was independent of the distance between the CpG and the TC TSS is surprising. Could you draw a figure where x-axis is the distance between the CpG site and TC TSS and y-axis is p-value?
  
  As suggested by the reviewer, we have taken a more detailed look at the relationship between the effect size and the distance between the CpG and the TC’s TSS. First, we confirmed that the relative orientation (upstream or downstream) did not affect the strength of the association (p-value=0.68). Second, we applied a linear regression between the absolute log2 fold change and the log10 of the distance (in absolute value), finding that they were inversely related. We have updated the manuscript with this information (page 22, line 504):
  
  “We observed an inverse linear association between the eCpG-eGene’s TSS distance and the effect size (p-value = 7.75e-9, Figure 2B); while we did not observe significant differences in effect size due to the relative orientation of the eCpG (upstream or downstream) with respect to the eGene’s TSS (p-value = 0.68).”
  
  Results are shown in Figure 2B. Of note, we winsorized effect size values in order to improve the visualization. The winsorizing process is also explained in Figure 2 legend. Moreover, we have done the plot suggested by the reviewer (see below). It shows that associations with smallest p-values are found close to the TC’s TSS. Nonetheless, as this pattern is also observed for the effect sizes, we have decided to not include it in the manuscript.
  
  Concerned about too many significant eQTMs. Almost half of genes are associated with methylation. I wonder if false positives are well controlled using the empirical p-values. Using empirical p-value with permutation may mislead since especially you only use 100 permutations. I wonder the result would be similar if they compare their result with the traditional way, either adjusting p-values using p-values from entire TCs or adjusting pvalues using a gene-based method as commonly used in GWAS. Compare your previous result with my suggestion for the first analysis.
  
  Despite the number of genes (TCs) whose expression is associated with DNA methylation is quite high, we do not think this is due to not correctly controlling false positives. Our approach is based on the method used by GTEx (GTEx consortium) and implemented in the FastQTL package (Ongen et al. 2016), to control for positives in the eQTLs discovery. As in GTEx, we run 100 permutations to estimate the parameters of a beta distribution, which we used to model the distribution of p-values for each CpG. Then, to correct for the number of TCs among significant CpGs, we applied False Discovery Rate (FDR) at a threshold < 0.05. Finally, we defined the final set of significant eQTMs using the beta distribution defined in a previous step.
  
  For illustration, we compared the number of eQTMs with our approach to what we would obtain by uniquely applying the FDR method (adjusted p-value <0.05), getting fewer associations with our approach: eQTMs (45,203 with FDR vs 39,749 with our approach), eCpGs (24,611 vs 21,966) and eGenes (9,937 vs 8,886). Among the 8,886 significant eGenes, 6,288 of them are annotated to coding genes, thus representing 27% of the 23,054 eGenes coding for a gene included in the array.
  
  References:
  
  GTEx consortium, The GTEx Consortium atlas of genetic regulatory effects across human tissues, Science (2020) Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.
  
  Ongen et al. Fast and efficient QTL mapper for thousands of molecular phenotypes, Bioinformatics (2016) May 15;32(10):1479-85. doi: 10.1093/bioinformatics/btv722. Epub 2015 Dec 26.
  
  I recommend starting with cell type specific results. Without adjusting cell type, the result doesn't make sense.
  
  As suggested by other reviewers, we have withdrawn the model unadjusted for cellular composition.
  
  Reviewer #3 (Public Review):
  
  Although several DNA methylation-gene expression studies have been carried out in adults, this is the first in children. The importance of this is underlined by the finding that surprisingly few associations are observed in both adults and children. This is a timely study and certain to be important for the interpretation of future omic studies in blood samples obtained from children.
  
  We agree with the reviewer that eQTMs in children are important for interpreting EWAS findings conducted in child cohorts such as those of the Pregnancy And Childhood Epigenetics (PACE) consortium.
  
  It is unfortunate that the authors chose to base their reporting on associations unadjusted for cell count heterogeneity. They incorrectly claim that associations linked to cell count variation are likely to be cell-type-specific. While possible, it is probably more likely that the association exists entirely due to cell type differences (which tend to be large) with little or no association within any of the cell types (which tend to be much smaller). In the interests of interpretability, it would be better to report only associations obtained after adjusting for cell count variation.
  
  Following reviewers’ recommendations, we have reconsidered our initial hypothesis about the role of cellular composition in the association between methylation and gene expression. Although we still think that some of the eQTMs only found in the model unadjusted for cellular composition could represent cell specific effects, we acknowledge that the majority might be confounded by the extensive gene expression and DNA methylation differences between cell types. Also, we recognize that more sophisticated statistical tests should be applied to prove our hypothesis. Because of this we have decided to report the eQTMs of the model adjusted for cellular composition in the main manuscript and keep the results of the model unadjusted for cellular composition only in the online catalogue.
  
  Several enrichments could be related to variation in probe quality across the DNA methylation arrays.
  
  For example, enrichment for eQTM CpG sites among those that change with age could simply be due to the fact age and eQTM effects are more likely to be observed for CpG sites with high quality probes than low quality probes. It is more informative to instead ask if eQTM CpG sites are more likely to have increasing rather than decreasing methylation with age. This avoids the probe quality bias since probes with positive associations with age would be expected to have roughly the same quality as those with negative associations with age. There are several other analyses prone to the probe quality bias.
  
  See answer to question 2, below.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.11.05.368076v1
www.biorxiv.org www.biorxiv.org

Finger somatotopy is preserved after tetraplegia but deteriorates over time

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  This work provides insight into the effects of tetraplegia on the cortical representation of the body in S1. By using fMRI and an attempted finger movement task, the researchers were able to show preserved fine-grained digit maps - even in patients without sensory and motor hand function as well as no spared spinal tissue bridges. The authors also explored whether certain clinical and behavioral determinates may contribute to preserving S1 somatotopy after spinal cord injury.
  
  Overall I found the manuscript to be well-written, the study to be interesting, and the analysis reasonable. I do, however, think the manuscript would benefit by considering and addressing two main suggestions.
  
  1) Provide additional context / rationale for some of the methods. Specific examples below:
  
  a) The rationale behind using the RSA analysis seemed to be predicated on the notion that the signals elicited via a phase-encoded design can only yield information about each voxel's preferred digit and little-to-no information about the degree of digit overlap (see lines 163-166 and 571-575). While this is the case for conventional analyses of these signals, there are more recently developed approaches that are now capable of estimating the degree of somatotopic overlap from phase-encoded data (see: Da Rocha Amaral et al., 2020; Puckett et al., 2020). Although I personally would be interested in seeing one of these types of analyses run on this data, I do not think it is necessary given the RSA data / analysis. Rather, I merely think it is important to add some context so that the reader is not misled into believing that there is no way to estimate this type of information from phase-encoded signals. - Da Rocha Amaral S, Sanchez Panchuelo RM, Francis S (2020) A Data-Driven Multi-scale Technique for fMRI Mapping of the Human Somatosensory Cortex. Brain Topogr 33 (1):22-36. doi:10.1007/s10548-019-00728-6 - Puckett AM, Bollmann S, Junday K, Barth M, Cunnington R (2020) Bayesian population receptive field modeling in human somatosensory cortex. Neuroimage 208:116465. doi:10.1016/j.neuroimage.2019.116465
  
  We did not intend to give the impression that inter-finger overlap can only be estimated using RSA. To clarify this, we included a sentence in our methods section stating that inter-finger overlap cannot be estimated using the traditional travelling wave approach, but new methods have estimated somatotopic overlap from travelling wave data. Since our RSA approach lends itself for estimating inter-finger overlap and is currently the gold standard in characterizing these representational patterns, we opt –in accordance with the reviewer’s comment– not to include this additional analysis.
  
  Revised text Methods:
  
  “While the traditional traveling wave approach is powerful to uncover the somatotopic finger arrangement, a fuller description of hand representation can be obtained by taking into account the entire fine-grained activity pattern of all fingers. RSA-based inter-finger overlap patterns have been shown to depict the invariant representational structure of fingers better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). RSA-based measures are furthermore not prone to some of the problems of measurements of finger selectivity (e.g., dependence on map thresholds). The most common approach for investigating inter-finger overlap is RSA, as used here, though note that somatotopic overlap has recently been estimated from travelling wave data using an iterated Multigrid Priors (iMGP) method and population receptive field modelling (Da Rocha Amaral et al., 2020; Puckett et al., 2020).”
  
  b. The rationale for using minimally thresholded (Z>2) data for the Dice overlap analysis as opposed to the threshold used in data visualization (q<0.05) was unclear. Providing the minimally thresholded maps (in Supplementary) would also aid interpretation of the Dice overlap results.
  
  We followed previously published procedures for calculating the Dice overlap between the two split-halves of the data (Kikkert et al., 2016; J. Kolasinski et al., 2016; Sanders et al., 2019). We used minimally thresholded data to calculate the dice overlap to ensure that our analysis was sensitive to overlaps that would be missed when using high thresholds. We clarified this in the revised manuscript. We thank the reviewer for their suggestion to add a Figure displaying the minimally thresholded split-half hard-edged finger maps - we have added this to the revised manuscript as Figure 2-Figure supplement 1.
  
  To ensure that our thresholding procedure did not change the results of the dice overlap analysis, we repeated this analysis using split-half maps that were thresholded using a q < 0.05 FDR criterion (as was used to create the travelling wave maps in Figures 2A-B). We found the same results as when using the Z >2 thresholding criterion: Overall, split-half consistency was not significantly different between patients and controls, as tested using a robust mixed ANOVA (F(1,17.69) = 0.08, p = 0.79). There was a significant difference in split- half consistency between pairs of same, neighbouring, and non-neighbouring fingers (F(2,14.77) = 38.80, p < 0.001). This neighbourhood relationship was not significantly different between the control and patient groups (i.e., there was no significant interaction; F(2,14.77) = 0.12, p = 0.89). We have included this analysis and the relating figure as Figure 2- Figure supplement 2 in the revised manuscript.
  
  Revised text Methods:
  
  “We followed previously described procedures for calculating the DOC between two halves of the travelling wave data (Kikkert et al., 2016; Kolasinski et al., 2016; Sanders et al., 2019). The averaged finger-specific maps of the first forward and backward runs formed the first data half. The averaged finger-specific maps of the second forward and backward runs formed the second data half. The finger-specific clusters were minimally thresholded (Z>2) on the cortical surface and masked using an S1 ROI, created based on Brodmann area parcellation using Freesurfer (see Figure 2– figure supplement 1 for a visualisation of the minimally thresholded split-half hard-edged finger maps used to calculate the DOC). We used minimally thresholded finger-specific clusters for the DOC analysis to ensure we were sensitive to overlaps that would be missed when using high thresholds. Note that results were unchanged when thresholding the finger-specific clusters using an FDR q < 0.05 criterion (see Figure 2 – figure supplement 2).”
  
  2) Provide a more thorough discussion - particularly with respect to the possible role of top-down processes (e.g., attention).
  
  a) The authors discuss a few potential signal sources that may contribute to the maintenance of (and ability to measure) the somatotopic maps; however, the overall interpretation seems a bit "motor efferent heavy". That is, it seems the authors favor an explanation that the activity patterns measured in S1 were elicited by efference copies from the motor system and that occasional corollary discharges or attempted motor movements play a role in their maintenance over time. The authors consider other explanations, noting - for example - the potential role of attention in preserving the somatotopic representations given that attention has been shown to be able to activate S1 hand representations. The mention of this was, however, rather brief - and I believe the issue deserves a bit more of a balanced consideration.
  
  When the authors consider the possible role of attention in maintaining the somatotopic representations (lines 329-333), they mention that observing others' fingers being touched or attending to others' finger movements may contribute. But there is no mention of attending to one's own fingers (which has been shown to elicit activity as cited). I realize that the patients lack sensorimotor function (and hence may find it difficult to "attend" to their fingers); however, they have all had prior experience with their fingers and therefore might still be able to attend to them (or at least the idea of their digits) such that activity is elicited. For example, it is not clear to me that it would be any more difficult for the patients to be asked to attend to their digits compared to being asked to attempt to move their digits. I would even suggest that attempting to move a digit (regardless of whether you can or not) requires that one attends to the digit before attempting to initiate the movement as well as throughout the attempted motor movement. Because of this, it seems possible that attention-related processes could be playing a role in or even driving the signals measured during the attempted movement task - as well as those involved in the ongoing maintenance of the maps after injury. I don't think this possibility can be dismissed given the data in hand, but perhaps the issue could be addressed by a bit more thorough of a discussion on the process of "attempting to move" a digit (even one that does not move) - and the various top-down processes that might be involved.
  
  We thank the reviewer for their consideration and insights into the potential mechanisms underlying our results. We have now elaborated further on the possibility that attention- related processes might have contributed to the reported effects, also in consideration of comment 3.4.
  
  Revised text Discussion:
  
  “Spared spinal cord tissue bridges can be found in most patients with a clinically incomplete injury, their width being predictive of electrophysiological information flow, recovery of sensorimotor function, and neuropathic pain (Huber et al., 2017; Pfyffer et al., 2021, 2019; Vallotton et al., 2019). However, in this study, spared midsagittal spinal tissue bridges at the lesion level, motor function, and sensory function did not seem necessary to maintain and activate a somatotopic hand representation in S1. We found a highly typical hand representation in two patients (S01 and S03) who did not have any spared spinal tissue bridges at the lesion level, a complete (S01) or near complete (S03) hand paralysis, and a complete (S01) or near complete loss (S03) of hand sensory function. Our predictive modelling results were in line with this notion and showed that these behavioural and structural spinal cord determinants were not predictive of hand representation typicality. Note however that our sample size was limited, and it is challenging to draw definite conclusions from non-significant predictive modelling results.”
  
  “How may these representations be preserved over time and activated through attempted movements in the absence of peripheral information? S1 is reciprocally connected with various brain areas, e.g., M1, lateral parietal cortex, poster parietal area 5, secondary somatosensory cortex, and supplementary motor cortex (Delhaye et al., 2019). After loss of sensory inputs and paralysis through SCI, S1 representations may be activated and preserved through its interconnections with these areas. Firstly, it is possible that cortico-cortical efference copies may keep a representation ‘alive’ through occasional corollary discharge (London and Miller, 2013). While motor and sensory signals no longer pass through the spinal cord in the absence of spinal tissue bridges, S1 and M1 remain intact. When a motor command is initiated (e.g., in the form of an attempted hand movement) an efference copy is thought to be sent to S1 in the form of corollary discharge. This corollary discharge resembles the expected somatosensory feedback activity pattern and may drive somatotopic S1 activity even in the absence of ascending afferent signals from the hand (Adams et al., 2013; London and Miller, 2013). It is possible that our patients occasionally performed attempted movements which would result in corollary discharge in S1. Second, it is likely that attempting individual finger movements poses high attentional demands on tetraplegic patients. Accordingly, attentional processes might have contributed to eliciting somatotopic S1 activity. Evidence for this account comes from studies showing that it is possible to activate somatotopic S1 hand representations through attending to individual fingers (Puckett et al., 2017) or through touch observation (Kuehn et al., 2018). Attending to fingers during our attempted finger movement task may have been sufficient to elicit somatotopic S1 activity through top-down processes in the tetraplegic patients who lacked hand motor and sensory function. Furthermore, one might speculate that observing others’ or one’s own fingers being touched or directing attention to others’ hand movements or one’s own fingers may help preserve somatotopic representations. Third, it is possible that these somatotopic maps are relatively hardwired and while they deteriorate over time, they never fully disappear. Indeed, somatotopic mapping of a sensory deprived body part has been shown to be resilient after dystonia (Ejaz et al., 2016; though see Burman et al., (2009) and Taub et al., (1998)) and arm amputation (Bruurmijn et al., 2017; Kikkert et al., 2016; Wesselink et al., 2019). Fourth, it is possible that even though a patient is clinically assessed to be complete and is unable to perceive sensory stimuli on the deprived body part, there is still some ascending information flow that contributes to preserving somatotopy (Wrigley et al., 2018). A recent study found that although complete paraplegic SCI patients were unable to perceive a brushing stimulus on their toe, 48% of patients activated the location appropriate S1 area (Wrigley et al., 2018). However, the authors of this study defined the completeness of patients’ injuries via behavioural testing, while we additionally assessed the retained connections passing through the SCI directly via quantification of spared spinal tissue bridges through structural MRI. It is unlikely that spinal tissue carrying somatotopically organised information would be missed by our assessment (Huber et al., 2017; Pfyffer et al., 2019). Our experiment did not allow us to tease apart these potential processes and it is likely that various processes simultaneously influence the preservation of S1 somatotopy and elicited the observed somatotopic S1 activity.”
  
  Reviewer #2:
  
  The authors investigate SCI patients and characterize the topographic representation of the hand in sensorimotor cortex when asked to move their hand (which controls could do but patients could not). The authors compare some parameters of topographic map organization and conclude that they do not differ between patients and controls, whereas they find changes in the typicality of the maps that decrease with years since disease onset in patients. Whereas these initial analyses are interesting, they are not clearly related to a mechanistic model of the disorder and the underlying pathophysiology that is expected in the patients. Furthermore, additional analyses on more fine-grained map changes are needed to support the authors' claims. Finally, the major result of changed typicality in the patients is in my view not valid.
  
  Concept 1. At present, there is no clear hypotheses about the (expected or hypothesized) mechanistic changes of the sensorimotor maps in the patients. The authors refer to "altered" maps and repeatedly say that "results are mixed" (3 times in the introduction).
  
  We thank the reviewer for highlighting to us that our introduction and hypotheses were unclear and/or incomplete to them. We have restructured our Introduction to better highlight competing hypotheses on how SCI may change S1 hand representations, the reasons for our analytical approach, and elaborate on our hypotheses.
  
  Revised text Introduction:
  
  “Research in non-human primate models of chronic and complete cervical SCI has shown that the S1 hand area becomes largely unresponsive to tactile hand stimulation after the injury (Jain et al., 2008; Kambi et al., 2014; Liao et al., 2021). The surviving finger-related activity became disorganised such that a few somatotopically appropriate sites but also other somatotopically nonmatched sites were activated (Liao et al., 2021). Seminal nonhuman primate research has further demonstrated that SCI leads to extensive cortical reorganisation in S1, such that tactile stimulation of cortically adjacent body parts (e.g., of the face) activated the deprived brain territory (e.g., of the hand; Halder et al., 2018; Jain et al., 2008; Kambi et al., 2014). Although the physiological hand representation appears to largely be altered following a chronic cervical SCI in non-human primates, the anatomical isomorphs of individual fingers are unchanged (Jain et al., 1998). This suggests that while a hand representation can no longer be activated through tactile stimulation after the loss of afferent spinal pathways, a latent and somatotopic hand representation could be preserved regardless of large-scale physiological reorganisation.
  
  A similar pattern of results has been reported for human SCI patients. Transcranial magnetic stimulation (TMS) studies induced current in localised areas of SCI patient’s M1 to induce a peripheral muscle response. They found that representations of more impaired muscles retract or are absent while representations of less impaired muscles shift and expand (Fassett et al., 2018; Freund et al., 2011a; Levy et al., 1990; Streletz et al., 1995; Topka et al., 1991; Urbin et al., 2019). Similarly, human fMRI studies have shown that cortically neighbouring body part representations can shift towards, though do not invade, the deprived M1 and S1 cortex (Freund et al., 2011b; Henderson et al., 2011; Jutzeler et al., 2015; Wrigley et al., 2018, 2009). Other human fMRI studies hint at the possibility of latent somatotopic hand representations following SCI by showing that attempted movements with the paralysed and sensory deprived body part can still evoke signals in the sensorimotor system (Cramer et al., 2005; Freund et al., 2011b; Kokotilo et al., 2009; Solstrand Dahlberg et al., 2018). This attempted ‘net’ movement activity was, however, shown to substantially differ from healthy controls: Activity levels have been shown to be increased (Freund et al., 2011b; Kokotilo et al., 2009; Solstrand Dahlberg et al., 2018) or decreased (Hotz- Boendermaker et al., 2008), volumes of activation have been shown to be reduced (Cramer et al., 2005; Hotz-Boendermaker et al., 2008), activation was found in somatotopically nonmatched cortical sites (Freund et al., 2011b), and activation was poorly modulated when patients switched from attempted to imagined movements (Cramer et al., 2005). These observations have therefore mostly been attributed to abnormal and/or disorganised processing induced by the SCI. It remains possible though that, despite certain aspects of sensorimotor activity being altered after SCI, somatotopically typical representations of the paralysed and sensory deprived body parts can be preserved (e.g., finger somatotopy of affected hand). Such preserved representations have the potential to be exploited in a functionally meaningful manner (e.g., via neuroprosthetics).
  
  Case studies using intracortical stimulation in the S1 hand area to elicit finger sensations in SCI patients hint at such preserved somatotopic representations (Fifer et al., 2020; Flesher et al., 2016), with one exception (Armenta Salas et al., 2018). Negative results were suggested to be due to a loss of hand somatotopy and/or reorganisation in S1 of the implanted SCI patient or due to potential misplacement of the implant (Armenta Salas et al., 2018). Whether fine-grained somatotopy is generally preserved in the tetraplegic patient population remains unknown. It is also unclear what clinical, behavioural, and structural spinal cord determinants may influence such representations to be maintained. Here we used functional MRI (fMRI) and a visually cued (attempted) finger movement task in tetraplegic patients to examine whether hand somatotopy is preserved following a disconnection between the brain and the periphery. We instructed patients to perform the fMRI tasks with their most impaired upper limb and matched controls’ tested hands to patients’ tested hands. If a patient was unable to make overt finger movements due to their injury, then we carefully instructed them to make attempted (i.e., not imagined) finger movements. To see whether patient’s maps exhibited characteristics of somatotopy, we visualised finger selectivity in S1 using a travelling wave approach. To investigate whether fine-grained hand somatotopy was preserved and could be activated in S1 following SCI, we assessed inter-finger representational distance patterns using representational similarity analysis (RSA). These inter-finger distance patterns are thought to be shaped by daily life experience such that fingers used more frequently together in daily life have lower representational distances (Ejaz et al., 2015). RSA-based inter-finger distance patterns have been shown to depict the invariant representational structure of fingers in S1 and M1 better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). Over the past years RSA has therefore regularly been used to investigate somatotopy of finger representations both in healthy (e.g., Akselrod et al., 2017; Ariani et al., 2020; Ejaz et al., 2015; Gooijers et al., 2021; Kieliba et al., 2021; Kolasinski et al., 2016; Liu et al., 2021; Sanders et al., 2019) and patient populations (e.g., Dempsey-Jones et al., 2019; Ejaz et al., 2016; Kikkert et al., 2016; Wesselink et al., 2019). We closely followed procedures that have previously been used to map preserved and typical somatotopic finger selectivity and inter-finger representational distance patterns of amputees’ missing hands in S1 using volitional phantom finger movements (Kikkert et al., 2016; Wesselink et al., 2019). However, in amputees, these movements generally recruit the residual arm muscles that used to control the missing limb via intact connections between the brain and spinal cord. Whether similar preserved somatotopic mapping can be observed in SCI patients with diminished or no connections between the brain and the periphery is unclear. If finger somatotopy is preserved in tetraplegic patients, then we should find typical inter-finger representational distance patterns in the S1 hand area of these patients. By measuring a group of fourteen chronic tetraplegic patients with varying amounts of spared spinal cord tissue at the lesion level (quantified by means of midsagittal tissue bridges based on sagittal T2w scans), we uniquely assessed whether preserved connections between the brain and periphery are necessary to preserve fine somatotopic mapping in S1 (Huber et al., 2017; Pfyffer et al., 2019). If spared connections between the periphery and the brain are not necessary for preserving hand somatotopy, then we would find typical inter-finger representational distance patterns even in patients without spared spinal tissue bridges. We also investigated what clinical and behavioural determinants may contribute to preserving S1 hand somatotopy after chronic SCI. If spared sensorimotor hand function is not necessary for preserving hand somatotopy, then we would find typical inter-finger representational distance patterns even in patients who suffer from full sensory loss and paralysis of the hand(s).”
  
  They do not in detail report which results actually have been reported before, which is a major problem, because those prior results should have motivated the analyses the authors conducted. For instance, two of the cited studies found that in SCI patients, only ONE FINGER shifted towards the malfunctioning area (i.e., the small finger) whereas all other fingers were the same. However, the authors do NOT perform single finger analyses but always average their results ACROSS fingers. This is even true in spite of some patients indeed showing MISSING FINGERS as is clearly evident in the figure, and in spite of the clearly reduced distance of the thumb in the patients as is also visible in another figure. Nothing of this is seen in the results, because the ANOVA and analyses never have the factor of "finger". Instead, the authors always average the analyses across finger. The conclusion that the maps do not differ is therefore not justified at present. This severely reduces any conclusions that an be drawn from the data at present.
  
  We apologise for the lack of clarity. We now added additional detail regarding studies showing altered sensorimotor processing following SCI. We also clarified that we based our analysis steps on previous studies investigating hand somatotopy following deafferentation (i.e., following arm amputation; Kikkert et al., 2016; Wesselink et al., 2019) and somatotopic reorganisation RSA- based inter-finger distance patterns have been shown to depict the invariant representational structure of fingers in S1 and M1 better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). Over the past years RSA has therefore regularly been used to investigate somatotopy of finger representations both in healthy (e.g., Akselrod et al., 2017; Ariani et al., 2020; Ejaz et al., 2015; Gooijers et al., 2021; Kieliba et al., 2021; Kolasinski et al., 2016; Liu et al., 2021; Sanders et al., 2019) and patient populations (e.g. Dempsey-Jones et al., 2019; Ejaz et al., 2016; Kikkert et al., 2016; Wesselink et al., 2019). It is believed to be the most appropriate measure to reliably detect subtle changes in somatotopy. We adjusted the text in our revised Introduction section to better highlight this.
  
  Please note that we do not average across fingers in our RSA typicality procedure. Instead, RSA considers how the (attempted) movement with one finger changes the activity pattern across the whole hand representation. Note that somatotopic reorganisation will change the inter-finger distance measured with this method as previously shown (Kieliba et al., 2021; Kolasinski et al., 2016; Wesselink et al., 2019).
  
  Still, as per the reviewer’s suggestion, we conducted a robust mixed ANOVA on the RSA distance measures with a within-subjects factor for finger pair (10 levels) and a between- subjects factor for group (2 levels: controls and SCI patients). We did not find a significant group effect (F(1,21.66) = 1.50, p = 0.23). There was a significant difference in distance between finger pairs (F(9,15.38) = 27.22, p < 0.001), but this was not significantly different between groups (i.e., no significant finger pair by group interaction; F(9,15.38) = 1.05, p = 0.45). When testing for group differences per finger pair, the BF only revealed inconclusive evidence (BF > 0.37 and < 1.11; note that we could not run a Bayesian ANOVA due to normality violations). We have added this analysis to the revised manuscript.
  
  Lastly, we would like to highlight that our argument is that the finger maps can be preserved in the absence of sensory and motor function, but over time they deteriorate and become less somatotopic. As such, we do not aim to state that they are unchanged overall – but rather that they can be unchanged even despite loss of sensory and motor function. We have clarified this in our abstract and manuscript to avoid confusion.
  
  Revised abstract:
  
  “Previous studies showed reorganised and/or altered activity in the primary sensorimotor cortex after a spinal cord injury (SCI), suggested to reflect abnormal processing. However,little is knownaboutwhether somatotopically-specific representations can be preserved despite alterations in net activity. In this observational study we used functional MRI and an (attempted) finger movement task in tetraplegic patients to characterise the somatotopic hand layout in primary somatosensory cortex. We further used structural MRI to assess spared spinal tissue bridges. We found that somatotopic hand representations can be preserved in absence of sensory and motor hand functioning, and no spared spinal tissue bridges. Such preserved hand somatotopy could be exploited by rehabilitation approaches that aim to establish new hand-brain functional connections after SCI (e.g., neuroprosthetics). However, over years since SCI the hand representation somatotopy deteriorated, suggesting that somatotopic hand representations are more easily targeted within the first years after SCI.”
  
  Revised text Methods:
  
  “Second, we tested whether the inter-finger distances were different between controls and patients using a robust mixed ANOVA with a within-participants factor for finger pair (10 levels) and a between-participants factor for group (2 levels: controls and patients).”
  
  Revised text Results:
  
  “We then tested whether the inter-finger distances were different across finger pairs between controls and SCI patients using a robust mixed ANOVA with a within-participants factor for finger pair (10 levels) and a between-participants factor for group (2 levels: controls and patients). We did not find a significant difference in inter-finger distances between patients and controls (F(1,21.66) = 1.50, p = 0.23). The inter-finger distances were significantly different across finger pairs, as would be expected based on somatotopic mapping (F(9,15.38) = 27.22, p < 0.001). This pattern of inter-finger distances was not significantly different between groups (i.e., no significant finger pair by group interaction; F(9,15.38) = 1.05, p = 0.45). When testing for group differences per finger pair, the BF only revealed inconclusive evidence (BF > 0.37 and < 1.11; note that we could not run a Bayesian ANOVA due to normality violations).”
  
  Revised text Discussion:
  
  “In this study we investigated whether hand somatotopy is preserved and can be activated through attempted movements following tetraplegia. We tested a heterogenous group of SCI patients to examine what clinical, behavioural, and structural spinal cord determinants contribute to preserving S1 somatotopy. Our results revealed that detailed hand somatotopy can be preserved following tetraplegia, even in the absence of sensory and motor function and a lack of spared spinal tissue bridges. However, over time since SCI these finger maps deteriorated such that the hand somatotopy became less typical.”
  
  Concept 2: This also relates to the fact that the most prominent and consistent finding of prior studies was to show changes in map AMPLITUDE in the maps of patients. It is not clear to me how amplitude was measured here, because the text says "average BOLD activity". What should be reported are standard measures of signal amplitude both across the map area and for individual fingers.
  
  We apologise for the lack of clarity, “average BOLD activity” represented the average z- standardised activity within the S1 hand ROI. To comply with the reviewer’s comment, we adjusted this to the percent signal change underneath the S1 hand ROI and report this instead in our revised manuscript and in revised Figure 3A and revised Figure 3- Figure supplement 1. Note that results were unchanged.
  
  As per the reviewer’s suggestion, we further extracted the activity levels for individual fingers under finger-specific ROIs. To create finger-specific ROIs, probability finger maps were created based on the travelling wave data of the control group, thresholded at 25% (i.e., meaning that at least 5 out of 18 control participants needed to significantly activate a vertex for this vertex to be included in the ROI), and binarised. We then used the separately acquired blocked design data to extract the corresponding finger movement activity levels underlying these finger-specific ROIs per participant. Per ROI, we then compared the activity level between groups. After correction for multiple comparisons, there was no significant difference between groups for the thumb (U = 93, p = 0.37), index (t(30) = -0.003, p = 0.99), middle (t(30) = 1.11, p = 0.35), ring (t(30) = 2.02, p = 0.13), or little finger (t(30) = 2.14, p = 0.20). We have added this analysis to Appendix 1.
  
  Note that lower or higher BOLD amplitude levels do not influence our typicality scores per se. Indeed, typical inter-finger representational patterns have been shown to persist even in ipsilateral M1 that exhibited a negative BOLD response during finger movements (Berlot et al., 2019). As long as the typical inter-finger relationships are preserved, brain areas that have low amplitudes of activity can have a typical somatotopic representation.
  
  Revised text in Methods:
  
  "The percent signal change for overall task-related activity was then extracted for voxels underlying this S1 hand ROI per participant. A similar analysis was used to investigate overall task-related activity in an M1 hand ROI (see Figure 3- Figure supplement 1). We further compared activity levels in finger-specific ROIs in S1 between groups and conducted a geodesic distance analysis to assess whether the finger representations of the SCI patients were aligned differently and/or shifted compared to the control participants (see Appendix 1)."
  
  Revised text in Results:
  
  “Task-related activity was quantified by extracting the percent signal change for finger movement (across all fingers) versus baseline across within the contralateral S1 hand ROI (see Figure 3A). Overall, all patients were able to engage their S1 hand area by moving individual fingers (t(13)=7.46, p < 0.001; BF10=4.28e +3), as did controls (t(17)=9.92, p < 0.001; BF10=7.40e +5). Furthermore, patients’ task-related activity was not significantly different from controls (t(30)=-0.82, p=0.42; BF10=0.44), with the BF showing anecdotal evidence in favour of the null hypothesis.”
  
  Revised Appendix 1:
  
  “Percent signal change in finger-specific clusters To assess whether finger movement activity levels were different between patients and controls, we created finger-specific ROIs and extracted the activity level of the corresponding finger movement for each participant. To create the finger-specific ROIs, the probability finger surface maps that were created from the travelling wave data of the control group (see main manuscript) were thresholded at 25% (i.e., meaning that at least 5 out of 18 control participants needed to significantly activate a vertex for this vertex to be included in the ROI), and binarised. We then used the separately acquired blocked design data to extract the finger movement activity levels underlying these finger-specific ROIs. We first flipped the contrast images resulting from each participant’s fixed effects analysis (i.e., that was ran to average across the 4 blocked design runs) along the x-axis for the left-hand tested participants. Each participant’s contrast maps were then resampled to the Freesurfer 2D average atlas and the averaged z-standardised activity level was extracted for each finger movement vs rest contrast underlying the finger-specific ROIs. We compared the activity levels for each finger movement in the corresponding finger ROI (i.e., thumb movement activity in the thumb ROI, index finger movement activity in the index finger ROI, etc.) between groups. After correction for multiple comparisons, there was no significant difference between groups for the thumb (U = 93, p = 0.37), index (t(30) = -0.003, p = 0.99), middle (t(30) = 1.11, p = 0.35), ring (t(30) = 2.02, p = 0.13), or little finger (t(30) = 2.14, p = 0.20).”
  
  Appendix 1- Figure 1: Finger-specific activity levels in finger-specific regions of interest. A) Finger- specific ROIs were based on the control group’s binarised 25% probability travelling wave finger selectivity maps. B) Finger movement activity levels in the corresponding finger-specific ROIs. There were no significant differences in activity levels between the SCI patient and control groups. Controls are projected in grey; SCI patients are projected in orange. Error bars show the standard error of the mean. White arrows indicate the central sulcus. A = anterior; P = posterior.
  
  Concept 3: The authors present a hypothesis on the underlying mechanisms of SCI that does not seem to reflect prior data. The argument is that changes in map alignment relate to maladaptive changes and pain. However, the literature that the authors cite does not support this claim. In fact, Freund 2011 promotes the importance of map amplitude but not alignment, whereas other studies either show no relation of activation to pain, or they even show that map shift relates to LESS pain, i.e., the reverse argument than what the authors say. My impression is that the model that the authors present is mainly a model that is used for phantom pain but not for SCI. Taking this into consideration, the findings the authors present are not surprising anymore, because in fact none of these studies claimed that the affected area should be absent in SCI patients; these papers only say that the other body parts change in location or amplitude, which is something the authors did not measure. It is important to make this clear in the text.
  
  As the reviewer states, the literature is debated regarding the relationship between reorganisation and pain in SCI patients. We did not highlight this clearly enough. To improve clarity and focus our message we have therefore removed the sentence regarding reorganisation and pain from the Introduction of our revised manuscript. Also taking comment 2.1 and 2.2 into consideration, we have restructured our Introduction.
  
  We respectfully disagree with the reviewer that our results are not novel or surprising. Whether the full fine-grained hand somatotopy is preserved following a complete motor and sensory loss through tetraplegia has not been considered before. Furthermore, to our knowledge, there is no paper that has inspected the full somatotopic layout in a heterogenous sample of SCI patients and shown that over time since injury, hand somatotopy deteriorates. We indeed cannot make claims regarding the reorganization in S1 with regards to neighbouring cortical areas activating the hand area, as we have now clarified further in the revised Discussion. We now also clarify in our discussion that our result does not exclude the possibility of reorganisation occurring simultaneously and that this is topic for further investigation. As described in the Discussion, it is very possible that reorganisation and preserved somatotopy could co-occur.
  
  Revised text Discussion:
  
  “We did not probe body parts other than the hand and could therefore not investigate whether any remapping of other (neighbouring and/or intact) body part representations towards or into the deprived S1 hand cortex may have taken place. Whether reorganisation and preservation of the original function can simultaneously take place within the same cortical area therefore remains a topic for further investigation. It is possible that reorganisation and preservation of the original function could co-occur within cortical areas. Indeed, non-human primate studies demonstrated that remapping observed in S1 actually reflects reorganisation in subcortical areas of the somatosensory pathway, principally the brainstem (Chand and Jain, 2015; Kambi et al., 2014). As such, the deprived S1 area receives reorganised somatosensory inputs upon tactile stimulation of neighbouring intact body parts. This would simultaneously allow the original S1 representation of the deprived body part to be preserved, as observed in our results when we directly probed the deprived S1 hand area through attempted finger movements.”
  
  Concept 4: There is yet another more general point on the concept and related hypotheses: Why do the authors assume that immediately after SCI the finger map should disappear? This seems to me the more unlikely hypotheses compared to what the data seem to suggest: preservation and detoriation over time. In my view, there is no biological model that would suggest that a finger map suddenly disappears after input loss. How should this deterioration be mediated? By cellular loss? As already stated above, the finding is therefore much less surprising as the authors argue.
  
  We did not expect that finger maps would disappear, especially given the case studies using S1 intracortical stimulation studies in SCI patients and the result of preserved somatotopy of the missing hand in amputees. We are not sure which part of the manuscript might have caused this misunderstanding.
  
  With regards to the reviewer’s comment that there are no models to suggest that fingers maps would disappear: there is competing research on this as we now explain in our revised Introduction. Non-human primate research has shown that the S1 hand area becomes largely unresponsive to tactile hand stimulation after an SCI (Jain et al., 2008; Kambi et al., 2014; Liao et al., 2021). The surviving finger-related activity was shown to be disorganised such that a few somatotopically appropriate sites but also other somatotopically nonmatched sites were activated (Liao et al., 2021). These fingers areas in S1 became responsive to touch on the face. Furthermore, TMS studies that induce current in localised areas of M1 to induce a peripheral muscle response in SCI patients have shown that representations of more impaired muscles retract or are absent (Fassett et al., 2018; Freund et al., 2011a; Levy et al., 1990; Streletz et al., 1995; Topka et al., 1991; Urbin et al., 2019). We do not believe that this indicates that the S1 hand somatotopy is lost, but rather that tactile inputs and motor outputs no longer pass the level of injury. Indeed, non-human primate work showing immutable myelin borders between finger representations in S1 post SCI suggests that a latent hand representation may be preserved. Further hints for such preserved somatotopy comes from fMRI studies showing net sensorimotor activity during attempted movements with the paralysed body part, intracortical stimulation studies in SCI patients, and preserved somatotopic maps of the missing hand in amputees. We have restructured our Introduction accordingly, also taking into consideration comments 2.1, 2.2, and 2.4.
  
  Methods & Results. The authors refer to an analyses that they call "typicality" where they say that they assess how "typical" a finger map is. Given this is not a standard measure, I was wondering how the authors decided what a "typical" finger map is. In fact, there are a few papers published on this issue where the average location of each finger in a large number of subjects is detailed. Rather than referring to this literature, the authors use another dataset from another study of themselves that was conduced on n=8 individuals and using 7T MRI (note that in the present study, 3T MRI was used) to define what "typical" is. This approach is not valid. First, this "typical" dataset is not validated for being typical (i.e., it is not compared with standard atlases on hand and finger location), second, it was assessed using a different MRI field strength, third, it was too little subjects to say that this should be a typical dataset, forth, the group differed from the patients in terms of age and gender (i.e., non-matched group), and fifth, the authors even say that the design was different ("was defined similarly", i.e., not the same). This approach is therefore in my view not valid, particularly given the authors measured age- and gender-matched controls that should be used to compare the maps with the patients. This is a critical point because changes in typicality is the main result of the paper.
  
  We respectfully disagree with the reviewer that the typicality measure is not standard, invalid, and inaccurate. RSA-based inter-finger overlap patterns have been shown to depict the invariant representational structure of fingers better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). RSA-based inter- finger representation measures have been shown to have more within-subject stability (both within the same session and between sessions that were 6 months apart) and less inter-subject variability (Ejaz et al., 2015) than these other measures of somatotopy. RSA-based measures are furthermore not prone to some of the problems of measurements of finger selectivity (e.g., dependence on map thresholds). Indeed, over the past years RSA has become the golden standard to investigate somatotopy of finger representations both in healthy (e.g., Akselrod et al., 2017; Ariani et al., 2020; Ejaz et al., 2015; Gooijers et al., 2021; Kieliba et al., 2021; Kolasinski et al., 2016; Liu et al., 2021; Sanders et al., 2019) and patient populations (e.g. Dempsey-Jones et al., 2019; Ejaz et al., 2016; Kikkert et al., 2016; Wesselink et al., 2019). Moreover, various papers have been published in eLife and elsewhere that used the same RSA-based typicality criteria to assess plasticity in finger representations (Dempsey-Jones et al., 2019; Ejaz et al., 2015; Kieliba et al., 2021; Wesselink et al., 2019). We now highlight this in the revised Introduction.
  
  The canonical RDM used in our study has previously been used as a canonical RDM in a 3T study exploring finger somatotopy in amputees (Wesselink et al., 2019) and was made available to us (note that we did not collect this data ourselves). We aimed to use similar measures as in Wesselink et al (2019) and therefore felt it was most appropriate to use the same canonical RDM. One of the strengths of RSA is it can be used to quantitatively relate brain activity measures obtained using different modalities, across different species, brain areas, brain and behavioural measures etc. (Kriegeskorte et al., 2008). As such, the fact that this canonical RDM was constructed based on data collected using 7T fMRI using a digit tapping task should not influence our results. We however agree with the reviewer it is good to demonstrate that our results would not change when using a canonical RDM based on the average RDM of our age-, sex- and handedness matched control group. We therefore recalculated the typicality of all participants using the controls’ average RDM as the canonical RDM. We found a strong and highly significant correlation in typicality scores calculated using the canonical RDM from the independent dataset and the controls’ average RDM (see figure below). This was true for both the patient (rs = 0.92, p < 0.001; red dots) and control groups (rs = 0.78, p < 0.001; grey dots).
  
  We then repeated all analysis using these newly calculated typicality scores. As expected, we found the same results as when using a canonical RDM based on the independent dataset (see below for details). This analysis has been added to the revised Appendix 1 and is referred to in the main manuscript.
  
  Revised text Introduction:
  
  “To investigate whether fine-grained hand somatotopy was preserved and could be activated in S1 following SCI, we assessed inter-finger representational distance patterns using representational similarity analysis (RSA). These inter-finger distance patterns are thought to be shaped by daily life experience such that fingers used more frequently together in daily life have lower representational distances (Ejaz et al., 2015). RSA-based inter-finger distance patterns have been shown to depict the invariant representational structure of fingers in S1 and M1 better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). Over the past years RSA has therefore regularly been used to investigate somatotopy of finger representations both in healthy (e.g., Akselrod et al., 2017; Ariani et al., 2020; Ejaz et al., 2015; Gooijers et al., 2021; Kieliba et al., 2021; Kolasinski et al., 2016; Liu et al., 2021; Sanders et al., 2019) and patient populations (e.g., Dempsey- Jones et al., 2019; Ejaz et al., 2016; Kikkert et al., 2016; Wesselink et al., 2019). We closely followed procedures that have previously been used to map preserved and typical somatotopic finger selectivity and inter-finger representational distance patterns of amputees’ missing hands in S1 using volitional phantom finger movements (Kikkert et al., 2016; Wesselink et al., 2019).”
  
  Revised text Results:
  
  “This canonical RDM was based on 7T finger movement fMRI data in an independently acquired cohort of healthy controls (n = 8). The S1 hand ROI used to calculated this canonical RDM was defined similarly as in the current study (see Wesselink and Maimon- Mor, (2017b) for details). Note that results were unchanged when calculating typicality scores using a canonical RDM based on the averaged RDM of the age-, sex-, and handedness-matched control group tested in this study (see Appendix 1).”
  
  Revised text Methods:
  
  “While the traditional traveling wave approach is powerful to uncover the somatotopic finger arrangement, a fuller description of hand representation can be obtained by taking into account the entire fine-grained activity pattern of all fingers. RSA-based inter-finger overlap patterns have been shown to depict the invariant representational structure of fingers better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). RSA-based measures are furthermore not prone to some of the problems of measurements of finger selectivity (e.g., dependence on map thresholds).”
  
  “Third, we estimated the somatotopic typicality (or normality) of each participant’s RDM by calculating a Spearman correlation with a canonical RDM. We followed previously described procedures for calculating the typicality score (Dempsey-Jones et al., 2019; Ejaz et al., 2015; Kieliba et al., 2021; Wesselink et al., 2019). The canonical RDM was based on 7T finger movement fMRI data in an independently acquired cohort of healthy controls (n = 8). The S1 hand ROI used to calculated this canonical RDM was defined similarly as in the current study (see Wesselink and Maimon-Mor, (2017b) for details). Note that results were unchanged when calculating typicality scores using a canonical RDM based on the averaged RDM of the sex-, handedness-, and age matched control group tested in this study (see Appendix 1).”
  
  Revised text Appendix 1:
  
  “Typicality analysis using a canonical RDM based on the controls’ average RDM
  
  To ensure that our typicality results did not change when using a canonical inter-finger RDM based on the age-, sex-, and handedness matched subjects tested in this study, we recalculated the typicality scores of all participants using the averaged inter-finger RDM of our control sample as the canonical RDM. We found a strong and highly significant correlation between the typicality scores calculated using the canonical inter-finger RDM from the independent dataset (reported in the main manuscript) and the typicality scores calculated using our controls’ average RDM. This was true for both the SCI patient (rs = 0.92, p < 0.001) and control groups (rs = 0.78, p < 0.001).
  
  We then repeated all typicality analysis reported in the main manuscript. As expected, using the typicality scores calculated using our controls’ average RDM we found the same results as when using the canonical inter-finger RDM from the independent dataset: There was a significant difference in typicality between SCI patients, healthy controls, and congenital one-handers (H(2)=27.61, p < 0.001). We further found significantly higher typicality in controls compared to congenital one-handers (U=0, p < 0.001; BF10=76.11). Importantly, the typicality scores of the SCI patients were significantly higher than the congenital one-handers (U=2, p < 0.001; BF10=50.98), but not significantly different from the controls (U=94, p=0.24; BF10=0.55). Number of years since SCI significantly correlated with hand representation typicality (rs=-0.54, p=0.05) and patients with more retained GRASSP motor function of the tested upper limb had more typical hand representations in S1 (rs=0.58, p=0.03). There was no significant correlation between S1 hand representation typicality and GRASSP sensory function of the tested upper limb, spared midsagittal spinal tissue bridges at the lesion level, or cross-sectional spinal cord area (rs=0.40, p=0.15, rs=0.50, p=0.10, and rs=0.48, p=0.08, respectively). An exploratory stepwise linear regression analysis revealed that years since SCI significantly predicted hand representation typicality in S1 with R2=0.33 (F(1,10)=4.98, p=0.05). Motor function, sensory function, spared midsagittal spinal tissue bridges at the lesion level, and spinal cord area did not significantly add to the prediction (t=1.31, p=0.22, t=1.62, p=0.14, t=1.70, p=0.12, and t=1.09, p=0.30, respectively).”
  
  Methods & Results: The authors make a few unproven claims, such as saying "generally, the position, order of finger preference, and extent of the hand maps were qualitatively similar between patients and control". There are no data to support these claims.
  
  As indicated in this sentence, this claim substantiated a qualitative inspection of the finger maps in Figure 2 and we indeed do not support this claim with quantitative analysis. We have therefore removed this sentence from the revised manuscript and instead say, as per the suggestion of reviewer 1, that overall, there were aspects of somatotopic finger selectivity in the SCI patients’ hand maps,
  
  Revised text Results:
  
  “Overall, we found aspects of somatotopic finger selectivity in the maps of SCI patients’ hands, in which neighbouring clusters showed selectivity for neighbouring fingers in contralateral S1, similar to those observed in eighteen age-, sex-, and handedness matched healthy controls (see Figure 2A&B). A characteristic hand map shows a gradient of finger preference, progressing from the thumb (red, laterally) to the little finger (pink, medially). Notably, a characteristic hand map was even found in a patient who suffered complete paralysis and sensory deprivation of the hands (Figure 2. patient map 1; patient S01). Despite most maps (Figure 2, except patient map 3) displaying aspects of characteristic finger selectivity, some finger representations were not visible in the thresholded patient and control maps.”
  
  Methods & Results: The authors argue that the map architecture is topographic as soon as the dissimilarity between two different fingers is above 0. First, what I am really wondering about is why the authors do not provide the exact dissimilarity values in the text but only give the stats for the difference to 0 (t-value, p-value, Bayes factor). Were the dissimilarity values perhaps very low? The values should be reported. Also, when this argument that maps are topographic as long as the value of two different fingers is above 0 should hold, then the authors have to show that the value for mapping the SAME finger is indeed 0. Otherwise, this argument is not convincing.
  
  We would like to clarify that a representation is not per se topographic when the RSA dissimilarity is > 0. The dissimilarity value provided by RSA indicates the extent to which a pair of conditions is distinguished – it can be viewed as encapsulating the information content carried by the region (Kriegeskorte et al., 2008). Due to cross-validation across runs, the expected distance value would be zero (but can go below 0) if two conditions’ activity patterns are not statistically different from each other, and larger than zero if there is differentiation between the conditions (fingers’ activity patterns in the S1 hand area in our case; Kriegeskorte et al., 2008; Nili et al., 2014). The diagonal of the RDM reflect comparisons between the same fingers and therefore reflect distances between the exact same activity pattern in the same run and are thus 0 by definition (Kriegeskorte et al., 2008; Nili et al., 2014). This was also the case in our individual participant RDMs. Since this is not a meaningful value (a distance between 2 identical activity patterns will always be 0) we chose not to report this. We have clarified the meaning of the separability measure in the revised Methods section.
  
  To investigate whether a representation is somatotopic, we have to take into account the full fine-grained inter-finger distance pattern. The full fine-grained inter-finger distance pattern is related to everyday use of our hand and has been shown to depict the invariant representational structure of fingers better than the size, shape, and exact location of the areas activated by finger movements (Ejaz et al., 2015). To determine whether a participant’s inter-finger distance pattern is somatotopic one should associate it to a canonical RDM – which is done in the typicality analysis (see also our response to comment 2.6).
  
  What can be done to demonstrate the validity of an ROI, is to run RSA on a control ROI where one would not expect to find activity that is distinguishable between finger conditions. Rather than comparing your separability measure against 0, one can then compare the separability of your ROI that is expected to contain this information to that of your control ROI. We created a cerebral spinal fluid (CSF) ROI, repeated our RSA analysis in this ROI, and then compared the separability of the CSF and S1 hand area ROIs. As expected, there was a significant difference between separability (or representation strength) in the S1 hand area and CSF ROIs for both controls (W=171, p < 0.001; BF10=4059) and patients (W=105, p < 0.00; BF10=279). This analysis has been added to the revised manuscript.
  
  Individual participant separability values (i.e., distances averaged across fingers) are visualised in Figure 3D. Following the reviewer’s suggestion, we have included individual participant inter-finger distance plots for both the controls and SCI patients as Figure 3- Figure supplement 2 and Figure 3- figure supplement 3, respectively. The inter-finger distances for each finger pair and subject can be extracted from this. We feel this is more readily readable and interpretable than a table containing the 10 inter-finger distance scores for all 32 participants. These values have instead been made available online, together with our other data, on https://osf.io/e8u95/.
  
  Revised text Methods:
  
  “If there is no information in the ROI that can statistically distinguish between the finger conditions, then due to cross-validation the expected distance measure would be 0. If there is differentiation between the finger conditions, the separability would be larger than 0 (Nili et al., 2014). Note that this does not directly indicate that this region contains topographic information, but rather that this ROI contains information that can distinguish between the finger conditions. To further ensure that our S1 hand ROI was activated distinctly for different fingers, we created a cerebral spinal fluid (CSF) ROI that would not contain finger specific information. We then repeated our RSA analysis in this ROI and statistically compared the separability of the CSF and S1 hand area ROIs.”
  
  Revised text Results:
  
  “We found that inter-finger separability in the S1 hand area was greater than 0 for patients (t(13) = 9.83, p < 0.001; BF10 = 6.77e +4) and controls (t(17) = 11.70, p < 0.001; BF10 = 6.92e +6), indicating that the S1 hand area in both groups contained information about individuated finger representations. Furthermore, for both controls (W = 171, p < 0.001; BF10 = 4059) and patients (W = 105, p < 0.001; BF10 = 279) there was significant greater separability (or representation strength) in the S1 hand area than in a control cerebral spinal fluid ROI that would not be expected to contain finger specific information. We did not find a significant group difference in inter-finger separability of the S1 hand area (t(30) = 1.52, p = 0.14; BF10 = 0.81), with the BF showing anecdotal evidence in favour of the null hypothesis.”
  
  Discussion. The authors argue that spared midsagittal spinal tissue bridges are not necessary because they were not predictive of hand representation typicality. First, the measure of typicality is questionable and should not be used to make general claims about the importance of structural differences. Second, given there were only n=14 patients included, one may question generally whether predictive modelling can be done with these data. This statement should therefore be removed.
  
  We would like to clarify that, like the reviewer, we do not believe that spared midsagittal spinal tissue bridges are unimportant. Indeed, a large body of our own research focuses on the importance of spared spinal tissue bridges in recovery of sensorimotor function and pain (Huber et al., 2017; Pfyffer et al., 2021, 2019; Vallotton et al., 2019). We have added a clarification sentence regarding the importance of tissue bridges with regards to recovery of function. We agree with the reviewer that given our limited sample size, it is difficult to make conclusive claims based on non-significant predictive modelling and correlational results. In the revised manuscript we therefore focus this statement (i.e., that sensory and motor hand function and tissue bridges are not necessary to preserve hand somatotopy) on our finding that two patients without spared tissue bridges at the lesion level and with complete or near complete loss of sensory and motor hand function had a highly typical hand representation. We present our predictive modelling results as being in line with this notion and added a word of caution that it is challenging to draw definite conclusions from non-significant predictive modelling and correlation results in such a limited sample size.
  
  With regards to the reviewer’s concern about the validity of the typicality measure – please see our detailed response to comment 2.6.
  
  Revised text Discussion:
  
  “Spared spinal cord tissue bridges can be found in most patients with a clinically incomplete injury, their width being predictive of electrophysiological information flow, recovery of sensorimotor function, and neuropathic pain (Huber et al., 2017; Pfyffer et al., 2021, 2019; Vallotton et al., 2019). However, in this study, spared midsagittal spinal tissue bridges at the lesion level and sensorimotor hand function did not seem necessary to maintain and activate a somatotopic hand representation in S1. We found a highly typical hand representation in two patients (S01 and S03) who did not have any spared spinal tissue bridges at the lesion level, a complete (S01) or near complete (S03) hand paralysis, and a complete (S01) or near complete loss (S03) of hand sensory function. Our predictive modelling results were in line with this notion and showed that these behavioural and structural spinal cord determinants were not predictive of hand representation typicality. Note however that our sample size was limited, and it is challenging to draw definite conclusions from non-significant predictive modelling results.”
  
  Discussion. The authors say that hand representation is "preserved" in SCI patients. Perhaps it is better to be precise and to say that they active during movement planning.
  
  We thank the reviewer for their suggestion and revised the Discussion accordingly.
  
  Revised text Discussion:
  
  "In this study we investigated whether hand somatotopy is preserved and can be activated through attempted movements following tetraplegia."
  
  "How may these representations be preserved over time and activated through attempted movements in the absence of peripheral information?"
  
  "Together, our findings indicate that in the first years after a tetraplegia, the somatotopic S1 hand representation is preserved and can be activated through attempted movements even in the absence of retained sensory function, motor function, and spared spinal tissue bridges."
  
  Reviewer #3:
  
  The demonstration that cortex associated with an amputated limb can be activated by other body parts after amputation has been interpreted as evidence that the deafferented cortex "reorganizes" and assumes a new function. However, other studies suggest that the somatotopic organization of somatosensory cortex in amputees is relatively spared, even when probed long after amputation. One possibility is that the stability is due to residual peripheral input. In this study, Kikkert et al. examine the somatotopic organization of somatosensory cortex in patients whose spinal cord injury has led to tetraplegia. They find that the somatotopic organization of the hand representation of somatosensory cortex is relatively spared in these patients. Surprisingly, the amount of spared sensorimotor function is a poor predictor of the stability of the patients' hand somatotopy. Nonethless, the hand representation deteriorates over decades after the injury. These findings contribute to a developing story on how sensory representations are formed and maintained and provide a counterpoint to extreme interpretations of the "reorganization" hypothesis mentioned above. Furthermore, the stability of body maps in somatosensory cortex after spinal cord injury has implications for the development of brain-machine interfaces.
  
  I have only minor comments:
  
  1) Given the controversy in the field, the use of the phrase "take over the deprived territory" (line 45) is muddled. Perhaps a more nuanced exposition of this phenomenon is in order?
  
  We agree a more nuanced expression would be more appropriate. We have changed this sentence accordingly in the revised manuscript.
  
  Revised text Introduction:
  
  “Seminal research in nonhuman primate models of SCI has shown that this leads to extensive cortical reorganisation, such that tactile stimulation of cortically adjacent body parts (e.g. of the face) activated the deprived brain territory (e.g. of the hand; Halder et al., 2018; Jain et al., 2008; Kambi et al., 2014).”
  
  2) The statement that "results are mixed" regarding intracortical microstimulation of S1 is dubious. In only one case has the hand representation been mislocalized, out of many cases (several at CalTech, 3 at the University of Pittsburgh, one at Case Western, one at Hopkins/APL, and one at UChicago). Perhaps rephrase to "with one exception?"
  
  We agree that this sentence may give a wrong outlook on the literature and have changed the text per the reviewer’s suggestion.
  
  Revised text Introduction:
  
  “Case studies using intracortical stimulation in the S1 hand area to elicit finger sensations in SCI patients hint at such preserved somatotopic representations (Fifer et al., 2020; Flesher et al., 2016), with one exception (Armenta Salas et al., 2018).”
  
  3) The phrase "tetraplegic sinal cord injury" seems awkward.
  
  Thank you for highlighting this to us. We have corrected these instances in our revised manuscript to “tetraplegia”.
  
  4) The stability of the representation is attributed to efference copy from M1. While this is a fine speculation, somatosensory cortex is part of a circuit and is interconnected with many other brain areas, M1 being one. Perhaps the stability is maintained due to the position of somatosensory cortex within this circuit, and not solely by its relationship with M1? There seems to be an overemphasis of this hypothesis at the exclusion of others.
  
  Thank you for this comment. We agree we overemphasized the efference copy theory. We have adjusted this and now provide a more balanced description of potential circuits and interconnections that could maintain somatotopic representations after tetraplegia.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.02.08.430185v1
www.biorxiv.org www.biorxiv.org

Diversification of multipotential postmitotic mouse retinal ganglion cell precursors into discrete types

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  In this report, Shekhar et al, have profiled developing retinal ganglion cells from embryonic and postnatal mouse retina to explore the diversification of this class of neurons into specific subtypes. In mature retina, scRNAseq and other methods have defined approximately 45 different subtypes of RGCs, and the authors ask whether these arise from a common postmitotic precursor, or many ditinct subtypes of precursors. The overall message, is that subtype diversification arises as a "gradual, asynchronus fate restriction of postmitotic multipotential precursors. The authors find that over time, clusters of cells become "decoupled" as they split into subclusters. This process of fate decoupling is associated with changes in the expression of specific transcription factors. This allows them to both predict lineage relationships among RGC subtypes and the time during development when these specification events occur. Although this conclusion based almost entirely on a computational analysis of the relationships among cells sampled at discrete times, the evidence presented supports the overall conclusion. Future experimental validation of the proposed lineage relationships of RGC subtypes will be needed, but this report clearly outlines the overall pattern of diversification in this cell class.
  
  We thank the reviewer for their thoughtful assessment of our study.
  
  Reviewer #2 (Public Review):
  
  The manuscript "Diversification of multipotential postmitotic mouse retinal ganglion cell precursors into discrete types" by Shekhar and colleagues represents an in-depth analysis of an additional transcriptomic datasets of retinal single-cells. It explores the progression of retinal ganglion cells diversity during development and describes some of aspects of fate acquisition in these postmitotic neurons. Altogether the findings provide another resource on which the neural development community will be able to generate new hypotheses in the field of retinal ganglion cell differentiation. A key point that is made by the authors regards the progression of the number of ganglion cell types in the mouse retina, i.e., how, and when neuronal "classes diversify into subclasses and types" (also p. 125). In particular, the authors would like to address whether postmitotic neurons follow either a predetermination or a stepwise progression (Fig. 2a). This is indeed a fascinating question, and the analysis, including the one based on the Waddington-OT method is conceptually interesting.
  
  Comments and questions:
  
  Is the transcriptomic diversity, based on highly variable genes (the number of which is not detailed in the study) a robust proxy to assess cell types? One could argue that early on predetermined cell types are specified by a small set of determinants, both at the proteomic and transcriptomic level, and that it takes several days or week to generate the cascade that allows the detection of transcriptional diversity at the level of >100 gene expression levels.
  
  We had tested the dependence of our results on the number of highly variable genes (HVGs) used. This analysis, shown in Figure 2h, demonstrates that results are robust over the range tested – 1244-3003 total HVGs. Since the analysis in the paper employs 2800 HVGs (~800- 1500 at each stage), we are confident that we are in comfortable excess of the number at which we would need to worry. We have expanded the discussion to avoid confusion on this point. We also address the possibility that a small set of determinants are sufficient to define cell state in a transcriptomic study. This is a common argument, but we believe it is a tenuous one. We believe that the only way a small number of genes can truly define cell state is if they are expressed at very high levels. If these are expressed at high levels, they should be detected in our data and should drive the clustering. If they are expressed at extremely low levels, then given the nature of molecular fluctuations in cells, they cannot be expected to serve as a stable scaffold for differentiation. Indeed, a small set of determinants (usually transcription factors) may be necessary to specify a cell type. However, sufficiency of specification requires the expression of a usually much larger of number downstream regulators.
  
  Since there are many RGC subsets (45) that share a great number of their gene expression, is it possible that a given RGC could transition from one subset to another between P5 and P56? Or even responding to a state linked to sustained activity? Was this possibility tested in the model?
  
  We cannot address the possibility that cells swap types postnatally so that the cells comprising type X at P5 are not the same ones that comprise type X at P56. It does seem pretty unlikely, as the cell types are well-separated in transcriptional space (~250 DE genes on average). Regarding activity, we have made some initial tests by preventing visually evoked activity from birth to P56 in three different ways (dark-rearing and two mutant lines). We find no statistically significant effect on diversification. These results are currently being prepared for publication.
  
  The authors state that early during development there is less diversity than later. This statement seems obvious but how much. Can this be due to differential differentiation stage? At E16 RGC are a mix of cells born from E11 to E16, with the latter barely located in the GCL. Does this tend to show a continuum that is may be probably lost when the analysis is performed on cells isolated a long time after they were born (postnatal stages)? Alternatively, would it be possible to compare RGC that have been label with birth dating methods?
  
  Regarding the amount of diversification, we quantified this using the Rao diversity index (Figure 2h), which suggests an overall increase in 2-fold transcriptional diversity at P56 compared to the early stages. The continuum is likely because cells at early stage are close to the precursor stage and not very differentiated. Regarding combining RNA-seq with birthdating, although elegant methods now make this combination possible, it falls beyond the scope of this study.
  
  Comparing data produced by different methods can be challenging. Here the authors compared transcriptomic diversity between embryonic dataset produced with 10X genomics (E13 to P0) and, on the other hand, postnatal P5 that were produced using a different drop-seq procedure). Is it possible to control that the differences observed are not due to the different methods?
  
  It is correct that most of the P5 data was produced using Drop-seq, but that dataset also includes transcriptomes obtained by the 10X method. The relative frequency of RGC clusters and the average gene expression values obtained using either method was highly correlated (Reviewer Fig. 1). This is now pointed out in the “Methods.”
  
  Reviewer Fig. 1. Comparison between the relative frequency of types (left) and the average gene expression levels (right) at P5 between 10X data (y-axis) and Drop-seq data (x-axis). R corresponds to the Pearson correlation coefficient. The axes are plotted in the logarithmic scale.
  
  It might be important to control the conclusion that diversity is lower at E13 vs P5 when we see that thrice less cells (5900 vs 180000) were analyzed at early stage (BrdU, EdU, CFSE...)? A simple downsampling prior to the analysis may help.
  
  Although we collected different numbers of cells at different ages, we noted in the text that they do not influence the number of clusters. Regarding P5 specifically, Rheaume et al. (who we now discuss) obtained very similar results to ours with only 6000 cells (3x lower).
  
  Ipsilateral RGC: It is striking that the DEG between C-RGC and I-RGC reflect a strong bias with cells scored as" ipsi" are immature RGC while the other ("contra") are much more mature. This bias comes from the way ipsilateral RGC were "inferred" using non-specific markers. Can the author try again the analysis by identifying RGC using more robust markers? (eg. EphB1). Would it be possible to select I-RGC and C-RGC that share same level of differentiation? Previous studies already identified I-RGC signature using more specific set-up (Wang et al., 2016 from retrogradely labelled RGC; Lo Giudice et al., 2019 with I-RGC specific transgenic mouse).
  
  We are not sure how the reviewer concludes that the putative I-RGCs are more immature than the putative C-RGCs. As discussed earlier, insofar as expression levels of pan-RGC markers are indicative of maturational stage, we found no evidence that clustering is driven by maturation gradients. Thus, we expect our putative I-RGCs and C-RGCs to not differ in differentiation state. Following the reviewer’s suggestion, we now include EphB1(Ephb1) in our I-RGC signature. The impact of replacing Igfbp5 with Ephb1 on the inferred proportion of I-RGCs within each terminal type was minimal (Reviewer Fig. 2). We would like to note that to assemble our IRGC/C-RGC signatures we relied on data presented Wang et al. (2016). Outside of wellestablished markers (e.g. Zic2, and Isl2), we chose the RNA-seq hits in Wang et al. that had been validated histologically in the same paper or that are correlated with Zic2 expression in our data. This nominated Igfbp5, Zic1, Fgf12, and Igf1.
  
  Reviewer Fig. 2. Comparison of inferred I-RGC frequency within each terminal type (points) using two I-RGC signature reported in the paper. For the y-axis we used Zic2 and EphB1.
  
  It would be important to discuss how their findings differs from the others (including Rheaume et al., 2018). To make a strong point, I-RGC shall be isolated at a stage of final maturation (P5?) and using retrograde labelling, which is a robust method to ensure the ipsilateral identity of postnatal RGCs.
  
  We cite Rheaume et al. in several places. In fact, there is good transcriptional correspondence between our dataset and theirs (Figure S1i), despite the differences in the number of cells profiled (~6000 vs ~18000) and technologies (10X vs. Drop-seq/10X). We now mention this is the text. Note also that we had compared our P56 data with Rheaume et al.’s, P5 data in an earlier publication (Tran et al., 2019) and observed a similar tight correspondence between clusters. Zic1 is expressed in I-RGCs (Wang et al., 2016) at early stages, and in our dataset its expression at E13 and E14 is similar to that of Zic2 (Supplementary Fig. 8); Postnatally, however, it marks W3B RGCs (Tran et al., 2019), many of which project contralaterally (Kim et al., J. Neurosci. 2010). Regarding retrograde labeling, as noted above, additional experiments would take a prohibitively long time (up to a year) to complete.
  
  It is unclear how good Zic1 and Igf1 can be used as I-RGC marker. Can the author specify how specific to I-RGC they are? Have they been confirmed as marker using retrograde labelling experiments?
  
  We have relied on previous work, primarily from the Mason lab, to choose I-RGC and C-RGC markers. Igf1 is a C-RGC marker that is expressed in a complementary fashion with Igfbp5, an I-RGC marker as noted in Wang et al, 2016. They also perform ISH to show that Igf1 is not expressed in the VT crescent, while Igfbp5 is (see Fig. 5 in Wang et al., 2016). Similarly, Zic1 is also cited in Wang et al. as an RNA-seq hit for I-RGCs. Although Zic1 was not validated using ISH, we found its expression pattern to be highly correlated with Zic2 at E13 (Supplementary Fig. 8c).
  
  The enrichment procedure may deplete the RGC subpopulation that express low levels of Thy1 or L1CAM. A comparison on that point could be done with the other datasets analysed in the study.
  
  We presume the reviewer is referring to the data of Lo Guidice and Clark/Blackshaw, which we show in comparison to ours in Figure S1. In both of those studies, all retinal cells were analyzed, whereas we enriched RGCs. As noted in the text, RGCs comprise a very small fraction of all retinal cells, so Lo Giudice and Clark/Blackshaw lacked the resolution to resolve RGC diversity at later time points. Indeed, there is no whole retina dataset available in which RGCs are numerous enough for comprehensive subtyping. Our approach to this issue was to collect RGCs with both Thy1 and L1 at E13, E14, E16 and P0, with the idea that the markers might have complementary strengths and weaknesses. In fact, at each age, all clusters are present in both collection types, although frequencies vary. This concordance supports the idea that neither marker excludes particular types. We now stress this point in results and in the Supplementary Fig. 2 legend.
  
  In supplemental Fig. S1e: why are cells embedded from "Clark" datasets only clusters on the right side of the UMAP while the others are more evenly distributed?
  
  Actually, both the Clark et al. and Lo Giudice et al. datasets are predominantly clustered on the right side of the UMAP. This reflects the methodological difference noted above: they profiled the whole retina, whereas we isolated RGCs. Thus, their datasets contain a much higher abundance of RPCs and non-neurogenic precursors compared to ours. The right clusters represent RPCs due to their expression of Fgf15 and other markers, while the left clusters represent RGCs based on their expression of Nefl. Indeed, a main reason for including these plots was to illustrate the relative abundance of RGCs in our data (also see Supplementary Fig. S1h).
  
  What could explain that CD90 and L1CAM population are intermingled at E14, distinct at E16, and then more mixed at P0?
  
  We believe the reviewer is referring to Supplementary Figs. S2a-c. Given the temporal expression level changes in Thy1 and L1cam (Supplementary Fig. S1c) in RGCs, a likely possibility is that they enrich RGC precursor subsets at different relative frequencies. We now note this in the Supplementary Fig. 2 legend.
  
  On Fig. 6: the E13 RGC seems to be segregated in early born RGC expressing Eomes and later born expressing neurod2. Thus, fare coupling with P5 seems to suggest that Eomes population at P5 may have been generated first, and Neurod2 generated later. Is that possible?
  
  That the Eomes RGCs are specified before Neurod2 RGCs is one of our conclusions from the fate decoupling analysis (Figures 6f-h). Whether this is because the former arise from early born cells and the latter arise from later born cells is not clear. There is disagreement in the literature on whether ipRGCs are born at a different time than other RGCs, so we prefer not to make a comment.
  
  Methods: The Methods section is extensive, and yet it is presented in a rather complex manner so that it is difficult to understand for a broad audience. It would be valuable if the authors could simplify or better explain some parts (the WOT section in particular).
  
  We believe that the sections on animals, molecular biology and histology are quite straightforward, but agree that the sections describing the computational analysis are hard going. We have modified them in several places as requested. As regards better explanation of the WOT, we now precede that section with an “overview” as a way of making it easier to follow. (We had already included an overview of the clustering procedures.) We have also provided further detail on some of the reviewer’s subsequent questions on this section, including the use of HVGs, the Classifier, and the strategy for inferring I-RGCs (see below). Perhaps most important, we have worked to make the “Results” and “Discussion” sections accessible to a broad audience.
  
  *Highly variable genes (HVG) used for clustering and dimensionality reduction: how many of them and what are they? Are they the same used for each stage?
  
  Since clustering was performed at each stage independently, we determined HVGs at each stage separately using a statistical method introduced in one of our previous studies (Pandey et al., Current Biology, 2018). The total number of HVGs at each stage were as follows: E13: N=1094 E14: N=834 E16: N=822 P0: N=881 P5: N=1105 P56: N=1510
  
  We note that these are not necessarily the same at each stage due to the temporal variation in gene expression. Together these correspond to 2854 unique genes (union of all HVGs). The WOT analysis was done using this full set.
  
  *In the methods p9: "The common features G = GR ∩ GT are used to train a third classifier ClassR on the reference atlas AR. This ensures that inferred transcriptomic correspondences are based on "core" gene expression programs that underlie cell type identity rather than maturation-associated genes." Could the authors explain the relevance of using a third model and, more importantly, is there any genes that eliminated through the procedure that could be important to drive the diversification process? If so, would it be possible to estimate their number and the relative impact?
  
  The rationale for this was as follows. Our goal is to map cells from one time point to a type at another time point. The naïve way to do this would be to use a classifier trained entirely at either of the time point. However, the features of such a classifier is likely to contain genes that are not expressed at the earlier time point, and likely to generate spurious mappings (since the set of cluster specific genes are not identical). Therefore, we sought to train a classifier that is trained using genes that are part of conserved transcriptional signatures at both time points, which corresponds to the third model.
  
  When this filtering was not performed, the temporal correspondences in the supervised classification model were less specific than those reported. In particular, ARI values dropped by about 15% on average. The simple reason for this is that a cluster specific gene at E13 (for e.g.) may no longer be expressed at E14, and vice-versa. Thus, by restricting the features to a common set of cluster specific genes, we obtained the “best possible” transcriptomic correspondences between clusters at consecutive time points. We note that the correspondences obtained in this way (Figure 3) were recovered through WOT when the results of the latter were collapsed at the cluster level (Supplementary Fig. 5).
  
  *Methods page 15: Inference of ipsilaterally-projecting RGC types. Wouldn't it be more valuable to consider more markers to distinguish RGC precursors?
  
  As indicated before, we used I-RGC genes and C-RGC genes reported in Wang et al., 2016 (Table 2), in addition to the well-known markers Zic2 and Isl2. Here, we prioritized genes that had been histologically validated (Figs. 4 and 5), which were expressed in our data (Sema3e and Tbx20 were not considered as these undetectable at E13 in our data). Following the reviewer’s earlier suggestion, we also noted that including Ephb1 in our signature minimally impacts the results.
  
  Discussion: *Is there somewhat a plasticity that allow the RGC subgroups to switch over time? (IF we were to record the transcriptome of the same cell over time, will one observe that the cell belong to another cluster / subgroup?
  
  One can only speculate. Other than long-term in vivo imaging combined with vital type-specific markers we know of no way to experimentally address the possibility that cells swap types postnatally so that the cells comprising type x at P5 are not the same ones that comprise type x at P56. It does seem pretty unlikely though.
  
  *While the data appears technically rigorous, and the number of cells sequenced very high, the results seem redundant with several prior studies and the discrepancies are not sufficiently discussed.
  
  We are confused by this point, since the reviewer does not cite the papers to which s/he refers. To our knowledge there is no study at present that has described RGC diversification, so it is not clear what would be discrepant.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.10.21.465277v1
www.biorxiv.org www.biorxiv.org

New submission 10/01/2023, 10:40:26

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  It is well established that valuation and value-based decision-making is context-dependent. This manuscript presents the results of six behavioral experiments specifically designed to disentangle two prominent functional forms of value normalization during reward learning: divisive normalization and range normalization. The behavioral and modeling results are clear and convincing, showing that key features of choice behavior in the current setting are incompatible with divisive normalization but are well predicted by a non-linear transformation of range-normalized values.
  
  Overall, this is an excellent study with important implications for reinforcement learning and decision-making research. The manuscript could be strengthened by examining individual variability in value normalization, as outlined below.
  
  We thank the Reviewer for the positive appreciation of our work and for the very relevant suggestions. Please find our point-by-point answer below.
  
  There is a lot of individual variation in the choice data that may potentially be explained by individual differences in normalization strategies. It would be important to examine whether there are any subgroups of subjects whose behavior is better explained by a divisive vs. range normalization process. Alternatively, it may be possible to compute an index that captures how much a given subject displays behavior compatible with divisive vs. range normalization. Seeing the distribution of such an index could provide insights into individual differences in normalization strategies.
  
  Thank you for pointing this out, it is indeed true that there is some variability. To address this, and in line with the Reviewer’s suggestion, we extracted model attributions per participant on the individual out-of-sample log-likelihood, using the VBA_toolbox in Matlab (Daunizeau et al., 2014). In experiment 1 (presented in the main text), we found that the RANGE model accounted for 79% of the participants, while the DIVISIVE model accounted for 12%. The relative difference was even higher when including the RANGEω model in the model space: the RANGE and RANGEω models account for a total of 85% of the participants, while the DIVISIVE model accounted only for 5%.
  
  In experiment 2 (presented in the supplementary materials), the results were comparable (see Figure 3-figure supplement 3: 73% vs 10%, 83% vs 2%).
  
  To provide further insights into the behavioral signatures behind inter-individual differences, we plotted the transfer choice rates for each group of participants (best explained by the RANGE, DIVISIVE, or UNBIASED models), and the results are similar to our model predictions from Figure 1C:
  
  Author Response Image 1. Behavioral data in the transfer phase, split over participants best explained by the RANGE (left), DIVISIVE (middle) or UNBIASED (right) model in experiment 1 (A) and experiment 2 (B) (versions a, b and c were pooled together).
  
  To keep things concise, we did not include this last figure in the revised manuscript, but it will be available for the interested readers in the Rebuttal letter.
  
  One possibility currently not considered by the authors is that both forms of value normalization are at work at the same time. It would be interesting to see the results from a hybrid model. R1.2 Thank you for the suggestion, we fitted and simulated a hybrid model as a weighted sum between both forms of normalization:
  
  First, the HYBRID model quantitatively wins over the DIVISIVE model (oosLLHYB vs oosLLDIV : t(149)=10.19, p<.0001, d=0.41) but not over the RANGE model, which produced a marginally higher log-likelihood (oosLLHYB vs oosLLRAN : t(149)=-1.82, p=.07, d=-0.008). Second, model simulations also suggest that the model would predict a very similar (if not worse) behavior compared to the RANGE model (see figure below). This is supported by the distribution of the weight parameter over our participants: it appears that, consistently with the model attributions presented above, most participants are best explained by a range-normalization rule (weight > 0.5, 87% of the participants, see figure below). Together, these results favor the RANGE model over the DIVISIVE model in our task.
  
  Out of curiosity, we also implemented a hybrid model as a weighted sum between absolute (UNBIASED model) and relative (RANGE model) valuations:
  
  Model fitting, simulations and comparisons slightly favored this hybrid model over the UNBIASED model (oosLLHYB vs oosLLUNB: t(149)=2.63, p=.0094, d=0.15), but also drastically favored the range normalization account (oosLLHYB vs oosLLRAN : t(149)=-3.80, p=.00021, d=-0.40, see Author Response Image 2).
  
  Author Response Image 2. Model simulations in the transfer phase for the RANGE model (left) and the HYBRID model (middle) defined as a weighted sum between divisive and range forms of normalization (top) and between unbiased (no normalization) and range normalization (bottom). The HYBRID model features an additional weight parameter, whose distribution favors the range normalization rule (right).
  
  To keep things concise, we did not include this last figure in the revised manuscript, but it will be available for the interested readers in the Rebuttal letter.
  
  Reviewer #2 (Public Review):
  
  This paper studies how relative values are encoded in a learning task, and how they are subsequently used to make a decision. This is a topic that integrates multiple disciplines (psych, neuro, economics) and has generated significant interest. The experimental setting is based on previous work from this research team that has advanced the field's understanding of value coding in learning tasks. These experiments are well-designed to distinguish some predictions of different accounts for value encoding. However there is an additional treatment that would provide an additional (strong) test of these theories: RN would make an equivalent set of predictions if the range were equivalently adjusted downward instead (for example by adding a "68" option to "50" and "86", and then comparing to WB and WT). The predictions of DN would differ however because adding a low-value alternative to the normalization would not change it much. Would the behaviour of subjects be symmetric for equivalent ranges, as RN predicts? If so this would be a compelling result, because symmetry is a very strong theoretical assumption in this setting.
  
  We thank the Reviewer for the overall positive appraisal concerning our work, but also for the stimulating and constructive remarks that we have addressed below. At this stage, we just wanted to mention that we also agree with the Reviewer concerning the fact that a design where we add "68" option to "50" and "86" would represent also an important test of our hypotheses. This is why we had, in fact, run this experiment. Unfortunately, their results were somehow buried in the Supplementary Materials of our original submission and not correctly highlighted in the main text. We modified the manuscript in order to make them more visible:
  
  Behavioral results in three experiments (N=50 each) featuring a slightly different design, where we added a mid value option (NT68) between NT50 and NT87 converge to the same broad conclusion: the behavioral pattern in the transfer phase is largely incompatible with that predicted by outcome divisive normalization during the learning phase (Figure 2-figure supplement 2).
  
  Reviewer #3 (Public Review):
  
  Bavard & Palminteri extend their research program by devising a task that enables them to disassociate two types of normalisation: range normalisation (by which outcomes are normalised by the min and max of the options) and divisive normalisation (in which outcomes are normalised by the average of the options in ones context). By providing 4 different training contexts in which the range of outcomes and number of options vary, they successfully show using 'ex ante' simulations that different learning approaches during training (unbiased, divisive, range) should lead to different patterns of choice in a subsequent probe phase during which all options from the training are paired with one another generating novel choice pairings. These patterns are somewhat subtle but are elegantly unpacked. They then fit participants' training choices to different learning models and test how well these models predict probe phase choices. They find evidence - both in terms of quantitive (i.e. comparing out-of-sample log-likelihood scores) and qualitative (comparing the pattern of choices observed to the pattern that would be observed under each mode) fit - for the range model. This fit is further improved by adding a power parameter which suggests that alongside being relativised via range normalisation, outcomes were also transformed non-linearly.
  
  I thought this approach to address their research question was really successful and the methods and results were strong, credible, and robust (owing to the number of experiments conducted, the design used and combination of approaches used). I do not think the paper has any major weaknesses. The paper is very clear and well-written which aids interpretability.
  
  This is an important topic for understanding, predicting, and improving behaviour in a range of domains potentially. The findings will be of interest to researchers in interdisciplinary fields such as neuroeconomics and behavioural economics as well as reinforcement learning and cognitive psychology.
  
  We thank Prof. Garrett for his positive evaluation and supportive attitude.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.07.14.500032v2
www.biorxiv.org www.biorxiv.org

Synthetic deconstruction of hunchback regulation by Bicoid

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this paper, Fernandes et al. take advantage of synthetic constructs to test how Bicoid (Bcd) activates its downstream target Hunchback (Hb). They explore synthetic constructs containing only Bcd, Bcd and Hb, and Bcd and Zelda binding sites. They use these to develop theoretical models for how Bcd drives Hb in the early embryo. They show that Hb sites alone are insufficient to drive further Hb expression.
  
  The paper's first half focuses on how well the synthetic constructs replicate the in vivo expression of hb. This approach is generally convincing, and the results are interesting. Consistent with previous work, they show that Bcd alone is sufficient to drive an expression profile that is similar to wild‐type, but the addition of Hb and Zelda are needed to generate precise and rapid formation of the boundaries. The experimental results are supported by modelling. The model does a nice job of encapsulating the key conclusions and clearly adds value to the analysis.
  
  In the second part of the paper, the authors use their synthetic approach to look at how the Hb boundary alters depending on Bcd dosage. This part asks whether the observed Bcd gradient is the same as the activity gradient of Bcd (i.e. the "active" part of Bcd is not a priori the same as the protein gradient). This is a very interesting problem and good the authors have tried to tackle this. However, the strength of their conclusions needs to be substantially tempered as they rely on an overestimation of the Bcd gradient decay length.
  
  Comments:
  
  ‐ My major concern regards the conclusions for the final section on the activity gradient. In the Introduction it is stated: "[the Bcd gradient has] an exponential AP gradient with a decay length of L ~ 20% egg‐length (EL)". While this was the initial estimate (Houchmandzadeh et al., Nature 2002), later measurements by the Gregor lab (see Supplementary Material of Liu et al., PNAS 2013) found that "The mean length constant was reduced to 16.5 ± 0.7%EL after corrections for EGFP maturation". The original measurements by Houchmandzadeh et al. had issues with background control, that also led to the longer measured decay length. In later work, Durrieu et al., Mol Sys Biol 2018, found a similar scale for the decay length to Liu et al. Looking at Figure 5, a value of 16.5%EL for the decay length is fully consistent with the activity and protein gradients for Bcd being similar. In short, the strength of the conclusions clearly does not match the known gradient and should be substantially toned down.
  
  The reviewer is right: several studies aiming to quantitatively measure the Bicoid protein gradient ended‐up with quite different decay lengths.
  
  A summary of the various decay lengths measured, and the method used for these measurements is given below:
  
  As indicated, these measurements are quite variable among the different studies and the differences can potentially be attributed to different methods of detection (antibody staining on fixed samples vs fluorescent measurements on live sample) or to the type of protein detected (endogenous Bicoid vs fluorescently tagged).
  
  We agree with the reviewer that given these discrepancies, the exact value of the Bcd protein gradient decay length is not known and that we only have measurements that put it in between 16 and 25 % EL (see the Table above). Therefore, we agree that we should tone down the difference between the protein vs activity gradient and focus on the measurements of the effective activity gradient decay length allowed by our synthetic reporters. This allows us to revisit the measurement of the Hill coefficient of the transcription step‐like response, which is based on the decay‐length for the Bcd protein gradient, and assumed in previous published work to be of 20% EL (Gregor et al., Cell, 2007a; Estrada et al., 2016; Tran et al., PLoS CB, 2018). Importantly, the new Hill coefficient allows us to set the Bcd system within the limits of an equilibrium model.
  
  As mentioned by the reviewer, it is possible that the decay length of the protein gradient measured using antibody staining (Houchmandzadeh et al,, Nature, 2002) was not correct due to background controls. Such measurements were also performed in Xu et al. (2015) which agree with the original measurements (Houchmandzadeh et al., Nature 2002). As indicated in the table above, all the other measurements of the Bcd protein gradient decay length were done using fluorescently tagged Bcd proteins and we cannot exclude the possibility the wt vs tagged protein might have different decay lengths due to potentially different diffusion coefficients or half‐lives. Before drawing any conclusion on the exact value of the endogenous Bcd protein gradient decay length, it is essential to measure it again in conditions that correct for the background issues for immuno‐staining as it was done in Liu et al., PNAS, 2013 for the Bcd‐eGFP protein. In this study, the authors only measured the decay length of the Bcd fusion protein using immuno‐staining for the Bcd protein. Unfortunately, in this study, the authors did not measure again the decay length of the endogenous Bcd protein gradient using immuno‐staining and the same procedure for background control. Therefore, they do not firmly exclude the possibility that the endogenous vs tagged Bcd proteins might have different decay length.
  
  We thank the reviewer for his comment which helped us to clarify the message. In addition, as there is clearly an issue for the measurements of the Bcd protein gradient, we added a section in the SI (Section E) and a Table (Table S4) describing the various decay length measured for the Bcd or the Bcd‐fluorescently tagged protein gradients from previous studies. In the discussion, together with the possibility that there might be a protein vs activity gradient (as we originally proposed and believe is still a valid possibility), we also discuss the alternative possibility proposed by the reviewer which is that the protein vs activity gradients have the same decay lengths but that the decay length of the Bcd protein gradient was potentially not correctly evaluated.
  
  ‐ All of the experiments are performed in a background with the hb gene present. Does this impact on the readout, as the synthetic lines are essentially competing with the wild‐type genes? What controls were done to account for this?
  
  We agree with the reviewer that this concern might be particularly relevant at the hb boundary where a nucleus has been shown to only contain ~ 700 Bicoid molecules (Gregor et al., Cell, 2007b). However, ~1000 Bicoid binding regions have been identified by ChIP seq experiments in nc14 embryos (Hannon et al., Elife, 2017) and given that several Bcd binding sites are generally clustered together in a Bcd region, the number of Bcd binding sites in the fly genome is likely larger than 1000. It is much greater than the number of Bicoid binding sites in our synthetic reporters. Therefore, we think that it is unlikely that adding the synthetic reporters (which in the case of B12 only represents at most 1/100 of the Bcd binding sites in the genome) will severely alter the competition for Bcd binding between the other Bcd binding sites in the genome. Additionally, the insertion of a BAC spanning the endogenous hb locus with all its Bcd‐dependent enhancers did not affect (as far as we can tell) the regulation of the wildtype gene (Lucas, Tran et al., 2018).
  
  We have added a sentence concerning this point in the main text (lines 108 to 111).
  
  ‐ Further, the activity of the synthetic reporters depends on the location of insertion. Erceg et al. PLoS Genetics 2014 showed that the same synthetic enhancer can have different readout depending on its genomic location. I'm aware that the authors use a landing site that appears to replicate similar hb kinetics, but did they try random insertion or other landing site? In short, how robust are their results to the specific local genome site? This should have been tested, especially given the boldly written conclusions from the work.
  
  This concern of the reviewer has been tested and is addressed Fig S1 where we compare two random insertions of the hb‐P2 transgene (on chromosome II and III; Lucas, Tran et al., 2018) and the insertion at the VK33 landing site that was used for the whole study. As shown Fig. S1, the dynamics of transcription (kymographs) are very similar. In the main text, the reference Fig. S1 is found in the Materials and Methods section (bottom of the 1st paragraph concerning the Drosophila stocks, lines 518).
  
  ‐ Related to the above, it's also not obvious that readout is linear ‐ i.e. as more binding sites are added, there could be cooperativity between binding domains. This may have been accounted for in the model but it is not clear to me how.
  
  The reviewer is totally correct. It is clear from our data that readout is not linear: comparing (increase of 1.5 X in the number of BS) B6 with B9 leads to a 4.5 X greater activation rate and this argues against independent activation of transcription by individual bound Bcd TF. There is almost no impact of adding 3 more sites when comparing B9 to B12 (even though it corresponds to an increase of 1.33 X in the number of BS). This issue has been rephrased in the main text (lines 200 to 203) and further developed for the modeling aspects in the SI section C and Figure S3. It is also discussed in the second paragraph of the discussion (lines 380 to 383).
  
  ‐ It would be good in the Introduction/Discussion to give a broader perspective on the advantages and disadvantages of the synthetic approach to study gene regulation. The intro only discusses Tran et al. Yet, there is a strong history of using this approach, which has also helped to reveal some of the approaches shortcoming. E.g. Gertz et al. Nature 2009 and Sharon et al. Nature Biotechnology 2012. Again, I may have missed, but from my reading I cannot see any critical analysis of the pros/cons of the synthetic approach in development. This is necessary to give readers a clearer context.
  
  One sentence was added in the introduction concerning this point (lines 79 to 82).
  
  A short review concerning the synthetic approach in development has also been added at the beginning of the discussion (lines 347 to 359).
  
  Reviewer #2 (Public Review):
  
  It is known that Bicoid increases in concentration across the syncytial division cycles, the gradient length scale for Bicoid does not change, and hunchback also increases in concentration during the syncytial cycles but the sharp boundary of the hunchback gradient is constantly seen despite the change in concentration of Bicoid. This manuscript shows that by increasing the Bicoid concentration or by adding Zelda binding sites, the expression of hunchback can be recapitulated to that of a previously studied promoter for hunchback.
  
  I have the following comments to understand the implications of the study in the context of increasing concentrations of Bicoid during the syncytial division cycles:
  
  ‐ Bicoid itself is also increasing over the syncytial division cycles, how does this change in concentration of Bicoid affect the activation of the hunchback promoter given the cooperative binding of Bicoid and Bicoid and Zelda as documented by the study?
  
  We thank the reviewer for this remark about the dynamics of the Bcd gradient, which we may have taken for granted. A seminal work on the dynamics of the Bcd gradient using fluorescent‐tagged Bcd (Gregor et al, Cell, 2007a) has shown that the gradient of Bcd nuclear concentration (this nuclear concentration is the one that matter for transcription) remains stable over nuclear cycles, despite a global increase of Bcd amount in the embryo. This can be explained by the fact that Bcd molecules are imported in the nuclei and that the number of nuclei double at every cycle, such that both processes compensate each other. Thus, we assumed that the gradient of Bcd nuclear concentration was stable over nc11 to nc13.
  
  We have clarified this assumption in the model section in the manuscript (lines 165‐168).
  
  Supporting our assumption, when looking at the transcription dynamics regulated by Bcd, in Lucas et al, PLoS Gen, 2018, we observed very reproducible expression pattern dynamics of the hb‐P2 reporter at each cycle nc11 to nc13. Such reproducibility in the pattern dynamics were also observed in this current work for hb‐P2, B6, B9, B12 and H6B6 reporters (Fig. S6A). Also, in Lucas et al, PLoS Gen, 2018, the shift in the established boundary positions of hb‐P2 reporter between nc11 to nc13 is ~2%EL (approximately a nucleus length ~10μm) and it is thus marginal.
  
  In addition, as mentioned in the text (lines 105 to 107), we only focused our analysis on nc13 data which are statistically stronger given the higher number of nuclei analyzed. Thus, any change of Bcd nuclear concentration that would happen over nuclear cycles will not matter.
  
  Concerning Zelda: Zelda’s transcriptional activity when measured on a reporter with only 6 Zld binding sites changes drastically over the nuclear cycles, with strong activity at nc11 and much weaker activity at nc13 (Fig S4A). This indicates that the changes in expression pattern dynamics of Z2B6 from nc11 to nc13 are caused predominantly by decreasing Zelda activity: the effect of Zld on the Z2B6 promoter is very strong during nc11 and nc12. It is also very strong at the beginning of nc13 (even though the Z6 reporter is almost silent) and became a bit weaker in the second part of nc13 (Fig S4B‐D).
  
  ‐ Does the change in concentration of Bicoid across the nuclear cycles shift the gradient similar to the change in numbers of Bicoid binding sites?
  
  In both Lucas et al, PLoS Gen, 2018 and in this work (Fig. 1, Fig. 3 and Fig. S6A), we found that the positions of the expression boundary are very reproducible and stable in time for hb‐P2, B6, B9, B12, H6B6 during the interphase of nc12 to 13. For hb‐P2, the averaged shift of the established boundary position in nc11, 12 and 13 is within 2 %EL. This averaged shift between the cycles is of similar magnitude to the difference caused by embryo‐to‐embryo variability within nc13 (~2 %EL) (Gregor et al, Cell, 2007b, Lucas et al, PloS Gen, 2018). This shift is much smaller than the difference between the expression boundary positions of B6 and B9 (~ 8 % EL) and between B6 and Z2B6 (~17.5 %EL) in nc13.
  
  For these reasons, we conclude that the difference between the expression patterns of B6, B9 and Z2B6 are caused predominantly by changing the TF binding site configurations of the reporters, rather than variability in the Bcd gradient.
  
  The assumption of gradient stability has been clarified in the previous answer and in the manuscript (lines 165‐168).
  
  ‐ The intensity is a little higher for B9 and B12 at the anterior in 2B? Is this statistically different? is this likely to change the amount of Bicoid expression at the locus and lead to more robust activation?
  
  We performed statistical tests to distinguish the spot intensities at the anterior pole for every pair of reporters in Fig. 2B (hb‐P2, B6, B9 and B12). All p‐values from pair‐wise KS tests are greater than 0.067, suggesting that the spot intensities at the anterior pole are not distinguishable between these reporters.
  
  We have clarified this in the manuscript (line 157).
  
  ‐Are the fraction of active loci not changing across the syncytial cycles when the concentration of Bicoid also changes and consistent with the synthetic promoters?
  
  To measure the reproducibility of the expression pattern dynamics in different nuclear cycles, we compared the boundary position of the fraction of active loci pattern as a function of time for all hbP2 and synthetic reporters (Fig. S6A). In this figure panel, for all reporters except Z2B6, the curves in nc12 and nc13 largely overlap, suggesting high reproducibility in the pattern dynamics between cycles and consequently low sensitivity to the subtle variation in the Bcd nuclear concentration gradient between the cycles.
  
  For Z2B6, we attributed the difference in pattern dynamics between nc12 and nc13 to the changes in Zelda activity, as validated independently with a synthetic reporter with only 6 Zld binding sites (Fig. S4A).
  
  ‐How do the numbers of Hb BS change the expression of Hb? H6B6 has 6 Hb BS whereas the Hb‐P2 has 1? Are more controls needed to compare these 2 contexts?
  
  As our goal was to determine to which mechanistic step of our model each TF (Bcd, Hb, Zld) contributed, we added BS numbers that are much higher than in the hb‐P2 promoter. The added number of Hb BS remains very low when compared to total number of Hb binding sites in the entire genome (Karplan et al, PLOS Gen, 2011), therefore, it is very unlikely to affect the endogenous expression of Hb protein.
  
  We clarified this in the manuscript (lines 211 to 212).
  
  Does Zelda concentration change across the syncytial division cycles? How does the change in concentration in the natural context affect the promoter activation of Hb?
  
  Zelda concentration is stable over the nuclear cycles, as observed with the fluorescently‐tagged Zld protein (Dufourt et al., Nat Com, 2018). However, Zelda’s transcriptional activity when measured on a reporter with only 6 Zld binding sites changes drastically over the nuclear cycles, with strong activity at nc11 and much weaker activity at nc13 (Fig S4A, this work).
  
  The impact of this change in Zld activity can be observed with the Z2B6 promoter, with the expression boundary moving from the posterior region toward the anterior region over the nuclear cycles (Fig. S4B‐D). However, we don’t detect any changes in the expression pattern dynamics of hb‐P2 over the nuclear cycles (Fig. S6A and in Lucas et al., PLoS Gen, 2018).
  
  We have clarified this in lines 250‐251 of the main manuscript.
  
  ‐Changing the dose of Bicoid shifts the boundary of hunchback expression. It would be nice to model or test this in the context of varing doses of zelda or even reason this with respect to varying doses of zelda across the syncytial division cycles.
  
  We thank the reviewer for this insight. Concerning Zelda, we did not perform any experiment reducing the amount of Zelda in the embryo. However, in a previous study (Lucas et al., PLoS Genetics, 2018), we observed that the boundary of hb was shifted towards the anterior when decreasing the amount of Zelda consistent to the fact that the dose of Zelda is critical to set the boundary position and the threshold of Bcd concentration required for activation. However, as Zelda is distributed homogeneously along the AP axis, it cannot bring per se positional information to the system.
  
  Reviewer #3 (Public Review):
  
  I think the framing could be improved to better reflect the contribution of the work. From the abstract, for example, it's unclear to me what the authors think is the most meaningful conclusion. Is it the observations about the finer details of TF regulation (bursting dynamics), the fact that Bcd is probably the sole source of "positional information" for hb‐p2, that Bcd exists in active/inactive form, or the fact that an equilibrium model probably suffices to explain what we observe? The first sentence itself seems to suggest this paper will discuss "dynamic positional information", in which case it's somewhat misleading to say this kind of work is "largely unexplored"; Johannes Jaeger in particular has been a strong proponent of this view since at least 2004. On that note some particularly relevant recent papers in the Drosophila early embryo include:
  
  1) Jaeger and Verd (2020) Curr Topics Dev Biol
  
  2) Verd et al. (2017) PLoS Comp Biol
  
  3) Huang, Amourda, et al. and Saunders (2017) eLife
  
  4) Yang, Zhu, et al. (2020) eLife [see also the second half of Perkins (2021) PLoS Comp Biol for further discussion of that model]
  
  ‐Some reviews from James Briscoe also discuss this perspective.
  
  We agree with the reviewer that the phrasing of the abstract was not clear enough to emphasize the contribution of the work and we are also sorry if it suggested that the dynamic positional information is largely unexplored because this was not at all our intention.
  
  We rephrased the abstract aiming to better highlight the most meaningful conclusions.
  
  ‐I would also recommend modifying the title to reflect the biology found in the new results.
  
  We modified the title to better reflect the new results:<br /> “Synthetic reconstruction of the hunchback promoter specifies the role of Bicoid, Zelda and Hunchback in the dynamics of its transcription”
  
  ‐A major point that the authors should address is the design of the synthetic constructs. From table S1, the sites are often very closely linked (4‐7 base pairs). From the footprint of these proteins, we know they can cover DNA across this size (see, https://pubmed.ncbi.nlm.nih.gov/8620846/). As such, there may be direct competition/steric hindrance (see https://pubmed.ncbi.nlm.nih.gov/28052257/). What impact does this have on their interpretations? Note also that the native enhancer has spaced sites with variable identities.
  
  We completely agree with the reviewer comment in the sense that we named our reporters according to the number (N) of Bcd binding sites sequences that they contain, even though we cannot prove definitively that they can effectively be bound simultaneously by N Bcd molecules. It is thus possible that B9 is not a B9 but an effective B6 (i.e. B9 can only be bound simultaneously by 6 molecules) if, for instance, the binding of a Bcd molecule to one site would prevent by the binding of another Bcd molecule to a nearby site (as proposed by the reviewer in the case of direct competition or steric hindrance).
  
  Even though we cannot exclude this possibility, we think that our use of B6, B9, B12, in reference to the 6 Bcd BS of hb‐P2 promoter, is relevant for several reasons : i) some of the Bcd BS in the hb‐P2 promoter are also very close from each other (see Table S1); ii) the design of the synthetic construct was made by multimerizing a series of 3 strong Bcd binding sites with a similar spacing as found for the closest sites in the hb‐P2 promoter (as shown in Figure 1A and Table S1); iii) the binding of the Bicoid protein has been shown in foot printing experiments in vitro to be more efficient on sites of the hb‐P2 promoter that are close from each other, and this has even been interpreted as binding cooperativity (Ma et al., 1996); iv) even though these experiments were not performed with full‐length proteins, two molecules of the paired homeodomain (from the same family of DNA binding domain as Bcd) are able to simultaneously bind to two binding sites separated by only 2 base pairs. This binding to very close sites is even cooperative while when the two sites are distant by 5 base pairs or more, the simultaneous binding to the two sites occurs without cooperativity (Wilson et al., 1993).
  
  Conversely, as it is very difficult to demonstrate that 9 Bcd molecules can effectively bind to our B9 promoter, it is very difficult to know exactly how many binding sites for Bcd the hb‐P2 contains, and a large debate concerning not only the number but also the identity of the Bcd sites in the hb promoter is still ongoing (Park et al., 2019; Ling et al., 2019).
  
  As we cannot exclude the possibility that B9 is an effective B6, it remains possible that B9 and hb‐P2 (which is supposed to only contains 6 sites) have the same number of effective Bcd binding site and this could explain why the two reporters have very similar transcription dynamics and features.
  
  Regarding other interpretations in the manuscript, we identified two other aspects that will be affected if our synthetic reporters have fewer effective sites than the number of sites they carry. The first one concerns the synergy, as the increase in the number of sites of 1.5 from B6 to B9 might be over‐estimated but this would even increase the synergistic effect given the 4.5 difference in activity of the two reporters (Fig. S3). The second one concerns the discussion on the Hill coefficient and the decay length where the effective number of binding sites (N) is required to determine the limit of concentration sensing (Fig. 5). This would particularly be important for the hb‐P2 promoter.
  
  Except for these specific points, we don’t think that the possibility that reporters do not exactly contain as many as effective binding sites than proposed, has a huge impact on our interpretations and the general message conveyed in this manuscript. Most importantly, it is very clear that our B6 and B9 reporters differ only by three Bcd binding sites and have yet very distinct expression dynamics: while B9 recapitulates almost all transcription features of hb‐P2, B6 is far from achieving it. Similarly, H6B6 and Z2B6 have very different transcription features than B6 and these differences have been key for understanding the mechanistic functions of the three TF we studied.
  
  This discussion has been added to the discussion (lines 400 to 414)
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.06.459125v3
www.biorxiv.org www.biorxiv.org

Dopamine enhances model-free credit assignment through boosting of retrospective model-based inference

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #3 (Public Review):
  
  This paper reports that levodopa administration to healthy volunteers enhances the guidance of model-free credit assignment (MFCA) by model-based (MB) inference without altering MF and MB learning per se. The issue addressed is fascinating, timely and clinically relevant, the experimental design and analysis strategy (reported previously) are complex, but sophisticated and clever and the results are tantalizing. They suggest that ldopa boosts model-based instruction about what (unobserved or inferred) state the model-free system might learn about. As such, the paper substantiates the hypothesis that dopamine plays a role specifically in the interaction between distinct model-based and model-free systems. This is really a very valuable contribution, one that my lab and I expect many other labs had already picked up immediately after it appeared as a preprint.
  
  Major strengths include the combination of pharmacology with a substantial sample size, clever theory-driven experimental design and application of advanced computational modeling. The key effect of ldopa on retroactive MF inference is not large, but substantiated by both model-agnostic and model-informed analyses and therefore the primary conclusion is supported by the results.
  
  The paper raises the following questions.
  
  What putative neural mechanism led the authors to predict this selective modulation of the interaction? The introduction states that "Given DA's contribution to both MF and MB systems, we set out to examine whether this aspect of MB-MF cooperation is subject to DA influence." This is vague. For the hypothesis to be plausible, it would need to be grounded in some idea about how the effect would be implemented. Where exactly does dopamine act to elicit an effect on retroactive MB inference, but not MB learning per se? If the mechanism is a modulation of working memory and/or replay itself, then shouldn't that lead to boosting of both MB learning as well as MB influences on MF learning? Addressing this involves specification of the mechanistic basis of the hypothesis in the introduction, but the question also pertains to the discussion section. Hippocampal replay is invoked, but can the authors clarify why a prefrontal working memory (retrieval) mechanism invoked in the preceding paragraph would not suffice. In any case, it seems that an effect of dopamine on replay would also alter MB choice/planning?
  
  In sum, we agree with this criticism and have now revised the relevant intro paragraph (p. 3/4).
  
  We now discuss DAergic manipulation of replay in particular (p. 24). We infer that a component of a MB influence over choice comes from the way it trains a putative MF system (something explicitly modelled in Mattar & Daw, 2018, and a new preprint from Antonov et al., 2021, referencing data from Eldar et al., 2020) – and consider what happens if this is boosted by DA manipulations. The difference between the standard two-step task and the present task is that in our task there is extra work for the MB system in order to perform inference so as to resolve uncertainty for MFCA. We later suggest that the anticorrelation we found between the effect of DA on MB influence over choice and MB guidance of MFCA arises from this extra work.
  
  The broader questions raised about (prefrontal) working memory and (hippocampal) replay pertains to recent and ongoing work, and we feel this should be part of the discussion, which we have re-written this to detail more clearly different possible mechanistic explanations, pointing to how they might be tested in the future (p. 23/24).
  
  A second issue is that the critical drug effects seems somewhat marginally significant and the key plots (e.g. Fig3b and Fig 44b,c, but also other plots) do not visualize relevant variability in the drug effect. I would recommend plotting differences between LDopa and placebo, allowing readers to appreciate the relevant individual variability in the drug effects.
  
  We have now replotted the data in the new Figures 4 and 5 to reflect drug-related variability.
  
  Third, I do wonder how to reconcile the lack of a drug x common reward effect (the lack of a dopamine effect on MF learning) as well as the lack of a drug effect on choice generalization with the long literature on dopamine and MF reinforcement and newer literature on dopamine effects on MB learning and inference. The authors mention this in the discussion, but do not provide an account. Can they elaborate on what makes these pure MB and MF metrics here less sensitive than in various other studies, and/or what are the implications of the lack of these effects for our understanding of dopamine's contributions to learning?
  
  Regarding a lack of a drug effect on MF learning or control, we now elaborate on this on p. 22/23:
  
  “With respect to our current task, and an established two-step task designed to dissociate MF and MB influences (Daw et al., 2011), there is as yet no compelling evidence for an direct impact of DA on MF learning or control (Deserno et al., 2015a; Kroemer et al., 2019; Sharp et al., 2016; Wunderlich et al., 2012, Kroemer et al., 2019). A commonality of our novel and the two-step task is dynamically changing reward contingencies. As MF learning is by definition incremental, slowly accumulating reward value over extended time-periods, it follows that dynamic reward schedules may lessen a sensitivity to detect changes in MF processes (see Doll et al., 2016 for discussion). In line with this, experiments in humans indicate that value-based choices performed without feedback-based learning (for reviews see, Maia & Frank, 2011; Collins and Frank, 2014), as well as learning in stable environments (Pessiglione et al., 2006), are susceptible to DA drug influences (or genetic proxies thereof) as expected under an MF RL account. Thus, the impact of DA boosting agents may vary as a function of contextual task demands. This resonates with features of our pharmacological manipulation using levodopa, which impacts primarily on presynaptic synthesis. Thus, instead of necessarily directly altering phasic DA release, levodopa impacts on baseline storage (Kumakura and Cumming, 2009), likely reflected in overall DA tone. DA tone is proposed to encode average environmental reward rate (Mohebi et al., 2019; Niv et al., 2007), a putative environmental summary statistic that might in turn impact an arbitration between behavioural control strategies according to environmental demands (Cools, 2019).”
  
  As pointed out by the reviewer as well, in the present task we did not find an effect of levodopa on MB influences per se and now discuss this on p. 22:
  
  “In this context, a primary drug effect on prefrontal DA might result in a boosting of purely MB influences. However, we found no such influence at a group level – unlike that seen previously in tasks that used only a single measure of MB influences (Sharpe et al., 2017; Wunderlich et al., 2012). Our novel task systematically separates two MB processes: a guidance of MFCA by MB inference and pure MB control. While we found that only one of these, namely guidance of MFCA by MB inference, was sensitive to enhancement of DA levels at a group level, we did detect a negative correlation between the DA drug effects on MB guidance of MFCA and on pure MBCA. One explanation is that a DA-dependent enhancement in pure MB influences was masked by this boosting in the guidance of MFCA by MB inference. In this regard, our data is suggestive of between-subject heterogeneity in the effects of boosting DA on distinct aspects of MB influences.”
  
  Another open question remains as to why different task conditions (guidance of MFCA by MB vs. pure MB control) apparently differ in their sensitivity to the drug manipulation. We discuss this (p. 22) by proposing that a cost-benefit trade-off might play an important role (Westbrook et al., 2020).
  
  Fourth, the correlation with WM and drug effect on preferential MBCA for non-informative but not informative destination is really quite small, and while I understand that WM should be associated with preferential MBCA under placebo, it does not become clear what makes the authors predict specifically that WM predicts a dopa effect on this metric, rather than the metric taken under placebo, for example.
  
  Our initial reasoning was that MFCA based on reward at the non-informative destination should be particularly sensitive to WM, on the basis that the reward is no longer perceptually available once state uncertainty can be resolved by the MB system. However, we agree with the reviewer that this reasoning does not indicate why it should specifically effect the drug-induced change. In light of this critique, we have removed this part from the abstract, introduction and the main results but still report this relation to WM in Appendix 1 (p. 44/45, subheading “Drug effect on guidance of MFCA and working memory”, Appendix 1 - Figure 11) as an exploratory analysis as suggested in the editor’s summary.
  
  A fifth issue is that I am not quite convinced about the negative link between dopamine's effects on MBCA and on PMFCA. The rationale for including WM, informativeness as well as DA effects on MBCA in the model of DA effects on PMFCA wasn't clear to me. The reported correlation is statistically quite marginal, and given that it was probably not the first one tested and given the multiple factors involved, I am somewhat concerned about the degree to which this reflects overfitting. I also find the pattern of effects rather difficult to make sense of: in high WM individuals, the drug-effects on PMFCA and MBCA are negatively related for informative and non-informative destinations. In low WM individuals, the drug-effects on PMFCA and MBCA are negatively related for informative, but not non-informative destinations. It is unclear to me how this pattern leads to the conclusion that there is a tradeoff between PMFCA and MBCA. And even if so, why would this be the case? It would be relevant to report the simple effects, that is the pattern of correlations under placebo separately from those under ldopa.
  
  The reviewer’s critique is well taken. In connection to the working memory finding reported in the previous section of the initial manuscript, we reasoned that it would be necessary to include WM in the model as well. We still consider this analysis on inter-individual differences in drug effects from different task conditions is important because it connects our current work to previous work linking DA to MB control. However, we now perform a simplified analysis on this where we leave out WM and instead average PMFCA across informative and non-informative destinations (since we had no prior hypothesis that these conditions should differ, p. 19/20). This results in a significant negative correlation of drug-related change in average PMFCA and MB control (Figure 6A, r=-.31,p=.02 Pearson r=-.30, p=.017, Spearman r=-.33, p=.009). In addition, we also ran extended simulations to verify that this negative correlation does not result from correlations among model parameters (see Appendix 1 - Figure 10 for control analysis verifying that this negative correlation survives control for parameter-tradeoff).
  
  Figure 6. Inter-individual differences in drug effects in MBCA and in preferential MFCA, averaged across informative and non-informative destinations (aPMFCA). A) Scatter plot of the drug effects (levodopa minus placebo; ∆ aPMFCA, ∆ MBCA). Dashed regression line and r Pearson correlation coefficient. B) Drug effects in credit assignment (∆ CA) based on a median on ∆ MBCA. Error bars correspond to SEM reflecting variability between participants.
  
  As suggested by the reviewer, we unpack this correlation further (p. 19/20) by taking the median on Δ MBCA (-0.019) and split the sample in lower/higher median groups. The higher median group showed a positive (M= 0.197, t(30)= 4.934, p<.001) and the lower-median group showed a negative (M= -0.267, t(30)= -7.97, p<.001) drug effect on MBCA, respectively (Figure 6B). In a mixed effects model (see Methods), we regressed aPMFCA against drug and a group indicator of lower/higher median Δ MBCA groups. This revealed a significant drug x Δ MBCA-split interaction (b=-0.17, t(120)=-2.05, p=0.042). In the negative Δ MBCA group (Figure 6B), a significantly positive drug effect on aPMFCA was detected (simple effect: b=.18, F(120,1)=10.35, p=.002) while in the positive Δ MBCA group a drug-dependent change in aPMFCA was not significant (Figure 6B, simple effect: b=.02, F(120,1)=0.10, p=.749).
  
  We have changed the respective section of the results accordingly (p. 19/20). Further, we have motivated this exploratory analysis more clearly in the introduction (p. 3/4) in terms of it providing a link to previous relevant studies (Deserno et al., 2015a; Groman et al., 2019; Sharp et al., 2016; Wunderlich et al., 2012). Lastly, we have endeavoured to improve the discussion on this (p. 21/22).
  
  More generally I would recommend that the authors refrain from putting too much emphasis on these between-subject correlations. Simple power calculation indicates that the sample size one would need to detect a realistically small to medium between-subject effect (that interacts with all kinds of within-subject factors) is in any case much larger than the sample size in this study.
  
  We agree with this and have, as mentioned above, substantially adjusted the section on inter-individual differences. We have moved the WM analysis to Appendix 1 (p. 44/45, subheading “Drug effect on guidance of MFCA and working memory”, Appendix 1 - Figure 11) and greatly simplified the analysis of inter-individual differences in drug effects (see previous paragraph). We also mention the overall small to moderate effects in the limitations section (p. 25/26).
  
  Another question is how worried should we be that the critical MB guidance of MFCA effect was not observed under placebo (Figure 3b)? I realize that the computational model-based analyses do speak to this issue, but here I had some questions too. Are the results from the model-informed and model-agnostic analyses otherwise consistent? Model-agnostic analyses reveal a greater effect of LDopa on informative destination for the ghost-nominated than the ghost-rejected trials and no effect for noninformative destination. Conversely model-informed analyses reveal a nomination effect of ldopa across informative and noninformative trials. This was not addressed, or am I missing something? In fact, regarding the modeling, I am not the best person to evaluate the details of the model comparison, fitting and recovery procedures, but the question that does rise is, and I would make explicit in the current paper how does this model space, the winning model and the modeling exercise differ (or not) from that in the previous paper by Moran et al without LDopa administration.
  
  A detailed response to this was provided in replay to point 6 as summarized by the editor. And we provide a summary here as well.
  
  Firstly, we clearly indicate discrepancies between our model-agnostic and computational modelling analyse and acknowledge that discrepancies may be expected when effects of interest are weak to moderate, which we acknowledge (p. 25/26, limitations).
  
  Secondly, the results from the computational model are generally statistically stronger, which is not surprising given that they are based on influences from far more trials. We now include a discussion of this in more detail in the section on limitations (p. 25/26).
  
  Thirdly, although the computational model uses a slightly different parameterization from that reported in Moran et al. (2019), it is a formal extension of that model, allowing the strength of effects for informative and uninformative destinations to differ. We now include a reference to this change in parameterization in the limitation section (p. 25/26), and include a more detailed description in Appendix 1 (p. 45-47).
  
  Finally, to test if the current models support our main conclusion from Moran et al. (2019) that retrospective MB inference guides MFCA for both the informative and non-informative destinations, we reanalysed the Moran et al. (2019) data using the current novel models and found converging support, as we now report (Appendix 1 – Figure 8).
  
  Finally, the general story that dopamine boosts model-based instruction about what the model-free system should learn is reminiscent of the previous work showing that prefrontal dopamine alters instruction biasing of reinforcement learning (Doll and Frank) and I would have thought this might deserve a little more attention, earlier on in the intro.
  
  The reviewer is indeed correct and we now reference this line of work (Doll et al., 2009, 2011) in the intro (p. 4).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.15.426639v1
www.biorxiv.org www.biorxiv.org

New submission 21/03/2023, 14:20:09

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  While the mechanism about arm-races between plant and specialist herbivores has been studied, such as detoxification of specific secondary metabolites, the mechanism of the wider diet breadth, so-called generalist herbivores have been less studied. Since the heterogeneity of host plant species, the experimental validation of phylogenetic generalism of herbivores seemed as hard to be conducted. The authors declared the two major hypotheses about the large diet breadth ("metabolic generalism" and "multi-host metabolic specialism"), and carefully designed the experiment using Drosophila suzukii as a model herbivore species.
  
  By an untargeted metabolomics approach using UHPLC-MS, authors attempted to falsify the hypotheses both in qualitative- and quantitative metabolomic profiles. Intersections of four fruit (puree) samples and each diet-based fly individual samples from the qualitative data revealed that there were few ions that occur as the specific metabolite in each diet-based fly group, which could reject the "multi-host metabolic specialism" hypothesis. Quantitative data also showed results that could support the "metabolic generalism" hypothesis. Therefore, the wide diet breadth of D. suzukii seemed to be derived from the general metabolism rather than the adaptive traits of the diverse host plant species. On the other hand, the reduction of the metabolites (ions) set using GLM seemed logical and 2-D clustering from the reduced ions set showed that quantitative aspects of diet-associated ions could classify "what the flies ate". These interesting results could enhance the understanding of the diet breadth (niche) of herbivorous insects.
  
  The authors' approach seemed clear to falsify the hypotheses based on the appropriate data processing. The intersection of shared ions from the qualitative dataset could distinguish the diet-specific metabolites in flies and commonly occurring metabolites among flies and/or fruits. Also, filtering on the diet-specific ions seemed to be a logical and appropriate way. Meanwhile, the discussion about the results seemed to be focused on different points regarding the research hypotheses which were raised in the introduction part. Discussion about the results mainly focused on the metabolism of D. suzukii itself, rather than the research hypotheses and questions that were raised from the evolution of the wide diet breadth of generalist herbivores. In particular, the conclusion seems to be far from the main context of the authors' research; e.g. frugivory. It makes the implication of the study weaker.
  
  We wish to thank Reviewer #1 for their appreciation of our study. As recommended, we now focus our discussion more on the general aspect of our findings (relevant to insects, herbivores, or frugivores), and less on the peculiarities of the metabolism of D. suzukii itself. Specifically, we now only mention D. suzukii in one section (two sentences) of our Discussion, to serve as an example (l.387-396). Thanks to this comment, the Discussion may interest a broader readership, on the evolution of diet breadth in generalist herbivorous species and offers a better understanding of the general implications of our findings.
  
  Reviewer #2 (Public Review):
  
  The manuscript: "Metabolic consequences of various fruit-based diets in a generalist insect species" by Olazcuaga et al., addresses an interesting question. Using an untargeted metabolomics approach, the authors study how diet generalism may have evolved versus diet specialization which is generally more commonly observed, at least in drosophila species. Using the phytophagous species Drosophila suzukii, and by directly comparing the metabolomes of fruit purees and the flies that fed on them, the authors found evidence for "metabolic generalism". Metabolic generalism means that individuals of a generalist species process all types of diet in a similar way, which is in contrast to "multi-host metabolic specialism" which entails the use of specific pathways to metabolize unique compounds of different diets. The authors find strong evidence for the first hypothesis, as they could easily detect the signature of each fruit diet in the flies. The authors then go on to speculate on the evolutionary ramifications of this for how potentially diet specializations may have evolved from diet generalism. Overall, the paper is well written, the experiments well documented, and the conclusions convincing.
  
  We thank Reviewer #2 for their comments and appreciation of our work.
  
  Reviewer #3 (Public Review):
  
  Laure Olazcuaga et al. investigated the metabolomes of four fruit-based diets and corresponding individuals of Drosophila suzukii that reared on them using comparative metabolomics analysis. They observed that the four fruit-based diets are metabolically dissimilar. On the contrary, flies that fed on them are mostly similar in their metabolic response. From a quantitative point of view, they find that part of the fly metabolomes correlates well with that of the corresponding diet metabolomes, which is indicative of insect ingestive history. By further focusing on 71 metabolites derived from diet-specific fly ions and highly abundant fruit ions, the authors show that D. suzukii differentially accumulates diet metabolism in a compound-specific manner. The authors claim that the data support the metabolic generalism hypothesis while rejecting the multi-host metabolic specialism hypothesis. This study provides a valuable global chemical comparison of how diverse diet metabolites are processed by a generalist insect species.
  
  Strengths:
  
  The rapid advances in high-resolution mass spectrometry have recently accelerated the discovery of many novel post-ingestive compounds through comparative metabolomics analysis of insect/frass and plant samples. Untargeted metabolomics is thus a very powerful approach for the systematic comparison of global chemical shifts when diverse plant-derived specialized metabolites are further modified or quantitatively metabolized after ingestion by insects. The technique can be readily extended to a larger micro- or macro-evolutionary context for both generalist and specialist insects to systematically investigate how plant chemical diversity contributes to dietary generalism and specialism.
  
  We would like to thank Reviewer #3 for their insightful comments on the power of untargeted metabolomics to evaluate the fate of plant metabolites and their use by herbivores. We also agree that these techniques can be used to tackle eco-evolutionary issues, such as the origin and maintenance of dietary generalism and specialism here. We hope that our study will inspire other researchers to explore such techniques and experiments to gain a global overview of biochemistry fluxes and their evolution. We now mention it in the conclusion (L454-459).
  
  Weaknesses:
  
  The authors claim that their data support the hypothesis of metabolic generalism, however, a total analysis of insect metabolism may not generate a clean dataset for direct comparison of fruit-derived metabolites with those metabolized by D. suzukii, given that much of these metabolites would be "diluted" proportionally by insect-derived metabolites. If the insect-derived metabolites predominate, then, as the authors observed, a tight clustering of D. suzukii metabolomes in the PCA plot would be expected. It is therefore very difficult to interpret these patterns.
  
  We agree with Reviewer #3 that a careful examination of the different possible origins of metabolites should take place to distinguish between our two competing hypotheses.
  
  The only source of metabolites for insects in our experimental setup is a mixture of (i) a large proportion of fruit purees and (ii) a minor proportion of artificial medium consisting mainly of yeast. Our goal is thus to understand the fate of (i) “fruit-derived” metabolites (transformed and untransformed), while controlling for (ii) “artificial media-derived” metabolites, that constitute a nuisance signal but are necessary for a complete development in our system.
  
  By “fruit-derived” and “insect-derived” metabolites, it is our understanding that Reviewer #3 means “fruit” metabolites (when in insects, untransformed “fruit-derived” metabolites) and “artificial medium-derived” metabolites. It is true that we do wish to avoid a predominance of “artificial medium-derived” metabolites and focus on “fruit-derived” metabolites in insects. We also want to note that it is of primary importance in our study to distinguish between “fruit” metabolites that are carried as is (“fruit” metabolites present in insects, ie untransformed “fruit-derived” metabolites), and “fruit” metabolites that are used after transformation by the insect (i.e., transformed “fruit-derived” metabolites).
  
  We agree with Reviewer #3 that the presence of “artificial medium-derived” metabolites could be problematic in direct comparisons of fruits and insects (and not among fruits or among insects’ comparisons).
  
  However, we took some steps to avoid such problems:
  
  We included control fly samples in our experiment: at each experimental generation, flies developed only on artificial medium (without fruit puree) were collected and processed simultaneously with flies that developed on fruit media. Results using these artificial medium-reared flies as controls (by subtracting their ions levels and removing ions that were similar, respective of their generation) were similar to results using raw data and conclusions were identical (see below).
  
  We lowered the proportion of artificial medium in our fruit media so that it was kept to a minimum, compatible with larval development and adult survival.
  
  Consistent with the low impact of this “artificial medium” component on our conclusions, we also wish to point out the presence pattern of metabolites found only in flies and never in fruits when using raw data (Figure 3, yellow stack). Even in the most conservative hypothesis of 100% of these metabolites originating from our artificial medium (which is probably not the case), we observe that it constitutes only a minor proportion of metabolites common to all flies (15.7%).
  
  For your consideration, we include below the main Figures, using both raw data and artificial medium-controlled:
  
  Figure 2, left = raw data; right = artificial-media controlled:
  
  Figure 3, left = raw data; right = artificial-media controlled:
  
  Figure 3S1, left = raw data; right = artificial-media controlled:
  
  Figure 4, above = raw data; below = artificial-media controlled:
  
  We hope that we convinced the Editor/Reviewers that raw data and artificial-medium controlled data provide a single and same answer to all our analyses. We chose to present only raw data, to simplify the Materials & Methods section.
  
  We however modified the current version of the manuscript to inform the reader that proper controls were done and that their inclusion do not modify any of our conclusions (l.110-113 and l.583-589).
  
  We also wish to point out two additional comments:
  
  As Reviewer #1 also recommended, we modified the expectations drawn in Fig1G to better consider the general comment of “insect derived” metabolites being fundamentally different from plant metabolites (even if we do show in our study that only approx. 9% of metabolites are private to flies).
  
  The main part of our care in the use of this global PCA analysis is that it follows two other analyses (global intersection and comparison of intersections among fruits and among flies) and precedes another one (fly-focused PCA). We hope that all these analyses help the readers get a comprehensive overview of the dataset and associated results, avoiding reliance on a single analysis.
  
  We also help readers to explore and visualize all analyses presented in our manuscript by setting up a shiny application (in addition to our available dataset and R code), at https://fruitfliesmetabo.shinyapps.io/shiny/. This is now mentioned in the main text (l.588-589).
  
  We thank the Reviewer for their comment that greatly improved the manuscript.
  
  The authors generated a qualitative dataset using the peak list produced by XCMS which contains quantitative peak areas, it is unclear how the threshold was selected to determine if a peak is present or absent in a given sample. The qualitative dataset would influence the output of their data analysis.
  
  The referee is right in pointing out that the threshold used to determine if a peak is present or absent in a given sample was not clearly specified. This has now been corrected in the “Host use” section of the Materials & Methods (l.513-516). Briefly, a given replicate of a compound was considered present if the corresponding peak area following XCMS quantification was > 1000. This threshold was selected to be close to the practical quantification threshold of the Thermo Exactive mass spectrometer used in this study. This threshold was selected in order to allow the quantification of low-abundance compounds, as many plant-derived diet compounds were expected to be present in trace amounts in flies. We additionally applied a stringent rule for presence of any given compound (presence in at least 3 biological replicates).
  
  The authors reply on in-source fragmentation for peak annotation when authentic standards are not available. The accuracy of the annotation thus requires further validation.
  
  The Supplementary Table 1 was unfortunately omitted in the first submission of the manuscript. This oversight has been now corrected and the Supplementary Table 1 details all information used for metabolite annotation. In particular, MS/MS data comparison with mass spectral databases as well as with published literature have been added to substantiate metabolite identifications. This MS/MS data was produced thanks to the comment of the Reviewer. We also provide four more annotations from standards to attain 30 / 71 identifications validated through chemical standards.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.21.513142v2
www.biorxiv.org www.biorxiv.org

YAP1 Activation by Human Papillomavirus E7 Promotes Basal Cell Identity in Squamous Epithelia

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This manuscript elegantly demonstrates that the degradation of PTPN14 by human papillomavirus (HPV) 16 and 18 E7 proteins previously reported by the authors is essential for E7-mediated YAP1 activation. This is important for E7-mediated maintenance of basal cell state and presumably persistence of HPV infection. The authors use a series of innovative tissue models combined with validation in clinical samples to demonstrate the importance of YAP1 activation in high-risk HPV pathogenesis.
  
  The data are of high quality with excellent controls. The manuscript is well-written and the rationale of each experiment easy to follow. In general the results support the authors conclusions. I have the following suggestion to improve the manuscript: The enhanced nuclear expression of YAP in the basal cells of epithelia expressing HPV16/18 E7 is difficult to see in the low resolution IF images shown. The magnified images do show enhanced expression compared to HFK cultures, but to remove any bias in selection of enhanced areas, could the authors include quantification of the distribution of IF signal in the basal cells, compared to the suprabasal cells, of the epithelia shown with statistical analysis? Figure 2 would also benefit from quantification as described above.
  
  We appreciate the positive feedback and constructive suggestions from Reviewer #1. We used widefield images with the goal of presenting as many cells in organotypic cultures as possible, but at low magnification. We have further analyzed the imaging data and updated the manuscript as follows:
  
  1) We assessed YAP1 intensity in basal and suprabasal layers as suggested by the reviewer. Consistent with literature reports, YAP1 is expressed predominantly in basal cells in each of our organotypic cultures, independent of E7 status (see figure below).
  
  2) Because YAP1 is always more highly expressed in basal cells than in suprabasal cells and YAP1 is regulated at the level of nuclear/cytoplasmic localization, we anticipated that quantification of YAP1 nuclear localization in our organotypic cultures may be more useful to readers than basal/suprabasal quantification.
  
  Consequently, we conducted classification-based analyses to quantify YAP1 nuclear localization (a surrogate for YAP1 activity) in the cultures. Each image to be analyzed was deidentified and assigned a coded name. Each cell in the basal layer was then classified as having either predominantly nuclear YAP1 staining, predominantly cytoplasmic YAP1 staining, or YAP1 staining that is comparably distributed between the nucleus and cytoplasm. At least three fields were analyzed per raft. We assessed YAP1 localization in 8,323 cells (average 378.3 cells/culture shown in the text for almost all cultures). The quantifications are now included in Figure 1-figure supplement 2C-E, Figure 1-Figure supplement 5A-C, and Figure 2-figure supplement 1D-F.
  
  The new quantifications do not change our interpretations of the results nor our conclusion that HPV E7 degrades PTPN14 to activate YAP1 in basal cells. We noted that HPV E6 may promote YAP1 nuclear localization to some degree and have updated the text accordingly.
  
  Reviewer #2 (Public Review):
  
  Strengths: A major strength of this report is the use of several different technical approaches, the results from which converge to provide several types of data supporting their conclusions. These various techniques include genetic knockdown/overexpression in primary keratinocytes, organotypic raft cultures, laser-capture microdissection, cell fate monitoring assays, and analysis of publicly available datasets. The manuscript is well-written and the figures are well-made. Weaknesses: Overall, there are only a few minor weaknesses related to figure quality and presentation (which will be conveyed in the private recommendations to the authors).
  
  We appreciate the positive feedback and these thoughtful comments from reviewer #2.
  
  Are claims/conclusions justified by data? Overall, the authors' conclusions are adequately justified by the data. However, there were a few interpretations I felt were somewhat overstated given the experiments performed and data provided. 1. The first issue relates to the interpretation/conclusion of the results from experiments analyzing basal cell number. In Figure 2, the basal cell number was indeed reduced in R84S compared to WT E7. However, it was not reduced to parental HFK levels, suggesting other E7 activities are involved in increasing basal cell number. A similar observation is presented in Figure 7 (E-F), where the R84S E7 mutant still had significantly higher basal cell retention than the empty vector control, albeit lower than WT E7. While their data certainly indicates that the binding and subsequent degradation of PTPN14 is an E7 function important to increasing basal cell number and retention, there are clearly other E7 functions involved. While the authors don't necessarily overinterpret these findings, the possibility that other E7 functions are involved is not explicitly acknowledged or explored in the Discussion.
  
  Indeed, cells expressing HPV18 E7 R84S retain some capacity to increase basal cell number (Figure 2) and promote basal cell retention (Figure 7). It is possible that an activity of HPV E7 in addition to PTPN14 degradation influences these phenotypes. HPV18 E7 R84S retains the capacity to bind and degrade RB1 (Hatterschide et al., 2020). The basal cells in the HPV18 E7 R84S cell fate experiment were predominantly found in clusters indicative of possible clonal expansion. We hypothesize that such clusters reflect proliferation induced by RB1 inactivation and cause the ratio of basal to suprabasal cells to remain high even in the R84S mutant condition. Our hypothesis is now described in further detail in the text.
  
  The second issue pertains to the findings related to the effect on differentiation upon modulation of key Hippo pathway components (Figure 4). It does not appear that the authors performed these studies in the presence of any well-known stimuli that induce the differentiation process in keratinocytes grown in 2D culture (high calcium, high serum, etc) nor did they use these cells in organotypic rafts wherein differentiation occurs during the raft stratification process. This is particularly true in the studies exploring PTPN14 plus LATS1/2 silencing and the effect on repression of keratinocyte differentiation. Whereas it seems PTPN14 itself was serving as the differentiation stimuli in earlier experiments (Figure 4C/D), it does not appear any differentiation stimuli were provided in the experiments shown in Figures 4E-I. For these reasons, the interpretation drawn by the authors that "...inactivation of three different YAP1 inhibitors dampens differentiation gene expression" (Line 220-221) and "inactivation of LATS1 or LATS2...also repressed differentiation genes" (Lines 349-350) seems specific to endogenous levels of differentiation genes. It seems difficult to conclude that inactivation of the Hippo pathway is actively repressing the induction of differentiation if the cells are not being treated with stimuli to induce differentiation.
  
  Indeed, no differentiation stimuli were used in these experiments. We previously observed that PTPN14 knockout or E7 expression reduced differentiation gene expression both in undifferentiated cells and in cells stimulated to differentiate (Hatterschide et al., 2019, 2020). We anticipate that gene expression in unstimulated cells is reflective of gene expression in cells stimulated to differentiate. We altered the results and discussion text to emphasize that the experiment measures differentiation gene expression in unstimulated cells.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.10.468068v1
www.biorxiv.org www.biorxiv.org

A Unifying Mechanism Governing Inter-Brain Neural Relationship During Social Interactions

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This paper provides experimental and modeling analysis of the inter-brain coupling of socially interacting bats, and reports that coordinated brain activity evolves at a slower time scale than the activity describing the differences. Specifically, the paper finds that there is an attracting submanifold corresponding to the mean (or "common mode") of neural activity, and that the dynamics in the orthogonal eigenmode, corresponding to the difference in brain activity, decays rapidly. These rapid decays in the difference mode are referred to as "catch up" activity.
  
  There are two main findings:
  
  1) Neural activity (especially higher frequency LFP activity in the 30-150Hz range) is modulated by social context. Specifically, the ratio of the averaged, moment-to-moment MEAN:DIFF ratio is much higher when the bats are in a single chamber, clearly indicating that the animals are coordinating their neural activity. This change also seems to hold -- although not as striking -- in lower-frequency LFP and spiking activity.
  
  2) The time scales of the mean vs. difference dynamics are segregated: the "difference dynamics" evolve at a faster time scale than "similarity dynamics", seems to be well supported.
  
  The basic finding is presented in Figure 1. The rest of the paper is focused on a modeling study to garner further insight into the dynamics.
  
  Weaknesses:
  
  This is an entirely phenomenological paper, and while it claims to garner "mechanistic insight", it is unclear what that means.
  
  We regret not clarifying sufficiently what we meant by “mechanistic insight.” The insight is the following: functional across-brain coupling acts as positive feedback to the mean component of neural activity, which amplifies it and slows it down; at the same time, it acts as negative feedback to the difference component, which suppresses it and speeds it up. Thus, findings (1) and (2) in the reviewer’s summary above can be explained by the same model mechanism. As the reviewer pointed out below, the details of the model are complex, which could have made the simple mechanism above opaque. Thus, we analyzed two simplified versions of the model to make the mechanistic insight clear. This is detailed below in our response to the reviewer’s comment on model complexity.
  
  The basic idea of the model is simple and somewhat interesting, but the details are extremely complex. There are many examples of this, but the method used to "regress out" the behavior was very hard to interpret.
  
  The method for regressing out behavior was described in Materials and Methods section 3.10, and we regret having neglected to reference it in the main text. We now reference it at the first instance in the main text where this is relevant.
  
  On the face of it, the model is extremely simple: a two-state linear dynamical system. However, this simplistic description buries extreme complexity. The model is extremely complex as involves a large number of parameters (e.g., time switching 'b' values, the values of which are completely unclear), the switching over time of these parameters based on hand-scored animal behavioral state, and the complex mix of markovian and linear dynamical systems theoretic results.
  
  As the reviewer pointed out, the core of the model is very simple: a linear dynamical system that models neural activity coupling. The model mechanism of positive and negative feedback, which is responsible for reproducing the two experimental results summarized by the reviewer above, is contained in this core (see Materials and Methods section 3.7 for details). On top of this, the model has a layer of complexity, involving a Markov chain model of behavior and a large number of behavioral parameters. This layer of complexity is independent from the feedback mechanism of the core of the model. Thus, while it makes the model more biologically realistic, it is not required to reproduce the two main experimental results. To explicitly show this, and to better understand the dependence of model behavior on its parameters, we analyzed two reduced versions of the model. The first reduced model replaces the behavioral inputs with white noise. The original model is , where a is neural activity, , is the coupling matrix, b is behavioral modulation, and τ is a time constant. b is where the complexity lies, as it is simulated using a Markov chain and involves many parameters. To strip away this layer of complexity, we replaced b with noise having a simple structure, namely, the mean and difference components of b having identical, flat power spectra. Importantly, this noise input does not induce correlation between bats, and it amounts to inputs of the same magnitude and same timescales to the mean and difference components of a. The resulting reduced model has only two parameters, the functional self-coupling C_S and functional across-brain coupling C_I (for simplicity, τ can be absorbed into the other parameters). We are interested in the two results the reviewer summarized above: (1) the mean component of neural activity having a larger variance than the difference component; (2) the mean component having a slow timescale than the difference component. In the manuscript, these are respectively quantified using the variance ratio and the power spectral centroid ratio of the mean and difference components. The reduced model allowed us to derive analytical expressions for these two quantities (see Materials and Methods section 3.8 for details). We found that they have very simple dependence on the functional coupling parameters: the variance ratio (mean variance divided by difference variance) is approximately , and the centroid ratio (mean centroid divided by difference centroid) is approximately .
  
  This parameter dependence is visualized below (note that the color maps are in log scale, and the white spaces are regions where the model is unstable).
  
  In the experimental data, the mean component had larger variance and lower power spectral centroid than the difference component. This corresponds to the parameter regime of (enclosed by dashed lines). Thus, a positive C_I acts as positive feedback to the mean component and negative feedback to the difference component, modulating their variance and timescales in opposite directions. This is consistent with the analysis of the original model in Materials and Methods section 3.7. In the revised manuscript, we’ve now added analysis of this reduced model to the Results section, and the above figure has been added as Figure 3I-J.
  
  The reviewer has stated a concern regarding the large number of parameters that set the input level according to behavioral state (b_resting, b_(social grooming), b_fighting, etc.). These parameters are important for ensuring that the model outputs realistic levels of behaviorally modulated neural activity (discussed below in our reply regarding model fit), but they are not important for the main results on variance and timescales. To demonstrate this, we studied a second reduced model. This model is identical to our original model except that, for each simulation, each of the behavior parameters (b_fighting, etc.) was independently drawn from the uniform distribution from 0 to 1. Despite the completely random behavioral parameters, this reduced model reproduces the variance and timescales results just like the original model, as shown in the figure below (compare with Figure 3E-F).
  
  To summarize, the reduced models allowed us to identify the simple parameter dependence of the modeling results, and showed that the simple linear dynamical system at the core of the original model is sufficient to reproduce the two main experimental observations.
  
  Indeed, a fundamental weakness of the model is that the Markov chain is taken as an "input" to the 2-state linear systems model, as if somehow the neural state does not affect the state transitions.
  
  Yes, this is a limitation of our model. We’ve added a discussion of this limitation, as well as future directions for overcoming it, in the Discussion section. The reason we did not model neural control of behavioral transitions is that it is under-constrained by existing data. While the brain obviously controls behaviors, not every part of the brain controls every behavior. Of the 11 behaviors observed in this study, we do not know which of them is controlled by the bat frontal cortex, and we do not know how they might be controlled (i.e., what specific spatiotemporal activity patterns affects behaviors in what ways). Without this knowledge, it’s unclear how to implement neural control of behavior in the model. This knowledge requires perturbation studies (lesion, inactivation, or activity manipulation) to establish casual relationships from neural activity to specific behaviors in the bat, which will be an important future direction.
  
  On the other hand, as the reviewer stated, our model included behavioral modulation of neural activity. It is well known that in mammals, arousal and movement modulate neural activity globally across cortex (McGinley et al., 2015, Neuron). Thus, given that different behaviors in general involve different levels of arousal and movement, our model included behavior-dependent modulation of frontal cortical neural activity. Finally, for the reviewer’s convenience, we also quote below the paragraph addressing this issue in the revised Discussion. “Another limitation of our model is the “open-loop” nature of the relationship between behavior and neural activity. Specifically, we modeled neural activity as being modulated by behavior, but behavior was modeled using a Markov chain that is independent from the neural activity. In reality, neural activity and behavior form a closed-loop, with different social behaviors being controlled by the neural activity of specific neural populations in specific brain regions. Thus, an important future direction is to close the loop by incorporating neural control of social behaviors into models of the inter-brain relationship in bats. This will require future experimental studies to identify which frontal cortical regions and populations in bats are necessary or sufficient to control social behaviors, as well as the detailed causal relationship from neural activity to social behavior. Furthermore, as social interactions can occur at multiple timescales, it will be interesting to investigate how these are controlled by neural activity at different timescales, and how those timescales are shaped by functional across-brain coupling. In summary, such a closed-loop model will shed light on how inter-brain activity patterns and dynamic social interactions co-evolve and feedback onto each other.”
  
  Further, the Markov assumption is not rigorously tested.
  
  We have now tested the Markov assumption, using the following methods. We compared three models of bat behaviors: (1) the independent model, where the behavioral state at a given time point is independent from the state at other time points; (2) the 1st-order dependency model, where the behavioral state at a given time point depends on the state at the previous time point only; (3) the 2nd-order dependency model, where the behavioral state at a given time point depends on the states at the two previous time points. The Markov assumption corresponds to model (2), which is used as a part of the main model of the paper. Note that models with longer time-dependencies (≥3) were not tested because the number of parameters grows exponentially with model order and our dataset is not large enough to fit them.
  
  To compare the three models, we split the behavioral data into a training set and a test set, fitted each model on the training set (Laplace smoothing was used to avoid assigning zero probability to unobserved events), and calculated the log-likelihood of the test set under each model. The figure below shows the cross-validated likelihoods for the behavioral data of one-chamber (A) and two-chambers (B) sessions, which were fitted separately; circles and error bars are means and standard deviations across 100 random splits of the data into training and test sets.
  
  As the figure above shows, the 1st-order model had the highest likelihood on average. This does not necessarily prove that bat behavior obeys the Markov assumption (if we had a lot more data, we might be able to fit better 2nd-order and higher-order models). But this does mean that, given the amount of data we have, the best model that we can fit is the 1st-order Markov chain. Thus, this result supports our usage of the Markov chain in the main model of the paper. In the revised manuscript, the above figure is included as Figure 3—figure supplement 2A-B, and the analysis is described in Materials and Methods section 3.5.
  
  No model selecting or other model validation appears to be done.
  
  To evaluate model fit, we simulated our model using experimentally observed behaviors (rather than simulating behaviors using a Markov chain), and compared the simulated neural activity with the experimentally observed activity (see Materials and Methods section 3.6 for detailed procedures). The comparison for an example experimental session is shown below, where we’ve plotted the experimentally observed neural activity and behaviors for bat 1 (A) and bat 2 (B), along with the simulated neural activity. The correlation coefficient between data and model are indicated above each plot. These are representative examples, as the average correlation over all sessions and bats is 0.72 (standard deviation is 0.10). This figure was added to the revised manuscript as Figure 3—figure supplement 1.
  
  In evaluating model fit, we realized that the model in the original manuscript produced outputs with a DC offset different from that of the data. Thus, in the revised manuscript (including the figure above), we added one more behavioral parameter (b_constant) that adjusts the DC offset, which is a parameter that reflects the effect of a baseline arousal level on neural activity (Materials and Methods section 3.4). Note that, since the only effect of this parameter is to adjust the DC offset of neural activity, it does not change any of the results in the paper.
  
  In short, the model, while very interesting, is so complex that it is literally impossible to evaluate. The authors report literally no shortcomings of their model. They do not report parameter estimation methods. They do not report fitting errors or other model validation metrics. The only evaluation is whether it can produce certain outputs that are similar to biological data. While the latter is certainly important, all models are wrong, and it essential to have a model simple enough to understand, both in terms of how it works and how it fails.
  
  The comments on the complexity of the model and on fitting errors have been addressed above. Regarding parameter estimation methods, they were described in Materials and Methods section 3.14, and we regret having neglected to directly reference it in the original manuscript. We now reference the section in the legend of Figure 3A which is the first place to introduce the parameters. Briefly, the behavioral parameters (b_resting, b_fighting, etc.) were simply chosen to be the average neural activity during the respective behaviors from the data; the other parameters were chosen by hand to roughly match the levels of activity from the data, keeping within the parameter regime of identified from the analyses. As we showed above, these parameters provide a reasonable fit to the data.
  
  The reason we chose the parameters heuristically in this way, rather than by minimizing some error objective, is the following. Our goal was to build a model that could qualitatively reproduce the experimental findings in a robust manner, that is, without fine-tuning of parameters. Thus, we analyzed the model to understand how model behaviors depend on the parameters, and to identify the parameter regime that reproduces the qualitative trends seen in the data (Figure 3I-J; Materials and Methods sections 3.7 and 3.8). Guided by these analyses, we chose parameters heuristically without algorithmic fine-tuning.
  
  Finally, following suggestions from reviewer 1 and reviewer 3, we have added discussions of shortcomings of the models (the last two paragraphs of the Discussion). With these discussions of model limitations, along with the presentation of simple insights into model mechanism from the reduced models above, we believe we have now presented a model that is “simple enough to understand, both in terms of how it works and how it fails.”
  
  In general, while the basic finding is fairly interesting, and the experiments and their findings are highly relevant to the field, the modeling and its explication fall short.
  
  It is not that it is wrong or bad; however, it is not clear that such a complex model increases our understanding beyond the experimental findings in Figure 1, and if it does, there has to be a major caveat that the model itself is not carefully vetted.
  
  Based on the reviewer’s comments on the model’s complexity, we have analyzed reduced versions of the model to understand its simple underlying mechanisms, as described above. This goes beyond the experimental findings in Figure 1, as it provides a computational mechanism that could give rise to those experimental findings. Moreover, based on the reviewer’s comments, we have more carefully vetted the model, by evaluating model fit and testing different behavioral models that assume or doesn’t assume the Markov property. Finally, we now discuss caveats of the model in the Discussion section, including the open-loop nature of the model as pointed out by the reviewer.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.02.446694v1
www.biorxiv.org www.biorxiv.org

New submission 27/06/2022, 11:13:20

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Overall, the science is sound and interesting, and the results are clearly presented. However, the paper falls in-between describing a novel method and studying biology. As a consequence, it is a bit difficult to grasp the general flow, central story and focus point. The study does uncover several interesting phenomena, but none are really studied in much detail and the novel biological insight is therefore a bit limited and lost in the abundance of observations. Several interesting novel interactions are uncovered, in particular for the SPS sensor and GAPDH paralogs, but these are not followed up on in much detail. The same can be said for the more general observations, eg the fact that different types of mutations (missense vs nonsense) in different types of genes (essential vs non-essential, housekeeping vs. stress-regulated...) cause different effects.
  
  This is not to say that the paper has no merit - far from it even. But, in its current form, it is a bit chaotic. Maybe there is simply too much in the paper? To me, it would already help if the authors would explicitly state that the paper is a "methods" paper that describes a novel technique for studying the effects of mutations on protein abundance, and then goes on to demonstrate the possibilities of the technology by giving a few examples of the phenomena that can be studied. The discussion section ends in this way, but it may be helpful if this was moved to the end of the introduction.
  
  We modified the manuscript as suggested.
  
  Reviewer #2 (Public Review):
  
  Schubert et al. describe a new pooled screening strategy that combines protein abundance measurements of 11 proteins determined via FACS with genome-wide mutagenesis of stop codons and missense mutations (achieved via a base editor) in yeast. The method allows to identify genetic perturbations that affect steady state protein levels (vs transcript abundance), and in this way define regulators of protein abundance. The authors find that perturbation of essential genes more often alters protein abundance than of nonessential genes and proteins with core cellular functions more often decrease in abundance in response to genetic perturbations than stress proteins. Genes whose knockouts affected the level of several of the 11 proteins were enriched in protein biosynthetic processes while genes whose knockouts affected specific proteins were enriched for functions in transcriptional regulation. The authors also leverage the dataset to confirm known and identify new regulatory relationships, such as a link between the SDS amino acid sensor and the stress response gene Yhb1 or between Ras/PKA signalling and GAPDH isoenzymes Tdh1, 2, and 3. In addition, the paper contains a section on benchmarking of the base editor in yeast, where it has not been used before.
  
  Strengths and weaknesses of the paper
  
  The authors establish the BE3 base editor as a screening tool in S. cerevisiae and very thoroughly benchmark its functionality for single edits and in different screening formats (fitness and FACS screening). This will be very beneficial for the yeast community.
  
  The strategy established here allows measuring the effect of genetic perturbations on protein abundances in highly complex libraries. This complements capabilities for measuring effects of genetic perturbations on transcript levels, which is important as for some proteins mRNA and protein levels do not correlate well. The ability to measure proteins directly therefore promises to close an important gap in determining all their regulatory inputs. The strategy is furthermore broadly applicable beyond the current study. All experimental procedures are very well described and plasmids and scripts are openly shared, maximizing utility for the community.
  
  There is a good balance between global analyses aimed at characterizing properties of the regulatory network and more detailed analyses of interesting new regulatory relationships. Some of the key conclusions are further supported by additional experimental evidence, which includes re-making specific mutations and confirming their effects on protein levels by mass spectrometry.
  
  The conclusions of the paper are mostly well supported, but I am missing some analyses on reproducibility and potential confounders and some of the data analysis steps should be clarified.
  
  The paper starts on the premise that measuring protein levels will identify regulators and regulatory principles that would not be found by measuring transcripts, but since the findings are not discussed in light of studies looking at mRNA levels it is unclear how the current study extends knowledge regarding the regulatory inputs of each protein.
  
  See response to Comment #10.
  
  Specific comments regarding data analysis, reproducibility, confounders
  
  1) The authors use the number of unique barcodes per guide RNA rather than barcode counts to determine fold-changes. For reliable fold changes the number of unique barcodes per gRNA should then ideally be in the 100s for each guide, is that the case? It would also be important to show the distribution of the number of barcodes per gRNA and their abundances determined from read counts. I could imagine that if the distribution of barcodes per gRNA or the abundance of these barcodes is highly skewed (particularly if there are many barcodes with only few reads) that could lead to spurious differences in unique barcode number between the high and low fluorescence pool. I imagine some skew is present as is normal in pooled library experiments. The fold-changes in the control pools could show whether spurious differences are a problem, but it is not clear to me if and how these controls are used in the protein screen.
  
  Because of the large number of screens performed in this study (11 proteins, with 8 replicates for each) we had to trade off sequencing depth and power against cell sorting time and sequencing cost, resulting in lower read and barcode numbers than what might be ideally aimed for. As described further in the response to Comment #5, we added a new figure to the manuscript that shows that the correlation of fold-changes between replicates is high (Figure 3–S1A). The second figure below shows that the correlation between the number of unique barcodes and the number of reads per gRNA is highly significant (p < 2.2e-16).
  
  2) I like the idea of using an additional barcode (plasmid barcode) to distinguish between different cells with the same gRNA - this would directly allow to assess variability and serve as a sort of replicate within replicate. However, this information is not leveraged in the analysis. It would be nice to see an analysis of how well the different plasmid barcodes tagging the same gRNA agree (for fitness and protein abundance), to show how reproducible and reliable the findings are.
  
  We agree with the reviewer that this would be nice to do in principle, but our sequencing depth for the sorted cell populations was not high enough to compare the same barcode across the low/unsorted/high samples. See also our response to Comment #5 for the replicate analyses.
  
  3) From Fig 1 and previous research on base editors it is clear that mutation outcomes are often heterogeneous for the same gRNA and comprise a substantial fraction of wild-type alleles, alleles where only part of the Cs in the target window or where Cs outside the target window are edited, and non C-to-T edits. How does this reflect on the variability of phenotypic measurements, given that any barcode represents a genetically heterogeneous population of cells rather than a specific genotype? This would be important information for anyone planning to use the base editor in future.
  
  We agree with the reviewer that the heterogeneity of editing outcomes is an important point to keep in mind when working with base editors. In genetic screens, like the ones described here, often the individual edit is less important, and the overall effects of the base editor are specific/localized enough to obtain insights into the effects of mutations in the area where the gRNA targets the genome. For example, in our test screens for Canavanine resistance and fitness effects, in which we used gRNAs predicted to introduce stop codons into the CAN1 gene and into essential genes, respectively, we see the expected loss-of-function effect for a majority of the gRNAs (canavanine screen: expected effect for 67% of all gRNAs introducing stop codons into CAN1; fitness screen: expected effect for 59% of all gRNAs introducing stop codons into essential genes) (Figure 2). In the canavanine screen, we also see that gRNAs predicted to introduce missense mutations at highly conserved residues are more likely to lead to a loss-of-function effect than gRNAs predicted to introduce missense mutations at less conserved residues, further highlighting the differentiated results that can be obtained with the base editor despite the heterogeneity in editing outcomes overall. We would certainly advise anyone to confirm by sequencing the base edits in individual mutants whenever a precise mutation is desired, as we did in this study when following up on selected findings with individual mutants.
  
  4) How common are additional mutations in the genome of these cells and could they confound the measured effects? I can think of several sources of additional mutations, such as off-target editing, edits outside the target window, or when 2 gRNA plasmids are present in the same cell (both target windows obtain edits). Could some of these events explain the discrepancy in phenotype for two gRNAs that should make the same mutation (Fig S4)? Even though BE3 has been described in mammalian cells, an off-target analysis would be desirable as there can be substantial differences in off-target behavior between cell types and organisms.
  
  Generally, we are not very concerned about random off-target activity of the base editor because we would not expect this to cause a consistent signal that would be picked up in our screen as a significant effect of a particular gRNA. Reproducible off-target editing with a specific gRNA at a site other than the intended target site would be problematic, though. We limited the chance of this happening by not using gRNAs that may target similar sequences to the intended target site in the genome. Specifically, we excluded gRNAs that have more than one target in the genome when the 12 nucleotides in the seed region (directly upstream of the PAM site) are considered (DiCarlo et al., Nucleic Acids Research, 2013).
  
  We do observe some off-target editing right outside the target window, but generally at much lower frequency than the on-target editing in the target window (Figure 1B and Figure 1–S2). Since for most of our analyses we grouped perturbations per gene, such off-target edits should not affect our findings. In addition, we validated key findings with independent experiments. For our study, we used the Base Editor v3 (Komor et al., Nature, 2016); more recently, additional base editors have been developed that show improved accuracy and efficiency, and we would recommend these base editors when starting a new study (see, e.g., Anzalone et al., Nature Biotechnology, 2020).
  
  We are not concerned about cases in which one cell gets two gRNAs, since the chance that the same two gRNAs end up in one cell repeatedly is low, and such events would therefore not result in a significant signal in our screens.
  
  We don’t think that off-target mutations can explain the discrepancy between pairs of gRNAs that should introduce the same mutation (Figure 3–S1. The effect of the two gRNAs is actually well-correlated, but, often, one of the two gRNAs doesn’t pass our significance cut-off or simply doesn’t edit efficiently (i.e., most discrepancies arise from false negatives rather than false positives). We may therefore miss the effects of some mutations, but we are unlikely to draw erroneous conclusions from significant signals.
  
  5) In the protein screen normalization uses the total unique barcode counts. Does this efficiently correct for differences from sequencing (rather than total read counts or other methods)? It would be nice to see some replicate plots for the analysis of the fitness as well as the protein screen to be able to judge that.
  
  We made a new figure that shows a replicate comparison for the protein screen (see below; in the manuscript it is Figure 3–S1A) and commented on it in the manuscript. For this analysis, the eight replicates for each protein were split into two groups of four replicates each and analyzed the same way as the eight replicates. The correlation between the two groups of replicates is highly significant (p < 2.2e-16). The second figure shows that the total number of reads and the total number of unique barcodes are well correlated.
  
  For the fitness screen, we used read counts rather than barcode counts for the analysis since read counts better reflect the dropout of cells due to reduced fitness. The figure below shows a replicate comparison for the fitness screen. For this analysis, the four replicates were split into two groups of two replicates each and analyzed the same way as the four replicates. The correlation between the two groups of replicates is highly significant (p < 2.2e-16).
  
  6) In the main text the authors mention very high agreement between gRNAs introducing the same mutation but this is only based on 20 or so gRNA pairs; for many more pairs that introduce the same mutation only one reaches significance, and the correlation in their effects is lower (Fig S4). It would be better to reflect this in the text directly rather than exclusively in the supplementary information.
  
  We clarified this in the manuscript main text: “For 78 of these gRNA pairs, at least one gRNA had a significant effect (FDR < 0.05) on at least one of the eleven proteins; their effects were highly correlated (Pearson’s R2 = 0.43, p < 2.2E-16) (Figure 3–S1B). For the 20 gRNA pairs for which both gRNAs had a significant effect, the correlation was even higher (Pearson’s R2 = 0.819, p = 8.8e-13) (Figure 3–S1C). These findings show that the significant gRNA effects that we identify have a low false positive rate, but they also suggest that many real gRNA effects are not detected in the screen due to limitations in statistical power.”
  
  7) When the different gRNAs for a targeted gene are combined, instead of using an averaged measure of their effects the authors use the largest fold-change. This seems not ideal to me as it is sensitive to outliers (experimental error or background mutations present in that strain).
  
  We agree that the method we used is more sensitive to outliers than averaging per gene. However, because many gRNAs have no effect either because they are not editing efficiently or because the edit doesn’t have a phenotypic consequence, an averaging method across all gRNAs targeting the same gene would be too conservative and not properly capture the effect of a perturbation of that gene.
  
  8) Phenotyping is performed directly after editing, when the base editor is still present in the cells and could still interact with target sites. I could imagine this could lead to reduced levels of the proteins targeted for mutagenesis as it could act like a CRISPRi transcriptional roadblock. Could this enhance some of the effects or alter them in case of some missense mutations?
  
  To reduce potential “CRISPRi-like” effects of the base editor on gene expression, we placed the base editor under a galactose-inducible promoter. For both the fitness and protein screens we grew the cultures in media without galactose for another 24 hours (fitness screen) or 8-9 hours (protein screens) before sampling. In the latter case, this recovery time corresponded to more than three cell divisions, after which we assume base editor levels to have strongly decreased, and therefore to no longer interfere with transcription. This is also supported by our ability to detect discordant effects of gRNAs targeting the same gene (e.g., the two mutations leading to loss-of-function and gain-of-function of RAS2), which would otherwise be overshadowed by a CRISPRi effect.
  
  9) I feel that the main text does not reflect the actual editing efficiency very well (the main numbers I noticed were 95% C to T conversion and 89% of these occurring in a specific window). More informative for interpreting the results would be to know what fraction of the alleles show an edit (vs wild-type) and how many show the 'complete' edit (as the authors assume 100% of the genotypes generated by a gRNA to be conversion of all Cs to Ts in the target window). It would be important to state in the main text how variable this is for different gRNAs and what the typical purity of editing outcomes is.
  
  We now show the editing efficiency and purity in a new figure (Figure 1B), and discuss it in the main text as follows: “We found that the target window and mutagenesis pattern are very similar to those described in human cells: 95% of edits are C-to-T transitions, and 89% of these occurred in a five-nucleotide window 13 to 17 base pairs upstream of the PAM sequence (Figure 1A; Figure 1–S2) (Komor et al., 2016). Editing efficiency was variable across the eight gRNAs and ranged from 4% to 64% if considering only cases where all Cs in the window are edited; percentages are higher if incomplete edits are considered, too (Figure 1B).”
  
  Comments regarding findings
  
  10) It would be nice to see a comparison of the results to the effects of ~1500 yeast gene knockouts on cellular transcriptomes (https://doi.org/10.1016/j.cell.2014.02.054). This would show where the current study extends established knowledge regarding the regulatory inputs of each protein and highlight the importance of directly measuring protein levels. This would be particularly interesting for proteins whose abundance cannot be predicted well from mRNA abundance.
  
  We agree with the reviewer that it would be very interesting to compare the effect of perturbations on mRNA vs protein levels. We have compared our protein-level data to mRNA-level data from Kemmeren and colleagues (Kemmeren et al., Cell 2014), and we find very good agreement between the effects of gene perturbations on mRNA and protein levels when considering only genes with q < 0.05 and Log2FC > 0.5 in both studies (Pearson’s R = 0.79, p < 5.3e-15).
  
  Gene perturbations with effects detected only on mRNA but not protein levels are enriched in genes with a role in “chromatin organization” (FDR = 0.01; as a background for the analysis, only the 1098 genes covered in both studies were considered). This suggests that perturbations of genes involved in chromatin organization tend to affect mRNA levels but are then buffered and do not lead to altered protein levels. There was no enrichment of functional annotations among gene perturbations with effects on protein levels but not mRNA levels.
  
  We did not include these results in the manuscript because there are some limitations to the conclusions that can be drawn from these comparisons, including that our study has a relatively high number of false negatives, and that the genes perturbed in the Kemmeren et al. study were selected to play a role in gene regulation, meaning that differences in mRNA-vs-protein effects of perturbations are limited to this function, and other gene functions cannot be assessed.
  
  11) The finding that genes that affect only one or two proteins are enriched for roles in transcriptional regulation could be a consequence of 'only' looking at 10 proteins rather than a globally valid conclusion. Particularly as the 10 proteins were selected for diverse functions that are subject to distinct regulatory cascades. ('only' because I appreciate this was a lot of work.)
  
  We agree with this, and we think it is clear in the abstract and the main text of the manuscript that here we studied 11 proteins. We made this point also more explicit in the discussion, so that it is clear for readers that the findings are based on the 11 proteins and may not extrapolate to the entire yeast proteome.
  
  Reviewer #3 (Public Review):
  
  This manuscript presents two main contributions. First, the authors modified a CRISPR base editing system for use in an important model organism: budding yeast. Second, they demonstrate the utility of this system by using it to conduct an extremely high throughput study the effects of mutation on protein abundance. This study confirms known protein regulatory relationships and detects several important new ones. It also reveals trends in the type of mutations that influence protein abundances. Overall, the findings are of high significance and the method appears to be extremely useful. I found the conclusions to be justified by the data.
  
  One potential weakness is that some of the methods are not described in main body of the paper, so the reader has to really dive into the methods section to understand particular aspects of the study, for example, how the fitness competition was conducted.
  
  We expanded the first section for better readability.
  
  Another potential weakness is the comparison of this study (of protein abundances) to previous studies (of transcript abundances) was a little cursory, and left some open questions. For example, is it remarkable that the mutations affecting protein abundance are predominantly in genes involved in translation rather than transcription, or is this an expected result of a study focusing on protein levels?
  
  We thank the reviewer for pointing out that this paragraph requires more explanation. We expanded it as follows: “Of these 29 genes, 21 (72%) have roles in protein translation—more specifically, in ribosome biogenesis and tRNA metabolism (FDR < 8.0e-4, Figure 5C). In contrast, perturbations that affect the abundance of only one or two of the eleven proteins mostly occur in genes with roles in transcription (e.g., GO:0006351, FDR < 1.3e-5). Protein biosynthesis entails both transcription and translation, and these results suggest that perturbations of translational machinery alter protein abundance broadly, while perturbations of transcriptional machinery can tune the abundance of individual proteins. Thus, genes with post-transcriptional functions are more likely to appear as hubs in protein regulatory networks, whereas genes with transcriptional functions are likely to show fewer connections.”
  
  Overall, the strengths of this study far outweigh these weaknesses. This manuscript represents a very large amount of work and demonstrates important new insights into protein regulatory networks.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.09.483657v1
www.biorxiv.org www.biorxiv.org

New submission 30/12/2022, 17:20:53

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  The authors seek to determine how various species combine their effects on the growth of a species of interest when part of the same community.
  
  To this end, the authors carry out an impressive experiment containing what I believe must be one of the largest pairwise + third-order co-culture experiments done to date, using a high-throughput co-culture system they had co-developed in previous work. The unprecedented nature of this data is a major strength of the paper. The authors also discover that species combine their effect through "dominance", i.e. the strongest effect masks the others. This is important as it calls into question the common assumption of additivity that is implicit in the choice of using Lotka-Volterra models.
  
  A stronger claim (i.e. in the abstract) is that joint effect of multiple species on the growth of another can be derived from the effect of individual species. Unless I am misunderstanding something, this statement may have to be qualified a little, as the authors show that a model based on pairwise dominance (i.e. the strongest pairwise) does a somewhat better job (lower RMSD, though granted, not by much, 0.57 vs 0.63) than a model based on single species dominance. This is, the effect of the strongest pair predicts better the effect of a trio than the effect of the larger species.
  
  This issue makes one wonder whether, had the authors included higher-order combinations of species (i.e. five-member consortia or higher), the strongest-effect trio would have predicted better than the strongest-effect pair, which in turn is better predictor than the strongest-effect species. This is important, as it would help one determine to what extent the strongest-effect model would work in more diverse communities, such as those one typically finds in nature. Indeed, the authors find that the predictive ability of the strongest effect species is much stronger for pairs than it is for trios (RMSD of 0.28 vs 0.63). Does the predictive ability of the single species model decline faster and faster as diversity grows beyond 4-member consortia?
  
  Thank you for raising this important point. It is true that in our study we see that single species predict pairs better than trios, and that pairs predict trios better than single species. As we did not perform experiments on more diverse communities (n>4), we are not sure if or how these rules will scale up. We explicitly address these caveats in our revised discussion.
  
  Reviewer #3 (Public Review):
  
  A problem in synthetic ecology is that one can't brute-force complex community design because combinatorics make it basically impossible to screen all possible communities from a bank of possible species. Therefore, we need a way to predict phenomena in complex communities from phenomena in simple communities. This paper aims to improve this predictive ability by comparing a few different simple models applied to a large dataset obtained with the use of the author's "kchip" microfluidics device. The main question they ask is whether the effect of two species on a focal species is predicted from the mean, the sum, or the max of the effect of each single "affecting" species on the focal species. They find that the max effect is often the best predictor, in the sense of minimizing the difference between predicted effect and measured effect. They also measure single-species trait data for their library of strains, including resource niche and antibiotic resistance, and then find that Pearson correlations between distance calculations generated from these metrics and the effect of added species are weak and unpredictive. This work is largely well-done, timely and likely to be of high interest to the field, as predicting ecosystem traits from species traits is a major research aim.
  
  My main criticism is that the main take-home from the paper (fig 3B)-that the strongest effect is the best predictor-is oversold. While it is true that, averaged over their six focal species, the "strongest effect" was the best overall predictor, when one looks at the species-specific data (S9), we see that it is not the best predictor for 1/3 of their focal species, and this fraction grows to 1/2 if one considers a difference in nRMSE of 0.01 to be negligible.
  
  As suggested, we have softened our language regarding the take-home message. This matter is addressed in detail above in response to 'Essential Revisions'. Briefly, we see that the strongest model works best when both single species have qualitatively similar effects, but is slightly less accurate when effects are mixed. We also see overall less accurate predictions for positive effects. In light of these findings, we propose that focal species for which the strongest model is not the most accurate is due to the interaction types, and not specific to the focal species.
  
  We made substantial changes to the manuscript, including the first paragraph of the discussion which more accurately describes these findings and emphasizes the relevant caveats:
  
  "By measuring thousands of simplified microbial communities, we quantified the effects of single species, pairs, and trios on multiple focal species. The most accurate model, overall and specifically when both single species effects were negative, was the strongest effect model. This is in stark contrast to models often used in antibiotic compound combinations, despite most effects being negative, where additivity is often the default model (Bollenbach 2015). The additive model performed well for mixed effects (i.e. one negative and one positive), but only slightly better than the strongest model, and poorly when both species had effects of the same sign. When both single species’ effects were positive, the strongest model was also the best, though the difference was less pronounced and all models performed worse for these interactions. This may be due to the small effect size seen with positive effects, as when we limited negative and mixed effects to a similar range of effects strength, their accuracy dropped to similar values (Figure 3–Figure supplement 5). We posit that the difference in accuracy across species is affected mainly by the effect type dominating different focal species' interactions, rather than by inherent species traits (Figure 3–Figure supplement 6)." (Lines 288-304)
  
  The same criticism applies to the result from figure 2-that pairs of affecting species have more negative effects than single species. Considered across all focal species this is true (though minor in effect size, Fig 2A). But there is only a significant effect within two individual species. Again, this points to the effects being focal-species-specific, and perhaps not as generalizable as is currently being claimed.
  
  Upon more rigorous analysis, and with regard to changes in the dataset after filtering, we see that the more accurate statement is that effects become stronger, not necessarily more negative (in line with the accuracy of the strongest model). The overall trend is towards more negative interactions, due to the majority of interactions being negative, but as stated this is not true for each individual focal. As such the following sentence in the manuscript has been changed:
  
  "The median effect on each focal was more negative by 0.28 on average, though the difference was not significant in all cases; additionally, focals with mostly positive single species interactions showed a small increase in median effect (Fig. 2D)" (Lines 151-154)
  
  As well as the title of this section: "Joint effects of species pairs tend to be stronger than those of individual affecting species" (Lines 127-128)
  
  Another thing that points to a focal-species-specific response is Fig 2D, which shows the distributions of responses of each focal species to pairs. Two of these distributions are unimodal, one appears bimodal, and three appear tri-modal. This suggests to me that the focal species respond in categorically different ways to species addition.
  
  We believe this distribution of pair effects is related to the distribution of single species effects, and not to the way in which different focal species respond to the addition of second species. Though this may be difficult to see from the swarm plots shown in the paper, below is a split violin plot that emphasizes this point.
  
  Fig R1: Distribution of single species and pair effects. Distribution of the effect of single and pairs of affecting species for each focal species individually. Dashed lines represent the median, while dotted lines the interquartile range.
  
  These differences occur even though the focal bacteria are all from the same family. This suggests to me that the generalizability may be even less when a more phylogenetically dispersed set of focal species are used.
  
  We have added the following sentence to the discussion explicitly emphasizing the phylogenetic limitations of our study:
  
  "Lastly, it is important to note that our focal species are all from the same order (Enterobacterales), which may also limit the purview of our findings." (Lines 364-366)
  
  Considering these points together, I argue that the conclusion should be shifted from "strongest effect is the best" to "in 3 of our focal species, strongest effect was the best, but this was not universal, and with only 6 focal species, we can't know if it will always be the best across a set of focal species".
  
  As mentioned above, we have softened our language regarding the take-home message in response to these evaluations.
  
  My second main criticism is that it is hard to understand exactly how the trait data were used to predict effects. It seems like it was just pearson correlation coefficients between interspecies niche distances (or antibiotic distances) and the effect. I'm not very surprised these correlations were unpredictive, because the underlying measurements don't seem to be relevant to the environment tested. What if, rather than using niche data across 20 nutrients, only the growth data on glucose (the carbon source in the experiments) was used? I understand that in a field experiment, for example, one might not know what resources are available, and so measuring niche across 20 resources may be the best thing to do. Here though it seems imperative to test using the most relevant data.
  
  It is true that much of the profiling data is not directly related to the experimental conditions (different carbon sources and antibiotics), but in addition to these we do use measurements from experiments carried out in the same environment as the interactions assays (i.e. growth rate and carrying capacity when growing on glucose), which also showed poor correlation with the effects on focals. Additionally, we believe that these profiles contain relevant information regarding metabolic similarity between species (similar to metabolic models often constructed computationally). To improve clarity, we added the following sentence to the figure legend of Figure 3–Figure supplement 1:
  
  "The growth rate, and maximum OD shown in panel A were measured only in M9 glucose, similar to conditions used in the interaction assays." (Lines 591-592)
  
  Additionally and relatedly, it would be valuable to show the scatterplots leading to the conclusion that trait data were uninformative. Pearson's r only works on an assumption of linearity. But there could be strong relationships between the trait data and effect that are monotonic but not linear, or even that are non-monotonic yet still strong (e.g. U-shaped). For the first case, I recommend switching to Spearman's rho over Pearson's r, because it only assumes monotonicity, not linearity. If there are observable relationships that are not monotonic, a different test should be used.
  
  Per your suggestion, we have changed the measurement of correlation in this analysis from Pearson's r, to Spearman's rho. As we observed similar, and still mostly weak correlations, we did not investigate these relationships further. See Figure 3–Figure supplement 1.
  
  Additionally, we generated heat maps including scatterplots mapping the data leading to these correlations. We found no notable dependency in these plots, and visually they were quite crowded and difficult to interpret. As this is not the central point of our study, we ultimately decided against adding this information to the plots.
  
  In general, I think the analyses using the trait data were too simplistic to conclude that the trait data are not predictive.
  
  We agree that more sophisticated analyses may help connect between species traits and their effects on focal species. In fact, other members of our research group have recently used machine learning to accomplish similar predictions (https://doi.org/10.1101/2022.08.02.502471). As such we have changed the wording in to reflect that this correlation is difficult to find using simple analyses:
  
  "These results indicate that it may be challenging to connect the effects of single and pairs of species on a focal strain to a specific trait of the involved strains, using simple analysis." (Lines 157-159)
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.01.506178v2
www.biorxiv.org www.biorxiv.org

New submission 19/12/2022, 15:09:10

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Slusarczyk et al present a very well written manuscript focused on understanding the mechanisms underlying aging of erythrophagocytic macrophages in the spleen (RPM) and its relationship to iron loading with age. The manuscript is diffuse with a broad swath of data elements. Importantly, the manuscript demonstrates that RPM erythrophagocytic capacity is diminished with age, restored in iron restricted diet fed aged mice. In addition, the mechanism for declining RPM erythrophagocytic capacity appears to be ferroptosis-mediated, insensitive to heme as it is to iron, and occur independently of ROS generation. These are compelling findings. However, some of the data relies on conjecture for conclusion and a clear causal association is not clear. The main conclusion of the manuscript points to the accumulation of unavailable insoluble forms of iron as both causing and resulting from decreased RPM erythrophagocytic capacity.
  
  We are proposing that intracellular iron accumulation progresses first and leads to global proteotoxic damage and increased lipid peroxidation. This eventually triggers the death of a fraction of aging RPMs, thus promoting the formation of extracellular iron-rich protein aggregates. More explanation can be found below. Besides, iron loading suppresses the erythrophagocytic activity of RPMs, hence further contributing to their functional impairment during aging.
  
  In addition, the finding that IR diet leads to increased TF saturation in aged mice is surprising.
  
  We believe that this observation implies better mobilization of splenic iron stores, and corroborates our conclusion that mice that age on an iron-reduced diet benefit from higher iron bioavailability, although these differences are relatively mild. More explanation can be found in our replies to Reviewer #2.
  
  Furthermore, whether the finding in RPMs is intrinsic or related to RBC-related changes with aging is not addressed.
  
  We now addressed this issue and we characterized in more detail both iron and ROS levels in RBCs.
  
  Finally, these findings in a single strain and only female mice is intriguing but warrants tempered conclusions.
  
  We tempered the conclusions and provided a basic characterization of the RPM aging phenotype in Balb/c female mice.
  
  Major points:
  
  1) The main concern is that there is no clear explanation of why iron increases during aging although the authors appear to be saying that iron accumulation is both the cause of and a consequence of decreased RPM erythrophagocytic capacity. This requires more clarification of the main hypothesis on Page 4, line 17-18.
  
  We thank the reviewer for this comment. It was previously reported that iron accumulates substantially in the spleen during aging, especially in female mice (Altamura et al., 2014). Since RPMs are those cells that process most of the iron in the spleen, we aimed to explore what is the relationship between iron accumulation and RPM functions during aging. This investigation led us to uncover that indeed iron accumulation is both the cause and the consequence of RPM dysfunction. Specifically, we propose that intracellular iron loading of RPMs precedes extracellular deposition of iron in a form of protein-rich aggregates, driven by RPMs damage. To support this, we now show that the proteome of RPMs overlaps with those proteins that are present in the age-triggered aggregates (Fig. 3F). Furthermore, corroborating our model, we now demonstrate that transient iron loading of RPMs via iron-dextran injection (new Fig. 3G) leads to the formation of protein-rich aggregates, closely resembling those present in aged spleens (new Fig. 3H). This implies that high iron content in RPMs is indeed a major driving factor that leads to aggregation of their proteome and cell damage. Importantly, we now supported this model with studies using iRPMs. We demonstrated that iron loading and blockage of ferroportin by synthetic mini-hepcidin (PR73)(Stefanova et al., 2018) cause protein aggregation in iRPMs and lead to their decreased viability only in cells that were exposed to heat shock, a well-established trigger of proteotoxicity (new Fig. 5K and L). We propose that these two factors, namely age-triggered decrease in protein homeostasis and exposure to excessive iron levels, act in concert and render RPMs particularly sensitive to damage during aging (see also Discussion, p. 16).
  
  In parallel, our data imply that the increased iron content in aged RPMs drives their decreased erythrophagocytic activity, as we now better documented by more extensive in vitro experiments in iRPMs (new Fig 6E-H). We cannot exclude that some of the senescent splenic RBCs that are retained in the red pulp and evade erythrophagocytosis due to RPM defects in aging, may also contribute to the formation of the aggregates. This is supported by the fact that mice that lack RPMs as well exhibit iron loading in the spleen (Kohyama et al., 2009; Okreglicka et al., 2021), and that the proteome of aggregates overlaps to some extent with the proteome of erythrocytes (new Fig. 3F).
  
  We believe that during aging intracellular iron accumulation is chiefly driven by ferroportin downregulation, as also suggested by Reviewer#3. We now show that ferroportin drops significantly already in mice aged 4 and 5 months (new Fig. 4H), preceding most of the other impairments. This drop coincides with the increase in hepcidin expression, but if this is the sole reason for ferroportin suppression during early aging would require further investigation outside the scope of the present manuscript.
  
  In sum, to address this comment, we now modified the fragment of the introduction that refers to our hypothesis and major findings to be more clear (p. 4), we improved our manuscript by providing new data mentioned above and we added more explanation in the corresponding sections of the Results and Discussion.
  
  2) It is unclear if RPMs are in limited supply. Based on the introduction (page 4, line 13-15), they have limited self-renewal capacity and blood monocytes only partially replenished. Fig 4D suggests that there is a decrease in RPMs from aged mice. The %RPM from CD45+ compartment suggests that there may just be relatively more neutrophils or fewer monocytes recruited. There is not enough clarity on the meaning of this data point.
  
  Thank you for this comment. We fully agree that %RPMs of CD45+ splenocytes, although well-accepted in literature (Kohyama et al., 2009; Okreglicka et al., 2021), is only a relative number. Hence, we now included additional data and explanations regarding the loss of RPMs during aging.
  
  It was reported that the proportion of RPMs derived from bone marrow monocytes increases mildly but progressively during aging (Liu et al., 2019). This implies that due to the loss of the total RPM population, as illustrated by our data, the cells of embryonic origin are likely even more affected. We could confirm this assumption by re-analysis of the data from Liu et al. that we now included in the manuscript as Fig. 5E. These data clearly show that the representation of embryonically-derived RPMs drops more drastically than the percent of total RPMs, whereas the replenishment rate from monocytes is not affected significantly during aging. Consistent with this, we have not observed any robust change in the population of monocytes (F4/80-low, CD11b-high) or pre-RPMs (F4/80-high, CD11b-high) in the spleen at the age of 10 months (Figure 5-figure supplement 2A and B). We also have detected a mild decrease, not an increase, in the number of granulocytes (new Figure 5-figure supplement 2C). Furthermore, we measured in situ apoptosis marker and found a clear sign of apoptosis in the aged spleen (especially in the red pulp area), a phenotype that is less pronounced in mice on an IR diet (new Fig. 5O). This is consistent with the observation that apoptosis markers can be elevated in tissues upon ferroptosis induction (Friedmann Angeli et al., 2014) and that the proteotoxic stress in aged RPMs, which we now emphasized better in our manuscript, may also lead to apoptosis (Brancolini & Iuliano, 2020). Taken together, we strongly believe that the functional defect of embryonically-derived RPMs chiefly contributes to their shortage during aging.
  
  3) Anemia of aging is a complex and poorly understood mechanistically. In general, it is considered similar to anemia of chronic inflammation with increased Epo, mild drop in Hb, and erythroid expansion, similar to ineffective erythropoiesis / low Epo responsiveness. It is not surprising that IR diet did not impact this mild anemia. However, was the MCV or MCH altered in aged and IR aged mice?
  
  We now included the data for hematocrit, RBC counts, MCV, and MCH in Figure 1-figure supplement 5. Hematocrit shows a similar tendency as hemoglobin levels, but the values for RBC counts, MCV, and MCH seem not to be altered. We also show now that the erythropoietic activity in the bone marrow is not affected in aged versus young mice. Taken together, the anemic phenotype in female C57BL/6J mice at this age is very mild, which we emphasized in the main text, and is likely affected by other factors than serum iron levels (p. 6).
  
  4) Page 6, line 23 onward: the conclusion is that KC compensate for the decreased function of RPM in the spleen, based on the expansion of KC fraction in the liver. Is there evidence that KCs are engaged in more erythrophagocytosis in aged mice? Furthermore, iron accumulation in the liver with age does not demonstrate specifically enhanced erythrophagocytosis of KC. Please clarify why liver iron accumulation would not be simply a consequence of increased parenchymal iron similar to increased splenic iron with age, independent of erythrophagocytic activity in resident macrophages in either organ.
  
  Thanks for these questions. For the quantification of the erythrophagocytosis rate in KC, we show, as for the RPMs (Fig. 1K), the % of PKH67-positive macrophages, following transfusion of PKH67-stained stressed RBCs (Fig. 1M). The data implies a mild (not statistically significant) drop (of approx. 30%) in EP activity. We believe that it is overridden by a more pronounced (on average, 2-fold) increase in the representation of KCs (Fig. 1N). The mechanisms of iron accumulation between the spleen and the liver are very different. In the liver, we observed iron deposition in the parenchymal cells (not non-parenchymal, new Fig. 1P) that we currently characterizing in more detail in a parallel manuscript. Our data demonstrate a drop in transferrin saturation in aged mice. Hence, it is highly unlikely that aging would be hallmarked by the presence of circulating non-transferrin-bound iron that would be sequestered by hepatocytes, as shown previously (Jenkitkasemwong et al., 2015). Thus, the iron released locally by KCs is the most likely contributor to progressive hepatocytic iron loading during aging. The mechanism of iron delivery to hepatocytes from erythrophagocytosing KCs was demonstrated by Theurl et al.(Theurl et al., 2016), and we propose that it may be operational, although in a much more prolonged time scale, during aging. We now discussed this part better in our Results sections (p. 7).
  
  5) Unclear whether the effect on RPMs is intrinsic or extrinsic. Would be helpful to evaluate aged iRPMs using young RBC vs. young iRPMs using old RBCs.
  
  We are skeptical if the generation of iRPMs cells from aged mice would be helpful – these cells are a specific type of primary macrophage culture, derived from bone marrow monocytes with MCSF1, and exposed additionally to heme and IL-33 for 4 days. We do not expect that bone marrow monocytes are heavily affected by aging, and would thus recapitulate some aspects of aged RPMs from the spleen, especially after 8-day in vitro culture. However, to address the concerns of the reviewer, we now provide additional data regarding RBC fitness. Consistent with the time life-span experiment (Fig, 2A), we show that oxidative stress in RBCs is only increased in splenic, but not circulating RBCs (new Fig. 2C, replacing the old Fig. 2B and C). In addition, we show no signs of age-triggered iron loading in RBCs, either in the spleen (new Fig. 2F) or in the circulation (new Fig. 2B). Hence, we do not envision a possibility that RPMs become iron-loaded during aging as a result of erythrophagocytosis of iron-loaded RBCs. In support of this, we also have observed that during aging first RPMs’ FPN levels drop, afterward erythrophagocytosis rate decreases, and lastly, RBCs start to exhibit significantly increased oxidative stress (presented now in new Fig. 4H, J and K).
  
  6) Discussion of aggregates in the spleen of aged mice (Fig 2G-2K and Fig 3) is very descriptive and non-specific. For example, if the iron-rich aggregates are hemosiderin, a hemosiderin-specific stain would be helpful. This data specifically is correlatory and difficult to extract value from.
  
  Thanks for these comments. To the best of our knowledge Prussian blue Perls’ staining (Fig. 2J) is considered a hemosiderin staining. Our investigations aimed to better understand the nature and the origin of splenic iron deposits that to some extent are referred to as hemosiderin. Most importantly, as mentioned in our reply R1 Ad. 1. to assign causality to our data, we now demonstrated that iron accumulation in RPMs in response to iron-dextran (Fig. 3G) increases lipid peroxidation (Fig. 5F), tends to provoke RPMs depletion (Fig. 5G) and triggers the formation of protein-rich aggregates (new Fig. 3H). Of note, we assume that the loss of embryonically-derived RPMs in this model may be masked by simultaneous replenishment of the niche from monocytes, a phenomenon that may be addressed by future studies using Ms4a3-driven reporter mice (as shown for aged mice in our new Fig. 5E).
  
  7) The aging phenotype in RPMs appears to be initiated sometime after 2 months of age. However, there is some reversal of the phenotype with increasing age, e.g. Fig 4B with decreased lipid peroxidation in 9 month old relative to 6 month old RPMs. What does this mean? Why is there a partial spontaneous normalization?
  
  Thanks for this comment and questions. Indeed, the degree of lipid peroxidation exhibits some kinetics, suggestive of partial normalization. Of note, such a tendency is not evident for other aging phenotypes of RPMs, hence, we did not emphasize this in the original manuscript. However, in a revised version of the manuscript, we now present the re-analysis of the published data which implies that the number of embryonically-derived RPMs drops substantially between mice at 20 weeks and 36 weeks (new Fig. 5E). We think that the higher proportion of monocyte-derived RPMs in total RPM population later in aging (9 months) might be responsible for the partial alleviation of lipid peroxidation. We now discussed this possibility in the Results sections (p. 12).
  
  8) Does the aging phenotype in RPMs respond to ferristatin? It appears that NAC, which is a glutathione generator and can reverse ferroptosis, does not reverse the decreased RPM erythrophagocytic capacity observed with age yet the authors still propose that ferroptosis is involved. A response to ferristatin is a standard and acceptable approach to evaluating ferroptosis.
  
  We fully agree with the Reviewer that using ferristatin or Liproxstatin-1 would be very helpful to fully characterize a mechanism of RPMs depletion in mice. However, previous in vivo studies involving Liproxstatin-1 administration required daily injections of this ferroptosis inhibitor (Friedmann Angeli et al., 2014). This would be hardly feasible during aging. Regarding the experiments involving iron-dextran injection, using Liproxstatin-1 would require additional permission from the ethical committee which takes time to be processed and received. However, to address this question we now provide data from iRPMs cell cultures (new Fig.5 K-L). In essence, our results imply that both proteotoxic stress and iron overload act in concert to trigger cytotoxicity in RPM in vitro model. Interestingly, this phenomenon does not depend solely on the increased lipid peroxidation, but when we neutralize the latter with Liproxstatin-1, the cytotoxic effect is diminished (please, see also Results on p. 13 and Discussion p. 15/16).
  
  9) The possible central role for HO-1 in the pathophysiology of decreased RPM erythrophagocytic capacity with age is interesting. However, it is not clear how the authors arrived at this hypothesis and would be useful to evaluate in the least whether RBCs in young vs. aged mice have more hemoglobin as these changes may be primary drivers of how much HO-1 is needed during erythrophagocytosis.
  
  Thanks for this comment. We got interested in HO-1 levels based on the RNA sequencing data, which detected lower Hmox-1 expression in aged RPMs (Figure 3-figure supplement 1). We now show that the content of hemoglobin is not significantly altered in aged RBCs (MCH parameter, Figure 1-figure supplement 5E), hence we do not think that this is the major driver for Hmox-1 downregulation. Likewise, the levels of the Bach1 message, a gene encoding Hmox-1 transcriptional repressor, are not significantly altered according to RNAseq data. Hence, the reason for the transcriptional downregulation of Hmox-1 is not clear. Of note, HO-1 protein levels in the total spleen are higher in aged versus young mice, and we also detected a clear appearance of its nuclear truncated and enzymatically-inactive form (see a figure below, we opt not to include this in the manuscript for better clarity). The appearance of truncated HO-1 seems to be partially rescued by the IR diet. It is well established that the nuclear form of HO-1 emerges via proteolytic cleavage and migrates to the nucleus under conditions of oxidative stress (Mascaro et al., 2021). This additionally confirms that the aging spleen is hallmarked by an increased burden of ROS. Moreover, we also detected HO-1 as one of the components of the protein iron-rich aggregates. Thus, we propose that the low levels of the cytoplasmic enzymatically active form of HO-1 in RPMs (that we preferentially detect with our intracellular staining and flow cytometry) may be underlain by its nuclear translocation and sequestration in protein aggregates that evade antibody binding [this is also supported by our observation that the protein aggregates, despite the high content of ferritin (as indicated by MS analysis) are negative for L-ferritin staining. Of note, we also cannot exclude that other cell types in the aging spleen (eg. lymphocytes) express higher levels of HO-1 in response to splenic oxidative stress.
  
  Fig. Total splenic levels of HO-1 in young, aged IR and aged mice.
  
  Reviewer #2 (Public Review):
  
  Slusarczyk et al. investigate the functional impairment of red pulp macrophages (RPMs) during aging. When red blood cells (RBCs) become senescent, they are recycled by RPMs via erythrophagocytosis (EP). This leads to an increase in intracellular heme and iron both of which are cytotoxic. The authors hypothesize that the continuous processing of iron by RPMs could alter their functions in an age-dependent manner. The authors used a wide variety of models: in vivo model using female mice with standard (200ppm) and restricted (25ppm) iron diet, ex vivo model using EP with splenocytes, and in vitro model with EP using iRPMs. The authors found iron accumulation in organs but markers for serum iron deficiency. They show that during aging, RPMs have a higher labile iron pool (LIP), decreased lysosomal activity with a concomitant reduction in EP. Furthermore, aging RPMs undergo ferroptosis resulting in a non-bioavailable iron deposition as intra and extracellular aggregates. Aged mice fed with an iron restricted diet restore most of the iron-recycling capacity of RPMs even though the mild-anemia remains unchanged.
  
  Overall, I find the manuscript to be of significant potential interest. But there are important discrepancies that need to be first resolved. The proposed model is that during aging both EP and HO-1 expression decreases in RPMs but iron and ferroportin levels are elevated. In their model, the authors show intracellular iron-rich proteinaceous aggregates. But if HO-1 levels decrease, intracellular heme levels should increase. If Fpn levels increase, intracellular iron levels should decrease. How does LIP stay high in RPMs under these conditions? I find these to be major conflicting questions in the model.
  
  We thank the Reviewer for her/his valuable feedback. As we mentioned in our replies we can only assume that a small misunderstanding in the interpretation of the presented data underlies this comment. We show that ferroportin levels in RPMs (Fig. 1F) are modulated in a manner that fully reflects the iron status of these cells (both labile and total iron levels, Figs. 1H and I). FPN levels drop in aged RPMs and are rescued when mice are maintained on a reduced iron diet. As pointed out by Reviewer#3, and explained in our replies we believe that ferroportin levels are critical for the observed phenotypes in aging. We now described our data in a more clear way to avoid any potential misinterpretation (p.6).
  
  Reviewer #3 (Public Review):
  
  This is a comprehensive study of the effects of aging of the function of red pulp macrophages (RPM) involved in iron recycling from erythrocytes. The authors document that insoluble iron accumulates in the spleen, that RPM become functionally impaired, and that these effects can be ameliorated by an iron-restricted diet. The study is well written, carefully done, extensively documented, and its conclusions are well supported. It is a useful and important addition for at least three distinct fields: aging, iron and macrophage biology.
  
  The authors do not explain why an iron-restricted diet has such a strong beneficial effect on RPM aging. This is not at all obvious. I assume that the number of erythrocytes that are recycled in the spleen, and are by far the largest source of splenic iron, is not changed much by iron restriction. Is the iron retention time in macrophages changed by the diet, i.e. the recycled iron is retained for a short time when diet is iron-restricted (making hepcidin low and ferroportin high), and long time when iron is sufficient (making hepcidin high and ferroportin low)? Longer iron retention could increase damage and account for the effect. Possibly, macrophages may not empty completely of iron before having to ingest another senescent erythrocyte, and so gradually accumulate iron.
  
  We are very grateful to this Reviewer for emphasizing the importance of the iron export capacity of RPMs as a possible driver of the observed phenotypes. Indeed, as mentioned above, we now show in the revised version of the manuscript that ferroportin drops early during aging (revised Fig. 4). Importantly, we now also observed that iron loading and limitation of iron export from iRPMs via ferroportin aggravate the impact of heat shock (a well-accepted trigger of proteotoxicity) on both protein aggregation and cell viability (new Fig. 5K and L). Physiologically, recent findings show that aging promotes a global decrease in protein solubility [BioRxiv manuscript (Sui X. et al., 2022)], and it is very likely that the constant exposure of RPMs to high iron fluxes renders these specialized cells particularly sensitive to proteome instability. This could be further aggravated by a build-up of iron due to the drop of ferroportin early during aging, ultimately leading to the appearance of the protein aggregates as early as at 5 months of age in C57BL/6J females. Based on the new data, we emphasized this model in the revised version of the manuscript (please, see Discussion on p. 16)
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.01.16.476518v4
www.biorxiv.org www.biorxiv.org

Neuronal Activity in Dorsal Anterior Cingulate Cortex during Economic Choices under Variable Action Costs

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2:
  
  Cai & Padoa-Schioppa recorded from macaque dorsal anterior cingulate cortex (ACCd) while requiring animals to choose between different juice types offered in variable amounts and with different action costs. Authors compared neural activity in ACCd (present study) with previous, directly comparable, findings on this same task when recording in macaque orbitofrontal cortex. The behavioral task is very powerful and the analyses of both the choice behavior and neural data are rigorous. Authors conclude that ACCd is unique in representing more post-decision variables and in its encoding of chosen value and binary outcome in several reference frames (chosen juice, chosen cost, and chosen action), not offer value, like OFC. Indeed, the encoding of choice outcomes in ACCd was skewed toward a cost-based reference frame. Overall, this is important new information about primate ACCd. I have only a few suggestions to enhance clarity. Figures 5 and 7 are maximally informative, but it is not clear that Figure 6 adds much to the reported Results. It is also suggested to abbreviate the comparison with Hosokawa et al. as it presently takes up 3 paragraphs in the Discussion: it is clear the methods and task designs were different enough to not be so easily compared with the present study. An additional suggestion would be to include mention of the comparison with OFC in the abstract and possibly also in the title, since the finding and direct comparison in Figure 7 are some of the most novel and interesting effects of the paper. Other suggestions are minor, and have to do with definition of time windows, variables, and additional papers that authors may cite for a well-rounded Discussion.
  
  Please refer to Essential Revisions point #4. And we added “In contrast to the OFC” in the abstract to highlight the difference between these two regions.
  
  Essential Revisions Point #4 Response:
  
  We shortened the discussion from 3 paragraphs to 1 paragraph as follows.
  
  "In another study, Hosokawa, Kennerley et al. (2013) compared the neuronal coding in ACCd and OFC in a choice task involving cost-benefit tradeoff. Our findings differ in two aspects. First, Hosokawa et. al. (2013) reported contralateral action value coding in ACCd while we did not discover significant offer value coding in either spatial- or action-based reference frames in our ACCd recordings. Second, they reported that there was no action-based value representation in the OFC therefore concluded that OFC does not integrate action cost in economic choice. Two elements may help explain the discrepancies between our findings in ACCd and OFC (Cai and Padoa-Schioppa 2019) and those of Hosokawa et. al. (2013). First, we recall that Hosokawa et. al. (2013) only tested value-related variables such as the benefit, cost and discounted value in action-based reference frame. Most importantly, they did not test the variable that is related to the saccade direction, which is highly correlated with the spatial value signal. As a consequence, contralateral value signal may not be significant if chosen target location was included in their regression analysis. Indeed, in our analysis, saccade direction (or chosen target location) was identified as one of the variables that explained a significant portion of neuronal activity in ACCd (Cai and Padoa-Schioppa 2012, Cai and Padoa-Schioppa 2019).The second and often overlooked aspect is that value may be encoded in schemes other than the action-based reference frame. In their study, each unique combination of reward quantity and cost was presented by a unique picture. Thus, information on good attributes were conveyed to the animal with an “integrated” visual representation. Accordingly, a distinct group of neurons may have been recruited to encode the reward and cost conjunctively represented by a unique fractal, which would result in 16 groups of offer value coding neurons."
  
  Reviewer #3:
  
  Cai and Padoa-Schioppa present a paper titled 'Neuronal Activity in Dorsal Anterior Cingulate Cortex during Economic Choices under Variable Action Costs'. They used a binary choice task where both offers indicated the reward type, reward amount, and the action cost (but not the specific action.) Variable action costs were then operationalized by placing targets on concentric circles of different radius. Here, and in a previous study that included OFC recordings (Cai and Padoa-Schioppa, 2019), monkeys integrated action costs into their decisions. Single-unit recordings in ACCd revealed that neurons predominantly coded for post-decision variables, such as cost of the chosen target and the juice type of the chosen offer, but not pre-decision variables, such as offer values. Given this finding, the authors compared the percentage of neurons in OFC and ACCd that coded for decision variables. In OFC neurons, the activity was mostly restricted to the offer presentation phase, whereas ACCd neurons showed sustained coding of chosen value and costs that lasted until the appearance of the saccade targets. Overall, this is an interesting study that provides evidence that decision-related signals evolve from coding offer values in the OFC to representing chosen costs in the ACC. This finding could highlight the roles of ACC neurons in learning and decision making. We have only a few questions.
  
  1) Do any of the variables used in this study correlate with a conflict? When the authors previously studied ACC, they discarded the conflict monitoring hypothesis - a hypothesis that is well established for ACC hemodynamic responses - for ACC single cell activity based on neural data from 'difficult' decisions (Cai and Padoa-Schioppa, 2012). The definition of difficulty they used, then, was descriptive and based on reaction times (RTs). They defined the most difficult trials as those trials with the longest RTs and discovered that those trials had options with similar offer values. This definition of choice difficulty appears to be contrived from evidence accumulation models/tasks, where normatively harder judgments elicit longer RTs. However, there is no normative economic reason that trials with similar offer values are more difficult or should cause conflict. After all, according to theory, choosing between two options with the same value is as easy as flipping a coin. Here, it seems like the authors could have a more fitting definition of conflict. For example, conflict can be operationalized by considering trials when the animal must choose between a high value/high-cost option and a low-value/low-cost option. In that case, the costs and benefits are in conflict. What do the RTs look like? Do the RTs indicate conflict resolution? If so, is this reflected in neuronal responses?
  
  We thank the reviewer for raising this important point. First, we would like to clarify that both in this study and in our previous study of ACC (Cai and Padoa-Schioppa 2012) we imposed a delay between offer presentation and the go signal. Such delay is critical to disentangle value comparison from action selection. However, the delay effectively dissociates reaction times from the decision difficulty. Normally, we operationalize the decision difficulty (or conflict) with the variable value ratio = chosen value / unchosen value. In an early behavioral study conducted in capuchin monkeys, where no delay was imposed between offer presentation and the go signal, we found that reaction times were strongly correlated with the value ratio, as one would naturally expect (Padoa-Schioppa, Jandolo et al. 2006). In the previous study of ACC (Cai and Padoa-Schioppa 2012) we referenced that earlier result but, again, we did not analyze reaction times.
  
  Coming to the present study, we addressed this question by including in the variable selection analyses the two variables value ratio and cost/benefit conflict = cost of A * sign(offer value A – offer value B) (see also Table 2). The results of the updated analysis are illustrated in the new Figure 4, which we include here below. In essence, including these two variables did not affect the results of the variable selection analysis. That is, both the stepwise and best-subset methods selected the variables chosen value, chosen cost, chosen juice, chosen offer location only and chosen target location only.
  
  Figure 4. Population summary of ANCOVA (all time windows). (A) Explained responses. Row and columns represent, respectively, time windows and variables. In each location, the number indicates the number of responses explained by the corresponding variable in that time window. For example, chosen value (juice) explained 34 responses in the post-offer time window. The same numbers are also represented in gray scale. Note that each response could be explained by more than one variable and thus could contribute to multiple bins in this panel. (B) Best fit. In each location, the number indicates the number of responses for which the corresponding variable provided the best fit (highest R2 in that time window. For example, chosen value (juice) provided the best fit for 40 responses in the late-delay time window. The numerical values are also represented in gray scale. In this plot, each response contributes to at most one bin.
  
  2) The authors claimed that the ACCd neurons integrated juice identity, juice quantity and action costs later in the trial. As they acknowledge, the evidence for this claim is marginal. The conclusion the authors made in line 211, therefore, could be moderated. Given that the model containing cost-related variables is more complex, it is equally valid and more appropriately to write '… we cannot reject the null hypothesis that action cost was not integrated by chosen value responses later in the trial.
  
  We acknowledge the complexity of this claim. However, results from previous studies (Kennerley, Dahmubed et al. 2009, Kennerley and Wallis 2009, Hosokawa, Kennerley et al. 2013) are in favor of establishing a null hypothesis of integration rather than non-integration. Therefore, we feel that it is more appropriate to keep the null hypothesis of cost integration while in the meantime acknowledging that in our study the evidence for cost integration is rather weak.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.14.448291v2
www.biorxiv.org www.biorxiv.org

New submission 07/02/2023, 10:46:57

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  1) It would be helpful to include some sort of comparison in Fig. 4, e.g. the regressions shown in Fig 3, to indicate to what extent the ICCl data corresponds to the "control range" of frequency tuning.
  
  Figure 4 was modified to show the frequency range typically found in the ICCls. This range is based on results from Wagner et al., 2007, which extensively surveyed ICCls responses. This modification shows that our ICCls recordings in the ruff-removed owls cover the normal frequency hearing range of the owl.
  
  2) A central hypothesis of the study is that the frequency preference of the high-frequency neurons is lower in ruff-removed owls because of the lowered reliability caused by a lack of the ruff. Yet, while lower, the frequency range of many neurons in juvenile and ruff-removed owls seems sufficiently high to be still responsive at 7-8 kHz. I think it would be important to know to what extent neurons are still ITD sensitive at the "unreliable high frequencies" even if the CFs are lower since the "optimization" according to reliability depends not on the best frequency of each neuron per se, but whether neurons are less ITD sensitive at the higher, less reliable frequencies.
  
  The concern regarding the frequency range that elicits responsivity was largely addressed above. Specifically, Figure L1 showing frequency tuning of frontally tuned ICx neurons in ruff-removed owls indicates that while there is some variability of tuning across neurons, there is little responsivity above 6 kHz. In contrast, equivalent analysis in juvenile owls (Figure L3), shows there is much more responsiveness and variability across neurons to high and low frequencies. This evidence supports our hypothesis that the juvenile owl brain is still highly plastic, which facilitates learning during development. Although the underlying data was already reported in Figure 7 of our previously submitted manuscript, we can include Figures L1 and L2, potentially as supplemental figures, if considered useful by editors and reviewers. Nevertheless, this argumentation was further expanded in the revised text (Line 229).
  
  Figure L1. Frequency tuning of frontally-tuned ICx neurons in ruff-removed owls. Tuning curves are normalized by the max response. Thick black line indicates the average tuning curve. Dashed black line indicates basal response.
  
  Figure L2. ITD sensitivity across frequencies in ruff-removed owl. Two example neurons shown in a and b. ITD tuning for tones (colored) and broadband (black) plotted by firing rate (non-normalized). Solid colored lines indicate responses to frequencies that are within the neuron’s preferred frequency range (i.e. above the half-height, see Methods), dashed lines indicate frequencies outside of the neuron’s frequency range.
  
  Figure L3. Frequency tuning of frontally-tuned ICx neurons in juvenile owls. Tuning curves are normalized by the max response. Thick black line indicates the average tuning curve. Dashed black line indicates basal response.
  
  3) It would be interesting to have an estimate of the time scale of experience dependency that induces tuning changes. Do the authors have any data on this question? I appreciate the authors' notion that the quantifications in Fig 7 might indicate that juvenile owls are already "beginning to be shaped by ITD reliability" (line 323 in Discussion). How many days after hearing onset would this correspond to? Does this mean that a few days will already induce changes?
  
  While tracking changes induced by ruff-removal over development were outside of the scope of this study, many other studies have assessed experience-dependent plasticity in the barn owl. The recordings in this study were performed approximately 20 days after hearing onset, suggesting that the juveniles had ample time to begin learning. These points were expanded upon in the discussion (Lines 254, 280-283).
  
  Reviewer #2 (Public Review):
  
  1) Why is IPD variability plotted instead of ITD variability (or indeed spatial reliability)? The relationship between these measures is likely to vary across frequency, which makes it difficult to compare ITD variability across frequency when IPDs are plotted. Normalizing data across frequencies also makes it difficult to compare different locations and acoustical conditions. For example, in Fig.1a and Fig.1b, the data shown for 3 kHz at ~160 degrees seems quantitatively and visually quite different, but the difference (in Fig.1c) appears to be negligible.
  
  Justification of why IPD variability is used as an estimate of ITD variability was added to introduction (Lines 55-60), results (Line 100) and methods (Lines 371-374) sections of the manuscript, explaining the fact that because ITD detection is based on phase locking by auditory nerve and ITD detector neurons tuned to narrow frequency bands, responses of ITD detector neurons forwarded to downstream midbrain regions are therefore determined by IPD variability. Additionally, ITD is calculated by dividing IPD by frequency, which makes comparisons of ITD reliability across frequency mathematically uninformative.
  
  2) How well do the measures of ITD reliability used reflect real-world listening? For example, the model used to calculate ITD reliability appears to assume the same (flat) spectral profile for targets and distractors, which are presented simultaneously with the same temporal envelope, and a uniform spatial distribution of sounds across space. It is therefore unclear how robust the study's results are to violations of these assumptions.
  
  While we agree that our analysis cannot completely capture real-world listening for the barn owl, a general analysis using similar flat spectral profiles for targets and concurrent sounds provides a broad assessment of reliability of ITD cues. While a full recapitulation of real-world listening is beyond the scope of this study (i.e. recording natural scenes from the ear canals of wild barn owls), we included additional analyses of ITD reliability in Figure 1-figure supplement 1, described above.
  
  3) Does facial ruff removal produce an isolated effect on ITD variability or does it also produce changes in directional gain, and the relationship between spatial cues and sound location? Although the study considers this issue in some places (e.g. Fig.2, Fig.5), a clearer presentation of the acoustical effects of facial ruff removal and their implications (for all locations, not just those to the front), as well as an attempt to understand how these acoustical changes lead to the observed changes in ITD reliability, would greatly strengthen the study. In addition, Fig.1 shows average ITD reliability across owls, but it would be helpful to know how consistent these measures are across owls, given individual variability in Head-Related Transfer Functions (HRTFs). This potentially has implications for the electrophysiological experiments, if the HRTFs of those animals were not measured. One specific question that is potentially very relevant is whether the facial ruff attenuates sounds presented behind the animal and whether it does so in a frequency-dependent way. In addition, if facial ruff removal enables ILDs to be used for azimuth, then ITDs may also become less necessary at higher frequencies, even if their reliability remains unchanged.
  
  Additional analysis was conducted to generate representation of changes in directional gain induced by ruff removal, added to new figure (Fig 5). This analysis shows that changes in gain following ruff-removal are largely frequency-independent: there is a de-attenuation of peripherally and rearwardly located sounds, but the highest gain remains for high frequencies in frontal space. There is an additional increase in gain for high frequencies from rearward space, these changes would not explain the changes in frequency tuning we report. As mentioned in new additions to the manuscript, the changes at the most rearward-located auditory spatial locations are unlikely to have an effect on the auditory midbrain. No studies in the barn owl have found neurons in the ICx or optic tectum tuned to >120° (Knudsen, 1982; Knudsen, 1984; Cazettes et al., 2014). In addition, variability of IPD reliability across owls was analyzed and reported in the amended Figure 1, which notes very little changes across owls. In this analysis, we did realize that the file of one of the HRTFs obtained from von Campenhausen et al. 2006 was mislabeled, which explains slight differences in revised Fig 1b. Nevertheless, added analysis of IPD reliability across owls indicates that the pattern in ITD reliability is stable across owls (Fig. 1d,e), which supports our decision to not record HRTFs from owls used in this study. Finally, we added to the discussion that clarifies that the use of ILD for azimuth would not provide the same resolution as ITD would (Lines 295-303). We also do not believe that the use of ILD for azimuth would make “ITDs… less necessary at higher frequencies”, given that the ICCls is still computing ITD at these high frequencies (Fig 4), and that ILDs also have higher resolution at higher frequencies, with and without the facial ruff (Olsen et al, 1989; Keller et al., 1998; von Campenhausen et al., 2006).
  
  1) It is unclear why some analyses (Fig.5, Fig.7) are focused on frontal locations and frontally-tuned neurons. It is also unclear why neurons with a best ITDs of 0 are described as frontally tuned since locations behind the animal produce an ITD of 0 also. Related to this, in Fig.1, facial ruff removal appears to reduce IPD variability at low frequencies for locations to the rear (~160 degrees), where the ITD is likely to be close to 0. Neurons with a best ITD of 0 might therefore be expected to adjust their frequency tuning in opposite directions depending on whether they are tuned to frontal or rearward locations.
  
  An extensive explanation was added to the methods detailing why we do not believe the neurons recorded in this study are tuned to the rear. Namely, studies mapping the barn owl’s ICx and optic tectum have not reported neurons tuned to locations >120°, with the number of neurons representing a given spatial location decreasing with eccentricity (Knudsen, 1982; Knudsen, 1984; Cazettes et al., 2014). While we agree that there does seem to be a change in ITD reliability at ~160° following ruff-removal, the result is largely similar to the change that occurs in frontal space (Fig 1b), which is consistent with the ruff-removed head functioning as a sphere. Thus, we wouldn’t expect rearwardly-tuned neurons, if they could be readily found, to adjust their frequency tuning to higher frequencies. Finally, we want to clarify that we focused our analyses on frontally-tuned neurons because frontal space is where we observed the largest change in ITD reliability. Text was added to the Discussion section to clarify this point (Lines 313-321).
  
  2) The study suggests that information about high-frequency ITDs is not passed on to the ICX if the ICX does not contain neurons that have a high best frequency. However, neurons might be sensitive to ITDs at frequencies other than the best frequency, particularly if their frequency tuning is broader. It is also unclear whether the best frequency of a neuron always corresponds to the frequency that provides the most reliable ITD information, which the study implicitly assumes.
  
  The concern about ITD sensitivity at non-preferred frequencies was addressed under the essential revision #3, as well as under Reviewer 1’s concerns.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.29.510116v1
www.biorxiv.org www.biorxiv.org

New submission 03/09/2022, 16:23:41

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This manuscript reports a systematic study of the cortical propagation patterns of human beta bursts (~13-35Hz) generated around simple finger movements (index and middle finger button presses).
  
  The authors deployed a sophisticated and original methodology to measure the anatomical and dynamical characteristics of the cortical propagation of these transient events. MEG data from another study (visual discrimination task) was repurposed for the present investigation. The data sample is small (8 participants). However, beta bursts were extracted over a +/- 2s time window about each button press, from single trials, yielding the detection and analysis of hundreds of such events of interest. The main finding consists of the demonstration that the cortical activity at the source of movement related beta bursts follows two main propagation patterns: one along an anteroposterior directions (predominantly originating from pre central motor regions), and the other along a medio- lateral (i.e., dorso lateral) direction (predominantly originating from post central sensory regions). Some differences are reported, post-hoc, in terms of amplitude/cortical spread/propagation velocity between pre and post-movement beta bursts. Several control tests are conducted to ascertain the veracity of those findings, accounting for expected variations of signal-to-noise ration across participants and sessions, cortical mesh characteristics and signal leakage expected from MEG source imaging.
  
  One major perceived weakness is the purely descriptive nature of the reported findings: no meaningful difference was found between bursts traveling along the two different principal modes of propagation, and importantly, no relation with behavior (response time) was found. The same stands for pre vs. post motor bursts, except for the expected finding that post-motor bursts are more frequent and tend to be of greater amplitude (yielding the observation of a so-called beta rebound, on average across trials).
  
  Overall, and despite substantial methodological explorations and the description of two modes of propagation, the study falls short of advancing our understanding of the functional role of movement related beta bursts.
  
  For these reasons, the expected impact of the study on the field may be limited. The data is also relatively limited (simple button presses), in terms of behavioral features that could be related to the neurophysiological observations. One missed opportunity to explain the functional role of the distinct propagation patterns reports would have been, for instance, to measure the cortical "destination" of their respective trajectories.
  
  In response to this comment, we would like to highlight two important points.
  
  First, our work constitutes the first non-invasive human confirmation of invasive work in animals (Balasubramanian et al., 2020; Roberts et al., 2019; Rule et al., 2018; (Balasubramanian et al., 2020; Best et al., 2016; Rubino et al., 2006; Takahashi et al., 2011, 2015) and patients (Takahashi et al., 2011). Thus, these results bridges between recordings limited to the size of multielectrode arrays (roughly 0.16 cm2; Balasubramanian et al., 2020; Best et al., 2016; Rubino et al., 2006; Takahashi et al., 2011, 2015) and human EEG recordings spanning across large areas of the cortex and several functionally distinct regions (Alexander et al., 2016; Stolk et al., 2019). The ability to access these neural signatures non- invasively is important for cross-species comparison. This further enables us, to provide an in-depth analysis of the spatiotemporal diversity of human MEG signals and a detailed characterisation of the two propagation directions, which significantly extends previous reports. We note that their functional role remains undetermined also in these animal studies, but being able to identify these signals now in humans can provide a steppingstone for identifying their role.
  
  Second, and related, the reviewers are correct that we did not observe distinct propagation directions between pre- and post-movement bursts, nor a relationship with reaction time. However, such a null result would be relevant, in our view, towards understanding what the functional relevance of these signals, if any, might be. Recent work in macaques indicates that the spatiotemporal patterns of high-gamma activity carry kinematic information about the upcoming movement (Liang et al 2023). The functional role of beta may therefore be more complex and not relate to reaction times or kinematics in a straightforward manner. We believe this is a relevant observation, and in keeping with the continued efforts to identify how sensorimotor beta relates to behaviour. It is increasingly clear that spatiotemporal diversity in animal recordings and human E/MEG and intracranial recordings can constitute a substantial proportion of the measured dynamics. As such, our report is relevant in narrowing down what these signals may reflect.
  
  Together, we think that our work provides new insights into the multidimensional and propagating features of burst activity. This is important for the entire electrophysiology community, as it transforms how we commonly analyse and interpret these important brain signals. We anticipate that our work will guide and inspire future work on the mechanistic underpinnings of these dominant neural signals. We are confident that our article has the scope to reach out to the diverse readership of eLife.
  
  Reviewer #2 (Public Review):
  
  The authors devised novel and interesting experiments using high precision human MEG to demonstrate the propagation of beta oscillation events along two axes in the brain. Using careful analysis, they show different properties of beta events pre- and post movement, including changes in amplitude. Due to beta's prominent role in motor system dynamics, these changes are therefore linked to behavior and offer insights into the mechanisms leading to movement. The linking of wave-like phenomena and transient dynamics in the brain offers new insight into two paradigms about neural dynamics, offering new ways to think about each phenomena on its own.
  
  Although there is a substantial, and recent, body of literature supporting the conclusions that beta and other neural oscillations are transient, care must be taken when analyzing the data and the resulting conclusions about beta properties in both time and space. For example, modifying the threshold at which beta events are detected could alter their reported properties and expression in space and time. The authors should therefore performing parameter sweeps on e.g. the thresholds for detection of oscillation bursts to determine whether their conclusions on beta properties and propagation hold. If this additional analysis does not change their story, it would lend confidence in the results/conclusions.
  
  We thank the reviewing team for this comment. As suggested, we evaluated the effect of different burst thresholds on the burst parameters.
  
  The threshold in the main analysis was determined empirically from the data, as in previous work (Little et al., 2019). Specifically, trial-wise power was correlated with the burst probability across a range of different threshold values (from median to median plus seven standard deviations (std), in steps of 0.25, see Figure 6-figure supplement 1). The threshold value that retained the highest correlation between trial-wise power and burst probability was used to binarize the data.
  
  We repeated our original analysis using four additional thresholds, i.e., original threshold - 0.5 std, -0.25 std, +0.25 std, +0.5 std. As one would expect, burst threshold is negatively related to the number of bursts (i.e., higher thresholds yield fewer bursts, Figure R4a [top]), and positively related to burst amplitude (i.e., higher thresholds yield higher burst amplitudes, Figure R4a [bottom]).
  
  Similarly, the temporal duration of bursts and apparent spatial width are modulated by the burst threshold: lowering the threshold leads to longer temporal duration and larger apparent spatial width while increasing the threshold leads to shorter temporal duration and smaller apparent spatial width Figure R4b. Note that for the temporal and spectral burst characteristics, the difference to the original threshold can be numerically zero, i.e., changing the burst threshold did not lead to changes exceeding the temporal and spectral resolution of the applied time-frequency transformation (i.e., 200ms and 1Hz respectively).
  
  Importantly, across these threshold values, the propagation direction and propagation speed remain comparable.
  
  We now include this result as Figure 6-figure supplement 2and refer to this analysis in the manuscript (page 28 line 717).
  
  “To explore the robustness of the results analyses were repeated using a range of thresholds (Figure 6-figure supplement 2).”
  
  Determining the generators of beta events at different locations is a tricky issue. The authors mentioned a single generator that is responsible for propagating beta along the two axes described. However, it is not clear through what mechanism the beta events could travel along the neural substrate without additional local generators along the way. Previous work on beta events examined how a sequence of synaptic inputs to supra and infragranular layers would contribute to a typical beta event waveform. Although it is possible other mechanisms exist, how might this work as the beta events propagate through space? Some further explanation/investigation on these issues is therefore warranted.
  
  Based on this and other comments (i.e., comments 7 and 8) we re-evaluated the use of the term ‘generator’ in this manuscript.
  
  While the term generator can be used across scales, from micro- to macroscale, ifor the purpose of the present paper, we believe one should differentiate at least two concepts: a) generator of beta bursts, and b) generator of travelling waves.
  
  We realised that in the previous version of the manuscript the term ‘generator’ was at times used without context. We removed the term where no longer necessary.
  
  Further, the previous version of the manuscript discussed putative generators of travelling waves (page 19f.) but not generators of beta bursts. We now address this as follows:
  
  “Studies using biophysical modelling have proposed that beta bursts are generated by a broad infragranular excitatory synaptic drive temporally aligned with a strong supragranular synaptic drive (Law et al., 2022; Neymotin et al., 2020; Sherman et al., 2016; Shin et al., 2017) whereby layer specific inhibition acts to stabilise beta bursts in the temporal domain (West et al., 2023). The supragranular drive is thought to originate in the thalamus (E. G. Jones, 1998, 2001; Mo & Sherman, 2019; Seedat et al., 2020), indicating thalamocortical mechanisms (page 22f).”
  
  Once the mechanisms have been better understood, a question of how much the results generalize to other oscillation frequencies and other brain areas. On the first question of other oscillation frequencies, the authors could easily test whether nearby frequency bands (alpha and low gamma) have similar properties. This would help to determine whether the observations/conclusions are unique to beta, or more generally applicable to transient bursts/waves in the brain. On the second issue of applicability to other brain areas, the authors could relate their work to transient bursts and waves recorded using ECoG and/or iEEG. Some recent work on traveling waves at the brain-wide level would be relevant for such comparisons.
  
  We appreciate the enthusiasm and the suggestions. To comment on the frequency specificity of the observed effects we conducted the same analysis focusing on the gamma frequency range (60-90 Hz). For computational reasons, we limited this analysis to one subject. Figure R1 shows the polar probability histogram for the beta frequency range (left) and the gamma frequency range (right). In contrast to the beta frequency range, no dominant directions were observed for the gamma range and von Mises functions did not converge. These preliminary results suggest some frequency specificity of the spatiotemporal pattern in sensorimotor beta activity. We believe this paves the way for future analysis mapping propagation direction across frequency and space.
  
  Here we did not investigate the spatial specificity of the effects, as the beta frequency range is dominant in sensorimotor areas. Investigating beta bursts in other cortical areas would have likely resulted in very few bursts. We discuss our results across spatial scales in the section: Distinct anatomical propagation axes of sensorimotor beta activity. However, please note that most of the previous literature operates on a different spatial scale (roughly 4mm; Balasubramanian et al., 2020; Best et al., 2016; Rubino et al., 2006; Rule et al., 2018; Takahashi et al., 2011, 2015) and different species (e.g., non-human primates). Non-invasive recordings in humans capture temporospatial patterns of a very different scale, i.e., often across the whole cortex (Alexander et al., 2016; Roberts et al., 2019). Comparing spatiotemporal patterns, across different spatial scales is inherently difficult. Work
  
  investigating different spatial scales simultaneously, such as Sreekumar et al. 2020, is required to fully unpack the relationship between mesoscopic and macroscopic spatiotemporal patterns.
  
  Figure R1: Spatiotemporal organisation for the beta (β, 13-30Hz) and gamma (γ, 60-90) frequency range for one exemplar subject. Same as Figure 4a, but for one exemplar subject.
  
  If the source code could be provided on github along with documentation and a standard "notebook" on use other researchers would benefit greatly.
  
  All analyses are performed using freely available tools in MATLAB. The code carrying out the analysis in this paper can be found here: [link provided upon acceptance]. The 3D burst analyses can be very computationally intensive even on a modern computer system. The analyses in this paper were computed on a MacBook Pro with a 2.6 GHz 6-Core Intel Core i7 and 32 Gb of RAM. Details on the installation and setup of the dependencies can be found in the README.md file in the main study repository.
  
  This information has been added to the paper in the methods section on page 35.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.19.492617v1
www.biorxiv.org www.biorxiv.org

New submission 15/10/2023, 18:48:42

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This manuscript provides a comprehensive investigation of the effects of the genetic ablation of three different transcription factors (Srf, Mrtfa, and Mrtfb) in the inner ear hair cells. Based on the published data, the authors hypothesized that these transcription factors may be involved in the regulation of the genes essential for building the actin-rich structures at the apex of hair cells, the mechanosensory stereocilia and their mechanical support - the cuticular plate. Indeed, the authors found that two of these transcription factors (Srf and Mrtfb) are essential for the proper formation and/or maintenance of these structures in the auditory hair cells. Surprisingly, Srf- and Mrtfb- deficient hair cells exhibited somewhat similar abnormalities in the stereocilia and in the cuticular plates even though these transcription factors have very different effects on the hair cell transcriptome. Another interesting finding of this study is that the hair cell abnormalities in Srfdeficient mice could be rescued by AAV-mediated delivery of Cnn2, one of the downstream targets of Srf. However, despite a rather comprehensive assessment of the novel mouse models, the authors do not have yet any experimentally testable mechanistic model of how exactly Srf and Mrtfb contribute to the formation of actin cytoskeleton in the hair cells. The lack of any specific working model linking Srf and/or Mrtfb with stereocilia formation decreases the potential impact of this study.
  
  Major comments:
  
  Figures 1 & 3: The conclusion on abnormalities in the actin meshwork of the cuticular plate was based largely on the comparison of the intensities of phalloidin staining in separate samples from different groups. In general, any comparison of the intensity of fluorescence between different samples is unreliable, no matter how carefully one could try matching sample preparation and imaging conditions. In this case, two other techniques would be more convincing: 1) quantification of the volume of the cuticular plates from fluorescent images; and 2) direct examination of the cuticular plates by transmission electron microscopy (TEM).
  
  In fact, the manuscript provides no single TEM image of the F-actin abnormalities either in the cuticular plate or in the stereocilia, even though these abnormalities seem to be the major focus of the study. Overall, it is still unclear what exactly Srf or Mrtfb deficiencies do with F-actin in the hair cells.
  
  Yes, we agree. As suggested by the reviewer, to directly examine the defects in F-actin organization within the cuticular plate of mutant mice, we conducted Transmission Electron Microscopy (TEM) analyses. The results, as presented in the revised Figures 1 and 4 (panels F, G, and E, F, respectively), provide crucial insights into the structural changes in the cuticular plate. Meanwhile, the comparison of the volume of the phalloidin labeled cuticular plate after 3-D reconstruction using Imaris software was conducted and shown in Author response image 1. The results of the cuticular plate (CP) volume were consistent with the relative F-actin intensity change of the cuticular plate in the revised Figures 1B and 4B. For the TEM analysis of the stereocilia, we regret that due to time constraints, we were unable to collect TEM images of stereocilia with sufficient quality for a meaningful comparison. However, we believe that the data we have presented sufficiently addresses the primary concerns, and we appreciate the reviewers’ understanding of these limitations.
  
  Author response image 1.
  
  Figures 2 & 4 represent another example of how deceiving could be a simple comparison of the intensity of fluorescence between the genotypes. It is not clear whether the reduced immunofluorescence of the investigated molecules (ESPN1, EPS8, GNAI3, or FSCN2) results from their mis-localization or represents a simple consequence of the fact that a thinner stereocilium would always have a smaller signal of the protein of interest, even though the ratio of this protein to the number of actin filaments remains unchanged. According to my examination of the representative images of these figures, loss of Srf produces mis-localization of the investigated proteins and irregular labeling in different stereocilia of the same bundle, while loss of Mrtfb does not. Obviously, a simple quantification of the intensity of fluorescence conceals these important differences.
  
  Yes, we agree. In addition to the quantification of tip protein intensity, we have added a few more analyses in the revised Figure 3 and Figure 6, such as the percentage of row 1 tip stereocilia with tip protein staining and the percentage of IHCs with tip protein staining on row 2 tip. Using the results mentioned above, the differences in the expression level, the row-specific distribution and the irregular labeling of tip proteins between the control and the mutants can be analyzed more thoroughly.
  
  Reviewer #2 (Public Review):
  
  The analysis of bundle morphology using both confocal and SEM imaging is a strength of the paper and the authors have some nice images, especially with SEM. Still, the main weakness is that it is unclear how significant their findings are in terms of understanding bundle development; the mouse phenotypes are not distinct enough to make it clear that they serve different functions so the reader is left wondering what the main takeaway is.
  
  Based on the reviewer’s comments, in this revised manuscript, we put more emphasis on describing the effects of SRF and MRTFB on key tip proteins’ localization pattern during stereocilia development, represented by ESPN1, EPS8 and GNAI3, as well as the effects of SRF and MRTFB on the F-actin organization of cuticular plate using TEM. We have made substantial efforts to interpret the mechanistic underpinnings of the roles of SRF and MRTFB in hair cells. This is reflected in the revised Figures 1, 3, 4, 6, and 10, where we provide more comprehensive insights into the mechanisms at play.
  
  We interpret our data in a way that both SRF and MRTF regulate the development and maintenance of the hair cell’s actin cytoskeleton in a complementary manner. Deletion of either gene thus results in somewhat similar phenotypes in hair cell morphology, despite the surprising lack of overlap of SRF and MRTFB downstream targets in the hair cell.
  
  In Figure 1 and 3, changes in bundle morphology clearly don't occur until after P5. Widening still occurs to some extent but lengthening does not and instead the stereocilia appear to shrink in length. EPS8 levels appear to be the most reduced of all the tip proteins (Srf mutants) so I wonder if these mutants are just similar to an EPS8 KO if the loss of EPS8 occurred postnatally (P0-P5).
  
  To address this question, we performed EPS8 staining on the control and Srf cKO hair cells at P4 and P10. We found that the dramatic decrease of the row 1 tip signal for EPS8 started since P4 in Srf cKO IHCs. Although the major hair bundle phenotype of Eps8 KO, including the defects of row 1 stereocilia lengthening and additional rows of short stereocilia also appeared in Srf cKO IHCs, there are still some bundle morphology differences between Eps8 KO and Srf cKO. For example, firstly, both Eps8 KO OHCs and IHCs showed additional rows of short stereocilia, but we only observed additional rows of short stereocilia in Srf cKO IHCs. Secondly, in Valeria Zampini’s study, SEM and TEM images did not show an obvious reduction of row 2 stereocilia widening (P18-P35), while our analysis of SEM images confirmed that the width of row 2 IHC stereocilia was drastically reduced by 40% in Srf cKO (P15). Generally, we think although Srf cKO hair bundles are somewhat similar to Eps8 KO, the Srf cKO hair bundle phenotype might be governed by multiple candidate genes cooperatively.
  
  Reference:
  
  Valeria Zampini, et al. Eps8 regulates hair bundle length and functional maturation of mammalian auditory hair cells. PLoS Biol. 2011 Apr;9(4): e1001048.
  
  A major shortcoming is that there are few details on how the image analyses were done. Were SEM images corrected for shrinkage? How was each of the immunocytochemistry quantitation (e.g., cuticular plates for phalloidin and tip staining for antibodies) done? There are multiple ways of doing this but there are few indications in the manuscript.
  
  We apologize for not making the description of the procedure of images analyses clear enough. As described in Nicolas Grillet group’s study, live and mildly-fixed IHC stereocilia have similar dimensions, while SEM preparation results in a hair bundle at a 2:3 scale compared to the live preparation. In our study, the hair cells selected for SEM imaging and measurements were located in the basal turn (30-32kHz), while the hair cells selected for fluorescence-based imaging and measurements were located in the middle turn (20-24kHz) or the basal turn (32-36kHz). Although our SEM imaging and fluorescence-based imaging of basal turn’s hair bundles were not from the same area exactly, the control hair bundles with SEM imaging have reduced row 1 stereocilia length by 10%-20%, compared to the control hair bundles with fluorescence-based imaging (revised Figure 2 and Figure 5). Generally, our stereocilia dimensions data showed appropriate shrinkage caused by the SEM preparation.
  
  Recognizing the need for clarity, we have provided a detailed description of our image quantification and analysis procedures in the “Materials and Methods” section, specifically under “Immunocytochemistry.” This will aid readers in understanding our methodologies and ensure transparency in our approach.
  
  Reference:
  
  Katharine K Miller, et al. Dimensions of a Living Cochlear Hair Bundle. Front Cell Dev Biol. 2021 Nov 25:9:742529.
  
  The tip protein analysis in Figs 2 and 4 is nice but it would be nice for the authors to show the protein staining separately from the phalloidin so you could see how restricted to the tips it is (each in grayscale). This is especially true for the CNN2 labeling in Fig 7 as it does not look particularly tip specific in the x-y panels. It would be especially important to see the antibody staining in the reslices separate from phalloidin.
  
  Thank you for the suggestions. We have shown tip proteins staining in grayscale separately from the phalloidin in the revised Figure 3 and Figure 6. To clearly show the tip-specific localization of CNN2, we conducted CNN2 staining at different ages during hair bundle development and showed CNN2 labeling in grayscale and in reslices in revised Figure 9-figure supplement 1B.
  
  In Fig 6, why was the transcriptome analysis at P2 given that the phenotype in these mice occurs much later? While redoing the transcriptome analysis is probably not an option, an alternative would be to show more examples of EPS8/GNAI/CNN2 staining in the KO, but at younger ages closer to the time of PCR analysis, such as at P5. Pinpointing when the tip protein intensities start to decrease in the KOs would be useful rather than just showing one age (P10).
  
  We agree with the reviewer. To address this question, we have performed ESPN1, EPS8 and GNAI3 staining on the control and the mutant’s hair cells at P4, P10 and P15 (the revised Figures 3 and 6). According to the new results, we found that the dramatic decreases of the row 1 tip signal for ESPN1 and EPS8 started since P4 in Srf cKO IHCs, is consistent with the appearance of the mild reduction of row 1 stereocilia length in P5 Srf cKO IHCs. For Mrtfb cKO hair cells, the obvious reduction of the row 1 tip signal for ESPN1 was observed until P10. However, a few genes related to cell adhesion and regulation of actin cytoskeleton were significantly down-regulated in P2 Mrtfb deficient hair cell transcriptome. We think that in hair cells the MRTFB may not play a major role in the regulation of stereocilia development, so the morphological defects of stereocilia happened much later in the Mrtfb mutant than in the Srf mutant.
  
  While it is certainly interesting if it turns out CNN2 is indeed at tips in this phase, the experiments do not tell us that much about what role CNN2 may be playing. It is notable that in Fig 7E in the control+GFP panel, CNN2 does not appear to be at the tips. Those images are at P11 whereas the images in panel A are at P6 so perhaps CNN2 decreases after the widening phase. An important missing control is the Anc80L65-Cnn2 AAV in a wild-type cochlea.
  
  We agree with the reviewer. We have conducted more immunostaining experiments to confirm the expression pattern of CNN2 during the stereocilia development, from P0 to P11. The results were included in the revised Figure 9-figure supplement 1B. As the reviewer suggested, CNN2 expression pattern in control cochlea injected with Anc80L65-Cnn2 AAV has also been provided in revised Figure 9E.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.06.26.546585v1
www.biorxiv.org www.biorxiv.org

New submission 09/09/2022, 13:33:27

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This is an awesome comprehensive manuscript. Authors start by sorting putative stromal cellcontaining BM non-hematopoietic (CD235a-/CD45-) plus additional CD271+/CD235a/CD45- populations to identify nine individual stromal identities by scRNA-seq. The dual sorting strategy is a clever trick as it enriches for rare stromal (progenitor) cell signals but may suffer a certain bias towards CD271+ stromal progenitors. The lack of readable signatures already among CD45-/CD45- sorts might argue against this fear. This reviewer would appreciate a brief discussion on number & phenotype of putative additional MSSC phenotypes in light of the fact that the majority of 'blood lineage(s)'-negative scRNA-seq signatures identified blood cell progenitor identities (glycophorin A-negative & leukocyte common antigen-negative). The nine stromal cell entities share the CXCL12, VCAN, LEPR main signature. Perhaps the authors could speculate if future studies using VCAN or LEPRbased sort strategies could identify additional stromal progenitor identities?
  
  We would like to thank the reviewer for critically evaluating our work and for the generally positive evaluation of the paper. We apologize for delayed resubmission as it took a long time for a specific antibody to arrive to complete the confocal microscopy analyses.
  
  The reviewer asks for a brief discussion on the cell numbers and phenotypes of MSSC phenotypes. The cell numbers and percentages of MSSC in sorted CD45low/-CD235a- and CD45low/-CD235a-CD271+ cells can be found in Supplementary File 3 and we have added a summary of the phenotypes of MSSC in the new Supplementary File 7.
  
  Due to the extremely low frequency of stromal cells in human bone marrow, we chose a sorting strategy that also included CD45low cells (Fig 1A) to ensure that no stromal cells were excluded from the analysis. Although stromal elements are certainly enriched using this approach, the CD45low population contains several different hematopoietic cell types. These include CD34+ HSPCs which are characterized by low CD45 expression2, as well as the CD45low-expressing fractions of other hematopoietic cell populations such as B cells, T cells, NK cells, megakaryocytes, monocytes, dendritic cells, and granulocytes. Furthermore, CD235a- late-stage erythroid progenitors, which are negative for CD45, are represented as well. Of note, our data are consistent with previously reported murine studies showing the presence of a number of hematopoietic populations in CD45- cells, which accounted for the majority of CD45-Ter119-CD31- murine BM cells3,4. However, despite a certain enrichment of stromal elements in the CD45low cell fraction, frequencies were still too low to allow for a detailed analysis of this important bone marrow compartment. This prompted us to adopt the stromal cell-enrichment strategy as described in the manuscript to achieve a better resolution of the stromal compartment. In fact, sorting based on CD45low/-CD235a-CD271+ allowed us to sufficiently enrich bone marrow stromal cells to be clearly detectable in scRNAseq analysis. According to the reviewer’s suggestion, a brief discussion on this issue is now included in the Discussion (page 28, lines 10-15).
  
  The reviewer also suggested using VCAN or LEPR-based sorting strategy to identify additional stromal identities in future studies.
  
  However, as an extracellular matrix protein, FACS analysis of cellular VCAN expression can only be achieved based on its intracellular expression after fixation and permeabilization5,6. Additionally, while VCAN is highly and ubiquitously expressed by stromal clusters, VCAN is also expressed by monocytes (cluster 36). Therefore, VCAN is not an optimal marker to isolate viable stromal cells.
  
  LEPR is the marker that was reported to identify the majority of colony-forming cells in adult murine bone marrow7. We have previously reported that the majority of human adult bone marrow CFU-Fs is contained in the LEPR+ fraction 8. In our current scRNAseq surface marker profiling analysis, group A cells showed high expression of several canonical stromal markers including VCAM1, PDGFRB, ENG (CD73), as well as LEPR (Fig. 4A). However, the four stromal clusters in Group A could not be separated based on the expression of LEPR. Therefore, we chose not to use LEPR as a marker to prospectively isolate the different stromal cell types.
  
  The authors furthermore localized CD271+, CD81+ and NCAM/CD56+ cells in BM sections in situ. Finally, referring to the strong background of the group in HSC research, in silico prediction by CellPhoneDB identified a wide range of interactions between stromal cells and hematopoietic cells. Evidence for functional interdependence of FCU-F forming cells is completing the novel and more clear bone marrow stromal cell picture.
  
  We thank the reviewer for the positive comments.
  
  An illustrative abstract naming the top9 stromal identities in their top4 clusters by their "top10 markers" + functions would be highly appreciated.
  
  We thank the reviewer for the suggestion. A summary of the characteristics of stromal clusters is now shown in the new Supplementary File 7, which we hope matches the reviewer’s expectations.
  
  Reviewer #2 (Public Review):
  
  Knowledge about composition and function of the different subpopulations of the hematopoietic niche of the BM is limited. Although such knowledge about the mouse BM has been accumulating in recent years, a thorough study of the human BM still needs to be performed. The present manuscript of Li and coworkers fills this gap by performing single cell RNA sequencing (scRNAseq) on control BM as well as CD271+ BM cells enriched for non-hematopoietic niche cells.
  
  We apologize for delayed resubmission as it took a long time for a specific antibody to arrive to complete the confocal microscopy analyses. We thank the reviewer for the critical expert review and overall positive comments.
  
  Based on their scRNAseq, the authors propose 41 different BM cell populations, ten of which represented non-hematopoietic cells, including one endothelial cell cluster. The nine remaining skeletal subpopulations were subdivided into multipotent stromal stem cells (MSSC), four distinct populations of osteoprogenitors, one cluster of osteoblasts and three clusters of pre-fibroblasts. Using bioinformatic tools, the authors then compare their results and divisions of subpopulations to some previously published work from others and attempt to delineate lineage relationships using RNA velocity analyses. From these, they propose different paths from which MSSC enter the progenitor stages, and might differentiate into pre-osteoblasts and -fibroblasts.
  
  It is of interest to note, that apparently adipo-primed cells may also differentiate into osteolineage cells, something that should be further explored or validated. Furthermore, although this analysis yields a large adipo-primed populations, pre-adipocytes and mature adipocytes appear not to be included in the data set the authors used, which should also be explained.
  
  We thank the reviewer for this comment. We chose to annotate Cluster 5 as adipoprimed cluster based on the higher expression of adipogenic differentiation markers as well as a group of stress-related transcription factors (FOS, FOSB, JUNB, EGR1) (Fig. 2B-C, Figure 2-figure supplement 1C) some of which had been shown to mark bone marrow adipogenic progenitors1. Although at considerably lower levels compared to adipogenic genes, osteogenic genes were also expressed in cluster 5 cells (Fig. 2B and D), indicating the multi-potent potential of this cluster. Therefore, our initial annotation of these cells as adipoprimed progenitors was too narrow as it did not include the possible osteogenic differentiation potential. We apologize for the confusion caused by the inappropriate annotation and, in order to avoid any further confusion, cluster 5 has now been re-annotated as ‘highly adipocytic gene-expressing progenitors (HAGEPs), which we believe is a better representation of the cells. We furthermore agree with the reviewer that in-vivo differentiation needs to be performed to address potential differentiation capacities in future studies.
  
  With regard to the lack of adipocytes in our data set, we described in the Materials and Methods section that human bone marrow cells were isolated based on density gradient centrifugation. After centrifugation, the mononuclear cell-containing monolayers were harvested for further analysis. However, the resulting supernatant containing mature adipocytic cells was discarded14. Therefore, adipocyte clusters were not identified in our dataset. We have amended the manuscript accordingly (page 5, line 7).
  
  Regarding the pre-adipocytes, we are not aware of any specific markers for pre-adipocytes in the bone marrow. We examined the only known markers (ICAM1, PPARG, FABP4) that have been shown to mark committed pre-adipocytes in human adipose tissue15. As illustrated in Fig. R1 (below), low expression of all three markers was not restricted to a single distinct cluster but could be found in almost all stromal clusters. These data thus allow us to neither confirm nor exclude the presence of pre-adipocytes in the dataset. Due to the lack of specific markers for pre-adipocytes and the absence of mature adipocytes in the current dataset, it is therefore difficult to identify a well-defined pre-adipocytes cluster.
  
  Figure R1. UMAP illustration of the normalized expression of the markers for pre-adipocytes in stromal clusters.
  
  In addition, based on a separate analysis of surface molecules, the authors propose new markers that could be used to prospectively isolate different human subpopulations of BM niche cells by using CD52, CD81 and NCAM1 (=CD56). Indeed, these analyses yield six different populations with differential abilities to form fibroblast-like colonies and differentiate into adipo-, osteo-, and chondrogenic lineages. To explore how the scRNAseq data may help to understand regulatory processes within the BM, the authors predict possible interactions between hematopoietic and non-hematopoietic subpopulations in the BM. These should be further validated, to support statements as the suggestion in the abstract that separate CXCL12- and SPP1-regulated BM niches might exist.
  
  We agree with the reviewer that functional validation of the CellPhoneDB results using for example in vivo humanized mouse models would be needed to demonstrate the presence of different niches in the bone marrow. At this point of time we only put forward the hypothesis that different niche types exist while we will work on providing experimental proof in our future studies.
  
  The scRNAseq analysis is indeed a strong and important resource, also for later studies meant to increase knowledge about the hematopoietic niche of the BM. Although the analyses using different bioinformatic tools is very helpful, they remain mostly speculative, since validatory experiments, as already mentioned, are missing. As such, I feel the authors did not succeed in achieving their goals of understanding how non-hematopoietic cells of the BM regulate the different hematopoietic processes within the BM. Nevertheless, they have created valuable resources, both in the scRNAseq data they generated, as well as the different predictions about different cell populations, their lineage relationships, and how they might interact with hematopoietic cells.
  
  We thank the reviewer for the appreciation of the value of this dataset. We agree with the reviewer that it is of great importance to validate the contribution of potential driver genes for stromal cell differentiation and verify the in vitro data and in-silico prediction using in-vivo models. As the main goal of the current study was to formulate hypotheses based on the scRNAseq data for future studies, we believe that in vivo validation experiments using engineered human bone marrow models or humanized bone marrow ossicles are out of the scope of the current study, but certainly need to be performed in the future.
  
  The impact of this work is difficult to envision, since validations still need to be performed. Also, it has the born in mind that humans are not mice, which can be studied in neat homogeneous inbred populations. Human populations on the other hand, are quite diverse, so that the data generated in this manuscript and others will probably have to be combined to extrapolate data relevant to the whole of the human population. However, as it is equally difficult to generate reliable scRNAseq data from human BM, it seems likely that the data will indeed an important resource, when more data from different donors become available.
  
  We thank the reviewer for the generally positive evaluation of this study.
  
  Taken at point value, the authors provide evidence that human counterparts exist to several BM populations described in mice. In my opinion, the lineage relationships predicted using the RNA velocity analyses need more substance, as it seems the differentiation-paths may diverge from what is known from mice. If so, this issue should be studied more stringently. Similarly, the paper would have been strengthened considerably if a relevant experimental validation would have been attempted, perhaps by using genetically modified (knockdown) MSSC, similar to Battula et al. (doi: 10.1182/blood-2012-06-437988).
  
  In the study from Welner’s group, stromal differentiation trajectory was inferred based on scRNAseq analysis of murine bone marrow cells using Velocyto16. Velocyto identified MSCs as the ‘source’ cell state with pre-adipocytes, pro-osteoblasts, and prochondrocytes being end states. In our study, the MSSC population was predicted to be at the apex of the trajectory and the pre-osteoblast cluster was placed close to the terminal state of differentiation, which is consistent with the murine study. However, different stromal cell types were identified in mice compared with humans. For example, we have identified prefibroblasts in our dataset which are absent in the murine study, while a well-defined murine pre-adipocyte population was not identified in our human dataset. Therefore, it is not surprising to find some discrepancies between human and murine stromal differentiation trajectories. Of course and as mentioned before, critical in-vivo functional validations need to be carried out to address these important issues in the future.
  
  In summary, this is a very interesting but also descriptive paper with highly important resources. However, to prospectively identify or isolate human non-hematopoietic/nonendothelial niche populations, more stringent validations should have been performed to strengthen the validity of the different analyses that have been performed. As such, it remains an open question which niche subpopulations has the most impact on the different hematopoietic processes important for normal and stress hematopoiesis, as well as malignancies.
  
  Thank you for this comment. We completely agree that more stringent validations are necessary but are outside of the aim of our current hypothesis-generating study. Accordingly, we are planning functional verification studies using genetically manipulated stromal cells in combination with in-vivo humanized ossicles. Furthermore, other groups will hopefully use our database and contribute with functional studies in model systems that are currently not available to us, e.g. iPS-derived bone marrow in-vitro proxies.
  
  Specific remarks
  
  • Since CD45, CD235a, and CD271 are used as distinguishing markers in the sample preparation of the scRNAseq, it would be helpful to highlight these markers in the different analyses (Figures 1D, 2B, 2C-F, and 4A), and restrict the analyses to those cells that also not express CD45, CD235a (why use CD71?) and highly express CD271.
  
  Thank you for this comment. As shown in Fig. R2, we have modified figures Fig. 1D, 2B, and 4A showing now also the expression of PTPRC (CD45), GYPA (CD235a), and NGFR (CD271) on the top (Fig. 1D and 2B) or right (Fig. 4A) panel of the figures. To complement Fig. 2C-F, we have generated new stacked violin plots showing the expression level of three markers by all 9 stromal clusters (Fig. R2B). As we believe that including these three markers in the figures does not provide a better strategy to improve the analyses, we decided to leave the original figures unchanged in this respect.
  
  Figure R2. (A) Modified Fig. 1D, 2B and 4A with PTPRC (CD45), GYPA (CD235a) and NGFR (CD271) expression. (B) Stacked violin plots of PTPRC, GYPA and NGFR expressed by stromal clusters to complement Fig. 2C-F.
  
  With regard to cell exclusion based on CD45, as shown in the modified Figure corresponding to Fig 1A in the manuscript (Fig R2A), CD45 gene expression is observed also in the endothelial cluster, basal cluster, and neuronal cluster (Fig. R2A). These clusters represent non-hematopoietic clusters that we would like to keep in our dataset for further analysis, such as cell-cell interaction. Therefore, we choose to not restrict the analysis to solely CD45 nonexpressing cells.
  
  With regard to CD235a (GYPA), expression of CD235a is not detected in any of the nonhematopoietic clusters. Thus, CD235a-expressing cell exclusion is not necessary.
  
  For CD271, according to our previous results (own unpublished data, belonging to a dataset of which only significantly expressed genes were reported in Li et al.8), protein expression of CD271 is not necessarily reflected by gene expression. In the other words, stromal cells with CD271 protein expression do not always have high mRNA expression. A significant fraction of stromal cells would be excluded if we restrict the analyses only to those cells that show high CD271 gene expression, which would not reflect the real cellular composition of human bone marrow stroma. In order to not risk losing stromal cells, we therefore kept our previous analyses which included stromal cells with various CD271 expression levels.
  
  With regard to using CD71 as an exclusion marker, please see also the comments to reviewer 1. Briefly, according to our data, CD71 (TFRC)-expressing erythroid precursors could still be found after excluding CD45 and CD235a positive cells (Figure 1-figure supplement 1B and R3). As furthermore shown in Figure 1-figure supplement 1G and R2, CD71 expression in the stromal clusters is negligible. Therefore, we believe that this justifies the use of CD71 as an additional marker to exclude erythroid cells. We have amended the discussion to address this issue (page 19, lines 7-8).
  
  Figure R3. FACS plots illustrating the expression of (A) CD71 (TFRC) vs CD271 in CD45- CD235a- cells and (B) FSC-A vs CD81 in CD45-CD235a-CD271+CD71+ cells following exclusion of doublets and dead cells.
  
  • Despite a distinct neuronal cluster (39), there does not seem to be a distinctive marker for these cells. Is this true?
  
  Yes, the reviewer is correct that there is no significantly-expressed distinctive marker for neuronal cells. Multiple markers indicating the presence of different cell types were identified in cluster 39 (Supplementary File 4). Among them, several neuronal markers (NEUROD1, CHGB, ELAVL2, ELAVL3, ELAVL4, STMN2, INSM1, ZIC2, NNAT) were found to be enriched in this cluster (Supplementary File 4 and Fig. 1D) with higher fold changes compared to other identified genes. However, the expression of these genes was not statistically significant, which is mainly due to the heterogeneity of the cluster and thus does not allow us to draw any firm conclusions.
  
  Several genes including MALAT1, HNRNPH1, AC010970.1, and AD000090.1 were identified to be statistically highly expressed by cluster 39 (Supplementary File 4). The expression of these genes is not restricted to any specific cell type. It is therefore impossible to annotate the cluster based on this and our data thus indicated that cluster 39 is a heterogeneous population containing multiple cell types. Based on the expression of neuronal markers, we nevertheless chose to annotate Cluster 39 as “neuronal” as the prominent expression of neuronal markers indicated the presence of neurons in this cluster. To be more accurate, the annotation of cluster 39 has been changed to ‘neuronal cell-containing cluster’ to correctly reflect the presence of non-neuronal gene expressing cells as well (page 29, lines 3-8).
  
  • Since based on 2C and 2D, the authors are unable to distinguish adipo- from osteogenic cells, would the authors use the same molecules to distinguish different populations of 2C-D, or would they use other markers, if so which and why.
  
  We agree with the reviewer that at the first glance adipo-primed (cluster 5, now annotated as “highly adipocytic gene-expressing progenitors”, HAGEPs), balanced progenitors (cluster 16), and pre-osteoblasts (cluster 38) shared a similar expression pattern according to the violin plots in Fig. 2C and 2D. However, as illustrated in the heatmap (Fig. 2B), the expression patterns of adipo-primed (HAGEP) and balanced progenitors were quite different in terms of their expression of adipogenic and osteogenic markers. Both adipogenic and osteogenic marker expression was detected in HAGEPs, balanced progenitors, and preosteoblasts. Thus, as violin plots are summarizing the overall expression levels of a certain marker in a certain cluster, these plots tend to make it more difficult to detect differential expression patterns between different clusters. In this case, the heatmap shown in Fig. 2B is a good complement to the violin plots as it is demonstrating the different expression patterns of every cell in the different stromal clusters.
  
  Additionally, cluster 5 showed the expression of a group of stress-related transcription factors (FOS, FOSB, JUNB, EGR1) (Fig. 2B and Figure 2-figure supplement 1C), some of which had been shown to mark bone marrow adipogenic progenitors1. The expression of the abovementioned stress-related transcription factors (putative adipogenic progenitor markers) was generally lower in cluster 38 compared to cluster 5, further demonstrating that clusters were different.
  
  Furthermore, there was a gradual upregulation of more mature osteogenic markers such as RUNX1, CDH11, EBF1, and EBF3 from cluster 5 to cluster 16 and finally cluster 38. As shown in Fig. 2D, the expression of these markers was higher in cluster 38 compared to cluster 5. Therefore, cluster 38 was annotated as pre-osteoblasts.
  
  Most of the stromal clusters form a continuum (Fig. 2A), which correlates very well with the gradual transition of different cellular states during stromal cell development. It is highly unlikely that abrupt and dramatic gene expression changes would occur during the cellular state transition of cells of the same lineage. Therefore, it is not surprising to find the differences in gene expression profiles between stromal clusters share a certain level of similarities.
  
  In summary, we rely on several factors to distinguish different stromal clusters, which include canonical adipo-, osteo- and chondrogenic markers, stress markers, heatmap, violin plots, and the gradual up-regulation of certain lineage-specific markers.
  
  To directly answer the reviewer’s question, we believe that we are able to distinguish different stromal clusters based on our data.
  
  • In de Jong et al., an inflammatory MSC population (iMSC) is defined. Since the Schneider group showed that inflammatory S100A8 and A9 are expressed by inflamed MSC, is it possible that the some of the designated pre-fibroblasts actually correspond to these S100A8/A9-expressing iMSC?
  
  We thank the reviewer for raising this interesting question.
  
  First of all, we would like to point out that scRNAseq was performed using viably frozen bone marrow aspirates in de Jong’s study while freshly isolated bone marrows were used in our study. There might be discrepancies between frozen and fresh bone marrow samples in terms of cellular composition including stromal composition and, importantly, processinginduced stress-related gene expression profiles.
  
  To investigate if designated pre-fibroblasts actually correspond to iMSCs as suggested by the reviewer, we have re-examined the expression of some of the key iMSC genes as reported by de Jong et al 17. As shown in Fig. R6, the markers that can distinguish iMSC from other MSC clusters in de Jong et al. study were not exclusively expressed by pre-fibroblasts, but also by other stromal cell types including HAGEPs, balanced progenitors, and pre-osteoblasts.
  
  In the study by R. Schneider’s group18, significant upregulation of S100A8/S100A9 was observed in stromal cells from patients with myelofibrosis. Furthermore, base-line expression of S100A8/A9 was also observed in the fibroblast clusters in the control group, which correlates very well with our data of S100A8/9 expression in pre-fibroblasts in normal donors (Fig. 2F). Our data thus indicate – in line with Schneider’s findings - that there is a baseline level expression of S100A8/9 in fibroblasts in hematologically normal samples and that the expression of S100A8/9 is not restricted to inflamed MSC.
  
  In summary, the gene expression profiles observed in our study do not indicate the presence of iMSC in the healthy bone marrow.
  
  • Figure 3A: Do human adipo-primed cells (cluster 5) indeed differentiate into osteogenic cells (clusters 6, 38, and 39). This would be highly unexpected. Can the authors substantiate this "reliable outcome of the RNA velocity analysis"?
  
  Please refer to our previous responses regarding this topic. Briefly, as shown in Fig. 2B and D, both osteogenic and adipogenic genes are expressed in cluster 5, indicating the multi-potent potentials of this cluster. Although the cluster was initially annotated as adipo-primed progenitors, this was not intended to exclude the osteogenic differentiation potential of these progenitors. Nevertheless, this annotation did not correctly reflect the differentiation potential and might thus have caused confusion, for which we apologize. In order to more correctly describe the characteristics of these cells, cluster 5 has now been reannotated as ‘highly adipocytic gene-expressing progenitors (HAGEPs)’.
  
  In general, the outcome of the RNA velocity analysis needs to be corroborated by in-vivo differentiation experiments. But we believe that functional verification, which would be extensive, is out of the scope of the current study and we will address these questions in future studies.
  
  • How statistically certain are the authors, that the populations in Figure 4B as defined by flow cytometry, correspond to MSSC, adipo-primed cells, osteoprogenitors, etc., as defined by scRNAseq?
  
  To address this question, we sorted the A1-A4 populations and performed RT- PCR to examine the CD81 expression level in each cluster. As shown in Figure 4-figure supplement 1B, CD81 expression levels were higher in A1 and A2 compared with A3 and A4, which is consistent with the scRNAseq data that showed the highest CD81 expression in MSSCs compared to other clusters (Supplementary File 4).
  
  The phenotypes defined in this study allowed us to isolate different stromal cell types which demonstrated significant functional differences as described in the manuscript (page 19, lines 17-25; page 20, lines 1-11). These results, in combination with the quantitative real-time PCR results (Figure 4-figure supplement 1B), demonstrated that the A1-A4 subsets in FACS are functionally distinct populations and are likely to be – at least in large parts – identical or equivalent to the transcriptionally identified clusters in group A stromal cells. However, at this point, we do not have performed the required experiments (scRNAseq of sorted cells) that would provide sufficient proof to confirm this statement statistically.
  
  • The immunohistochemistry results shown do not allow distinct conclusions as the colors give unequivocal mix-colors, and surface expression cannot be distinguished from intracellular expression. Please use a 3D (confocal) method for such statements.
  
  We thank the reviewer for the suggestion and we have performed additional confocal microscopy analysis of human bone marrow biopsies as suggested by the reviewer. Representative confocal images are now presented in the middle and right panel of Fig. 6E. We also include a separate file (Supplemental confocal image file). Here, confocal scans of all maker combinations are shown as ortho views in addition to detailed intensity profile analyses of the cells of interest clearly distinguishing surface staining from intracellular staining.
  
  Confocal analysis of bone marrow biopsies confirmed our findings presented in the manuscript. As observed in the scanning images, CD271-expressing cells were negative for CD45 and were located in perivascular, endosteal, and peri-adipocytic regions. CD271/CD81double positive cells could be found either in the peri-adipocytic regions or perivascular regions while CD271/NCAM1 double-positive cells were exclusively situated at the bone-lining endosteal regions. The results of the confocal analysis have been added to the revised manuscript (page 21, lines 15-17).
  
  • Figure 5A: as all cells seem to interact with all other cells, this figure does not convey relevant information about BM regions using for instance CXCL12 or SPP1. Please reanalyze to show specificity of the interactions of the single clusters. Also, since it is unlikely the CellPhoneDB2-predicted interactions are restricted to hematopoietic responders, please also describe the possible interactions between non-hematopoietic cells.
  
  Fig. 5A was used to demonstrate the complexity of the interactions between hematopoietic cells and stromal cells.
  
  To gain a more detailed understanding of the interactions, we also performed an analysis with the top-listed ligand-receptor pairs as shown in Fig. 5B-C and Figure 5-figure supplement 1B. Here, each dot represents the interaction of a specific ligand-receptor pair listed on the x-axis between the two individual clusters indicated in the y-axis, which we believe shows what the reviewer is asking for.
  
  The specificity of the interactions between single clusters were shown in Fig. 5B-C and Figure 5-figure supplement 1B. The CXCL12- and SPP1-mediated interactions between MSSC/OC and hematopoietic clusters clearly suggested stromal cell type-specific interactions.
  
  Regarding non-hematopoietic cells, both inter- and intra-stromal interactions were identified to be operative between different stromal subsets as well as within the same stromal cell population as shown in Figure 5-figure supplement 3B. In addition, we have also analyzed the interaction pattern between endothelial cells and hematopoietic cells as shown in Fig. 7A, and thus we believe that we have sufficiently described these interactions as requested by the reviewer.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.01.26.477664v3
www.biorxiv.org www.biorxiv.org

New submission 25/10/2022, 12:31:10

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Point 1: The transcriptomic analysis of E12.5 endocardial cushion cells in the various mouse models is informative in the extraction of Igf2- and H19-specific gene functions. In Fig. 6D, a huge sex effect is obvious with many more DEGs in female embryos compared to males. How can this be explained given that Igf2/H19 reside on Chr7 and do not primarily affect gene expression on the X chromosome? Is any chromosomal bias observed in the genomic distribution of DEGs?
  
  We examined chromosomal distribution of DEGs between WT and +/hIC1 (Supplemental Figure 6D) and did not see any bias on X chromosome. We described this result on lines 278-280: “Although the number of +/hIC1-specific DEGs largely differed between males and females, there was no sex-specific bias on the X chromosome (Supplemental Figure 6D).” Additionally, we agree with the reviewer that it is noteworthy that the dysregulated H19/Igf2 expression affected transcriptome in a sex-specific manner, especially when the mutation is located on a somatic chromosome. Although investigating the role of hormones versus sex chromosome in these effects would be quite interesting, it is beyond the scope of current study.
  
  Point 2: A separate issue is raised by Fig. 6E that shows a most dramatic dysregulation of a single gene in the delta3.8/hIC1 "rescue" model. Interestingly, this gene is Shh. Hence, these embryos should exhibit some dramatic skeletal abnormalities or other defects linked to sonic hedgehog function.
  
  The reason why Shh appeared to be differentially expressed between wild-type and d3.8/hIC1 samples was that Shh expression was 0 across all the samples except for two wild-type samples. In order to detect all the DEGs that might be lowly expressed, we did not want to filter DEGs based on the level of total expression. As a result, Shh was represented as significantly differently expressed in d3.8/hIC1 samples, although its expression in our samples appears to be too low to have any significant effect on development. This explanation was added to lines 310-312. To confirm that this was an exceptional case, we analyzed the expression of DEGs obtained from other pairwise comparisons. In the volcano plots below, genes of which expression is not statistically different between two groups are marked grey. Genes of which expression is statistically different and detected in both groups are marked red. Genes with statistically different but not detected in one group at all, such as Shh, are marked blue (Figure G). It is clear that that almost all of our DEGs are expressed consistently across the groups, and genes with no expression detected in one group are very rare.
  
  Point 3: The placental analysis needs to be strengthened. Placentas should be consistently positioned with the decidua facing up, and the chorionic plate down. The placentas in Fig. 3F are sectioned at an angle and the chorionic plate is missing. These images must be replaced with better histological sections.
  
  As requested, we have replaced placental images with better representative sections (Figure 3F and 4E). In addition, we have improved alignment of placental histology figures.
  
  Point 4: The CD34 staining has not worked and does not show any fetal vasculature, in particular not in the WT sample.
  
  As requested, we have replaced the CD34 vascular stained images with those that better represent fetal vasculature (Figure 3G).
  
  Point 5: The "thrombi" highlighted in Fig. 4E are well within the normal range, to make the point that these are persistent abnormalities more thorough measurements would need to be performed (number, size, etc).
  
  As requested, we measured the number and relative size of the thrombi that are found in dH19/hIC1 placentas with lesions. No thrombi were found in wild-type placentas whereas an average of 1.3 thrombi were found in six dH19/hIC1 placentas. The size of the thrombi widely varied, but occupied average of 2.58% of the labyrinth zone where these lesions were found (Supplemental Figure 4D). Additionally, we replaced the image in Figure 4E into the section that better represents the lesion.
  
  Point 6: The statement that H19 is disproportionately contributing to the labyrinth phenotype (lines 154/155) is not warranted as Igf2 expression is reduced to virtually nothing in these mice. Even though there is more H19 in the labyrinth than in the junctional zone, the phenotype may still be driven by a loss of Igf2. Given the quasi Igf2-null situation in +/hIC1 mice, is the glycogen cell type phenotype recapitulated in these mice, and how do glycogen numbers compare in the other mouse models?
  
  The sentence was edited in line 157. We performed Periodic acid Schiff (PAS) staining on +/hIC1 placentas to address if glycogen cells are affected by abnormal H19/Igf2 expression (Supplemental Figure 1E). In contrary to previous reports where Igf2-null mice had lower placental glycogen concentration (Lopez et al., 1996) and H19 deletion led to increased placental glycogen storage (Esquiliano et al., 2009), our quantification on PAS-stained images showed that the glycogen content is not significantly different between wild-type and +/hIC1 placentas. We have described this result in lines 166-168.
  
  Point 7: How do delta3.8/+ and delta3.8/hIC1 mice with a VSD survive? Is it resolved some time after birth such that heart function is compatible with postnatal viability? And more importantly, do H19 expression levels correlate with phenotype severity on an individual basis?
  
  Our study was limited to phenotypes prior to birth, thus postnatal/adult phenotypes were not examined. Because the VSD showed only partial penetrance in these mice, we cannot state that the d3.8/+ or d3.8/hlC1 mice with VSDs survive. It has also been previously reported in another mouse model with incomplete penetrance of a VSD that the mice which survived to adulthood did not have the VSDs (Sakata et al., 2002). We find it highly unlikely that either mouse model would survive significantly past the postnatal timepoint with a VSD. We have examined two PN0 d3.8/hIC1 neonates, and both did not have VSD.
  
  Regarding the second point, the only way to quantitatively address this question would be to do qPCR or RNA-seq on individual hearts, which then makes it impossible for those hearts to be examined for histology to confirm the VSD. Thus, hearts used to identify VSDs via histology could not also be used for quantitative H19 measurements. One thing to note is that the H19/Igf2 expression in independent replicates of d3.8/hIC1 cardiac ECs used in our RNA-seq experiment is quite variable, not clustering together in contrast to other mouse models used in this study (Fig. 6A). Such wide range of variability in the extent of H19/Igf2 dysregulation suggests that H19/Igf2 levels could have an impact on the penetrance or the severity of the VSD phenotype in d3.8/hIC1 embryos.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.28.486058v1
www.biorxiv.org www.biorxiv.org

New submission 02/01/2023, 13:58:07

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Weaknesses:
  
  1) The relevance of the LPS-induced calvarial osteolysis model is not clear. Calvaria is mostly composed of cortical bone-like structures lacking marrow space, though small marrow space exists near the suture. Osteolysis appears to occur in areas apart from where marrow is located. The authors did not show in the manuscript which cells Adipoq-Cre marks in the calvaria.
  
  We have shown in a recent publication that MALPs exist in the calvarial bone marrow (2). As shown in Fig. R1A, Td+ cells are layer of cortical bone (Fig. R1B, blue arrows). In WT mice, after LPS injection, the normal bone structure, including suture and cortical bone, were mostly eroded, and filled with inflammatory cells (green arrows). Thus, osteolysis does occur at the area where bone marrow is originally located. On the contrary, calvarial bone structure was preserved in the CKO mice, demonstrating that Csf1 deficiency in MALPs suppresses LPS-induced osteolysis. We included the H&E staining data in the revised manuscript:
  
  "H&E staining showed that calvarial bone marrow is surrounded by a thin layer of cortical bone (Fig. 5C). After the LPS injection, normal calvarial structure, including suture and cortical bone, were mostly eroded and filled with inflammatory cells in WT mice, but unaltered in CKO mice."
  
  Figure R1. Calvarial bone marrow structure. (A) Representative coronal section of 1.5-month-old Adipoq/Td mouse calvaria. Bone surfaces are outlined by dashed lines. Boxed areas in the low magnification image (top) are enlarged to show periosteum (bottom left), suture (bottom middle), and bone marrow (BM, bottom right) regions. Red: Td; Blue: DAPI. Adopted from our previous publication (2). (B) H&E staining of coronal sections of WT and Csf1 CKOAdipoq mice after LPS injection. Blue arrows point to bone marrow space close to suture (indicated by *). Green arrows point to the osteolytic lesion where cortical bone was eroded, and the space were filled with inflammatory cells.
  
  2) Although the contrast between the two Csf1 conditional deletion models (Adipoq-Cre and Prx1-Cre) is very interesting, the relationship between these two cell populations are not well described. The authors did not clarify if MALPs are also targeted by Prx1-Cre, or these two cell types are from different cell lineages. "Other mesenchymal lineage cells" in the subtitle is not extremely helpful to place this finding in context.
  
  We thank the Reviewer for this comment. The original article constructing Prx1-Cre mouse line demonstrates that Prx1-Cre targets all mesenchymal cells in the limb bud at early as 10.5 dpc (10). This early expression pattern ensures that all bone marrow mesenchymal lineage cells, including MALPs, are targeted by Prx1-Cre. In addition, based on our scRNA-seq data (1), Adipoq is mainly expressed in MALPs, while Prrx1 (Prx1) is highly expressed not only in MALPs but also in EMPs, IMPs, LMPs, LCPs, and OBs (Fig. R2). Thus, the fact that Prx1-Cre driven CKO mice have much more severer bone phenotypes than AdipoqCre driven CKO mice indicates that mesenchymal lineage cells other than MALPs also contribute Csf1 to regulate bone resorption. To avoid confusion, we changed the title and the first sentence in the Result session about Prx1 mice to the following:
  
  "Csf1 from mesenchymal lineage cells other than MALPs regulate bone structure.
  
  To explore whether Csf1 from MALPs plays a dominant role in regulating bone structure, we generated Prx1-Cre Csf1flox/flox (Csf1 CKOPrx1) mice to knockout Csf1 in all mesenchymal lineage cells in bone (10), including MALPs."
  
  Figure R2. Dotplot of Prrx1 and Adipoq expression in bone marrow mesenchymal lineage cells based on our scRNA-seq analysis of 1-month-old mice.
  
  3) The data supporting defective bone marrow hematopoiesis in Csf1 CKO mice are not particularly strong. They observed a reduction in bone marrow cellularity, but this was only associated with an expected reduction in macrophages and a mild reduction in overall HSPC populations. More in-depth analyses might be required to define mechanisms underlying reduced bone marrow cellularity in CKO mice.
  
  We thank the Reviewer for this constructive comment. Accordingly, we performed a thorough analysis of bone marrow hematopoietic compartments and observed significant decreases of monocytes and erythroid progenitors in CKO mice compared to WT mice. These results are now included as Fig. 6E.
  
  4) Some of the phenotypic analyses are still incomplete. The authors did not report whether CHet (Adipoq-Cre Csf1(flox/+)) showed any bone phenotype. Further, the authors did not report whether Csf1 mRNA or M-Csf protein is indeed expressed by MALPs, with current evidence solely reliant on scRNAseq and qPCR data of bulk-isolated cells. More specific histological methods will be helpful to support the premise of the study.
  
  A pilot microCT study revealed the same femoral trabecular bone structure in WT and Adipoq-Cre Csf1flox/+ (Csf1 Het) mice at 3 months of age (Fig. R3). While the sample number for Het is low, we are confident about this conclusion.
  
  Figure R3. MicroCT measurement of trabecular bone structural parameters from WT and Csf1 Het mice. BV/TV: bone volume fraction; BMD: bone mineral density; Tb.N: trabecular number; Tb.Th: trabecular thickness; Tb.Sp: trabecular separation; SMI: structural model index. n=3-8 mice/group.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.07.27.501742v1
www.biorxiv.org www.biorxiv.org

Integration of overlapping sequences emerges with consolidation through mPFC neural ensembles and hippocampal-cortical connectivity

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author response:
  
  Reviewer #1 (Public Review):
  
  In this paper, Tompary & Davachi present work looking at how memories become integrated over time in the brain, and relating those mechanisms to responses on a priming task as a behavioral measure of memory linkage. They find that remotely but not recently formed memories are behaviorally linked and that this is associated with a change in the neural representation in mPFC. They also find that the same behavioral outcomes are associated with the increased coupling of the posterior hippocampus with category-sensitive parts of the neocortex (LOC) during a post-learning rest period-again only for remotely learned information. There was also correspondence in rest connectivity (posterior hippocampus-LOC) and representational change (mPFC) such that for remote memories specifically, the initial post-learning connectivity enhancement during rest related to longer-term mPFC representational change.
  
  This work has many strengths. The topic of this paper is very interesting, and the data provide a really nice package in terms of providing a mechanistic account of how memories become integrated over a delay. The paper is also exceptionally well-written and a pleasure to read. There are two studies, including one large behavioral study, and the findings replicate in the smaller fMRI sample. I do however have two fairly substantive concerns about the analytic approach, where more data will be required before we can know whether the interpretations are an appropriate reflection of the findings. These and other concerns are described below.
  
  Thank you for the positive comments! We are proud of this work, and we feel that the paper is greatly strengthened by the revisions we made in response to your feedback. Please see below for specific changes that we’ve made.
  
  1) One major concern relates to the lack of a pre-encoding baseline scan prior to recent learning.
  
  a) First, I think it would be helpful if the authors could clarify why there was no pre-learning rest scan dedicated to the recent condition. Was this simply a feasibility consideration, or were there theoretical reasons why this would be less "clean"? Including this information in the paper would be helpful for context. Apologies if I missed this detail in the paper.
  
  This is a great point and something that we struggled with when developing this experiment. We considered several factors when deciding whether to include a pre-learning baseline on day two. First, the day 2 scan session was longer than that of day 1 because it included the recognition priming and explicit memory tasks, and the addition of a baseline scan would have made the length of the session longer than a typical scan session – about 2 hours in the scanner in total – and we were concerned that participant engagement would be difficult to sustain across a longer session. Second, we anticipated that the pre-learning scan would not have been a ‘clean’ measure of baseline processing, but rather would include signal related to post-learning processing of the day 1 sequences, as multi-variate reactivation of learned stimuli have been observed in rest scans collected 24-hours after learning (Schlichting & Preston, 2014). We have added these considerations to the Discussion (page 39, lines 1047-1070).
  
  b) Second, I was hoping the authors could speak to what they think is reflected in the post-encoding "recent" scan. Is it possible that these data could also reflect the processing of the remote memories? I think, though am not positive, that the authors may be alluding to this in the penultimate paragraph of the discussion (p. 33) when noting the LOC-mPFC connectivity findings. Could there be the reinstatement of the old memories due to being back in the same experimental context and so forth? I wonder the extent to which the authors think the data from this scan can be reflected as strictly reflecting recent memories, particularly given it is relative to the pre-encoding baseline from before the remote memories, as well (and therefore in theory could reflect both the remote + recent). (I should also acknowledge that, if it is the case that the authors think there might be some remote memory processing during the recent learning session in general, a pre-learning rest scan might not have been "clean" either, in that it could have reflected some processing of the remote memories-i.e., perhaps a clean pre-learning scan for the recent learning session related to point 1a is simply not possible.)
  
  We propose that theoretically, the post-learning recent scan could indeed reflect mixture of remote and recent sequences. This is one of the drawbacks of splitting encoding into two sessions rather than combining encoding into one session and splitting retrieval into an immediate and delayed session; any rest scans that are collected on Day 2 may have signal that relates to processing of the Day 1 remote sequences, which is why we decided against the pre-learning baseline for Day 2, as you had noted.
  
  You are correct that we alluded to in our original submission when discussing the LOC-mPFC coupling result, and we have taken steps to discuss this more explicitly. In Brief, we find greater LOC-mPFC connectivity only after recent learning relative to the pre-learning baseline, and cortical-cortical connectivity could be indicative of processing memories that already have undergone some consolidation (Takashima et al., 2009; Smith et al., 2010). From another vantage point, the mPFC representation of Day 1 learning may have led to increased connectivity with LOC on Day 2 due to Day 1 learning beginning to resemble consolidated prior knowledge (van Kesteren et al., 2010). While this effect is consistent with prior literature and theory, it's unclear why we would find evidence of processing of the remote memories and not the recent memories. Furthermore, the change in LOC-mPFC connectivity in this scan did not correlate with memory behaviors from either learning session, which could be because signal from this scan reflects a mix of processing of the two different learning sessions. With these ideas in mind, we have fleshed out the discussion of the post-encoding ‘recent’ scan in the Discussion (page 38-39, lines 1039-1044).
  
  c) Third, I am thinking about how both of the above issues might relate to the authors' findings, and would love to see more added to the paper to address this point. Specifically, I assume there are fluctuations in baseline connectivity profile across days within a person, such that the pre-learning connectivity on day 1 might be different from on day 2. Given that, and the lack of a pre-learning connectivity measure on day 2, it would logically follow that the measure of connectivity change from pre- to post-learning is going to be cleaner for the remote memories. In other words, could the lack of connectivity change observed for the recent scan simply be due to the lack of a within-day baseline? Given that otherwise, the post-learning rest should be the same in that it is an immediate reflection of how connectivity changes as a function of learning (depending on whether the authors think that the "recent" scan is actually reflecting "recent + remote"), it seems odd that they both don't show the same corresponding increase in connectivity-which makes me think it may be a baseline difference. I am not sure if this is what the authors are implying when they talk about how day 1 is most similar to prior investigation on p. 20, but if so it might be helpful to state that directly.
  
  We agree that it is puzzling that we don’t see that hippocampal-LOC connectivity does not also increase after recent learning, equivalently to what we see after remote learning. However, the fact that there is an increase from baseline rest to post-recent rest in mPFC – LOC connectivity suggests that it’s not an issue with baseline, but rather that the post-recent learning scan is reflecting processing of the remote memories (although as a caveat, there is no relationship with priming).
  
  On what is now page 23, we were referring to the notion that the Day 1 procedure (baseline rest, learning, post-learning rest) is the most straightforward replication of past work that finds a relationship between hippocampal-cortical coupling and later memory. In contrast, the Day 2 learning and rest scan are less ‘clean’ of a replication in that they are taking place in the shadow of Day 1 learning. We have clarified this in the Results (page 23, lines 597-598).
  
  d) Fourth and very related to my point 1c, I wonder if the lack of correlations for the recent scan with behavior is interpretable, or if it might just be that this is a noisy measure due to imperfect baseline correction. Do the authors have any data or logic they might be able to provide that could speak to these points? One thing that comes to mind is seeing whether the raw post-learning connectivity values (separately for both recent and remote) show the same pattern as the different scores. However, the authors may come up with other clever ways to address this point. If not, it might be worth acknowledging this interpretive challenge in the Discussion.
  
  We thought of three different approaches that could help us to understand whether the lack of correlations in between coupling and behavior in the recent scan was due to noise. First, we correlated recognition priming with raw hippocampal-LOC coupling separately for pre- and post-learning scans, as in Author response image 1:
  
  Author response image 1.
  
  Note that the post-learning chart depicts the relationship between post-remote coupling and remote priming and between post-recent coupling and recent priming (middle). Essentially, post-recent learning coupling did not relate to priming of recently learned sequences (middle; green) while there remains a trend for a relationship between post-remote coupling and priming for remotely learned sequences (middle; blue). However, the significant relationship between coupling and priming that we reported in the paper (right, blue) is driven both by the initial negative relationship that is observed in the pre-learning scan and the positive relationship in the post-remote learning scan. This highlights the importance of using a change score, as there may be spurious initial relationships between connectivity profiles and to-be-learned information that would then mask any learning- and consolidation-related changes.
  
  We also reasoned that if comparisons between the post-recent learning scan and the baseline scan are noisier than between the post-remote learning and baseline scan, there may be differences in the variance of the change scores across participants, such that changes in coupling from baseline to post-recent rest may be more variable than coupling from baseline to post-remote rest. We conducted F-tests to compare the variance of the change in these two hippocampal-LO correlations and found no reliable difference (ratio of difference: F(22, 22) = 0.811, p = .63).
  
  Finally, we explored whether hippocampal-LOC coupling is more stable across participants if compared across two rest scans within the same imaging session (baseline and post-remote) versus across two scans across two separate sessions (baseline and post-recent). Interestingly, coupling was not reliably correlated across scans in either case (baseline/post-remote: r = 0.03, p = 0.89 Baseline/post-recent: r = 0.07, p = .74).
  
  Finally, we evaluated whether hippocampal-LOC coupling was correlated across different rest scans (see Author response image 2). We reasoned that if such coupling was more correlated across baseline and post-remote scans relative to baseline and post-recent scans, that would indicate a within-session stability of participants’ connectivity profiles. At the same time, less correlation of coupling across baseline and post-recent scans would be an indication of a noisier change measure as the measure would additionally include a change in individuals’ connectivity profile over time. We found that there was no difference in the correlation of hipp-LO coupling is across sessions, and the correlation was not reliably significant for either session (baseline/post-remote: r = 0.03, p = 0.89; baseline/post-recent: r = 0.07, p = .74; difference: Steiger’s t = 0.12, p = 0.9).
  
  Author response image 2.
  
  We have included the raw correlations with priming (page 25, lines 654-661, Supplemental Figure 6) as well as text describing the comparison of variances (page 25, lines 642-653). We did not add the comparison of hippocampal-LOC coupling across scans to the current manuscript, as an evaluation of stability of such coupling in the context of learning and reactivation seems out of scope of the current focus of the experiment, but we find this result to be worthy of follow-up in future work.
  
  In summary, further analysis of our data did not reveal any indication that a comparison of rest connectivity across scan sessions inserted noise into the change score between baseline and post-recent learning scans. However, these analyses cannot fully rule that possibility out, and the current analyses do not provide concrete evidence that the post-recent learning scan comprises signals that are a mixture of processing of recent and remote sequences. We discuss these drawbacks in the Discussion (page 39, lines 1047-1070).
  
  2) My second major concern is how the authors have operationalized integration and differentiation. The pattern similarity analysis uses an overall correspondence between the neural similarity and a predicted model as the main metric. In the predicted model, C items that are indirectly associated are more similar to one another than they are C items that are entirely unrelated. The authors are then looking at a change in correspondence (correlation) between the neural data and that prediction model from pre- to post-learning. However, a change in the degree of correspondence with the predicted matrix could be driven by either the unrelated items becoming less similar or the related ones becoming more similar (or both!). Since the interpretation in the paper focuses on change to indirectly related C items, it would be important to report those values directly. For instance, as evidence of differentiation, it would be important to show that there is a greater decrease in similarity for indirectly associated C items than it is for unrelated C items (or even a smaller increase) from pre to post, or that C items that are indirectly related are less similar than are unrelated C items post but not pre-learning. Performing this analysis would confirm that the pattern of results matches the authors' interpretation. This would also impact the interpretation of the subsequent analyses that involve the neural integration measures (e.g., correlation analyses like those on p. 16, which may or may not be driven by increased similarity among overlapping C pairs). I should add that given the specificity to the remote learning in mPFC versus recent in LOC and anterior hippocampus, it is clearly the case that something interesting is going on. However, I think we need more data to understand fully what that "something" is.
  
  We recognize the importance of understanding whether model fits (and changes to them) are driven by similarity of overlapping pairs or non-overlapping pairs. We have modified all figures that visualize model fits to the neural integration model to separately show fits for pre- and post-learning (Figure 3 for mPFC, Supp. Figure 5 for LOC, Supp. Figure 9 for AB similarity in anterior hippocampus & LOC). We have additionally added supplemental figures to show the complete breakdown of similarity each region in a 2 (pre/post) x 2 (overlapping/non-overlapping sequence) x 2 (recent/remote) chart. We decided against including only these latter charts rather than the model fits since the model fits strike a good balance between information and readability. We have also modified text in various sections to focus on these new results.
  
  In brief, the decrease in model fit for mPFC for the remote sequences was driven primarily by a decrease in similarity for the overlapping C items and not the non-overlapping ones (Supplementary Figure 3, page 18, lines 468-472).
  
  Interestingly, in LOC, all C items grew more similar after learning, regardless of their overlap or learning session, but the increase in model fit for C items in the recent condition was driven by a larger increase in similarity for overlapping pairs relative to non-overlapping ones (Supp. Figure 5, page 21, lines 533-536).
  
  We also visualized AB similarity in the anterior hippocampus and LOC in a similar fashion (Supplementary Figure 9).
  
  We have also edited the Methods sections with updated details of these analyses (page 52, lines 1392-1397). We think that including these results considerably strengthen our claims and we are pleased to have them included.
  
  3) The priming task occurred before the post-learning exposure phase and could have impacted the representations. More consideration of this in the paper would be useful. Most critically, since the priming task involves seeing the related C items back-to-back, it would be important to consider whether this experience could have conceivably impacted the neural integration indices. I believe it never would have been the case that unrelated C items were presented sequentially during the priming task, i.e., that related C items always appeared together in this task. I think again the specificity of the remote condition is key and perhaps the authors can leverage this to support their interpretation. Can the authors consider this possibility in the Discussion?
  
  It's true that only C items from the same sequence were presented back-to-back during the priming task, and that this presentation may interfere with observations from the post-learning exposure scan that followed it. We agree that it is worth considering this caveat and have added language in the Discussion (page 40, lines 1071-1086). When designing the study, we reasoned that it was more important for the behavioral priming task to come before the exposure scans, as all items were shown only once in that task, whereas they were shown 4-5 times in a random order in the post-learning exposure phase. Because of this difference in presentation times, and because behavioral priming findings tend to be very sensitive, we concluded that it was more important to protect the priming task from the exposure scan instead of the reverse.
  
  We reasoned, however, that the additional presentation of the C items in the recognition priming task would not substantially override the sequence learning, as C items were each presented 16 times in their sequence (ABC1 and ABC2 16 times each). Furthermore, as this reviewer suggests, the order of C items during recognition was the same for recent and remote conditions, so the fact that we find a selective change in neural representation for the remote condition and don’t also see that change for the recent condition is additional assurance that the recognition priming order did not substantially impact the representations.
  
  4) For the priming task, based on the Figure 2A caption it seems as though every sequence contributes to both the control and primed conditions, but (I believe) this means that the control transition always happens first (and they are always back-to-back). Is this a concern? If RTs are changing over time (getting faster), it would be helpful to know whether the priming effects hold after controlling for trial numbers. I do not think this is a big issue because if it were, you would not expect to see the specificity of the remotely learned information. However, it would be helpful to know given the order of these conditions has to be fixed in their design.
  
  This is a correct understanding of the trial orders in the recognition priming task. We chose to involve the baseline items in the control condition to boost power – this way, priming of each sequence could be tested, while only presenting each item once in this task, as repetition in the recognition phase would have further facilitated response times and potentially masked any priming effects. We agree that accounting for trial order would be useful here, so we ran a mixed-effects linear model to examine responses times both as a function of trial number and of priming condition (primed/control). While there is indeed a large effect of trial number such that participants got faster over time, the priming effect originally observed in the remote condition still holds at the same time. We now report this analysis in the Results section (page 14, lines 337-349 for Expt 1 and pages 14-15, lines 360-362 for Expt 2).
  
  5) The authors should be cautious about the general conclusion that memories with overlapping temporal regularities become neurally integrated - given their findings in MPFC are more consistent with overall differentiation (though as noted above, I think we need more data on this to know for sure what is going on).
  
  We realize this conclusion was overly simplistic and, in several places, have revised the general conclusions to be more specific about the nuanced similarity findings.
  
  6) It would be worth stating a few more details and perhaps providing additional logic or justification in the main text about the pre- and post-exposure phases were set up and why. How many times each object was presented pre and post, and how the sequencing was determined (were any constraints put in place e.g., such that C1 and C2 did not appear close in time?). What was the cover task (I think this is important to the interpretation & so belongs in the main paper)? Were there considerations involving the fact that this is a different sequence of the same objects the participants would later be learning - e.g., interference, etc.?
  
  These details can be found in the Methods section (pages 50-51, lines 1337-1353) and we’ve added a new summary of that section in the Results (page 17, lines 424- 425 and 432-435). In brief, a visual hash tag appeared on a small subset of images and participants pressed a button when this occurred, and C1 and C2 objects were presented in separate scans (as were A and B objects) to minimize inflated neural similarity due to temporal proximity.
  
  Reviewer #2 (Public Review):
  
  The manuscript by Tompary & Davachi presents results from two experiments, one behavior only and one fMRI plus behavior. They examine the important question of how to separate object memories (C1 and C2) that are never experienced together in time and become linked by shared predictive cues in a sequence (A followed by B followed by one of the C items). The authors developed an implicit priming task that provides a novel behavioral metric for such integration. They find significant C1-C2 priming for sequences that were learned 24h prior to the test, but not for recently learned sequences, suggesting that associative links between the two originally separate memories emerge over an extended period of consolidation. The fMRI study relates this behavioral integration effect to two neural metrics: pattern similarity changes in the medial prefrontal cortex (mPFC) as a measure of neural integration, and changes in hippocampal-LOC connectivity as a measure of post-learning consolidation. While fMRI patterns in mPFC overall show differentiation rather than integration (i.e., C1-C2 representational distances become larger), the authors find a robust correlation such that increasing pattern similarity in mPFC relates to stronger integration in the priming test, and this relationship is again specific to remote memories. Moreover, connectivity between the posterior hippocampus and LOC during post-learning rest is positively related to the behavioral integration effect as well as the mPFC neural similarity index, again specifically for remote memories. Overall, this is a coherent set of findings with interesting theoretical implications for consolidation theories, which will be of broad interest to the memory, learning, and predictive coding communities.
  
  Strengths:
  
  1) The implicit associative priming task designed for this study provides a promising new tool for assessing the formation of mnemonic links that influence behavior without explicit retrieval demands. The authors find an interesting dissociation between this implicit measure of memory integration and more commonly used explicit inference measures: a priming effect on the implicit task only evolved after a 24h consolidation period, while the ability to explicitly link the two critical object memories is present immediately after learning. While speculative at this point, these two measures thus appear to tap into neocortical and hippocampal learning processes, respectively, and this potential dissociation will be of interest to future studies investigating time-dependent integration processes in memory.
  
  2) The experimental task is well designed for isolating pre- vs post-learning changes in neural similarity and connectivity, including important controls of baseline neural similarity and connectivity.
  
  3) The main claim of a consolidation-dependent effect is supported by a coherent set of findings that relate behavioral integration to neural changes. The specificity of the effects on remote memories makes the results particularly interesting and compelling.
  
  4) The authors are transparent about unexpected results, for example, the finding that overall similarity in mPFC is consistent with a differentiation rather than an integration model.
  
  Thank you for the positive comments!
  
  Weaknesses:
  
  1) The sequence learning and recognition priming tasks are cleverly designed to isolate the effects of interest while controlling for potential order effects. However, due to the complex nature of the task, it is difficult for the reader to infer all the transition probabilities between item types and how they may influence the behavioral priming results. For example, baseline items (BL) are interspersed between repeated sequences during learning, and thus presumably can only occur before an A item or after a C item. This seems to create non-random predictive relationships such that C is often followed by BL, and BL by A items. If this relationship is reversed during the recognition priming task, where the sequence is always BL-C1-C2, this violation of expectations might slow down reaction times and deflate the baseline measure. It would be helpful if the manuscript explicitly reported transition probabilities for each relevant item type in the priming task relative to the sequence learning task and discussed how a match vs mismatch may influence the observed priming effects.
  
  We have added a table of transition probabilities across the learning, recognition priming, and exposure scans (now Table 1, page 48). We have also included some additional description of the change in transition probabilities across different tasks in the Methods section. Specifically, if participants are indeed learning item types and rules about their order, then both the control and the primed conditions would violate that order. Since C1 and C2 items never appeared together, viewing C1 would give rise to an expectation of seeing a BL item, which would also be violated. This suggests that our priming effects are driven by sequence-specific relationships rather than learning of the probabilities of different item types. We’ve added this consideration to the Methods section (page 45, lines 1212-1221).
  
  Another critical point to consider (and that the transition probabilities do not reflect) is that during learning, while C is followed either by A or BL, they are followed by different A or BL items. In contrast, a given A is always followed by the same B object, which is always followed by one of two C objects. While the order of item types is semi-predictable, the order of objects (specific items) themselves are not. This can be seen in the response times during learning, such that response times for A and BL items are always slower than for B and C items. We have explained this nuance in the figure text for Table 1.
  
  2) The choice of what regions of interest to include in the different sets of analyses could be better motivated. For example, even though briefly discussed in the intro, it remains unclear why the posterior but not the anterior hippocampus is of interest for the connectivity analyses, and why the main target is LOC, not mPFC, given past results including from this group (Tompary & Davachi, 2017). Moreover, for readers not familiar with this literature, it would help if references were provided to suggest that a predictable > unpredictable contrast is well suited for functionally defining mPFC, as done in the present study.
  
  We have clarified our reasoning for each of these choices throughout the manuscript and believe that our logic is now much more transparent. For an expanded reasoning of why we were motivated to look at posterior and not anterior hippocampus, see pages 6-7, lines 135-159, and our response to R2. In brief, past research focusing on post-encoding connectivity with the hippocampus suggests that posterior aspect is more likely to couple with category-selective cortex after learning neutral, non-rewarded objects much like the stimuli used in the present study.
  
  We also clarify our reasoning for LOC over mPFC. While theoretically, mPFC is thought to be a candidate region for coupling with the hippocampus during consolidation, the bulk of empirical work to date has revealed post-encoding connectivity between the hippocampus and category-selective cortex in the ventral and occipital lobes (page 6, lines 123-134).
  
  As for the use of the predictable > unpredictable contrast for functionally defining cortical regions, we reasoned that cortical regions that were sensitive to the temporal regularities generated by the sequences may be further involved in their offline consolidation and long-term storage (Danker & Anderson, 2010; Davachi & Danker, 2013; McClelland et al., 1995). We have added this justification to the Methods section (page 18, lines 454-460).
  
  3) Relatedly, multiple comparison corrections should be applied in the fMRI integration and connectivity analyses whenever the same contrast is performed on multiple regions in an exploratory manner.
  
  We now correct for multiple comparisons using Bonferroni correction, and this correction depends on the number of regions in which each analysis is conducted. Please see page 55, lines 1483-1490, in the Methods section for details of each analysis.
  
  Reviewer #3 (Public Review):
  
  The authors of this manuscript sought to illuminate a link between a behavioral measure of integration and neural markers of cortical integration associated with systems consolidation (post-encoding connectivity, change in representational neural overlap). To that aim, participants incidentally encoded sequences of objects in the fMRI scanner. Unbeknownst to participants, the first two objects of the presented ABC triplet sequences overlapped for a given pair of sequences. This allowed the authors to probe the integration of unique C objects that were never directly presented in the same sequence, but which shared the same preceding A and B objects. They encoded one set of objects on Day 1 (remote condition), another set of objects 24 hours later (recent condition) and tested implicit and explicit memory for the learned sequences on Day 2. They additionally collected baseline and post-encoding resting-state scans. As their measure of behavioral integration, the authors examined reaction time during an Old/New judgement task for C objects depending on if they were preceded by a C object from an overlapping sequence (primed condition) versus a baseline object. They found faster reaction times for the primed objects compared to the control condition for remote but not recently learned objects, suggesting that the C objects from overlapping sequences became integrated over time. They then examined pattern similarity in a priori ROIs as a measure of neural integration and found that participants showing evidence of integration of C objects from overlapping sequences in the medial prefrontal cortex for remotely learned objects also showed a stronger implicit priming effect between those C objects over time. When they examined the change in connectivity between their ROIs after encoding, they also found that connectivity between the posterior hippocampus and lateral occipital cortex correlated with larger priming effects for remotely learned objects, and that lateral occipital connectivity with the medial prefrontal cortex was related to neural integration of remote objects from overlapping sequences.
  
  The authors aim to provide evidence of a relationship between behavioral and neural measures of integration with consolidation is interesting, important, and difficult to achieve given the longitudinal nature of studies required to answer this question. Strengths of this study include a creative behavioral task, and solid modelling approaches for fMRI data with careful control for several known confounds such as bold activation on pattern analysis results, motion, and physiological noise. The authors replicate their behavioral observations across two separate experiments, one of which included a large sample size, and found similar results that speak to the reliability of the observed behavioral phenomenon. In addition, they document several correlations between neural measures and task performance, lending functional significance to their neural findings.
  
  Thank you for this positive assessment of our study!
  
  However, this study is not without notable weaknesses that limit the strength of the manuscript. The authors report a behavioral priming effect suggestive of integration of remote but not recent memories, leading to the interpretation that the priming effect emerges with consolidation. However, they did not observe a reliable interaction between the priming condition and learning session (recent/remote) on reaction times, meaning that the priming effect for remote memories was not reliably greater than that observed for recent. In addition, the emergence of a priming effect for remote memories does not appear to be due to faster reaction times for primed targets over time (the condition of interest), but rather, slower reaction times for control items in the remote condition compared to recent. These issues limit the strength of the claim that the priming effect observed is due to C items of interest being integrated in a consolidation-dependent manner.
  
  We acknowledge that the lack of a day by condition interaction in the behavioral priming effect should discussed and now discuss this data in a more nuanced manner. While it’s true that the priming effect emerges due to a slowing of the control items over time, this slowing is consistent with classic time-dependent effects demonstrating slower response times for more delayed memories. The fact that the response times in the primed condition does not show this slowing can be interpreted as a protection against this slowing that would otherwise occur. Please see page 29, lines 758-766, for this added discussion.
  
  Similarly, the interactions between neural variables of interest and learning session needed to strongly show a significant consolidation-related effect in the brain were sometimes tenuous. There was no reliable difference in neural representational pattern analysis fit to a model of neural integration between the short and long delays in the medial prefrontal cortex or lateral occipital cortex, nor was the posterior hippocampus-lateral occipital cortex post-encoding connectivity correlation with subsequent priming significantly different for recent and remote memories. While the relationship between integration model fit in the medial prefrontal cortex and subsequent priming (which was significantly different from that occurring for recent memories) was one of the stronger findings of the paper in favor of a consolidation-related effect on behavior, is it possible that lack of a behavioral priming effect for recent memories due to possible issues with the control condition could mask a correlation between neural and behavioral integration in the recent memory condition?
  
  While we acknowledge that lack of a statistically reliable interaction between neural measures and behavioral priming in many cases, we are heartened by the reliable difference in the relationship between mPFC similarity and priming over time, which was our main planned prediction. In addition to adding caveats in the discussion about the neural measures and behavioral findings in the recent condition (see our response to R1.1 and R1.4 for more details), we have added language throughout the manuscript noting the need to interpret these data with caution.
  
  These limitations are especially notable when one considers that priming does not classically require a period of prolonged consolidation to occur, and prominent models of systems consolidation rather pertain to explicit memory. While the authors have provided evidence that neural integration in the medial prefrontal cortex, as well as post-encoding coupling between the lateral occipital cortex and posterior hippocampus, are related to faster reaction times for primed objects of overlapping sequences compared to their control condition, more work is needed to verify that the observed findings indeed reflect consolidation dependent integration as proposed.
  
  We agree that more work is needed to provide converging evidence for these novel findings. However, we wish to counter the notion that systems consolidation models are relevant only for explicit memories. Although models of systems consolidation often mention transformations from episodic to semantic memory, the critical mechanisms that define the models involve changes in the neural ensembles of a memory that is initially laid down in the hippocampus and is taught to cortex over time. This transformation of neural traces is not specific to explicit/declarative forms of memory. For example, implicit statistical learning initially depends on intact hippocampal function (Schapiro et al., 2014) and improves over consolidation (Durrant et al., 2011, 2013; Kóbor et al., 2017).
  
  Second, while there are many classical findings of priming during or immediately after learning, there are several instances of priming used to measure consolidation-related changes to newly learned information. For instance, priming has been used as a measure of lexical integration, demonstrating that new word learning benefits from a night of sleep (Wang et al., 2017; Gaskell et al., 2019) or a 1-week delay (Tamminen & Gaskell, 2013). The issue is not whether priming can occur immediately, it is whether priming increases with a delay.
  
  Finally, it is helpful to think about models of memory systems that divide memory representations not by their explicit/implicit nature, but along other important dimensions such as their neural bases, their flexibility vs rigidity, and their capacity for rapid vs slow learning (Henke, 2010). Considering this evidence, we suggest that systems consolidation models are most useful when considering how transformations in the underlying neural memory representation affects its behavioral expression, rather than focusing on the extent that the memory representation is explicit or implicit.
  
  With all this said, we have added text to the discussion reminding the reader that there was no statistically significant difference in priming as a function of the delay (page 29, lines 764 - 766). However, we are encouraged by the fact that the relationship between priming and mPFC neural similarity was significantly stronger for remotely learned objects relative to recently learned ones, as this is directly in line with systems consolidation theories.
  
  References
  
  Abolghasem, Z., Teng, T. H.-T., Nexha, E., Zhu, C., Jean, C. S., Castrillon, M., Che, E., Di Nallo, E. V., & Schlichting, M. L. (2023). Learning strategy differentially impacts memory connections in children and adults. Developmental Science, 26(4), e13371. https://doi.org/10.1111/desc.13371
  
  Dobbins, I. G., Schnyer, D. M., Verfaellie, M., & Schacter, D. L. (2004). Cortical activity reductions during repetition priming can result from rapid response learning. Nature, 428(6980), 316–319. https://doi.org/10.1038/nature02400
  
  Durrant, S. J., Cairney, S. A., & Lewis, P. A. (2013). Overnight consolidation aids the transfer of statistical knowledge from the medial temporal lobe to the striatum. Cerebral Cortex, 23(10), 2467–2478. https://doi.org/10.1093/cercor/bhs244
  
  Durrant, S. J., Taylor, C., Cairney, S., & Lewis, P. A. (2011). Sleep-dependent consolidation of statistical learning. Neuropsychologia, 49(5), 1322–1331. https://doi.org/10.1016/j.neuropsychologia.2011.02.015
  
  Gaskell, M. G., Cairney, S. A., & Rodd, J. M. (2019). Contextual priming of word meanings is stabilized over sleep. Cognition, 182, 109–126. https://doi.org/10.1016/j.cognition.2018.09.007
  
  Henke, K. (2010). A model for memory systems based on processing modes rather than consciousness. Nature Reviews Neuroscience, 11(7), 523–532. https://doi.org/10.1038/nrn2850
  
  Kóbor, A., Janacsek, K., Takács, Á., & Nemeth, D. (2017). Statistical learning leads to persistent memory: Evidence for one-year consolidation. Scientific Reports, 7(1), 760. https://doi.org/10.1038/s41598-017-00807-3
  
  Kuhl, B. A., & Chun, M. M. (2014). Successful remembering elicits event-specific activity patterns in lateral parietal cortex. The Journal of Neuroscience, 34(23), 8051–8060. https://doi.org/10.1523/JNEUROSCI.4328-13.2014
  
  Richter, F. R., Chanales, A. J. H., & Kuhl, B. A. (2016). Predicting the integration of overlapping memories by decoding mnemonic processing states during learning. NeuroImage, 124, Part A, 323–335. https://doi.org/10.1016/j.neuroimage.2015.08.051
  
  Schapiro, A. C., Gregory, E., Landau, B., McCloskey, M., & Turk-Browne, N. B. (2014). The necessity of the medial-temporal lobe for statistical learning. Journal of Cognitive Neuroscience, 1–12. https://doi.org/10.1162/jocn_a_00578
  
  Schlichting, M. L., & Preston, A. R. (2014). Memory reactivation during rest supports upcoming learning of related content. Proceedings of the National Academy of Sciences, 111(44), 15845–15850. https://doi.org/10.1073/pnas.1404396111
  
  Smith, J. F., Alexander, G. E., Chen, K., Husain, F. T., Kim, J., Pajor, N., & Horwitz, B. (2010). Imaging systems level consolidation of novel associate memories: A longitudinal neuroimaging study. NeuroImage, 50(2), 826–836. https://doi.org/10.1016/j.neuroimage.2009.11.053
  
  Takashima, A., Nieuwenhuis, I. L. C., Jensen, O., Talamini, L. M., Rijpkema, M., & Fernández, G. (2009). Shift from hippocampal to neocortical centered retrieval network with consolidation. The Journal of Neuroscience, 29(32), 10087–10093. https://doi.org/10.1523/JNEUROSCI.0799-09.2009
  
  Tamminen, J., & Gaskell, M. G. (2013). Novel word integration in the mental lexicon: Evidence from unmasked and masked semantic priming. The Quarterly Journal of Experimental Psychology, 66(5), 1001–1025. https://doi.org/10.1080/17470218.2012.724694
  
  van Kesteren, M. T. R. van, Fernández, G., Norris, D. G., & Hermans, E. J. (2010). Persistent schema-dependent hippocampal-neocortical connectivity during memory encoding and postencoding rest in humans. Proceedings of the National Academy of Sciences, 107(16), 7550–7555. https://doi.org/10.1073/pnas.0914892107
  
  Wang, H.-C., Savage, G., Gaskell, M. G., Paulin, T., Robidoux, S., & Castles, A. (2017). Bedding down new words: Sleep promotes the emergence of lexical competition in visual word recognition. Psychonomic Bulletin & Review, 24(4), 1186–1193. https://doi.org/10.3758/s13423-016-1182-7
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.20.513126v1
www.biorxiv.org www.biorxiv.org

Structure of mycobacterial cytochrome bcc in complex with Q203 and TB47, two anti-TB drug candidates

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #3:
  
  The authors modified a previously reported hybrid cytochrome bcc-aa3 supercomplex, consisting of bcc from M. tuberculosis and aa3 from M. smegmatis, (Kim et al 2015) by appending an affinity tag facilitating purification. The cryo-EM experiments are based on the authors' earlier work (Gong et al. 2018) on the structure of the bcc-aa3 supercomplex from M. smegmatis. The authors then determine the structure of the bcc part alone and in complex with Q203 and TB47.
  
  The manuscript is well written and the obtained results are presented in a concise, clear-cut manner. In general, the data support the conclusions drawn.
  
  We thank the reviewer for this evaluation.
  
  To this reviewer, the following points are unclear:
  
  The purified enzyme elutes from the gel filtration column as one peak, but there seems to be no information given on the subunit composition and the enzymatic activity of the purified hybrid cytochrome bcc-aa3 supercomplex.
  
  See answers to Question 1 from the major Essential Revisions and Question 1 from the minor Essential Revisions.
  
  "We have now shown that the purified chimeric supercomplex is a functional assembly with a (mean ± s.d., n = 4), in agreement with the previous study that shows M. tuberculosis CIII can functionally complement native M. smegmatis CIII and maintain the growth of M. smegmatis (Kim et al., 2015). The in vitro inhibitions of this enzyme by Q203 and TB47 was determined by means of an DMNQH2/oxygen oxidoreductase activity assay. In the assay, 500 nM Q203 or TB47 was chosen, which is close to the median inhibitory concentration (IC50) obtained from the menadiol-induced oxygen consumption in our previous study (Gong et al., 2018). After addition of Q203 and TB47, the values of turnover number of the hybrid supercomplex are reduced to 5.8 +/- 2.4 e-s-1 (Figure 4-figure supplement 4) and 5.1 +/- 2.9 e-s-1 (Figure 5-figure supplement 4) respectively, from 23.3 +/- 2.4 e-s-1. We have incorporated this new data into the text (lines 90-93, 187-189, 206-209)."
  
  "The subunit composition of the purified enzyme has now been provided in Figure 2-figure supplement 1."
  
  It is unclear what is the conclusion of the structure comparison (Fig 6) is regarding the affinity of Q203 for M. smegmatis.
  
  The structural comparison indicates that Q203 should have a similar binding mechanism and a similar effect on the activity of cytochrome bcc from M. smegmatis and M. tuberculosis. This is in good agreement with previous antimycobacterial activity data and inhibition data for the bcc complexes from M. smegmatis and M. tuberculosis (Gong et al., 2018; Lu et al., 2018a). These have now been incorporated into the revised manuscript (line 223-227).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.15.448498v1
www.biorxiv.org www.biorxiv.org

Experience Transforms Crossmodal Object Representations in the Anterior Temporal Lobes

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This study used a multi-day learning paradigm combined with fMRI to reveal neural changes reflecting the learning of new (arbitrary) shape-sound associations. In the scanner, the shapes and sounds are presented separately and together, both before and after learning. When they are presented together, they can be either consistent or inconsistent with the learned associations. The analyses focus on auditory and visual cortices, as well as the object-selective cortex (LOC) and anterior temporal lobe regions (temporal pole (TP) and perirhinal cortex (PRC)). Results revealed several learning-induced changes, particularly in the anterior temporal lobe regions. First, the LOC and PRC showed a reduced bias to shapes vs sounds (presented separately) after learning. Second, the TP responded more strongly to incongruent than congruent shape-sound pairs after learning. Third, the similarity of TP activity patterns to sounds and shapes (presented separately) was increased for non-matching shape-sound comparisons after learning. Fourth, when comparing the pattern similarity of individual features to combined shape-sound stimuli, the PRC showed a reduced bias towards visual features after learning. Finally, comparing patterns to combined shape-sound stimuli before and after learning revealed a reduced (and negative) similarity for incongruent combinations in PRC. These results are all interpreted as evidence for an explicit integrative code of newly learned multimodal objects, in which the whole is different from the sum of the parts.
  
  The study has many strengths. It addresses a fundamental question that is of broad interest, the learning paradigm is well-designed and controlled, and the stimuli are real 3D stimuli that participants interact with. The manuscript is well written and the figures are very informative, clearly illustrating the analyses performed.
  
  There are also some weaknesses. The sample size (N=17) is small for detecting the subtle effects of learning. Most of the statistical analyses are not corrected for multiple comparisons (ROIs), and the specificity of the key results to specific regions is also not tested. Furthermore, the evidence for an integrative representation is rather indirect, and alternative interpretations for these results are not considered.
  
  We thank the reviewer for their careful reading and the positive comments on our manuscript. As suggested, we have conducted additional analyses of theoretically-motivated ROIs and have found that temporal pole and perirhinal cortex are the only regions to show the key experience-dependent transformations. We are much more cautious with respect to multiple comparisons, and have removed a series of post hoc across-ROI comparisons that were irrelevant to the key questions of the present manuscript. The revised manuscript now includes much more discussion about alternative interpretations as suggested by the reviewer (and also by the other reviewers).
  
  Additionally, we looked into scanning more participants, but our scanner has since had a full upgrade and the sequence used in the current study is no longer supported by our scanner. However, we note that while most analyses contain 17 participants, we employed a within-subject learning design that is not typically used in fMRI experiments and increases our power to detect an effect. This is supported by the robust effect size of the behavioural data, whereby 17 out of 18 participants revealed a learning effect (Cohen’s D = 1.28) and which was replicated in a follow-up experiment with a larger sample size.
  
  We address the other reviewer comments point-by-point in the below.
  
  Reviewer #2 (Public Review):
  
  Li et al. used a four-day fMRI design to investigate how unimodal feature information is combined, integrated, or abstracted to form a multimodal object representation. The experimental question is of great interest and understanding how the human brain combines featural information to form complex representations is relevant for a wide range of researchers in neuroscience, cognitive science, and AI. While most fMRI research on object representations is limited to visual information, the authors examined how visual and auditory information is integrated to form a multimodal object representation. The experimental design is elegant and clever. Three visual shapes and three auditory sounds were used as the unimodal features; the visual shapes were used to create 3D-printed objects. On Day 1, the participants interacted with the 3D objects to learn the visual features, but the objects were not paired with the auditory features, which were played separately. On Day 2, participants were scanned with fMRI while they were exposed to the unimodal visual and auditory features as well as pairs of visual-auditory cues. On Day 3, participants again interacted with the 3D objects but now each was paired with one of the three sounds that played from an internal speaker. On Day 4, participants completed the same fMRI scanning runs they completed on Day 2, except now some visual-auditory feature pairs corresponded with Congruent (learned) objects, and some with Incongruent (unlearned) objects. Using the same fMRI design on Days 2 and 4 enables a well-controlled comparison between feature- and object-evoked neural representations before and after learning. The notable results corresponded to findings in the perirhinal cortex and temporal pole. The authors report (1) that a visual bias on Day 2 for unimodal features in the perirhinal cortex was attenuated after learning on Day 4, (2) a decreased univariate response to congruent vs. incongruent visual-auditory objects in the temporal pole on Day 4, (3) decreased pattern similarity between congruent vs. incongruent pairs of visual and auditory unimodal features in the temporal pole on Day 4, (4) in the perirhinal cortex, visual unimodal features on Day 2 do not correlate with their respective visual-auditory objects on Day 4, and (5) in the perirhinal cortex, multimodal object representations across Days 2 and 4 are uncorrelated for congruent objects and anticorrelated for incongruent. The authors claim that each of these results supports the theory that multimodal objects are represented in an "explicit integrative" code separate from feature representations. While these data are valuable and the results are interesting, the authors' claims are not well supported by their findings.
  
  We thank the reviewer for the careful reading of our manuscript and positive comments. Overall, we now stay closer to the data when describing the results and provide our interpretation of these results in the discussion section while remaining open to alternative interpretations (as also suggested by Reviewer 1).
  
  (1) In the introduction, the authors contrast two theories: (a) multimodal objects are represented in the co-activation of unimodal features, and (b) multimodal objects are represented in an explicit integrative code such that the whole is different than the sum of its parts. However, the distinction between these two theories is not straightforward. An explanation of what is precisely meant by "explicit" and "integrative" would clarify the authors' theoretical stance. Perhaps we can assume that an "explicit" representation is a new representation that is created to represent a multimodal object. What is meant by "integrative" is more ambiguous-unimodal features could be integrated within a representation in a manner that preserves the decodability of the unimodal features, or alternatively the multimodal representation could be completely abstracted away from the constituent features such that the features are no longer decodable. Even if the object representation is "explicit" and distinct from the unimodal feature representations, it can in theory still contain featural information, though perhaps warped or transformed. The authors do not clearly commit to a degree of featural abstraction in their theory of "explicit integrative" multimodal object representations which makes it difficult to assess the validity of their claims.
  
  Due to its ambiguity, we removed the term “explicit” and now make it clear that our central question was whether crossmodal object representations require only unimodal feature-level representations (e.g., frogs are created from only the combination of shape and sound) or whether crossmodal object representations also rely on an integrative code distinct from the unimodal features (e.g., there is something more to “frog” than its original shape and sound). We now clarify this in the revised manuscript.
  
  “One theoretical view from the cognitive sciences suggests that crossmodal objects are built from component unimodal features represented across distributed sensory regions.8 Under this view, when a child thinks about “frog”, the visual cortex represents the appearance of the shape of the frog whereas the auditory cortex represents the croaking sound. Alternatively, other theoretical views predict that multisensory objects are not only built from their component unimodal sensory features, but that there is also a crossmodal integrative code that is different from the sum of these parts.9,10,11,12,13 These latter views propose that anterior temporal lobe structures can act as a polymodal “hub” that combines separate features into integrated wholes.9,11,14,15” – pg. 4
  
  For this reason, we designed our paradigm to equate the unimodal representations, such that neural differences between the congruent and incongruent conditions provide evidence for a crossmodal integrative code different from the unimodal features (because the unimodal features are equated by default in the design).
  
  “Critically, our four-day learning task allowed us to isolate any neural activity associated with integrative coding in anterior temporal lobe structures that emerges with experience and differs from the neural patterns recorded at baseline. The learned and non-learned crossmodal objects were constructed from the same set of three validated shape and sound features, ensuring that factors such as familiarity with the unimodal features, subjective similarity, and feature identity were tightly controlled (Figure 2). If the mind represented crossmodal objects entirely as the reactivation of unimodal shapes and sounds (i.e., objects are constructed from their parts), then there should be no difference between the learned and non-learned objects (because they were created from the same three shapes and sounds). By contrast, if the mind represented crossmodal objects as something over and above their component features (i.e., representations for crossmodal objects rely on integrative coding that is different from the sum of their parts), then there should be behavioral and neural differences between learned and non-learned crossmodal objects (because the only difference across the objects is the learned relationship between the parts). Furthermore, this design allowed us to determine the relationship between the object representation acquired after crossmodal learning and the unimodal feature representations acquired before crossmodal learning. That is, we could examine whether learning led to abstraction of the object representations such that it no longer resembled the unimodal feature representations.” – pg. 5
  
  Furthermore, we agree with the reviewer that our definition and methodological design does not directly capture the structure of the integrative code. With experience, the unimodal feature representations may be completely abstracted away, warped, or changed in a nonlinear transformation. We suggest that crossmodal learning forms an integrative code that is different from the original unimodal representations in the anterior temporal lobes, however, we agree that future work is needed to more directly capture the structure of the integrative code that emerges with experience.
  
  “In our task, participants had to differentiate congruent and incongruent objects constructed from the same three shape and sound features (Figure 2). An efficient way to solve this task would be to form distinct object-level outputs from the overlapping unimodal feature-level inputs such that congruent objects are made to be orthogonal from the representations before learning (i.e., measured as pattern similarity equal to 0 in the perirhinal cortex; Figure 5b, 6, Supplemental Figure S5), whereas non-learned incongruent objects could be made to be dissimilar from the representations before learning (i.e., anticorrelation, measured as patten similarity less than 0 in the perirhinal cortex; Figure 6). Because our paradigm could decouple neural responses to the learned object representations (on Day 4) from the original component unimodal features at baseline (on Day 2), these results could be taken as evidence of pattern separation in the human perirhinal cortex.11,12 However, our pattern of results could also be explained by other types of crossmodal integrative coding. For example, incongruent object representations may be less stable than congruent object representations, such that incongruent objects representation are warped to a greater extent than congruent objects (Figure 6).” – pg. 18
  
  “As one solution to the crossmodal binding problem, we suggest that the temporal pole and perirhinal cortex form unique crossmodal object representations that are different from the distributed features in sensory cortex (Figure 4, 5, 6, Supplemental Figure S5). However, the nature by which the integrative code is structured and formed in the temporal pole and perirhinal cortex following crossmodal experience – such as through transformations, warping, or other factors – is an open question and an important area for future investigation.” – pg. 18
  
  (2) After participants learned the multimodal objects, the authors report a decreased univariate response to congruent visual-auditory objects relative to incongruent objects in the temporal pole. This is claimed to support the existence of an explicit, integrative code for multimodal objects. Given the number of alternative explanations for this finding, this claim seems unwarranted. A simpler interpretation of these results is that the temporal pole is responding to the novelty of the incongruent visual-auditory objects. If there is in fact an explicit, integrative multimodal object representation in the temporal pole, it is unclear why this would manifest in a decreased univariate response.
  
  We thank the reviewer for identifying this issue. Our behavioural design controls unimodal feature-level novelty but allows object-level novelty to differ. Thus, neural differences between the congruent and incongruent conditions reflects sensitivity to the object-level differences between the combination of shape and sound. However, we agree that there are multiple interpretations regarding the nature of how the integrative code is structured in the temporal pole and perirhinal cortex. We have removed the interpretation highlighted by the reviewer from the results. Instead, we now provide our preferred interpretation in the discussion, while acknowledging the other possibilities that the reviewer mentions.
  
  As one possibility, these results in temporal pole may reflect “conceptual combination”. “hummingbird” – a congruent pairing – may require less neural resources than an incongruent pairing such as “bark-frog”.
  
  “Furthermore, these distinct anterior temporal lobe structures may be involved with integrative coding in different ways. For example, the crossmodal object representations measured after learning were found to be related to the component unimodal feature representations measured before learning in the temporal pole but not the perirhinal cortex (Figure 5, 6, Supplemental Figure S5). Moreover, pattern similarity for congruent shape-sound pairs were lower than the pattern similarity for incongruent shape-sound pairs after crossmodal learning in the temporal pole but not the perirhinal cortex (Figure 4b, Supplemental Figure S3a). As one interpretation of this pattern of results, the temporal pole may represent new crossmodal objects by combining previously learned knowledge. 8,9,10,11,13,14,15,33 Specifically, research into conceptual combination has linked the anterior temporal lobes to compound object concepts such as “hummingbird”.34,35,36 For example, participants during our task may have represented the sound-based “humming” concept and visually-based “bird” concept on Day 1, forming the crossmodal “hummingbird” concept on Day 3; Figure 1, 2, which may recruit less activity in temporal pole than an incongruent pairing such as “barking-frog”. For these reasons, the temporal pole may form a crossmodal object code based on pre-existing knowledge, resulting in reduced neural activity (Figure 3d) and pattern similarity towards features associated with learned objects (Figure 4b).”– pg. 18
  
  (3) The authors ran a neural pattern similarity analysis on the unimodal features before and after multimodal object learning. They found that the similarity between visual and auditory features that composed congruent objects decreased in the temporal pole after multimodal object learning. This was interpreted to reflect an explicit integrative code for multimodal objects, though it is not clear why. First, behavioral data show that participants reported increased similarity between the visual and auditory unimodal features within congruent objects after learning, the opposite of what was found in the temporal pole. Second, it is unclear why an analysis of the unimodal features would be interpreted to reflect the nature of the multimodal object representations. Since the same features corresponded with both congruent and incongruent objects, the nature of the feature representations cannot be interpreted to reflect the nature of the object representations per se. Third, using unimodal feature representations to make claims about object representations seems to contradict the theoretical claim that explicit, integrative object representations are distinct from unimodal features. If the learned multimodal object representation exists separately from the unimodal feature representations, there is no reason why the unimodal features themselves would be influenced by the formation of the object representation. Instead, these results seem to more strongly support the theory that multimodal object learning results in a transformation or warping of feature space.
  
  We apologize for the lack of clarity. We have now overhauled this aspect of our manuscript in an attempt to better highlight key aspects of our experimental design. In particular, because the unimodal features composing the congruent and incongruent objects were equated, neural differences between these conditions would provide evidence for an experience-dependent crossmodal integrative code that is different from its component unimodal features.
  
  Related to the second and third points, we were looking at the extent to which the original unimodal representations change with crossmodal learning. Before crossmodal learning, we found that the perirhinal cortex tracked the similarity between the individual visual shape features and the crossmodal objects that were composed of those visual shapes – however, there was no evidence that perirhinal cortex was tracking the unimodal sound features on those crossmodal objects. After crossmodal learning, we see that this visual shape bias in perirhinal cortex was no longer present – that is, the representation in perirhinal cortex started to look less like the visual features that comprise the objects. Thus, crossmodal learning transformed the perirhinal representations so that they were no longer predominantly grounded in a single visual modality, which may be a mechanism by which object concepts gain their abstraction. We have now tried to be clearer about this interpretation throughout the paper.
  
  Notably, we suggest that experience may change both the crossmodal object representations, as well as the unimodal feature representations. For example, we have previously shown that unimodal visual features are influenced by experience in parallel with the representation of the conjunction (e.g., Liang et al., 2020; Cerebral Cortex). Nevertheless, we remain open to the myriad possible structures of the integrative code that might emerge with experience.
  
  We now clarify these points throughout the manuscript. For example:
  
  “We then examined whether the original representations would change after participants learned how the features were paired together to make specific crossmodal objects, conducting the same analysis described above after crossmodal learning had taken place (Figure 5b). With this analysis, we sought to measure the relationship between the representation for the learned crossmodal object and the original baseline representation for the unimodal features. More specifically, the voxel-wise activity for unimodal feature runs before crossmodal learning was correlated to the voxel-wise activity for crossmodal object runs after crossmodal learning (Figure 5b). Another linear mixed model which included modality as a fixed factor within each ROI revealed that the perirhinal cortex was no longer biased towards visual shape after crossmodal learning (F1,32 = 0.12, p = 0.73), whereas the temporal pole, LOC, V1, and A1 remained biased towards either visual shape or sound (F1,30-32 between 16.20 and 73.42, all p < 0.001, η2 between 0.35 and 0.70).” – pg. 14
  
  “To investigate this effect in perirhinal cortex more specifically, we conducted a linear mixed model to directly compare the change in the visual bias of perirhinal representations from before crossmodal learning to after crossmodal learning (green regions in Figure 5a vs. 5b). Specifically, the linear mixed model included learning day (before vs. after crossmodal learning) and modality (visual feature match to crossmodal object vs. sound feature match to crossmodal object). Results revealed a significant interaction between learning day and modality in the perirhinal cortex (F1,775 = 5.56, p = 0.019, η2 = 0.071), meaning that the baseline visual shape bias observed in perirhinal cortex (green region of Figure 5a) was significantly attenuated with experience (green region of Figure 5b). After crossmodal learning, a given shape no longer invoked significant pattern similarity between objects that had the same shape but differed in terms of what they sounded like. Taken together, these results suggest that prior to learning the crossmodal objects, the perirhinal cortex had a default bias toward representing the visual shape information and was not representing sound information of the crossmodal objects. After crossmodal learning, however, the visual shape bias in perirhinal cortex was no longer present. That is, with crossmodal learning, the representations within perirhinal cortex started to look less like the visual features that comprised the crossmodal objects, providing evidence that the perirhinal representations were no longer predominantly grounded in the visual modality.” – pg. 13
  
  “Importantly, the initial visual shape bias observed in the perirhinal cortex was attenuated by experience (Figure 5, Supplemental Figure S5), suggesting that the perirhinal representations had become abstracted and were no longer predominantly grounded in a single modality after crossmodal learning. One possibility may be that the perirhinal cortex is by default visually driven as an extension to the ventral visual stream,10,11,12 but can act as a polymodal “hub” region for additional crossmodal input following learning.” – pg. 19
  
  (4) The most compelling evidence the authors provide for their theoretical claims is the finding that, in the perirhinal cortex, the unimodal feature representations on Day 2 do not correlate with the multimodal objects they comprise on Day 4. This suggests that the learned multimodal object representations are not combinations of their unimodal features. If unimodal features are not decodable within the congruent object representations, this would support the authors' explicit integrative hypothesis. However, the analyses provided do not go all the way in convincing the reader of this claim. First, the analyses reported do not differentiate between congruent and incongruent objects. If this result in the perirhinal cortex reflects the formation of new multimodal object representations, it should only be true for congruent objects but not incongruent objects. Since the analyses combine congruent and incongruent objects it is not possible to know whether this was the case. Second, just because feature representations on Day 2 do not correlate with multimodal object patterns on Day 4 does not mean that the object representations on Day 4 do not contain featural information. This could be directly tested by correlating feature representations on Day 4 with congruent vs. incongruent object representations on Day 4. It could be that representations in the perirhinal cortex are not stable over time and all representations-including unimodal feature representations-shift between sessions, which could explain these results yet not entail the existence of abstracted object representations.
  
  We thank the reviewer for this suggestion and have conducted the two additional analyses. Specifically, we split the congruent and incongruent conditions and also investigated correlations between unimodal representations on Day 4 with crossmodal object representations on Day 4. There was no significant interaction between modality and congruency in any ROI across or within learning days. One possible explanation for these findings is that both congruent and incongruent crossmodal objects are represented differently from their underlying unimodal features, and all of these representations can transform with experience.
  
  However, the new analyses also revealed that perirhinal cortex was the only region without a modality-specific bias after crossmodal learning (e.g., Day 4 Unimodal Feature runs x Day 4 Crossmodal Object runs; now shown in Supplemental Figure S5). Overall, these results are consistent with the notion of a crossmodal integrative code in perirhinal cortex that has changed with experience and is different from the component unimodal features. Nevertheless, we explore alternative interpretations for how the crossmodal code emerges with experience in the discussion.
  
  “To examine whether these results differed by congruency (i.e., whether any modality-specific biases differed as a function of whether the object was congruent or incongruent), we conducted exploratory linear mixed models for each of the five a priori ROIs across learning days. More specifically, we correlated: 1) the voxel-wise activity for Unimodal Feature Runs before crossmodal learning to the voxel-wise activity for Crossmodal Object Runs before crossmodal learning (Day 2 vs. Day 2), 2) the voxel-wise activity for Unimodal Feature Runs before crossmodal learning to the voxel-wise activity for Crossmodal Object Runs after crossmodal learning (Day 2 vs Day 4), and 3) the voxel-wise activity for Unimodal Feature Runs after crossmodal learning to the voxel-wise activity for Crossmodal Object Runs after crossmodal learning (Day 4 vs Day 4). For each of the three analyses described, we then conducted separate linear mixed models which included modality (visual feature match to crossmodal object vs. sound feature match to crossmodal object) and congruency (congruent vs. incongruent)….There was no significant relationship between modality and congruency in any ROI between Day 2 and Day 2 (F1,346-368 between 0.00 and 1.06, p between 0.30 and 0.99), between Day 2 and Day 4 (F1,346-368 between 0.021 and 0.91, p between 0.34 and 0.89), or between Day 4 and Day 4 (F1,346-368 between 0.01 and 3.05, p between 0.082 and 0.93). However, exploratory analyses revealed that perirhinal cortex was the only region without a modality-specific bias and where the unimodal feature runs were not significantly correlated to the crossmodal object runs after crossmodal learning (Supplemental Figure S5).” – pg. 14
  
  “Taken together, the overall pattern of results suggests that representations of the crossmodal objects in perirhinal cortex were heavily influenced by their consistent visual features before crossmodal learning. However, the crossmodal object representations were no longer influenced by the component visual features after crossmodal learning (Figure 5, Supplemental Figure S5). Additional exploratory analyses did not find evidence of experience-dependent changes in the hippocampus or inferior parietal lobes (Supplemental Figure S4c-e).” – pg. 14
  
  “The voxel-wise matrix for Unimodal Feature runs on Day 4 were correlated to the voxel-wise matrix for Crossmodal Object runs on Day 4 (see Figure 5 in the main text for an example). We compared the average pattern similarity (z-transformed Pearson correlation) between shape (blue) and sound (orange) features specifically after crossmodal learning. Consistent with Figure 5b, perirhinal cortex was the only region without a modality-specific bias. Furthermore, perirhinal cortex was the only region where the representations of both the visual and sound features were not significantly correlated to the crossmodal objects. By contrast, every other region maintained a modality-specific bias for either the visual or sound features. These results suggest that perirhinal cortex representations were transformed with experience, such that the initial visual shape representations (Figure 5a) were no longer grounded in a single modality after crossmodal learning. Furthermore, these results suggest that crossmodal learning formed an integrative code different from the unimodal features in perirhinal cortex, as the visual and sound features were not significantly correlated with the crossmodal objects. * p < 0.05, ** p < 0.01, *** p < 0.001. Horizontal lines within brain regions indicate a significant main effect of modality. Vertical asterisks denote pattern similarity comparisons relative to 0.” – Supplemental Figure S5
  
  “We found that the temporal pole and perirhinal cortex – two anterior temporal lobe structures – came to represent new crossmodal object concepts with learning, such that the acquired crossmodal object representations were different from the representation of the constituent unimodal features (Figure 5, 6). Intriguingly, the perirhinal cortex was by default biased towards visual shape, but that this initial visual bias was attenuated with experience (Figure 3c, 5, Supplemental Figure S5). Within the perirhinal cortex, the acquired crossmodal object concepts (measured after crossmodal learning) became less similar to their original component unimodal features (measured at baseline before crossmodal learning); Figure 5, 6, Supplemental Figure S5. This is consistent with the idea that object representations in perirhinal cortex integrate the component sensory features into a whole that is different from the sum of the component parts, which might be a mechanism by which object concepts obtain their abstraction…. As one solution to the crossmodal binding problem, we suggest that the temporal pole and perirhinal cortex form unique crossmodal object representations that are different from the distributed features in sensory cortex (Figure 4, 5, 6, Supplemental Figure S5). However, the nature by which the integrative code is structured and formed in the temporal pole and perirhinal cortex following crossmodal experience – such as through transformations, warping, or other factors – is an open question and an important area for future investigation.” – pg. 18
  
  In sum, the authors have collected a fantastic dataset that has the potential to answer questions about the formation of multimodal object representations in the brain. A more precise delineation of different theoretical accounts and additional analyses are needed to provide convincing support for the theory that “explicit integrative” multimodal object representations are formed during learning.
  
  We thank the reviewer for the positive comments and helpful feedback. We hope that our changes to our wording and clarifications to our methodology now more clearly supports the central goal of our study: to find evidence of crossmodal integrative coding different from the original unimodal feature parts in anterior temporal lobe structures. We furthermore agree that future research is needed to delineate the structure of the integrative code that emerges with experience in the anterior temporal lobes.
  
  Reviewer #3 (Public Review):
  
  This paper uses behavior and functional brain imaging to understand how neural and cognitive representations of visual and auditory stimuli change as participants learn associations among them. Prior work suggests that areas in the anterior temporal (ATL) and perirhinal cortex play an important role in learning/representing cross-modal associations, but the hypothesis has not been directly tested by evaluating behavior and functional imaging before and after learning cross- modal associations. The results show that such learning changes both the perceived similarities amongst stimuli and the neural responses generated within ATL and perirhinal regions, providing novel support for the view that cross-modal learning leads to a representational change in these regions.
  
  This work has several strengths. It tackles an important question for current theories of object representation in the mind and brain in a novel and quite direct fashion, by studying how these representations change with cross-modal learning. As the authors note, little work has directly assessed representational change in ATL following such learning, despite the widespread view that ATL is critical for such representation. Indeed, such direct assessment poses several methodological challenges, which the authors have met with an ingenious experimental design. The experiment allows the authors to maintain tight control over both the familiarity and the perceived similarities amongst the shapes and sounds that comprise their stimuli so that the observed changes across sessions must reflect learned cross-modal associations among these. I especially appreciated the creation of physical objects that participants can explore and the approach to learning in which shapes and sounds are initially experienced independently and later in an associated fashion. In using multi-echo MRI to resolve signals in ventral ATL, the authors have minimized a key challenge facing much work in this area (namely the poor SNR yielded by standard acquisition sequences in ventral ATL). The use of both univariate and multivariate techniques was well-motivated and helpful in testing the central questions. The manuscript is, for the most part, clearly written, and nicely connects the current work to important questions in two literatures, specifically (1) the hypothesized role of the perirhinal cortex in representing/learning complex conjunctions of features and (2) the tension between purely embodied approaches to semantic representation vs the view that ATL regions encode important amodal/crossmodal structure.
  
  There are some places in the manuscript that would benefit from further explanation and methodological detail. I also had some questions about the results themselves and what they signify about the roles of ATL and the perirhinal cortex in object representation.
  
  We thank the reviewer for their positive feedback and address the comments in the below point-by-point responses.
  
  (A) I found the terms "features" and "objects" to be confusing as used throughout the manuscript, and sometimes inconsistent. I think by "features" the authors mean the shape and sound stimuli in their experiment. I think by "object" the authors usually mean the conjunction of a shape with a sound---for instance, when a shape and sound are simultaneously experienced in the scanner, or when the participant presses a button on the shape and hears the sound. The confusion comes partly because shapes are often described as being composed of features, not features in and of themselves. (The same is sometimes true of sounds). So when reading "features" I kept thinking the paper referred to the elements that went together to comprise a shape. It also comes from ambiguous use of the word object, which might refer to (a) the 3D- printed item that people play with, which is an object, or (b) a visually-presented shape (for instance, the localizer involved comparing an "object" to a "phase-scrambled" stimulus---here I assume "object" refers to an intact visual stimulus and not the joint presentation of visual and auditory items). I think the design, stimuli, and results would be easier for a naive reader to follow if the authors used the terms "unimodal representation" to refer to cases where only visual or auditory input is presented, and "cross-modal" or "conjoint" representation when both are present.
  
  We thank the reviewer for this suggestion and agree. We have replaced the terms “features” and “objects” with “unimodal” and “crossmodal” in the title, text, and figures throughout the manuscript for consistency (i.e., “crossmodal binding problem”). To simplify the terminology, we have also removed the localizer results.
  
  (B) There are a few places where I wasn't sure what exactly was done, and where the methods lacked sufficient detail for another scientist to replicate what was done. Specifically:
  
  (1) The behavioral study assessing perceptual similarity between visual and auditory stimuli was unclear. The procedure, stimuli, number of trials, etc, should be explained in sufficient detail in methods to allow replication. The results of the study should also minimally be reported in the supplementary information. Without an understanding of how these studies were carried out, it was very difficult to understand the observed pattern of behavioral change. For instance, I initially thought separate behavioral blocks were carried out for visual versus auditory stimuli, each presented in isolation; however, the effects contrast congruent and incongruent stimuli, which suggests these decisions must have been made for the conjoint presentation of both modalities. I'm still not sure how this worked. Additionally, the manuscript makes a brief mention that similarity judgments were made in the context of "all stimuli," but I didn't understand what that meant. Similarity ratings are hugely sensitive to the contrast set with which items appear, so clarity on these points is pretty important. A strength of the design is the contention that shape and sound stimuli were psychophysically matched, so it is important to show the reader how this was done and what the results were.
  
  We agree and apologize for the lack of sufficient detail in the original manuscript. We now include much more detail about the similarity rating task. The methodology and results of the behavioral rating experiments are now shown in Supplemental Figure S1. In Figure S1a, the similarity ratings are visualized on a multidimensional scaling plot. The triangular geometry for shape (blue) and sound (red) indicate that the subjective similarity was equated within each unimodal feature across individual participants. Quantitatively, there was no difference in similarity between the congruent and incongruent pairings in Figure S1b and Figure S1c prior to crossmodal learning. In addition to providing more information on these methods in the Supplemental Information, we also now provide a more detailed description of the task in the manuscript itself. For convenience, we reproduce these sections below.
  
  “Pairwise Similarity Task. Using the same task as the stimulus validation procedure (Supplemental Figure S1a), participants provided similarity ratings for all combinations of the 3 validated shapes and 3 validated sounds (each of the six features were rated in the context of every other feature in the set, with 4 repeats of the same feature, for a total of 72 trials). More specifically, three stimuli were displayed on each trial, with one at the top and two at the bottom of the screen in the same procedure as we have used previously27. The 3D shapes were visually displayed as a photo, whereas sounds were displayed on screen in a box that could be played over headphones when clicked with the mouse. The participant made an initial judgment by selecting the more similar stimulus on the bottom relative to the stimulus on the top. Afterwards, the participant made a similarity rating between each bottom stimulus with the top stimulus from 0 being no similarity to 5 being identical. This procedure ensured that ratings were made relative to all other stimuli in the set.”– pg. 28
  
  “Pairwise similarity task and results. In the initial stimulus validation experiment, participants provided pairwise ratings for 5 sounds and 3 shapes. The shapes were equated in their subjective similarity that had been selected from a well-characterized perceptually uniform stimulus space27 and the pairwise ratings followed the same procedure as described in ref 27. Based on this initial experiment, we then selected the 3 sounds from the that were most closely equated in their subjective similarity. (a) 3D-printed shapes were displayed as images, whereas sounds were displayed in a box that could be played when clicked by the participant. Ratings were averaged to produce a similarity matrix for each participant, and then averaged to produce a group-level similarity matrix. Shown as triangular representational geometries recovered from multidimensional scaling in the above, shapes (blue) and sounds (orange) were approximately equated in their subjective similarity. These features were then used in the four-day crossmodal learning task. (b) Behavioral results from the four-day crossmodal learning task paired with multi-echo fMRI described in the main text. Before crossmodal learning, there was no difference in similarity between shape and sound features associated with congruent objects compared to incongruent objects – indicating that similarity was controlled at the unimodal feature-level. After crossmodal learning, we observed a robust shift in the magnitude of similarity. The shape and sound features associated with congruent objects were now significantly more similar than the same shape and sound features associated with incongruent objects (p < 0.001), evidence that crossmodal learning changed how participants experienced the unimodal features (observed in 17/18 participants). (c) We replicated this learning-related shift in pattern similarity with a larger sample size (n = 44; observed in 38/44 participants). *** denotes p < 0.001. Horizontal lines denote the comparison of congruent vs. incongruent conditions. – Supplemental Figure S1
  
  (2) The experiences through which participants learned/experienced the shapes and sounds were unclear. The methods mention that they had one minute to explore/palpate each shape and that these experiences were interleaved with other tasks, but it is not clear what the other tasks were, how many such exploration experiences occurred, or how long the total learning time was. The manuscript also mentions that participants learn the shape-sound associations with 100% accuracy but it isn't clear how that was assessed. These details are important partly b/c it seems like very minimal experience to change neural representations in the cortex.
  
  We apologize for the lack of detail and agree with the reviewer’s suggestions – we now include much more information in the methods section. Each behavioral day required about 1 hour of total time to complete, and indeed, participants rapidly learned their associations with minimal experience. For example:
  
  “Behavioral Tasks. On each behavioral day (Day 1 and Day 3; Figure 2), participants completed the following tasks, in this order: Exploration Phase, one Unimodal Feature 1-back run (26 trials), Exploration Phase, one Crossmodal 1-back run (26 trials), Exploration Phase, Pairwise Similarity Task (24 trials), Exploration Phase, Pairwise Similarity Task (24 trials), Exploration Phase, Pairwise Similarity Task (24 trials), and finally, Exploration Phase. To verify learning on Day 3, participants also additionally completed a Learning Verification Task at the end of the session. – pg. 27
  
  “The overall procedure ensured that participants extensively explored the unimodal features on Day 1 and the crossmodal objects on Day 3. The Unimodal Feature and the Crossmodal Object 1-back runs administered on Day 1 and Day 3 served as practice for the neuroimaging sessions on Day 2 and Day 4, during which these 1-back tasks were completed. Each behavioral session required less than 1 hour of total time to complete.” – pg. 27
  
  “Learning Verification Task (Day 3 only). As the final task on Day 3, participants completed a task to ensure that participants successfully formed their crossmodal pairing. All three shapes and sounds were randomly displayed in 6 boxes on a display. Photos of the 3D shapes were shown, and sounds were played by clicking the box with the mouse cursor. The participant was cued with either a shape or sound, and then selected the corresponding paired feature. At the end of Day 3, we found that all participants reached 100% accuracy on this task (10 trials).” – pg. 29
  
  (3) I didn't understand the similarity metric used in the multivariate imaging analyses. The manuscript mentions Z-scored Pearson's r, but I didn't know if this meant (a) many Pearson coefficients were computed and these were then Z-scored, so that 0 indicates a value equal to the mean Pearson correlation and 1 is equal to the standard deviation of the correlations, or (b) whether a Fisher Z transform was applied to each r (so that 0 means r was also around 0). From the interpretation of some results, I think the latter is the approach taken, but in general, it would be helpful to see, in Methods or Supplementary information, exactly how similarity scores were computed, and why that approach was adopted. This is particularly important since it is hard to understand the direction of some key effects.
  
  The reviewer is correct that the Fisher Z transform was applied to each individual r before averaging the correlations. This approach is generally recommended when averaging correlations (see Corey, Dunlap, & Burke, 1998). We are now clearer on this point in the manuscript:
  
  “The z-transformed Pearson’s correlation coefficient was used as the distance metric for all pattern similarity analyses. More specifically, each individual Pearson correlation was Fisher z-transformed and then averaged (see 61).” – pg. 32
  
  (C) From Figure 3D, the temporal pole mask appears to exclude the anterior fusiform cortex (or the ventral surface of the ATL generally). If so, this is a shame, since that appears to be the locus most important to cross-modal integration in the "hub and spokes" model of semantic representation in the brain. The observation in the paper that the perirhinal cortex seems initially biased toward visual structure while more superior ATL is biased toward auditory structure appears generally consistent with the "graded hub" view expressed, for instance, in our group's 2017 review paper (Lambon Ralph et al., Nature Reviews Neuroscience). The balance of visual- versus auditory-sensitivity in that work appears balanced in the anterior fusiform, just a little lateral to the anterior perirhinal cortex. It would be helpful to know if the same pattern is observed for this area specifically in the current dataset.
  
  We thank the reviewer for this suggestion. After close inspection of Lambon Ralph et al. (2017), we believe that our perirhinal cortex mask appears to be overlapping with the ventral ATL/anterior fusiform region that the reviewer mentions. See Author response image 1 for a visual comparison:
  
  Author response image 1.
  
  The top four figures are sampled from Lambon Ralph et al (2017), whereas the bottom two figures visualize our perirhinal cortex mask (white) and temporal pole mask (dark green) relative to the fusiform cortex. The ROIs visualized were defined from the Harvard-Oxford atlas.
  
  We now mention this area of overlap in our manuscript and link it to the hub and spokes model:
  
  “Notably, our perirhinal cortex mask overlaps with a key region of the ventral anterior temporal lobe thought to be the central locus of crossmodal integration in the “hub and spokes” model of semantic representations.9,50 – pg. 20
  
  (D) While most effects seem robust from the information presented, I'm not so sure about the analysis of the perirhinal cortex shown in Figure 5. This compares (I think) the neural similarity evoked by a unimodal stimulus ("feature") to that evoked by the same stimulus when paired with its congruent stimulus in the other modality ("object"). These similarities show an interaction with modality prior to cross-modal association, but no interaction afterward, leading the authors to suggest that the perirhinal cortex has become less biased toward visual structure following learning. But the plots in Figures 4a and b are shown against different scales on the y-axes, obscuring the fact that all of the similarities are smaller in the after-learning comparison. Since the perirhinal interaction was already the smallest effect in the pre-learning analysis, it isn't really surprising that it drops below significance when all the effects diminish in the second comparison. A more rigorous test would assess the reliability of the interaction of comparison (pre- or post-learning) with modality. The possibility that perirhinal representations become less "visual" following cross-modal learning is potentially important so a post hoc contrast of that kind would be helpful.
  
  We apologize for the lack of clarity. We conducted a linear mixed model to assess the interaction between modality and crossmodal learning day (before and after crossmodal learning) in the perirhinal cortex as described by the reviewer. The critical interaction was significant, which is now clarified in the text as well as in the rescaled figure plots.
  
  “To investigate this effect in perirhinal cortex more specifically, we conducted a linear mixed model to directly compare the change in the visual bias of perirhinal representations from before crossmodal learning to after crossmodal learning (green regions in Figure 5a vs. 5b). Specifically, the linear mixed model included learning day (before vs. after crossmodal learning) and modality (visual feature match to crossmodal object vs. sound feature match to crossmodal object). Results revealed a significant interaction between learning day and modality in the perirhinal cortex (F1,775 = 5.56, p = 0.019, η2 = 0.071), meaning that the baseline visual shape bias observed in perirhinal cortex (green region of Figure 5a) was significantly attenuated with experience (green region of Figure 5b). After crossmodal learning, a given shape no longer invoked significant pattern similarity between objects that had the same shape but differed in terms of what they sounded like. Taken together, these results suggest that prior to learning the crossmodal objects, the perirhinal cortex had a default bias toward representing the visual shape information and was not representing sound information of the crossmodal objects. After crossmodal learning, however, the visual shape bias in perirhinal cortex was no longer present. That is, with crossmodal learning, the representations within perirhinal cortex started to look less like the visual features that comprised the crossmodal objects, providing evidence that the perirhinal representations were no longer predominantly grounded in the visual modality.” – pg. 13
  
  We note that not all effects drop in Figure 5b (even in regions with a similar numerical pattern similarity to PRC, like the hippocampus – also see Supplemental Figure S5 for a comparison for patterns only on Day 4), suggesting that the change in visual bias in PRC is not simply due to noise.
  
  “Importantly, the change in pattern similarity in the perirhinal cortex across learning days (Figure 5) is unlikely to be driven by noise, poor alignment of patterns across sessions, or generally reduced responses. Other regions with numerically similar pattern similarity to perirhinal cortex did not change across learning days (e.g., visual features x crossmodal objects in A1 in Figure 5; the exploratory ROI hippocampus with numerically similar pattern similarity to perirhinal cortex also did not change in Supplemental Figure S4c-d).” – pg. 14
  
  (E) Is there a reason the authors did not look at representation and change in the hippocampus? As a rapid-learning, widely-connected feature-binding mechanism, and given the fairly minimal amount of learning experience, it seems like the hippocampus would be a key area of potential import for the cross-modal association. It also looks as though the hippocampus is implicated in the localizer scan (Figure 3c).
  
  We thank the reviewer for this suggestion and now include additional analyses for the hippocampus. We found no evidence of crossmodal integrative coding different from the unimodal features. Rather, the hippocampus seems to represent the convergence of unimodal features, as evidenced by …[can you give some pithy description for what is meant by “convergence” vs “integration”?]. We provide these results in the Supplemental Information and describe them in the main text:
  
  “Analyses for the hippocampus (HPC) and inferior parietal lobe (IPL). (a) In the visual vs. auditory univariate analysis, there was no visual or sound bias in HPC, but there was a bias towards sounds that increased numerically after crossmodal learning in the IPL. (b) Pattern similarity analyses between unimodal features associated with congruent objects and incongruent objects. Similar to Supplemental Figure S3, there was no main effect of congruency in either region. (c) When we looked at the pattern similarity between Unimodal Feature runs on Day 2 to Crossmodal Object runs on Day 2, we found that there was significant pattern similarity when there was a match between the unimodal feature and the crossmodal object (e.g., pattern similarity > 0). This pattern of results held when (d) correlating the Unimodal Feature runs on Day 2 to Crossmodal Object runs on Day 4, and (e) correlating the Unimodal Feature runs on Day 4 to Crossmodal Object runs on Day 4. Finally, (f) there was no significant pattern similarity between Crossmodal Object runs before learning correlated to Crossmodal Object after learning in HPC, but there was significant pattern similarity in IPL (p < 0.001). Taken together, these results suggest that both HPC and IPL are sensitive to visual and sound content, as the (c, d, e) unimodal feature-level representations were correlated to the crossmodal object representations irrespective of learning day. However, there was no difference between congruent and incongruent pairings in any analysis, suggesting that HPC and IPL did not represent crossmodal objects differently from the component unimodal features. For these reasons, HPC and IPL may represent the convergence of unimodal feature representations (i.e., because HPC and IPL were sensitive to both visual and sound features), but our results do not seem to support these regions in forming crossmodal integrative coding distinct from the unimodal features (i.e., because representations in HPC and IPL did not differentiate the congruent and incongruent conditions and did not change with experience). * p < 0.05, ** p < 0.01, *** p < 0.001. Asterisks above or below bars indicate a significant difference from zero. Horizontal lines within brain regions in (a) reflect an interaction between modality and learning day, whereas horizontal lines within brain regions in reflect main effects of (b) learning day, (c-e) modality, or (f) congruency.” – Supplemental Figure S4.
  
  “Notably, our perirhinal cortex mask overlaps with a key region of the ventral anterior temporal lobe thought to be the central locus of crossmodal integration in the “hub and spokes” model of semantic representations.9,50 However, additional work has also linked other brain regions to the convergence of unimodal representations, such as the hippocampus51,52,53 and inferior parietal lobes.54,55 This past work on the hippocampus and inferior parietal lobe does not necessarily address the crossmodal binding problem that was the main focus of our present study, as previous findings often do not differentiate between crossmodal integrative coding and the convergence of unimodal feature representations per se. Furthermore, previous studies in the literature typically do not control for stimulus-based factors such as experience with unimodal features, subjective similarity, or feature identity that may complicate the interpretation of results when determining regions important for crossmodal integration. Indeed, we found evidence consistent with the convergence of unimodal feature-based representations in both the hippocampus and inferior parietal lobes (Supplemental Figure S4), but no evidence of crossmodal integrative coding different from the unimodal features. The hippocampus and inferior parietal lobes were both sensitive to visual and sound features before and after crossmodal learning (see Supplemental Figure S4c-e). Yet the hippocampus and inferior parietal lobes did not differentiate between the congruent and incongruent conditions or change with experience (see Supplemental Figure S4).” – pg. 20
  
  (F) The direction of the neural effects was difficult to track and understand. I think the key observation is that TP and PRh both show changes related to cross-modal congruency - but still it would be helpful if the authors could articulate, perhaps via a schematic illustration, how they think representations in each key area are changing with the cross-modal association. Why does the temporal pole come to activate less for congruent than incongruent stimuli (Figure 3)? And why do TP responses grow less similar to one another for congruent relative to incongruent stimuli after learning (Figure 4)? Why are incongruent stimulus similarities anticorrelated in their perirhinal responses following cross-modal learning (Figure 6)?
  
  We thank the author for identifying this issue, which was also raised by the other reviewers. The reviewer is correct that the key observation is that the TP and PRC both show changes related to crossmodal congruency (given that the unimodal features were equated in the methodological design). However, the structure of the integrative code is less clear, which we now emphasize in the main text. Our findings provide evidence of a crossmodal integrative code that is different from the unimodal features, and future studies are needed to better understand the structure of how such a code might emerge. We now more clearly highlight this distinction throughout the paper:
  
  “By contrast, perirhinal cortex may be involved in pattern separation following crossmodal experience. In our task, participants had to differentiate congruent and incongruent objects constructed from the same three shape and sound features (Figure 2). An efficient way to solve this task would be to form distinct object-level outputs from the overlapping unimodal feature-level inputs such that congruent objects are made to be orthogonal from the representations before learning (i.e., measured as pattern similarity equal to 0 in the perirhinal cortex; Figure 5b, 6, Supplemental Figure S5), whereas non-learned incongruent objects could be made to be dissimilar from the representations before learning (i.e., anticorrelation, measured as patten similarity less than 0 in the perirhinal cortex; Figure 6). Because our paradigm could decouple neural responses to the learned object representations (on Day 4) from the original component unimodal features at baseline (on Day 2), these results could be taken as evidence of pattern separation in the human perirhinal cortex.11,12 However, our pattern of results could also be explained by other types of crossmodal integrative coding. For example, incongruent object representations may be less stable than congruent object representations, such that incongruent objects representation are warped to a greater extent than congruent objects (Figure 6).” – pg. 18
  
  “As one solution to the crossmodal binding problem, we suggest that the temporal pole and perirhinal cortex form unique crossmodal object representations that are different from the distributed features in sensory cortex (Figure 4, 5, 6, Supplemental Figure S5). However, the nature by which the integrative code is structured and formed in the temporal pole and perirhinal cortex following crossmodal experience – such as through transformations, warping, or other factors – is an open question and an important area for future investigation. Furthermore, these anterior temporal lobe structures may be involved with integrative coding in different ways. For example, the crossmodal object representations measured after learning were found to be related to the component unimodal feature representations measured before learning in the temporal pole but not the perirhinal cortex (Figure 5, 6, Supplemental Figure S5). Moreover, pattern similarity for congruent shape-sound pairs were lower than the pattern similarity for incongruent shape-sound pairs after crossmodal learning in the temporal pole but not the perirhinal cortex (Figure 4b, Supplemental Figure S3a). As one interpretation of this pattern of results, the temporal pole may represent new crossmodal objects by combining previously learned knowledge. 8,9,10,11,13,14,15,33 Specifically, research into conceptual combination has linked the anterior temporal lobes to compound object concepts such as “hummingbird”.34,35,36 For example, participants during our task may have represented the sound-based “humming” concept and visually-based “bird” concept on Day 1, forming the crossmodal “hummingbird” concept on Day 3; Figure 1, 2, which may recruit less activity in temporal pole than an incongruent pairing such as “barking-frog”. For these reasons, the temporal pole may form a crossmodal object code based on pre-existing knowledge, resulting in reduced neural activity (Figure 3d) and pattern similarity towards features associated with learned objects (Figure 4b).” – pg. 18
  
  This work represents a key step in our advancing understanding of object representations in the brain. The experimental design provides a useful template for studying neural change related to the cross-modal association that may prove useful to others in the field. Given the broad variety of open questions and potential alternative analyses, an open dataset from this study would also likely be a considerable contribution to the field.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.31.504599v1
www.biorxiv.org www.biorxiv.org

Unfolding and identification of membrane proteins in situ

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  Gavanetto et al. propose an interesting method to identify membrane proteins based on the analysis of single-molecule AFM (smAFM) force-extension traces obtained from native plasma membranes. In the proposed pipeline, the authors use smAFM to non-specifically probe isolated plasma membranes by recording a large number (millions) of force-extension traces. While, as expected, most of them lack any binding or represent spurious events, the authors use an unsupervised clustering algorithm to identify groups of force-extension curves with a similar mechanical pattern, suggesting that each cluster corresponds to a unique protein species that can be fingerprinted by its specific force-extension pattern. By implementing a Bayesian framework, the authors contrast the identified groups with proteomics databases, which provide the most likely proteins that correspond to the identified force-extension clusters. A set of control experiments complements the manuscript to validate the proposed methodology, such as the application of their pipeline using purified samples or overexpressing a specific protein species to enrich its population.
  
  The primary strength of the manuscript is its originality, as it proposes a novel application of smAFM as a protein-detection method that can be applied in native samples. This methodology combines ingredients from conventional mass spectrometry and cryoEM; the contour length released upon extending a protein is a direct measure of its sequence extension (related to its mass), but the force pattern contains insightful information about the protein's structure. In this sense, the authors' proposal is very smart. However, the relationship between protein structure and mechanics is far from straightforward, and here perhaps lies one of the main limitations of the proposed method. This is particularly true for the case of membrane proteins, where we cannot talk about protein unfolding in its classical sense but rather about pullout events which is likely what each peak corresponds to (indeed, the authors speak throughout the paper about unfolding events, which I believe is not the correct term).
  
  We fully agree with the semantics concern of reviewer #3 about the term unfolding. A membrane protein when pulled with the tip of the AFM is pulled out of the membrane (see 2 in the image below) and, simultaneously, the segment that is pulled out unfolds (see 3). To our knowledge, force peaks corresponding to a contour length equal to 2 where not consistently observed or reported (when e.g. a transmembrane alpha helix is out of the membrane but folded).
  
  Since the field evolved with the practice of using the term ‘unfolding’ even for membrane proteins (see for instance (Kessler and Gaub, 2006; Oesterhelt et al., 2000; Yu et al., 2017) and many others), we would prefer to stick with this term.
  
  In the context of membrane proteins the term unfolding therefore refers to at least the tertiary structure of the protein, because it is not clear when and at which timescale the secondary structures really unfolds.
  
  We pointed this out in Line 131 (and following Lines).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/732933v4
www.biorxiv.org www.biorxiv.org

New submission 29/09/2022, 10:03:17

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This is a well performed study to demonstrate the antiviral function and viral antagonism of the dynein activating adapter NINL. The results are clearly presented to support the conclusions.
  
  This reviewer has only one minor suggestion to improve the manuscript.
  
  Add a discussion (1) why the folds of reduction among VSV, SinV and CVB3 were different in the NINL KO cells and (2) why the folds of reduction of VSV in the NINL KO A549 and U-2 OS cells.
  
  Thank you for this suggestion. We have amended the results section to include additional information about these observations and possible explanations for these results.
  
  Reviewer #2 (Public Review):
  
  This manuscript is of interest to readers for host-viral co-evolution. This study has identified a novel human-virus interaction point NINL-viral 3C protease, where NINL is actively evolving upon the selection pressure against viral infect and viral 3Cpro cleavage. This study demonstrates that the viral 3Cpros-mediated cleavage of host NINL disrupts its adaptor function in dynein motor-mediated cargo transportation to the centrosome, and this disruption is both host- and virus-specific. In addition, this paper indicates the role of NINL in the IFN signaling pathway. Data shown in this manuscript support the major claims.
  
  In this paper, the authors have identified a novel host-viral interaction, where viral 3C proteases (3Cpro) cleave at specific sites on a host activating adaptor of dynein intracellular transportation machinery, ninein-like protein (NINL or NLP in short) and inhibit its role in the antiviral innate immune response.
  
  The authors firstly found that, unlike other activating adaptors of dynein intracellular transportation machinery, NINL (or NLP) is rapidly evolving. Thus, the authors hypothesized that this rapid evolution of NINL was caused by its interaction with viral infection. The authors found that viruses replicated higher in NINL knock-out (KO) cells than in wild-type (WT) cells and the replication level was not attenuated upon IFNa treatment in NINL KO cells, unlike in WT cells. Next, the authors investigated the role of NINL in type I IFN-mediated immune response and found that the induction of Janus kinase/signal transducer and activation of transcription (JAK/STAT) genes were attenuated in NINL KO cells upon IFNa treatment. The author further showed that the reduction of replication IFNa sensitive Vaccinia virus mutant upon IFNa treatment was decreased in NINL KO A549 cells compared to WT cells. The authors further showed that the virus antagonized NINL function by cleaving it with viral 3Cpro at its specific cleavage sites. NINL-peroxisome ligation-based cargo trafficking visualization assay showed that the redistribution of immobile membrane-bound peroxisome was disrupted by cleavage of NINL or viral infection.
  
  This paper has revealed a novel host-virus interaction, and an antiviral function of a rapidly evolving activating adaptor of dynein intracellular transportation machinery, NINL. The major conclusions of this paper are well supported by data, but several aspects can be improved.
  
  1) It would be necessary to include a couple of other pathways involved in innate immune response besides JAK/STAT pathway.
  
  We are very interested in this question as well. Our RNAseq data (Supplementary file 4 and Figure 3 – Figure supplement 4) suggest that there are several transcriptional changes that result from NINL KO. Our goal in this manuscript was to focus on IFN signaling in order to understand this specific effect of NINL KO since it might have wide-ranging consequences on viral replication. While we agree that broadening our studies to other signaling pathways, including other pathways involved in innate immune response, is a good idea, we feel that those experiments would take longer than two months to perform and therefore fall outside of the scope of this paper.
  
  2) The in-cell cleavages of NINL by viral 3Cpros were well demonstrated and supported by data of high quality. A direct biochemical demonstration of the cleavage is needed with purified proteins.
  
  We agree with the reviewer that a direct biochemical cleavage assay would further demonstrate that viral 3Cpros cleave NINL specifically. However, our attempts to purify full-length NINL have been unsuccessful due to solubility issues (see example gel below), which is not surprising given that NINL is a >150 kDa human protein that has multiple surfaces that bind to other human proteins. As such, we focused our efforts on in-cell cleavage assays using specificity controls for cleavage. Specifically, we used catalytically inactive CVB3 3Cpro to show a dependence on protease catalytic activity and a variety of NINL constructs in which the glutamine in the P1 position is replaced by an arginine to show site specificity of cleavage. Notably, the cleavage sites in NINL that we mapped using this mutagenesis were predicted bioinformatically from known sites of 3Cpro cleavage in viral polyproteins, further indicating that cleavage is 3Cpro-dependent. We believe these results thus demonstrate that cleavage of NINL is dependent on viral protease activity and occurs in a sequence-specific manner. In light of the difficulty of purifying full-length NINL that would make biochemical experiments very challenging and likely take longer than two months to perform, we believe that our in cell data should be sufficient to demonstrate activity-dependent site-specific cleavage of NINL by viral 3Cpros.
  
  Sypro stained SDS-PAGE gel showing supernatant (S) and insoluble pellet (P) fractions across multiple purifications with altered buffer conditions.
  
  3) The author used different cell types in different assays. Explain the rationale with a sentence for each assay.
  
  Throughout this work, we choose to use a variety of cell lines for specific purposes. A549 cells were chosen as our main cell line as they are widely used in virology, are susceptible to the viruses we used, are responsive to interferon, and express both NINL and our control NIN at moderate levels. In the case of our virology and ISG expression data, we performed the same experiments with NINL KOs in other cell lines confirm that the phenotypes we observed in A549 cells could be attributed to the absence of NINL rather than off-target CRISPR perturbations or cell-line specific effects. All cleavage experiments were performed in HEK293T for their ease of transfection and protein expression. The inducible peroxisome trafficking assays were performed in U-2 OS cells as their morphology is ideal for observing the spatial organization of peroxisomes via confocal microscopy, and based on the fact that we had recapitulated the virology results and ISG expression results in those cells. At the suggestion of the reviewer, we have amended the text to include rationales where appropriate.
  
  4) While cell-based assays well support the conclusions in this paper, further demonstration in vivo would be helpful to provide an implication on the pathogenicity impact of NINL.
  
  We agree. However, we believe that examining the impact of the loss of or antagonism of NINL on the pathogenesis of infectious diseases in an in vivo model is outside the scope of this study.
  
  In summary, this manuscript contributes to a novel antiviral target. In addition, it is important to understand the host-virus co-evolution. The use of the evolution signatures to identify the "conflict point" between host and virus is novel.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.07.11.499552v1
www.medrxiv.org www.medrxiv.org

Correlation between leukocyte phenotypes and prognosis of amyotrophic lateral sclerosis: a longitudinal cohort study

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Cui and colleagues have performed a longitudinal analysis of blood cell counts in a cohort of ALS patients. The major findings include increases in neutrophils and monocytes that negatively correlated with ALSFRS-R score, but not disease progression rate. Increases in NK and central memory TH2 T cells correlated with a lower risk of death, while increased CD4 CD45RA effector memory and CD8 T cells were correlated with a higher risk of death.
  
  Strengths of the study include the sample size and effort to broadly include data.
  
  Thank you for the positive comment.
  
  Limitations of the study include indication bias, as the authors acknowledge, because the timing of the blood draws is not predefined. The specific review for possibility of infection does not, in this reviewer's opinion, sufficiently address this potential for bias. Also concerning is the fact that half the subjects have only a single measurement, and how well the findings generalize to more or late measurements is not clear. Similarly, the number of later measurements driving some of the main findings is much lower, further raising concern about the potential bias. Given these issues, one really would want to see disease controls, and how the different cell counts change in another disease. Finally, there is not discussion about how or whether treatments, or changes in treatment, could influence observed counts.
  
  We agree with the reviewer regarding indication bias and that is precisely why we performed the sensitivity analyses including 1) restricting the analysis to the first cell measure of each patient and 2) excluding cell measures with signs of ongoing infection at the time of blood draw. Reassuringly, both analyses provided rather similar results as those of the main analysis. We also agree with the reviewer regarding the varying numbers of measurements between patients. This is an unavoidable challenge to any longitudinal study of ALS patients, primarily due to the high mortality rate of this patient group. We have now added this limitation to the discussion:
  
  “First, the main cohort was heterogeneous in terms of the numbers of cell measurements and the time intervals between measurements, as the timing of blood sampling was not predefined. Indication bias due to, for example, ongoing infections might therefore be a concern. The sensitivity analysis excluding all samples taken at the time of infections provided however rather similar results. Further, the longitudinal analysis of cell counts should be interpreted with caution because not all patients contributed repeated cell measurements. This is however an unavoidable problem for any longitudinal study of ALS patients, given the high mortality rate of this patient group. Regardless, when focusing on the first cell measures, we obtained similar results as in the main analysis.”
  
  We further agree with the reviewer regarding the use of disease control. We have access to a cohort of patients with relapsing-remitting MS (RRMS) treated by rituximab (n=34), who had been measured with all the studied cell populations at the start of treatment and the 6-month follow-up. These cell measurements were processed during the same time-period using the identical setup at Karolinska University Hospital as the ones studied in the present study. In brief, we found different longitudinal changes of the studied immune cell populations between RRMS patients and ALS patients (please see below figure for details). The declining B cells are most likely due to rituximab treatment.
  
  Given the largely different disease mechanisms, phenotypes, and treatments between RRMS and ALS, we are not confident that RRMS would be a good disease control for the present study. We are certainly willing to reconsider our position if the reviewer and editors would disagree with us. We have regardless now added discussion about this in the manuscript:
  
  “It would therefore be interesting to compare ALS with other diseases, especially other neurodegenerative diseases, regarding the studied cell counts, in terms of both their longitudinal trajectories during disease course and their prognostic values in predicting patient outcome.”
  
  Finally, we agree that it is interesting to consider treatment in the analysis of cell counts. Among the ALS patients of the main cohort, majority (89.6%) were treated with Riluzole. We have now added a supplementary figure to demonstrate the leukocyte counts before and after start of Riluzole treatment. The corresponding analysis is however not possible for the FlowC cohort as majority of the patients started Riluzole treatment around time of diagnosis and almost all measurements were taken after Riluzole treatment. Th17 of CD4+ CM cells CD4+ EMRA cells CD8+ T cells Naïve CD8+ T cells CD8+ EM cells CD8+ CM cells CD8+ EMRA cells CD4+ HLA-DR+ CD38- cells CD4+ HLA-DR+ CD38+ cells CD8+ HLA-DR+ CD38- cells CD8+ HLA-DR+ CD38+ cells.
  
  We have now added this analysis to Methods and Results, including a new Figure 1—figure supplement 2.
  
  “To evaluate whether ALS treatment would influence the cell counts, we further visualized the temporal patterns of differential leukocyte counts before and after Riluzole treatment.”
  
  “The levels of leukocytes, neutrophils and monocytes increased, whereas the levels of lymphocytes decreased, after Riluzole treatment, compared with before such treatment (Figure 1—figure supplement 2).”
  
  Reviewer #2 (Public Review):
  
  Cui et al. investigated the correlation of immune profiles in ALS patients to functional status (by ALSFRS-R score), disease progression (rate of ALSFRS-R decline) and/or risk of death (or invasive ventilation use). The study longitudinally assessed basic immune profiles from a large cohort of ALS patients (n=288). Additionally, they deeply immunophenotyped a subset of ALS patients (n=92) to examine immune cell subtypes on ALS status, progression rate, and survival. The longitudinal design, deep immunophenotyping, and large cohort are significant strengths. Using various statistical models, the authors found leukocyte, neutrophil, and monocyte counts increased gradually over time as ALSFRS-R score declined. Within lymphocyte subpopulations, increasing natural killer cells and Th2-diffrentiated CD4+ central memory T cell counts correlated with a lower risk of death. Increasing CD4+ effector memory cells re-expressing CD45RA T cell and CD8+ T cell levels associated with a higher risk of death. These findings have broad implications for ALS pathogenesis and the development of immune-based ALS therapies tailored to specific immune cell populations.
  
  Thank you for the very positive comments.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2021.10.05.21264570v1
www.medrxiv.org www.medrxiv.org

Impaired HA-specific T follicular helper cell and antibody responses to influenza vaccination are linked to inflammation in humans

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  In this manuscript Hill et al, analyze immune responses to vaccination of adults with the seasonal influenza vaccine. They perform a detailed analysis of the hemagglutinin-specific binding antibody responses against several different strains of influenza, and antigen-specific CD4+ T cells/T follicular cells, and cytokines in the plasma. Their analysis reveals that: (i) tetramer positive, HA-specific T follicular cells induced 7 days post vaccination correlate with the binding Ab response measured 42 days later; (ii) the HA-specific T fh have a diverse TCR repertoire; (iii) Impaired differentiation of HA-specific T fh in the elderly; and (iv) identification of an "inflammatory" gene signature within T fh in the elderly, which is associated with the impaired development of HA-specific Tfh.
  
  The paper addresses a topic of considerable interest in the fields of human immunology and vaccinology. In general the experiments appear well performed, and support the conclusions. However, the following points should be addressed to enhance the clarity of the paper, and add support to the key conclusions drawn.
  
  We thank the reviewer for their supportive evaluation of the manuscript, and have provided the details of how we have addressed each the points raised below.
  
  1) Abstract: "(cTfh) cells are the best predictor of high titre antibody responses.." Since the authors have not done any blind prediction using machine learning tools with independent cohort, the sentence should be rephrased thus: "cTfh) cells are were associated with high titre antibody responses."
  
  We agree that this phrasing better reflects the presented data. The sentence in the abstract (page 2) now reads “we show that formation of circulating T follicular helper (cTfh) cells was associated with high titre antibody responses.”
  
  2) Figure 1A: Please indicate the age range of the subjects.
  
  Figure 1 has been updated to include the age range of the subjects.
  
  3) Almost all the data in the paper shows binding Ab titers. Yet, typically HAI titers of MN titers are used to assess Ab responses to influenza. Fig 1C shows HAI titers against the H1N1 Cal 09 strain. Can the authors show HAI titers for Cal 09 and the other A and B strains contained in the 2 vaccine cohorts? Do such HAI titers correlate with the tetramer positive cells, similar to the correlations show in Fig 2e.
  
  In this manuscript we have deliberately focussed on the immune response to the H1N1 Cal09 strain, as it is the only influenza strain in the vaccine common to both cohorts. The HAI titre for this strain is now shown as supplementary figure 4. In addition, the class II tetramers were specifically selected to recognise unique epitopes in the Cal 09 strain (J. Yang, {..} W. W. Kwok, CD4+ T cells recognize unique and conserved 2009 H1N1 influenza hemagglutinin epitopes after natural infection and vaccination. Int Immunol 25, 447-457, 2013) because of this we do not think it is appropriate to correlate HAI titres for the non-Cal 09 strains with tetramer positive cells. We agree that showing the correlation of cTfh and other immune parameters with the HAI titres for Cal 09 is important and have included this as supplementary figure 7. The new data and text are presented below:
  
  Figure 1-figure supplement 4: HAI responses before and after vaccination A) Log2 HAI titres at baseline (d0), d7 and d42 for cohort 1 (n=16) and B) cohort 2 (n = 21). C) Correlation between HAI and A.Cali09 IgG as measured by Luminex assay for cohort 1 and 2 combined. p-values determined using paired Wilcoxon signed rank-test, and Pearson’s correlation.
  
  Text changes. Page 4. “The increase in anti-HA antibody titre was coupled with an increase in hemagglutination inhibitory antibodies to A.Cali09, the one influenza A strain contained in the TIVs that was shared across the two cohorts and showed a positive correlation with the A.Cali09 IgG titres measured by Luminex assay (Fig. 1C, Figure 1-figure supplement 4).”
  
  Figure 2-figure supplement 1: Correlations between HAI assay titres and selected immune parameters. Correlation between vaccine-induced A.Cali09 HAI titres at d42 with selected immune parameters in both Cohort 1 and Cohort 2 (n=37). Dot color corresponds to the cohort (black = Cohort 1, grey = Cohort 2). Coefficient (Rho) and p-value determined using Spearman’s correlation, and line represents linear regression fit.
  
  Results text Changes: Page 5. “Similar trends were seen when these immune parameters were correlated to HAI titres against A/Cali09 (Fig Figure 2-figure supplement 1).”
  
  4) Fig 2d to i: what % of all bulk activated Tfh at day 7 are tetramer positive? The tetramer positive T cells constitute roughly 0.094% of all CD4 T cells (Fig 2d), of which 1/3rd are CXCR5+, PD1+ (i.e. ~0.03% of CD4 T cells). What fraction of all activated Tfh is this subset of tetramer positive cells? Presumably, there will also be Tfh generated against other viral proteins in the vaccine, and these will constitute a significant fraction of all activated Tfh.
  
  This is an important point, as the tetramers only recognise one peptide epitope of the Cal.09 HA protein, so there will be many other influenza reactive CD4+ T cells that are responding to other Cal 09 epitopes as well as other proteins in the vaccine. The analysis suggested by the reviewer shows that the frequency of Tet+ cells amongst bulk cTfh cells ranges from 0.14%-1.52% in cohort 1, and from 0.022-2.7% in cohort 2. These data have been included as Figure Figure 1-figure supplement 6C, D in the revised manuscript. In addition, Tet+ cells as a percentage of bulk cTfh cells were reduced in older people compared to younger adults. This data has been included in Figure 5-figure supplement 1C in the revised manuscript.
  
  Figure 1-figure supplement 6: Percentage of cTfh cells that are Tet+ and CXCR3 and CCR6 expression on HA-specific CD4+ T cells. A) Representative flow cytometry gating strategy for CXCR5+PD-1+ cTfh cells on CD4+CD45RA- T cells, and the proportion of HA-specific Tet+ cells within the CXCR5+PD-1+ cTfh cell gate. B) Percentage Tet+ cells within the CXCR5+PD-1+ cTfh cell population. Within-cohort age group differences were determined using the Mann-Whitney U test.
  
  Results text, page 4: These antigen-specific T cells had upregulated ICOS after immunisation, indicating that they have been activated by vaccination (Fig. 1F, G). In addition, a median of one third of HA-specific T cells upregulated the Tfh markers CXCR5 and PD1 on d7 after immunisation (Fig. 1H, I). The tetramer binding cells represented between 0.022-2.7% of the total CXCR5+PD-1+ bulk population (Fig Figure 1-figure supplement 6A, B).
  
  Figure 5-figure supplement 1C: Age-related differences in cytokines and HA-specific CD4+ T cell parameters. C) Percentage Tet+ cells within the CXCR5+PD-1+ cTfh cell population. Within-cohort age group differences were determined using the Mann-Whitney U test.
  
  Results text, page 8: Across both cohorts, the only CD4+ T cell parameters consistently reduced in older individuals at d7 were the frequency of polyclonal cTfh cells and HA-specific Tet+ cTfh cells, with the strongest effect within the antigen-specific cTfh cell compartment (Fig. 5H-J, Figure 5-figure supplement 1C).
  
  Reviewer #2:
  
  Hill and colleagues present a comprehensive dataset describing the recall and expansion of HA-specific cTFH cells following influenza immunisation in two cohorts. Using class II tetramers, IgG titres against a large panel of HA antigens, and quantification of plasma cytokines, they find that activated and HA-specific cTFH cells were a strong predictor of the IgG response against the vaccine after 6 weeks. Using RNAseq and TCR clonotype analysis, they find that, in 10/15 individuals, the HA-specific cTFH response at day 7 post-vaccination is recalled from the available CD4 T cell memory pool present prior to vaccination. Post-vaccination HA-specific cTFH cells exhibited a transcriptional profile consistent with lymph node-derived GC TFH, as well as evidence of downregulation of IL-2 signaling pathways relative to pre-vaccine CD4 memory cells.
  
  The authors then apply these findings to a comparison of vaccine immunogenicity between younger (18-36) and older (>65) adults. As expected, they found lower levels of vaccine-specific IgG responses among the older cohort. Analysis of HA-specific T cell responses indicated that tet+ cTFH fail to properly develop in the older cohort following vaccination. Further analysis suggests that development of HA-specific cTFH in older individuals is not caused by a lack of TCR diversity, but is associated with higher expression of inflammation-associated transcripts in tet+ cTFH.
  
  Overall this is an impressive study that provides clarity around the recall of HA-specific CD4 T cell memory, and the burst of HA-specific cTFH cells observed 7 days post-vaccination. The association between defective cTFH recall and lower IgG titres post-vaccination in older individuals provides new targets for improving influenza vaccine efficacy in this age group. However, as currently presented, the model of impaired cTFH differentiation in the older cohort and the link to inflammation is somewhat unclear. There are several issues that could be clarified to improve the manuscript in its current form:
  
  We thank the reviewer for their supportive and comprehensive summary of our work. We agree that the link between impaired inflammation and cTfh differentiation is correlative, we have added new data to address this, including mechanistic data to support chronic IL-2 signalling as antagonistic to cTfh development, as well as providing new analyses to address the other points raised.
  
  1) It is somewhat unclear the extent to which the reduction in HA-specific cTFH in the older cohort is also related to an overall reduction in T cell expansion - cohort 1 shows a significant reduction in total tet+ CD4 T cells post-vaccination as well as in the cTFH compartment, and while this difference may not reach statistical significance, a similar trend is shown for cohort 2.
  
  We agree that a possible interpretation is a global failure in T cell expansion in the older individuals. To determine whether there is a relationship between the degree of Tet+ CD4+ T cell expansion and cTfh cell differentiation with age, we performed correlation analyses. There is no correlation between the expansion of Tet+ cells and the frequency of cTfh cells formed seven days after immunisation in either age group. This suggests that the impaired cTfh cell differentiation in older persons is most likely caused by factors other than the capacity of CD4+ T cells to expand after vaccination. These data have been added as Figure 5-figure supplement 1D, and included in the results text on page 8.
  
  Figure 5-figure supplement 1D: Age-related differences in cytokines and HA-specific CD4+ T cell parameters. D) Correlation between Tet+ cells (d7-d0, % of CD4+) and cTfh (d7-d0, % of TET+) in both cohorts for each age-group (18- 36 y.o n=37, 65+ y.o. n= 39). Dot color corresponds to the cohort (black = Cohort 1, grey = Cohort 2). Coefficient (Rho) and p-value determined using Spearman’s correlation, and line represents linear regression fit.
  
  Text changes, Page 8: There was no consistent difference in the total d7 Tet+ HA-specific T cell population with age for both cohorts (Fig. 5H) and we observed no age-related correlation between the ability of an individual to differentiate Tet+ cells into a cTfh cell and the overall expansion of Tet+ HA-specific T cell population (Figure 5-figure supplement 1D). Thus, our data suggests that the poor vaccine antibody responses in older individuals is impacted by impaired cTfh cell differentiation (Fig. 5J) rather than size of the vaccine-specific CD4+ T cell pool.
  
  2) Transcriptomic analysis indicates that HA-specific cTFH in the older cohort show impaired downregulation of inflammation, TNF and IL-2-related signaling pathways. The authors therefore conclude that excess inflammation can limit the response to vaccination. In its current presentation, the data does not necessarily support this conclusion. While it is clear that downregulation of TNF and IL-2 signalling pathways occur during cTFH/TFH differentiation, there is no evidence presented to support the idea that (a) vaccination results in increased pro-inflammatory cytokine production in lymphoid organs in older individuals or that (b) these pro-inflammatory cytokines actively promote CXCR5-, rather than cTFH, differentiation of existing memory T cells.
  
  We agree with the reviewer that the data presented in figure 7 are correlative, rather than causative. Unfortunately, we do not have access to secondary lymphoid tissues from younger and older people after vaccination to test point (a) above. In order to test the hypothesis that increased inflammatory cytokine production in lymphoid organs limits Tfh cell differentiation we have used Il2cre/+; Rosa26stop-flox-Il2/+ transgenic mice. In this mouse model, IL-2-dependent cre- recombinase activity facilitates the expression of low levels of IL-2 in cells that have previously expressed IL-2. This creates a scenario in which cells that physiologically express IL-2 cannot turn its expression off therefore increasing expression IL-2 after antigenic stimulation (mice reported in Whyte et al., bioRxiv, 2020, doi: https://doi.org/10.1101/2020.12.18.423431).
  
  Twelve days after influenza A infection, Il2cre/+; Rosa26stop-flox-Il2/+ transgenic mice have fewer Tfh cells in the draining mediastinal lymph node and in the spleen (Fig. 8A-C), this is accompanied by a reduction in the magnitude of the GC B cell response (Fig. 8D-E). These data provide a proof of concept that sustained IL-2 production limit the formation of Tfh cells, consistent with the negative correlation of an IL-2 signalling gene signature and cTfh cell formation in humans (Figure 7). These new data support the conclusion that excess IL-2 signalling can limit the Tfh cell response. These data are presented in Figure 8, and are discussed on page 12 in the results, and pages 12-13 in the discussion.
  
  Figure 8: Increased IL-2 production impairs Tfh cell formation and the germinal centre response. Assessment of the Tfh cell and germinal centre response in Il2cre/+; Rosa26stop-flox-Il2/+ transgenic mice that do not switch off IL-2 production, and Il2cre/+; Rosa26+/+ control mice 12 days after influenza A infection. Flow cytometric contour plots (A) and quantification of the percentage of CXCR5highPD-1highFoxp3-CD4+ Tfh cells in the mediastinal lymph node (B) and spleen (C). Flow cytometric contour plots (D) and quantification of the percentage of Bcl6+Ki67+B220+ germinal centre B cells in the mediastinal lymph node (E) and spleen (F). The height of the bars indicates the median, each symbol represents one mouse, data are pooled from two independent experiments. P-values calculated between genotype-groups by Mann Whitney U test.
  
  Results text, page 12: Sustained IL-2 production inhibits Tfh cell frequency and the germinal centre response. To test the hypothesis that cytokine signalling needs to be curtailed to facilitate Tfh cell differentiation turned to a genetically modified mouse model in which cells that have initiated IL-2 production cannot switch it off, Il2cre/+; Rosa26stop-flox-Il2/+ mice (37). Twelve days after influenza infection Il2cre/+; Rosa26stop-flox-Il2/+ mice have fewer Tfh cells in the draining lymph node and spleen (Fig. 8A-C), which is associated with a reduced frequency of germinal center B cells (Fig. 8D-F). This provides a proof of concept that proinflammatory cytokine production needs to be limited to enable full Tfh cell differentiation in secondary lymphoid organs.
  
  Discussion text, pages 12, 13: These enhanced inflammatory signatures associated with poor antibody titre in an independent cohort of influenza vaccinees. The dampening of Tfh cell formation by enhanced cytokine production was confirmed by the use of genetically modified mice where IL-2 production is restricted to the appropriate anatomical and cellular compartments, but once initiated cannot be inactivated. Together, this suggests that formation of antigen-specific Tfh cells is essential for high titre antibody responses, and that excessive inflammatory factors can contribute to poor cTfh cell responses.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2021.04.07.21255038v1
www.biorxiv.org www.biorxiv.org

New submission 31/10/2022, 09:52:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In the article "Neuroendocrinology of the lung revealed by single cell RNA sequencing", Kuo et. al. described various aspects of pulmonary neuroendocrine cells (PNECs) including the scRNA-seq profile of one human lung carcinoid sample. Overall, although this manuscript does not have any specific storyline, it is informative and would be an asset for researchers exploring various new roles of PNECs.
  
  Thank you for appreciating the significance of the data presented. Our storyline focuses on the newly uncovered molecular diversity of PNECs and the extraordinary repertoire of peptidergic signals they express and cell types these signals can directly target in (and outside) the lung, in mice and human, and in health and disease (human carcinoid tumor).
  
  Major comments:
  
  The major concern about the work is most results are preliminary, and at a descriptive level, conclusions or sub-conclusions are derived from scRNA-seq analysis only, lacking in-depth functional analysis and validation in other methods or systems. There are many open-end results that have been predicted by the authors based on their scRNA-seq data analysis without functional validation. In order to give them a constructive roadmap, it would be better to investigate literature and put them in a potential or probable hypothesis by citing the available literature. This should be done in each section of the result part. The paper lacks a main theme or specific biology question to address. In addition, the description about the human lung carcinoid by scRNA-seq is somehow disconnected from the main study line. Also, these results are derived from the study on only one single patient, lacking statistical power.
  
  We agree that much of the data and analysis presented in the paper is descriptive and hypothesis-generating for PNECs, however we do not consider it preliminary. We focused on validating two key conclusions from the scRNA-seq analysis: PNECs are extraordinarily diverse molecularly (as validated by multiplex in situ hybridization and immunostaining) and they express many different combinations of peptidergic signals (and appear to package them in separate vesicles). From the lung expression profiles of the cognate receptors, we also predicted the direct lung targets of the dozens of new PNEC peptidergic signals we uncovered, and validated the cell target (PSN4, a recently identified subtype of pulmonary sensory neuron) of one of the newly identified PNEC signals (the classic hormone angiotensin) by confirming expression of the cognate receptor gene in PSN4 neurons that innervate PNECs and showing that the hormone can directly activate PSN4 neurons. The characterized human carcinoid provided evidence that during tumorigenesis, the amplified PNECs retain a memory (albeit imperfect) of the molecular subtype of PNEC from which they originated. As suggested by the Reviewer, we have provided more background in Results by adding additional citations from the literature to clarify the rationale for each analysis and what was known prior to the analysis. We feel that our paper provides a broad foundation for exploring the diversity and signaling functions of PNECs, and although each molecular type of PNEC and new PNEC peptidergic signal we uncovered and potential target cell in (and outside) the lung warrants follow up (as do the sensory and other properties of PNECs we inferred from their expression profiles), such studies will require the effort of many individuals in many labs studying both normal and disease physiology in mouse and human, and exploiting the data, hypotheses, approaches, and framework we provide.
  
  Reviewer #2 (Public Review):
  
  Pulmonary neuroendocrine cells (PNECs) are known to monitor oxygen levels in the airway and can serve as stem cells that repair the lung epithelium after injury. Due to their rarity, however, their functions are still poorly understood. To identify potential sensory functions of PNECs, the authors have used single-cell RNA-sequencing (scRNA-seq) to profile hundreds of mouse and human PNECs. They report that PNECs express over 40 distinct peptidergic genes, and over 150 distinct combinations of these genes can be detected. Receptors for these neuropeptides and peptide hormones are expressed in a wide range of lung cell types, suggesting that PNECs may have mechanical, thermal, acid, and oxygen sensory roles, among others. However, since some of these cognate receptors are not expressed in the lung, PNECs may also have systemic endocrine functions. Although these data are largely descriptive, the results represent a significant resource for understanding the potential roles of PNECs in normal biology as well as in pulmonary diseases and cancer and are likely to be relevant for understanding neuroendocrine cells in other tissue contexts.
  
  However, there are several aspects of the data analysis that are unclear and require clarification, most notably the definition of a neuroendocrine cell (points #1 and #2 below).
  
  1) Figure S1 shows the sorting strategy used for isolation of putative PNECs from Ascl1CreER/+; Rosa26ZsGreen/+ mice, and distinguishes neuroendocrine cells defined as ZsGreen+ EpCAM+ and "neural" cells defined as ZsGreen+ EpCAM-; the figure legend also refers to the ZsGreen+ EpCAM- cells as "control" cells. However, the table shown in panel D indicates that the NE population combines 112 ZsGreen+ EpCAM+ cells together with 64 ZsGreen+ EpCAM- cells to generate the 176 cells used for subsequent analyses. Why are these ZsGreen+ EpCAM- cells initially labeled as neural or control, but are then defined as neuroendocrine? If these do not express an epithelial marker, can they be rigorously considered as neuroendocrine?
  
  As explained above in the response to Essential Revision point 1, we define pulmonary neuroendocrine cells (PNECs) throughout the paper by their transcriptomic clustering and signatures, which includes the dozens of newly identified PNEC markers as well as the few extant marker genes available before this study (listed in Table S2). The confusion here arises from the two previously known markers (Ascl1 lineage marker ZsGreen, EpCAM) we used for flow sorting to enrich for these rare cells for transcriptomic profiling (Fig. S1). Although most of the cells with PNEC transcriptomic profiles were from the ZsGreenhi EpCAMhi sorted population (as expected), some were from the ZsGreenhi EpCAMlo sorted population. The latter resulted from the high EpCAM gating threshold we used during flow sorting, which excluded some PNECs with intermediate levels of surface EpCAM. Indeed, nearly all PNECs (> 95%) expressed EpCAM by scRNAseq, and there was no difference in EpCAM transcript levels or transcriptomic clustering of PNECs that were from the ZsGreenhi EpCAMhi vs. ZsGreenhi EpCAMlo sorted populations, as we now show in the new panels (C', C'') added to Fig S1C. This point is now clarified in the legend to Fig. S1C, and it nicely demonstrates that transcriptomic profiling is a more robust method of identifying PNECs than flow sorting based on two classical markers.
  
  2) Similarly, in the human scRNA-seq analysis, how were PNECs defined? The methods description states that these cells were identified by their expression of CALCA and ASCL1, but does not indicate whether they also expressed epithelial markers.
  
  Human PNECs were identified in the single cell transcriptomic analysis by the same strategy described above for mouse PNECs: by their transcriptomic clustering and signatures, which includes the dozens of newly identified PNEC markers as well as the few extant marker genes available before this study (listed in Table S2). In addition to expression of classic and new markers, the human PNEC cluster defined by scRNA-seq indeed showed the expected expressed of epithelial markers (e.g, EPCAM, see dotplot below), like other epithelial cells.
  
  3) The presentation of sensitivity and specificity in Figure 1 is confusing and potentially misleading. According to Figure 1B, Psck1 and Nov are two of the top-ranked differentially expressed genes in PNECs with respect to both sensitivity and specificity. However, the specificity of these two genes appears to be lower than that of Scg5, Chgb, and several other genes, as suggested in Figure 1C and Figure S1E. In contrast, Chgb appears to have higher specificity and sensitivity than Psck1 in Figures 1C and E but is not shown in the list of markers in Figure 1B.
  
  As explained above in the response to Essential Revision point 2, because different marker features are important for different applications, we have provided several different graphical formats (Figs. 1B,C, Fig. S1E) and a table (Table S1) to aid in selection of the optimal markers for each application. Fig. 1B shows the most sensitive and specific PNEC markers identified by ratio of the natural logs of the average expression of the marker in PNECs vs. non-PNEC epithelial cells (Table S1), and we have added a two-dimensional plot of this sensitivity and specificity for a large set of PNEC markers (new panel E of Fig. S1). The violin plots in Fig. 1C allow visual comparison of expression of selected markers across PNECs and 40 other lung cell types including non-epithelial cells (from our extensive mouse lung atlas in Travaglini, Nabhan et al, Nature 2020). Pcsk1 and Nov score high in the analysis of Fig. 1B because they are highly sensitive and specific markers within the pulmonary epithelium, and they are also valuable markers because they are highly expressed in PNECs. However, they appear slightly less specific in the violon plots of Fig. 1C (Pcsk1) and Fig. S1F (Nov) because of expression (though at much lower levels) in individual lung cell types outside the epithelium: Pcsk1 is expressed also at low levels in some Alox5+ lymphocytes, and Nov is expressed at low levels in some smooth muscle cells. Chgb is a new PNEC marker that did not make the cutoff for the list in Fig. 1B because it is expressed in a slightly higher percentage of non-PNEC epithelial cells than the markers shown, which ranked slightly above it by this metric (see Table S1).
  
  4) The expression of serotonin biosynthetic genes in mouse versus human PNECs deserves some comment. The authors fail to detect the expression of Tph1 and Tph2 in any of the mouse PNECs analyzed, but TPH1 is expressed in 76% of the human PNECs (Table S8). Is it possible that Tph1 and Tph2 are not detected in the mouse scRNA-seq data due to gene drop-out? If serotonin signaling by mouse PNECs is due to protein reuptake, as implied on p. 5, is there a discrepancy between serotonin expression as detected by smFISH versus immunostaining?
  
  It is always possible that the failure to detect expression of Tph1 and Tph2 in the mouse scRNA-seq dataset is due to technical dropout, however when we analyzed this in our other mouse PNEC scRNA-seq dataset obtained using a microfluidic platform and also deeply-sequenced (Ouadah et al, Cell 2019), we found similar values as in the previously analyzed dataset: no Tph2 expression was detected and only 3% (3 of 92) of PNECs had detected Tph1 expression, whereas 24% (22 of 92) had detected expression of serotonin re-uptake transporter Slc6a4. Because our mouse and human scRNA-seq datasets were prepared similarly and sequenced to a similar depth (105 to 106 reads/cell), the difference observed in Tph1/TPH1 expression between mouse (0-3% PNECs) and human (76% PNECs) is more likely a true biological difference. We also analyzed serotonin levels in mouse PNECs by immunohistochemistry (not shown) and detected serotonin in nearly all (~90%) embryonic PNECs but only ~10% of adult PNECs. Systematic follow up studies will be necessary to resolve the mechanism of serotonin biogenesis and uptake in PNECs, and the potential stage and species-specific differences in these processes suggested by this initial data.
  
  5) The smFISH and immunostaining analyses are often presented without any indication of the number of independent replicate samples analyzed (e.g., Figure 2B, Figure 3F, G).
  
  The number of samples analyzed have been added (the values for Fig. 2B are given in legend to Fig. 2C, the quantification of Fig. 2B).
  
  6) It would be helpful to provide a statistical analysis of the similarities and differences shown in the graphs in Figures 1E and G.
  
  We added a statistical analysis (Fisher's exact test, two-sided) of Fig. 1E comparing expression of each examined gene in the two scRNA-seq datasets (Table S4). We added a similar statistical analysis of Fig. 1G comparing the expression values of each examined gene by scRNA-seq vs smFISH (see Fig. 1G legend).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.08.483399v1
www.medrxiv.org www.medrxiv.org

Structural differences in adolescent brains can predict alcohol misuse

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Responses
  
  Reviewer #1 (Public Review):
  
  This study uses a nice longitudinal dataset and performs relatively thorough methodological comparisons. I also appreciate the systematic literature review presented in the introduction. The discussion of confound control is interesting and it is great that a leave-one-site-out test was included. However, the prediction accuracy drops in these important leave-one-site-out analyses, which should be assessed and discussed further.
  
  Furthermore, I think there is a missed opportunity to test longitudinal prediction using only pre-onset individuals to gain clearer causal insights. Please find specific comments below, approximately in order of importance.
  
  We thank the reviewers for their positive remarks and for providing important suggestions to improve the analysis. Please see our detailed comments below.
  
  1) The leave-one-site-out results fail to achieve significant prediction accuracy for any of the phenotypes. This reveals a lack of cross-site generalizability of all results in this work. The authors discuss that this variance could be caused by distributed sample sizes across sites resulting in uneven folds or site-specific variance. It should be possible to test these hypotheses by looking at the relative performance across CV folds. The site-specific variance hypothesis may be likely because for the other results confounds are addressed using oversampling (i.e., sampling with replacement) which creates a large sample with lower variance than a random sample of the same size. This is an important null finding that may have important implications, so I do not think that it is cause for rejection. However, it is a key element of this paper and I think it should be assessed further and discussed more widely in the abstract and conclusion.
  
  We thank the reviewer for raising this point and providing specific suggestions. As mentioned by the reviewer, the leave-one-site-out results showed high-variance across sites, that is, across cross validation (CV) folds. Therefore, as suggested by the reviewer, we further investigated the source of this variance by observing how the model accuracies correlates with each site and its sample sizes, ratio of AAM-to-controls, and the sex distribution in each site. We ranked the sites from low to high accuracy and observed different performance metrics such as sensitivity and specificity:
  
  As shown, the models performed close-to-chance for sites ‘Dublin’, ‘Paris’ and ‘Berlin’ (<60% mean balanced accuracy) in the leave-one-site-out experiment, across all time-points and metrics. Notably, the order of the performance at each site does not correspond to the sample sizes (please refer to the ‘counts’ column in the above figure). It also does not correspond to the ratio of AAM-to-controls, or to the sex distribution.
  
  To further investigate this, we performed another additional leave-one-site-out experiment with all 8 sites. Here, we repeated the ML (Machine Learning) exploration by using the entire data, including the data from the Nottingham site that was kept aside as the holdout. Since there are 8 sites now, we used a 8-fold cross validation and observed how the model accuracy varied across each site:
  
  The results were comparable to the original leave-one-site-out experiment. Along with ‘Dublin’ and Berlin’, the models additionally performed poorly on the ‘Nottingham’ site. Results on ‘London’ and ‘Paris’ also fell below 60% mean balanced accuracy.
  
  Finally, we compared the above two results to the main experiment from the paper where the test samples were randomly sampled across all sites. The performance on test subjects from each site was compared:
  
  As seen, the models struggled with subjects from ‘Dublin’ followed by ‘Nottingham’ ‘London’ and ‘Berlin’ respectively, and performed well on subjects from ‘Dresden’, ‘Mannheim’, ‘Hamburg’ and ‘Paris’.
  
  Across all the three results discussed above, the models consistently struggle to generalize to subjects particularly from ‘Dublin’ and ‘Nottingham’. As already pointed out by the reviewer, the variance in the main experiment in the manuscript is lower because of the random sampling of the test set across all sites. Since these results have important implications, we have included them in the manuscript and also provided these figures in the Appendix.
  
  2) The authors state that "83.3% of subjects reported having no or just one binge drinking experience until age 14". To gain clearer insights into the causality, I recommend repeating the MRIage14 → AAMage22 prediction using only these 83% of subjects.
  
  We thank the reviewer for this valuable comment. As suggested by the reviewer, we now repeated the MRIage14 → AAMage22 analysis by including (a) only the subjects who had no binge drinking experiences (n=477) by age 14 and (b) subjects who had one or less binge drinking experiences (n=565). The results are shown below. The balanced accuracy on the holdout set were 72.9 +/- 2% and 71.1 +/- 2.3% respectively, which is comparable to the main result of 73.1 +/- 2%.
  
  These results provide further evidence that certain form of cerebral predisposition might be preceding the observed alcohol misuse behavior in the IMAGEN dataset. We discuss these results now in the Results section and the 2nd paragraph of Discussion.
  
  3) The feature importance results for brain regions are quite inconsistent across time points. As such, the study doesn't really address one of the main challenges with previous work discussed in the introduction: "brain regions reported were not consistent between these studies either and do not tell a coherent story". This would be worth looking into further, for example by looking at other indices of feature importance such as permutation-based measures and/or investigating the stability of feature importance across bootstrapped CV folds.
  
  The feature importance results shown in Figure 9 is intended to be illustrative and show where the most informative structural features are mainly clustered around in the brain, for each time point. We would like to acknowledge that this figure could be a bit confusing. Hence, we have now provided an exhaustive table in the Appendix, consisting of all important features and their respective SHAP scores obtained across the seven repeated runs. In addition, we address the inconsistencies across time points in the 3rd paragraph in the Discussion chapter and contrast our findings with previous studies. These claims can now be verified from the table of features provided in the Appendix.
  
  Addressing the reviewer's suggestions, we would like to point out that SHAP is itself a type of permutation-based measure of feature importance. Since it derives from the theoretically-sound shapley values, is model agnostic, and has been already applied for biomedical applications, we believe that running another permutation-based analysis would not be beneficial. We have also investigated the stability of our feature importance scores by repeating the SHAP estimation with different random permutations. This process is explained in the Methods section Model Interpretation.
  
  Additionally now, the SHAP scores across the seven repetitions are also provided in the Appendix table 6 for verification.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2022.01.31.22269833v1
www.biorxiv.org www.biorxiv.org

New submission 01/11/2022, 12:37:54

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This paper tests the hypothesis that 1/f exponent of LFP power spectrum reflects E-I balance in a rodent model and Parkinson's patients. The authors suggest that their findings fit with this hypothesis, but there are concerns about confirmation bias (elaborated on below) and potential methodological issues, despite the strength of incorporating data from both animal model and neurological patients.
  
  First, the frequency band used to fit the 1/f exponent varies between experiments and analyses, inviting concerns about potentially cherry-picking the data to fit with the prior hypothesis. The frequency band used for fitting the exponent was 30-100 Hz in Experiment 1 (rodent model), 40-90 Hz in Experiment 2 (PD, levodopa), and 10-50 Hz in Experiment 3 (PD, DBS). Ad-hoc reasons were given to justify these choices, such as " to avoid a spectral plateau starting > 50 Hz" in Experiment 3. However, at least in Experiment 3 (Fig. 3), if the frequency range was shifted to 1-10 Hz, the authors would have uncovered the opposite effect, where the exponent is smaller for DBS-on condition.
  
  We agree that parameter choice is crucial, in particular, choice of the fitting range. In addition to the 40-90 Hz range (Figure 2C), we have performed aperiodic fitting for five other frequency ranges to test to what extent the reported results are sensitive to the selected frequency range (Figure S2A). This analysis showed that the results are robust when a broad frequency range from 30 to 95 Hz was chosen, which is consistent with what has been suggested by Gao et al., 2017 to make inferences on the E/I ratio.
  
  Accordingly, we have now repeated the analyses for the animal data with the same fitting range used for the ON-OFF medication comparison in humans. Along with Figure S2A where different frequency ranges were tested for data used in Figure 2, this shows that the results in Figure 1 and 2 hold up with higher aperiodic exponents when STN spiking is low and vice versa. Therefore, a broad fitting range from 30 to 90 Hz (excluding harmonics of mains interference) generates consistent results for both human and animal data.
  
  We opted against a fitting range from 1-10 Hz because of two restraints highlighted in Gerster et al., 2022. First, a fitting range starting at 1 Hz could have a larger y-intercept due to the presence of low-frequency oscillations. This could lead to a larger aperiodic exponent and could be misinterpreted as stronger neural inhibition. Therefore, the lower fitting bound should be chosen to best avoid known oscillations in the delta/theta range (Gerster et al., 2022). Second, frequencies should be chosen to avoid oscillations crossing fitting range limits. In Figure 3A, oscillations in the theta/alpha band both ON and OFF stimulation would complicate parameterisation and would likely result in spurious fits.
  
  We also tested the effect of changing the peak threshold, peak width limits and the aperiodic fitting mode on FOOOF parameterisation. Increasing and decreasing the peak threshold from its default value (at 2 standard deviations) did not change results (Figure S2B). Similarly, adapting the peak width limits did not affect the exponent difference between medication states (Figure S2C). Finally, choosing the ‘knee’ mode instead of ‘fixed’ resulted in fundamentally different aperiodic fits that did not differ anymore with medication (Figure S2D). This is most likely a consequence of the near linear PSD in log-log space from 40 to 90 Hz (Figure 2B). If there is no bend in the PSD, the FOOOF algorithm will be forced to assign a ‘random’ knee and the aperiodic fit will then mostly reflect the slope of the spectrum above the knee point.
  
  Second, there are important, fine-grained features in the spectra that are ignored in the analyses, which confounds the interpretation.
  
  One salient example of this is Fig. 2, where based on the plots in B, one would expect that the power of beta-band oscillations to be higher in the Med-On condition, as the oscillatory peaks rise higher above the 1/f floor and reach the same amplitude level as the Med-OFF condition (in other words, similar total power is subtracted by a smaller 1/f power in the Med-ON condition). But this impression is opposite to the model-fitting results in C, where beta power is lower in the Med-ON condition.
  
  We agree that PSDs over a broad frequency range (e.g. 5-90 Hz) typically do not have a single 1/f property. Instead, there can be multiple oscillatory peaks and ‘knees/bends’ in the aperiodic component. For these cases, fitting should be performed using the knee mode. To extract periodic beta power, we parameterise the PSD between 5 and 90 Hz and select the largest oscillatory component between 8 and 35 Hz (this range was extended to include the large oscillatory peaks in hemispheres 27 and 28 at ~ 10 Hz, see Figure R1). We now use the knee mode, to model the aperiodic component between 5 and 90 Hz when periodic beta power is calculated (see our previous comments). Figure R1 provides an overview of all PSDs ON and OFF medication, the aperiodic fits (5-90 Hz (knee) and 40-90 Hz (fixed)) and the detected beta peaks. In spite of this modification in our pipeline, periodic beta power is still larger OFF medication (Figure 2C), in keeping with previous studies (Kim et al., 2022; Kühn et al., 2006; Neumann et al., 2017; Ray et al., 2008). We acknowledge the reviewer’s point that the average spectra in Figure 2B are misleading in that respect and for clarity provide here all 30 spectra in both conditions. Note that the calculation of aperiodic exponents between 40 and 90 Hz is not affected by this change in our pipeline. Figures 2B, D+E were revised accordingly.
  
  We have repeated the analysis of our animal data using the ‘knee mode’ with a fitting range from 30 to 100 Hz. However, using the knee mode did not improve the goodness of fit or fitting error and, in fact, made them slightly worse (Figure S5). Based on this, we think the fixed mode would provide a more holistic model for the PSDs used in this analysis. We have now added this comparison in Figure S5 to justify the choice of the fixed mode.
  
  Figure R1. PSDs from all 30 hemispheres ON and OFF medication. Aperiodic fits are shown between 5-90 Hz (knee mode), which was used to calculate the power of beta peaks, and between 40-90 Hz (fixed mode), which was used to estimate the aperiodic exponent of the spectrum.
  
  Another example is Fig. 1C, where the spectra for high and low STN spiking epochs are identical between 10 and 20 Hz, and the difference in higher frequency range could be well-explained by an overall increase of broadband gamma power (e.g. as observed in Manning et al., J Neurosci 2012, Ray & Maunsell PLoS Biol 2011). This increase of broadband gamma power is trivially expected, as broadband gamma power is tightly coupled with population spiking rate, which was used to define the two conditions.
  
  We agree with the reviewer that in Figure 1C, high and low STN spiking states could well be separated by average gamma power (Figure 1E), too. However, the difference of aperiodic exponents is more prominent between both conditions (Figure 1D+E, based on p-values). What is more, in human LFP data recorded from clinical macroelectrodes, medication states can be reasonably well distinguished using the aperiodic exponent between 40-90 Hz (Figure 2C), but average gamma power does not separate both states (Figure S3A). This suggests that the aperiodic exponent reflects more than just power differences in the high gamma regions. In addition, power changes do not inevitably change the aperiodic exponent and vice versa as elaborated in (Donoghue et al., 2020).
  
  Manning et al., 2009 show that the power spectrum is shifted to higher power values at all observed frequencies (2-150 Hz) as firing rates increase. As the reviewer points out, power spectra of our data are almost identical between 10-20 Hz (despite the marked spiking differences) and only drift apart from > 20 Hz (Figure 1C). This is a relevant difference between our study and Manning et al., 2009 and suggests that power differences in the gamma range are not solely explained by differences in spiking. This is confirmed when cortical activity at different spikes/sec is modelled (Miller et al., 2009). The entire spectrum is shifted to higher power values if spiking rates increase.
  
  Ray & Maunsell, 2011 reported low (30-80 Hz) and high (> 80 Hz) gamma activity in the macaque visual cortex, with a positive correlation between spiking activity and high gamma activity. However, activities in the low gamma range (30-80 Hz), which largely overlaps with the frequency range in our study, does not necessarily correlate with firing rates.
  
  In conclusion, the link between gamma power and spiking activity is not as strong as alluded. Even if the change in spiking activities can lead to changes of both gamma power and the aperiodic exponent, the aperiodic exponent would still constitute a measure to separate E/I levels and medication states.
  
  The above consideration also speaks to a major weakness of the general approach of considering the 1/f spectrum a monolithic spectrum that can be captured by a single exponent. As the authors' Fig. 1C shows, there are distinct frequency regions within the 1/f spectrum that have different slopes. Indeed, this tripartite shape of the 1/f spectrum, including a "knee" feature around 40-70 Hz which is well visible here, was described in multiple previous papers (Miller et al., PLoS Comput Biol 2009; He et al., Neuron 2010), and have been successfully modeled with a neural network model using biologically plausible mechanisms (Chaudhuri et al., Cereb Cortex, 2017). The neglect of these fine-grained features confounds the authors' model fitting, because an overall increase in the broadband gamma power - which can be explained straightforwardly by the change in population firing rates - can result in the exponent, fit over a larger spectral frequency region, to decrease. However, this is not due to the exponent actually changing, but the overall increase of power in a specific sub-frequency-region of the broadband 1/f activity.
  
  We have now used the knee mode for aperiodic fits between 5 and 90 Hz when periodic beta power is calculated. We agree that this broad frequency range is unlikely to have a single 1/f component.
  
  We have also repeated the analysis of our animal data using the knee mode for aperiodic fits between 30 and 100 Hz (Figure S5). However, the goodness of fits had barely changed. In fact, the R2 and error become slightly worse. In addition, the knee parameter complicates interpretation of the aperiodic exponent and has to be considered along with the knee frequency. What is more, we do not see this bend around 40-70 Hz in all subjects. We show PSDs of representative LFP channels in Figure R2 and need to assert that the knee around 40-70 Hz is not a robust finding in our data set. Therefore, we chose the fixed mode for parameterisation within this frequency band.
  
  Please see our answer to the previous comment regarding the link between broad gamma power and changes in population firing rates.
  
  Figure R2. PSDs of representative PSD channels for each animal (data used in Figure 1C). The knee around 40-70 Hz is not a robust finding in all PSDs.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.23.504923v1
www.biorxiv.org www.biorxiv.org

First-principles model of optimal translation factors stoichiometry

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The manuscript by Lalanne and Li aims to provide an intuitive and quantitative understanding of the expression of translation factors (TFs) from first principles. The authors first find that the steady-state solutions for translation sub-processes are largely independent at optimality. With a coarse-grained model, the authors derive the optimal expression of translation factors for all important sub-processes. The authors show that intuitive scaling factors can explain the differential expression of translation factors.
  
  The results are impressive. However, as detailed in the major comments, the choice of some important parameters is not sufficiently justified in the current version. In particular, it is not clear to what extent parameter choice and rescaling was biased toward achieving a good agreement with the experimental data.
  
  Major comments:
  
  1) The work assumes that reaction times per TF are constant. That may be true at the highest growth rates, but it might not hold for conditions with lower growth rates. The data of Schmidt et al. (Nat. Biotechnol. 34, 104 (2016)) would allow to compare the predictions to proteome partitioning in E. coli across growth rates. It is ok to restrict the present work to maximal growth rates, but then this caveat should be made explicit. This last point also concerns ignoring the offset in the bacterial growth laws, which is only permissible at fast growth; that also should be stated more prominently in the manuscript; see also the legend of Fig. 1, "Our framework of flux optimization under proteome allocation constraint addresses what ribosome and translation factor abundances maximize growth rate".
  
  We see two distinct but related points made by the reviewer, which we address in turn.
  
  First, we thank the reviewer for highlighting the important and interesting point of the growth rate dependence of expression in components of the translational machinery, which encouraged us to investigate this aspect further. Leveraging other existing ribosome profiling datasets (which provide better quantitation than mass spectrometry data, see response to minor point #6 below) across multiple growth conditions and species, we compared the predicted optimal translation factor abundance in these conditions (using same formula for the optima). The new conditions and species now include E. coli at much slower growth rates, C. crescentus in two different media, and others. We found similar degrees of agreement between predicted and observed levels (shown in Figure 4-Figure supplement 1 ). One exception is aaRS in C. crescentus, and the discrepancy likely arises from a lack of quantification of tRNA abundance which is a parameter we use to predict the optimal aaRS levels.
  
  These additional data also provided another way to examine the model predictions. Specifically, we assessed the predicted square-root scaling of translation factor abundance with growth rate. While the expression stoichiometry remains constant across growth rates (see response to minor point #6 below), the overall abundance decreases following our predicted scaling (Figure 4-Figure supplement 2B). We now describe these new analyses and results in the main text (p. 7, line 216):
  
  "Analysis of tlF expression across slower growth conditions supports the derived square root dependence (Figure 4-Figure supplement 2)."
  
  The second point made by the reviewer pertains to the “offset in bacterial growth law” that corresponds to inactive ribosomes, which make up a substantial fraction of ribosomes at very slow growth rates. We note that the derivation of the optimality condition, equation 5, does not rely on all ribosomes being active. What is necessary is that that there is a direct proteomic trade-off between ribosomes and translation factors (see response to minor point 1 below). To rigorously place our work in the context of previous literature, we have replaced mention of ribosome with “active ribosome” (as well as in equation 1 and Figure 1), which we define as those functionally engaged in the translation cycle. We also formally include the proteome fraction of inactive ribosome in equations 2 and 3 leading to the optimality condition.
  
  2) The diffusion-limited regime considers only the free and idle reactants. For some translation factors, the free state only accounts for a small fraction of its total concentration. In this case, the diffusion-limited regime only explains a small fraction of the TFs. For example, most of EF-Ts may not be in its free state: in simulations with in vitro kinetics, free EF-Ts accounts for 6%-48% of its total concentration (Supplementary Data 3 in [21]). Can the authors use in vitro parameters (or other ways) to provide a rough estimate of the fraction of free TFs? Including this might allow to make quantitative statements about some of the deviations seen in Fig. 4, as most of the TFs are underestimated.
  
  We thank the reviewer for the suggestion that deviations between the diffusion-limited prediction and the observed abundance might be quantitatively explained by the finite catalytic activity of the respective factors. However, to do so requires accurate values of kcat, which are often not available. In the Supplement of the initial submission, we provided an example of the in vitro kcat being not compatible with the protein synthesis rates in vivo, which we have now moved to the main text (reproduced below).
  
  Another experimental approach that can feasibly be used to infer the bound fraction of translation factors in live cell is fluorescence microscopy of tagged proteins. Indeed, by quantifying the diffusive states of a tagged EF-Tu protein, Volkov et al (1) could estimate that <10% of EF-Tu was in its bound state, which is consistent with the agreement between our diffusion-limited prediction and observed abundance for that factor.
  
  We now discuss these possibilities and the facts about EF-Ts in a paragraph in the Discussion (p. 13, line 471):
  
  "Our optimization model can also be solved analytically in the non-diffusion-limited regime (Table 2), with the finite catalytic rate leading to an additional contribution of the form ∝ l 𝜆*/kcat. Recent detailed modeling of the EF-Ts cycle (Hu et al., 2020) estimated that a minor fraction (6 to 48%) of its abundance was in the free form in the cell, consistent with the large deviation we observe for this factor from our diffusion only prediction. However, the numerical values for these solutions are in general difficult to obtain because measurements of catalytic rates are sparse and often inconsistent with estimates of kinetics in live cells. As an example, the catalytic rates for aaRSs (Jeske et al., 2019) measured in vitro is ≈3 s-1 (median across different aaRSs), which is well below the minimal value of 15 s-1 required to sustain translation flux at the measured translation elongation rate (Appendix 5), suggesting substantial deviation between in vitro and in vivo kinetics. Although technically demanding, the fraction of free vs. bound factors can in principle be determined through live cell microscopy of tagged factors based on the partitioning the diffusive states of enzymes. Using that approach, (Volkov et al., 2018) estimated that EF-Tu was in its bound state <10% of the time (consistent with the agreement between our diffusion-limited prediction and the observed value for this factor)."
  
  3) "A factor-independent time τ_ind (e.g., peptidyl transfer), which does not come into play in our optimization framework, was added to account for additional steps making up the full elongation cycle." - what happened to this time? I couldn't find it anywhere else in the paper. What value was chosen, and by what rationale?
  
  We thank the reviewer for pointing out a lack of clarity in our presentation. The factor-independent time τind in fact did not appear in our optimization procedure at all (by virtue of obeying dτind/d𝜙TFi = 0 by definition), and was only included for generality to account for steps such as peptidyl transferase (extremely fast (2)). In line with the parsimony of our model, and to avoid any confusion, we have now removed this factor from our model and description altogether.
  
  4) Fig. 4: The agreement is very impressive, especially given the simplifying assumptions. However, there are some questions relating the choice of parameters.
  
  a) Were any parameters fitted? Which, how? What about τ_ind, for example (see above)?
  
  Our approach does not include any fitted parameter. We instead rely on biophysically measured quantities such as diffusion constants, protein sizes, tRNA abundances, cell doubling times (growth rates), and in vivo kinetic estimates. (In the line of Major Comment #3 above, we have removed τind for clarity.) We now include all quantities needed to predict the optimal translation factor abundances (using the formula listed in section “Summary of optimal solutions”, Table 2) in Appendix 5-Tables 1-3, including new Appendix 5-Tables 2-3, reproduced below.
  
  b) The "predicted" value for ribosomes is calculated from observed data (in a way described on p. S34 that I found incomprehensible, and would likely look very similar regardless of the predicted values for the TFs). According to the section "Equipartition between TF and corresponding ribosomes", the corresponding ribosomes can be quantified in the authors' scheme, too, by the method used for deriving optimal TF concentrations in equation 5. Why didn't the authors directly use the sum of these estimations as the optimal ribosome concentration in Fig. 4? In the current state, it does not seem fair to include the ribosome with the other predictions.
  
  We agree that the nature of the prediction for ribosomes was different than for other translation factors in our original manuscript in a way that might have lacked clarity. We now exclude ribosomes from Fig. 4 to avoid any possible confusion.
  
  It is interesting to directly estimate ribosome abundance using the equipartition principle. This estimation is however limited by the fact that the equipartition principle only accounts for ribosomes that are waiting for factor- dependent binding steps. Substantial fractions of ribosomes may be engaged at factor-free steps (e.g., peptidyl transfer catalyzed by ribosome itself) and factor-dependent catalytic steps after binding. Although the latter could be estimated using the observed tlF concentrations (by considering that the tlF in excess to the binding-limited predictions is sequestered in catalytic steps), the former is not estimated in our model. Furthermore, some other ribosomes may not be fully assembled yet or are inactive (3). Indeed, the predicted factor-dependent ribosome abundance using the equipartition principle with observed tlF abundances constitute a fraction (40%) of the measured total ribosome abundance.
  
  c) Predictions are for a specific growth rate (doubling time 21min). Was this growth rate also averaged over the three organisms? What were the individual values? These points would need to be discussed in the main text.
  
  The reviewer is correct. In the initial submission, we used the average growth rate of E. coli (doubling time 21.5±0.4 min), B. subtilis (doubling time 21±1 min), and V. natriegens (doubling time 19±1 min). A note has been added in the main text (p. 11, line 448):
  
  "We take the growth rate 𝜆* to be the average of the fast-growing species considered, corresponding to a doubling time of 21±1 min (E. coli: 21.5±1 min, B. subtilis: 21±1 min, V. natriegens: 19±1 min)."
  
  In addition, we now include predictions for different growth rates and compared them with several bacterial species grown in a wide range of conditions (Figure 4-Figure supplement 1) (see response to Major Comment #1 and to reviewer 2’s third request). These predictions and data are now included in Supplementary Files 1-4.
  
  5) In the same vein, in a footnote (!) to Table S4: "#For the ternary complex, the total mass of tRNA+EF-Tu was converted to an equivalent amino acid length." - I can see that this is important to get reasonable results, but it constitutes a major deviation from the strategy proclaimed throughout the main text: that the predicted effects result from a competition for fractions of the limited proteome. That rationale has to be changed (and explained in the main text), or the predictions in Fig. 4 should be based on calculations using only the protein part of TCs (i.e., EF-Tu).
  
  We are sorry for the confusion. The procedure of converting tRNA size to protein size was only used to estimate diffusion coefficients for the ternary complex (described in Appendix 5 Table 2), and not for the competition within the proteome. For factors for which no direct experimental estimates exist for in vivo diffusion coefficient, we used the relationship DA = (lTC/lA)1/3 DTC. The resulting estimated diffusion coefficients were then used to rescale the association rate inferred from in vivo measurements for the ternary complex (see response to point 6 below as well) to obtain association rates for other factors.
  
  6) S9: "we anchored our association rates to the estimated in vivo association rate for the ternary complex, 𝑘^𝑇𝐶 = 6.4 μM−1s−1 [13], and rescale the association rate by diffusion of related components" - in comparison, the diffusion limited k^TC is >100. If I understand this correctly, you simply rescale ALL on-rates by 100/6.4 = 15.6. If that is (qualitatively) correct, you would need to discuss this point (and the derivation of the scaling factor) explicitly in the main text.
  
  The reviewer is correct in his interpretation of our approach, and we are grateful for his remark as this led us to spot a mistake in our choice of parameter (capture radius R). Indeed, while the ternary complex as a largest physical dimension of about 10 nm (from structural data (4)), the appropriate capture radius is closer to 2 nm (size of the portion binding to the ribosome) (5). Correcting for the appropriate capture radius alone brings the estimate to 45 μM-1s-1 , which is however still several-fold higher than the measured value of 6.4 μM-1s-1. Whereas a part of this could be due to systematic overestimation of the diffusion coefficient, a large portion of the discrepancy is assuredly due to the many simplifying assumptions underlying the Smoluchowski estimate which serve to place an absolute upper bound on the reaction rate (perfectly/instantaneously absorbing spheres, and hence no notion of specific reaction position or molecular orientation).
  
  The estimate for capture radius R has been corrected (p. 47, line 1605) and a new sentence has now been included in the main text (p. 11, line 441):
  
  "Importantly, the absolute values of the optimal concentrations can be anchored by the association rate constant between TC and the ribosome obtained from translation elongation kinetic measurements in vivo (Dai et al., 2016). The latter was found to be several-fold smaller than the simplest and absolute upper bound of a Smoluchowski estimate of perfectly absorbing spheres (section Estimation of optimal abundances), and we assume that the rescaling factor is the same for all reactions."
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.04.02.438287v1
www.biorxiv.org www.biorxiv.org

New submission 01/11/2022, 12:00:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Iyer et al. address the problem of how cells exposed to a graded but noisy morphogen concentration are able to infer their position reliably, in other words how the positional information of a realistic morphogen gradient is decoded through cell-autonomous ligand processing. The authors introduce a model of a ligand processing network involving multiple ”branches” (receptor types) and ”tiers” (compartments where ligand-bound receptors can be located). Receptor levels are allowed to vary with distance from the source independently of the morphogen concentration. All rates, except for the ligand binding and unbinding rates, are potentially under feedback control. The authors assume that the cells can infer their position from the output of the signalling network in an optimal way. The resulting parameter space is then explored to identify optimal ”network architectures” and parameters, i.e. those that maximise the fidelity of the positional inference. The analysis shows how the presence of both specific and non-specific receptors, graded receptor expression and feedback loops can contribute to improving positional inference. These results are compared with known features of the Wnt signalling system in Drosophila wing imaginal disc.
  
  The authors are doing an interesting study of how feedback control of the signalling network reading a morphogen gradient can influence the precision of the read-out. The main strength of this work is the attention to the development of the mathematical framework. While the family of network architectures introduced here is not completely generic, there is enough flexibility to explore various features of realistic signalling systems. It is exciting to find that some network topologies are particularly efficient at reducing the noise in the morphogen gradient. The comparison with the Wnt system in Drosophila is also promising.
  
  Major comments:
  
  1) The authors assume that the cell estimates its position through the maximum a posteriori estimate, Eq.(5), which is a well-defined mathematical object; it seems to us however that whether the cell is actually capable of performing this measurement is uncertain (it is an optimal measurement in some sense, but there is no guarantee that the cell is optimal in that respect). Notably, this entails evaluating p(theta), which is a probability distribution over the entire tissue, so this estimate can not be done with purely local measurements. Can the authors comment on this and how the conclusions would change if a different position measurement was performed?
  
  This is indeed an important question. Our viewpoint is that if the cells were to use a maximum a posteriori (MAP) estimate (Eq. 5) to decode their positions, then what features of the channel architecture would lead to small errors in positional inference. Whether the maximum a posteriori estimate is employed by the cell, or some other estimate, is an important but difficult question to address. Our choice has been motivated by how this estimate has allowed the precise determination of developmental fates in the context of gap gene expression in Drosophila embryo [1, 2, 3]. We had earlier computed the inference error with a different estimate i.e.
  
  which computes the mean squared deviations of the inferred positions from the true position for each x, taking into account the entire distribution p(x∗|x). While the qualitative results are the same, the inference errors showed spurious jitters from outliers in sampling the noisy morphogen input distribution. This consistency might suggest that our qualitative results are insensitive to the choice of the estimate.
  
  Further, when evaluating the MAP estimate, the term p(θ) in the denominator serves as a normalisation factor to ensure p(x|θ) is a probability density. This is not strictly necessary for MAP estimation. Since p(θ) does not depend on x, the MAP estimate can be written as follows
  
  without the need for evaluating p(θ). In the case of a uniform prior, it would be equivalent to maximum likelihood estimate (MLE) i.e.
  
  2) One of the features of the signalling networks studied in the manuscript is the ability of the system to form a complex (termed a conjugated state, Q) made of two ligands L, one receptor and one nonsignalling receptor. While there are clear examples of a single ligand binding to two signalling receptors (e.g. Bmps), are there also known situations where such a complex with two ligands, one receptor, and one non-signalling receptor can form? In the Wnt example (Fig. 10a), it is not clear what this complex would be? In general, it would be great to have a more extended discussion of how the model hypothesis for the signalling networks could relate to real systems.
  
  This is a good suggestion. We have now added a discussion on the various possible realisations of the “conjugate state” Q in Section 3.6. We have also explored the various states in the context of different signalling contexts such as Dpp, Hh, Fgf in the Discussion section.
  
  The conjugated state ‘Q’ represents a combination of the readings from the two branches i.e. receptor types. This could be realised through processes like ligand exchange or complex formation, both in a shared spatial location such as a compartment. As discussed in the original manuscript (Section 3.6 of the revised manuscript), the ligand Wg in the Wg signalling pathway is internalised through two separate endocytic pathways associated with the receptor types - signalling receptor Frizzled (via Clathrin-mediated endocytosis (CME)) and non-signalling receptor HSPGs (via the CLIC/GEEC pathway (CLIC - (clathrin-independent carriers, GEEC - GPI-anchored protein-enriched early endosomal compartments)). Both pathways meet in a common early endosomal compartment where the ligands may be exchanged between the two receptors [4]. In a previous work by Hemalatha et al [4], we had shown that there are more Wg-DFz2 interactions in the endosomal compartment (measured through FRET) than on the cell surface. Therefore, the non-signalling receptors directing Wg through the CLIC/GEEC pathway titrate the amount of Wg interaction with the signalling receptor, DFz2.
  
  As mentioned in the original manuscript (Section 3.3 and subsection 4.2 of the Discussion in the revised manuscript), apart from Wg signalling, non-signalling receptors such as the HSPGs have also been proposed to act as co-receptors for Dpp, Hh, FGF (reviewed in [5, 6]). Although some ligands bind to the core protein of HSPG, the majority of the ligands bind to the negatively charged HS chains [7, 8]. Here, the coreceptors HSPGs aid in capturing diffusible ligands and presenting the same to signalling receptors (either on the cell surface or within endosomes).
  
  3) The authors consider feedback on reaction rates - it would seem natural to also consider feedback on the total number of receptors; notably, since there are known examples of receptors transcriptionally down-regulated by their ligands (e.g. Dpp/Tkv)? Also it is not clear in insets such as in Fig. 7b, if the concentration plotted corresponds to the concentration of receptors bound to ligands?
  
  As mentioned in the original manuscript (Section 2.2 of the revised manuscript), we have indeed considered control on reaction rates and receptors, although the control on the latter is done with the constraint of receptor profiles being monotonic. Further, while the control on reaction rates is considered via feedbacks explicitly, the control on receptors is done via an approach akin to the openloop control used in control theory. In reality, cellular control on receptors will involve transcriptional up- or down-regulation of receptor and thus warrant a feedback control approach – however, the timescales involved in such a control are different from the binding-unbinding and signalling timescales.
  
  Therefore, in the current work, we take the morphogen profile to be given i.e. independent of receptor concentrations, and we ask for the receptor concentrations that would help reduce the inference errors.
  
  Our predictions of increasing signalling receptor and decreasing non-signalling receptors in a twobranch channel architecture are consistent with the known transcriptional up-regulation of Dally/Dlp and down-regulation of Fz by Wg signalling [9].
  
  In a future work, we will extend the control on receptors to include feedbacks explicitly. Furthermore, the explicit feedback control on receptors may need to be considered concomitantly with the effect of receptors on morphogen dynamics (i.e. morphogen sculpting by receptors) along with the possibility of spatial correlations in receptor concentrations through neighbouring cell-cell interactions.
  
  As mentioned in the original manuscript (Section 2.2 of the revised manuscript), the variables ψ and φ stand for the total (bound + unbound) surface receptor concentrations of the signalling and the non-signalling receptors respectively. Therefore, the insets showing receptor profiles such as in Fig. 6b, 7b, and Appendix H Fig.8b,e correspond to the total surface receptor concentrations.
  
  4) The authors are clear about the fact that they consider the morphogen gradient to be fixed independently of the reaction network; however, that seems like a very strong assumption; in the Dpp morphogen gradient for instance over expression of the Tkv receptor leads to gradient shortening. Can the authors comment on this?
  
  This point is related to the earlier question 4. As discussed in the Discussion of the original manuscript (subsection 4.3 of the revised manuscript), we focus on finding the optimal receptor concentration profiles and reaction networks that enable precision and robustness in positional information from a given noisy morphogen profile. The framework and the optimisation scheme within it will prescribe different receptor profiles and reaction networks for different monotonically behaving, noisy morphogen profiles. It is possible that cells may achieve the optimal receptor concentrations via feedback control on production of the receptors.
  
  Broadly, morphogen dynamics depends on cell surface receptors, which could participate in both the inference and the sculpting of the morphogen profile, and factors independent of them such as extracellular degradation, transport and production, etc. In our present work, we have taken the receptors involved in sculpting and inference as being independent.
  
  In a more general case, feedback control on receptors will change the receptor concentrations as well as the morphogen profile. We are currently working on realising such a feedback control on receptors within the same broader information theoretic framework proposed in the current work.
  
  5) Fig. 10f is showing an exciting result on the change in endocytic gradient CV in the WT and in DN mutant of Garz. Can the authors check that the Wg morphogen gradient is not changing in these two conditions? And can they also show the original gradient, and not only its CV?
  
  The reviewer raises a legitimate concern – could the observed changes in CV upon perturbation of endocytic machinery be attributed to a systematic change in the mean levels of the endocytosed Wg alone? In the original manuscript (Appendix O Fig.17b,c of the revised manuscript), we show the normalised profiles of endocytic Wg in control and myr-Garz-DN cases. Here, in Fig.1 below, we show a comparison between the mean Wg concentrations (measured as fluorescence intensity) in control wing discs and discs wherein CLIC/GEEC endocytic pathway is removed using UAS-myr-Garz-DN. For clarity, we show the discs with largest and smallest fluorescence intensities from the control and myr-Garz-DN discs. It is hard to conclude that the mean concentrations are significantly different in the two cases.
  
  Reviewer #2 (Public Review):
  
  The work of Iyer et al. uses a computational approach to investigate how cells using multiple tiers of processing and multiple parallel receptor types allow more accurate reading of position from a noisy signal. Authors find that combining signaling and non-signaling types of receptors together with additional feedback increases the accuracy of positional readout against extrinsic noise that is conveyed in the morphogen signal. Further, extending the number of layers of signal processing counteracts the intrinsic stochasticity of the signal reading and processing steps. The mathematical formulation of the model is general but comprehensive in the way it handles the difference between branches and tiers for the processing of channels with feedbacks. The results of the model are presented from simple one-branch and one-tier architecture to two-branch and two-tier architecture with feedbacks. Interestingly authors find that adding more tiers results in only very small improvements in the accuracy of positional readout. The model is tested against a perturbation experiment that impairs one of the signaling branches in the Drosophila wing disc, but the comparison is only qualitative as further experiment-oriented work is planned in a separate paper.
  
  Strengths
  
  There is a clear statement of objectives, model, and how the model is evaluated. In particular, the objective is to find what number of receptor types and their concentrations for a given number of tiers and feedback types is resulting in the most accurate positional readout. The employed optimization procedure is capable to find signalling architectures that result in one cell diameter positional precision for most of the tissue with 3-4 cells at the tissue end that is most distant to the morphogen source. This demonstrates that employing additional complexity in signal processing results in a very accurate positional readout, which is comparable with estimates of positional precision obtained in other developmental systems (Petkova et al., Cell 2019, Zagorski et al., Science 2017).
  
  The optimal signalling architectures indicate that both signalling (specific) and non-signalling (nonspecific) receptors affect the precision of positional readout, but the contributions of each type of these receptors are qualitatively different. Even slight perturbation of signalling receptors drives the system out of optimum, resulting in a decrease in positional precision. In contrast, the non-signalling receptors could accommodate much larger perturbations. This observation could provide a biophysical explanation for how cross-talk between different morphogen species could be realized in a way that positional precision is kept at the optimum when morphogen signaling undergoes extrinsic and intrinsic perturbations.
  
  Last, the model formulation allows to specifically address perturbations of signalling and feedbacks, that could be explored to validate model predictions experimentally in Drosophila wing disc, but also in other developmental tissues. The authors present a proof-of-concept by obtaining consistent results of variation of output profiles in two-tier two-branch architectures with non-signaling branch removed and intensity profiles of Wg in wing disc where the CLIC/GEEC endocytic pathway was perturbed.
  
  Weaknesses
  
  The list of model parameters is long including more than 20 entries for two-tier two-branch architectures. This is expected, as the aim of the model is to describe the sophisticated signalling architecture mimicking the biological system. However, this also makes it very challenging or impossible to provide guiding principles or understanding of the system behaviour for the complete space of signalling architectures that optimize positional readout. Although, the employed optimization procedure finds solutions that exhibit very high positional accuracy, there is only very limited notion how these solutions depend on variation of different parameters. The authors do not address the following question, whether these solutions correspond to broad global optima in the space of all solutions, or were rather fine-tuned by the optimization procedure and are quite rare.
  
  It is unclear how contributions from the intrinsic noise affect the system behaviour compared to contributions from extrinsic noise. In principle, the two-branch one-tier architecture results in an already very accurate positional readout across the tissue. The adding of another tier seems to provide only a very weak improvement over a one-tier solution. It is possible that contributions from intrinsic noise for the investigated signalling architectures are only mildly affecting the system compared with contributions from extrinsic noise. Hence, it is difficult to assess whether the claim of reducing intrinsic noise by adding another tier is supported by the presented data, as the contributions from intrinsic noise could overall very weakly affect the positional readout.
  
  The optimal response of the channel to extrinsic and intrinsic noises is very distinct. As noted correctly by the reviewer, an additional tier provides only a marginal improvement in inference error due extrinsic noise (compare Fig.7 and Fig.8 in the revised manuscript). However, as shown in Fig.9c of the revised manuscript (same as in the original manuscript), adding an extra tier provides a substantial improvement in inference errors due to intrinsic noise.
  
  References
  
  [1] Gasper Tkacik, Julien O Dubuis, Mariela D Petkova, and Thomas Gregor. Positional information, positional error, and readout precision in morphogenesis: a mathematical framework. Genetics, 199:39– 59, 2015.
  
  [2] Mariela D Petkova, Gasper Tkacik, William Bialek, Eric F Wieschaus, and Thomas Gregor. Optimal decoding of cellular identities in a genetic network. Cell, 176:844–855, 2019.
  
  [3] Julien O Dubuis, Gaˇsper Tkaˇcik, Eric F Wieschaus, Thomas Gregor, and William Bialek. Positional information, in bits. Proceedings of the National Academy of Sciences, 110:16301–16308, 2013.
  
  [4] Anupama Hemalatha, Chaitra Prabhakara, and Satyajit Mayor. Endocytosis of wingless via a dynaminindependent pathway is necessary for signaling in drosophila wing discs. Proceedings of the National Academy of Sciences, 113:E6993–E7002, 2016.
  
  [5] Xinhua Lin. Functions of heparan sulfate proteoglycans in cell signaling during development. Development, 131:6009–6021, 2004.
  
  [6] Stephane Sarrazin, William C Lamanna, and Jeffrey D Esko. Heparan sulfate proteoglycans. Cold Spring Harbor perspectives in biology, 3(7):a004952, 2011.
  
  [7] Catherine A Kirkpatrick, Sarah M Knox, William D Staatz, Bethany Fox, Daniel M Lercher, and Scott B Selleck. The function of a drosophila glypican does not depend entirely on heparan sulfate modification. Developmental biology, 300(2):570–582, 2006.
  
  [8] Mariana I Capurro, Ping Xu, Wen Shi, Fuchuan Li, Angela Jia, and Jorge Filmus. Glypican-3 inhibits hedgehog signaling during development by competing with patched for hedgehog binding. Developmental cell, 14(5):700–711, 2008.
  
  [9] Kenneth M Cadigan, Matthew P Fish, Eric J Rulifson, and Roel Nusse. Wingless repression of drosophila frizzled 2 expression shapes the wingless morphogen gradient in the wing. Cell, 93(5):767–777, 1998.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.30.486187v2
www.medrxiv.org www.medrxiv.org

New submission 12/01/2023, 16:23:18

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Strength: The study is summarizing a large cohort of human samples of blood, nasal swabs and nasopharyngeal aspirates. This is very uncommon as most of the time studies focus on the blood and serum of patients. Within the study, 3 monocyte and 3 DC subsets have been followed in healthy and Influenza A virus-infected persons. The study also includes functional data on the responsiveness of Influenza A virus-infected DC and monocyte populations. The authors achieved their aims in that they were able to show that the tissue microenvironment is important to understand subset specific migration and activation behavior in Influenza A virus infection and in addition that it matters with which kind of agent a person is infected. Thus, this study also impacts a better understanding of vaccine design for respiratory viruses.
  
  We thank Reviewer 1 for highlighting what we believe to be the greatest strengths of our study. The key feature of this study was to generate a comprehensive description of monocytes and dendritic cells (DC) in the human nasopharynx during influenza A virus infection, and to provide a comparison with healthy and convalescent individuals. Further, we wished to emphasize the value of studying the nasopharynx during respiratory viral infections, particularly in light of the ongoing COVID-19 pandemic. We describe a non-invasive method to (longitudinally) sample this anatomical compartment that allows retrieval of intact immune cells as well as mucosal fluid for soluble marker analysis. We also believe that the addition of proteomic profiles in the different compartments (new Figure 7) further highlights the importance of the tissue microenvironment.
  
  Weakness: In the described study, the authors used a different nomenclature to introduce the DC subsets. This is confusing and the authors should stick to the nomenclature introduced by Guilliams et al., 2014 (doi.org/10.1038/nri3712) and commented in Ginhoux et al., 2022 (DOI: 10.1038/s41577-022-00675-7 ) or at least should introduce the alternative names (cDC1, cDC2, expression markers XCR1, CD172a/Sirpa). Further, Segura et al., 2013 (doi: 10.1084/jem.20121103) showed that all three DC subpopulations were able to perform cross-presentation when directly isolated. Overall, a more up-to-date introduction would be useful.
  
  Reviewer 1 commented on the DC nomenclature used in the manuscript. We agree that our manuscript would benefit from appropriately updating the DC nomenclature. We therefore revised the text, and now we refer to the subsets previously described as CD1c+ and CD141+ myeloid DCs (MDC) as cDC2 and CDC1 subsets, respectively. We have also modified the text in the Introduction of the revised manuscript to reflect the same and give a more up-to-date introduction of DC subsets (marked-up version lines 75-81).
  
  As the data of this was already obtained in 2016-2018 it is clear that the FACS panel was not developed to study DC3. If possible, the authors might be able to speculate about the role of this subset in their data set. Moreover, there were other studies on SARS-CoV-2 infection and DC subset analyses in blood (line 87, and line 489) e.g. Winheim et al., (DOI: 10.1371/journal.ppat.1009742 ), which the authors should introduce and discuss in regard to their own data.
  
  As reviewer 1 accurately pointed out, the flow cytometry panel used in this study was indeed not developed to study the DC3 subset. The data was obtained in 2016-2018, and lack the typical markers used to identify the DC3 subset, such as CD163, BTLA and CD5 (Cytlak et al, https://doi.org/10.1016/j.immuni.2020.07.003, Villani et al, https://doi.org/10.1126/science.aah4573). Due to the constraints of the panel, we would not be able to accurately identify DC3s. However, in an attempt to dig deeper into the data that is available, we re-analyzed the data to identify CD14+CD1c+ cells among the lineage–HLADR+CD16–CD14+ cells, here collectively called “mo-DC”. This population is likely a combination of monocytes upregulating CD1c and bona fide DC3 expressing CD14. Accordingly, the gating strategy was updated in Supplementary figure 1 (marked-up version lines 192-194), and new data plot in Figure 2H (marked-up version lines 208-220) summarizes the changes observed in mo-DC numbers in IAV patients between blood and the nasopharynx. Parallel to the pattern seen in other DC subsets, mo-DC frequencies are reduced in blood and we observed an increase (not significant) in the nasopharynx.
  
  As CD88 was not included in the original panel, it was not possible to discriminate between bona fide monocytes and DC3s. We performed a staining of PBMCs (buffy coat) with CD88 (FITC) added to the original flow panel used in the study, to assess if CD88 can be helpful for future studies (Reviewer figure 1). The staining showed that some cells in the mo-DC population are CD88 positive, indicating a bona fide monocyte origin, whereas some are negative, indicating that they are bona fide DC3 expressing CD14. (Bourdely et al, https://doi.org/10.1016/j.immuni.2020.06.002).
  
  Reviewer figure 1. Expression of CD88 in the “mo-DC” population. Cells from a buffy coat were stained with the flow cytometry panel used in the manuscript, with the addition of CD88 (FITC). Within the CD14+CD1c+ population, the “mo-DC” population, we identified both CD88+ and CD88- cells.
  
  Reviewer 1 also suggested citing Winheim et al (https://doi.org/10.1371/journal.ppat.1009742), and we thank them for their suggestion. We have now cited Winheim et al, and two additional reports (Kvedaraite et al, https://doi.org/10.1073/pnas.2018587118 and Affandi et al, https://doi.org/10.3389/fimmu.2021.697840) describing a depletion of DC3s (and other DC subsets) from circulation, and functional impairment of DCs following SARS-CoV-2 infection. Further, Winheim et al observed an increased frequency of a CD163+CD14+ subpopulation within the DC3s, which correlated with systemic inflammatory responses in SARS-CoV-2 infection. We speculate that perhaps in IAV infection too, DC3s may follow the trend of other DC subsets and be found in increased numbers in the nasopharynx (marked-up version lines 75-81 and 543-552).
  
  Taken together, although the data are very important and very interesting, my overall impression of the manuscript is that in the era of RNA seq and scRNA seq analyses the study lacks a bit of comprehensiveness.
  
  The final comment from reviewer 1 is well taken, in that our study does not include RNA-seq analyses. Again, we ask Reviewer 1 to take into consideration the challenging material we worked with in our study in combination with the COVID-19 pandemic that subsequently has excluded recruitment of new influenza patients to the study. The cell numbers and viability in the nasopharyngeal aspirates limit what experimental approaches can be done simultaneously, and flow cytometry seemed to be the best approach for the study. However, we agree that in future studies, both our own and those of others in the field, will greatly benefit from single cell analysis of nasopharyngeal immune cells, and from generating transcriptomic or epigenetic profiles of these cells. Unfortunately, it is a limitation that we are currently unable to overcome within the scope of this revision. Despite this weakness, we agree with Reviewer 1 that the methods we developed and the data we generated are important and interesting.
  
  Moreover, we have added additional proteomics data from both NPA and plasma from influenza and COVID-19 patients, using the SomaScan platform (new Figure 7) (marked-up version lines 472-511, 738-755 and 768-792). We also included a supplementary table listing enriched pathway data from gProfiler. Briefly, our data showed sizeable changes within the blood and nasopharyngeal proteome during respiratory virus infection (IAV or SARS-CoV-2), as compared to healthy controls. Importantly, we found several differentially expressed proteins unique to the nasopharynx that were not seen in blood, and pathway analysis highlighted “host immune responses” and “innate immunity” pathways, containing TNF, IL-6, ISG15, IL-18R, CCL7, CXCL10 (IP-10), CXCL11, GZMB, SEMA4A, S100A8, S100A9. These findings are in line with our flow cytometry data, and support our hypothesis that the immunological response to viral infection in the upper airways differ from that in matching plasma samples. One of the main messages in this manuscript is the importance of looking at the site of infection, and not only at systemic immune responses to better understand respiratory viral infections in humans. We believe that the addition of the proteomics data serves to further highlight this point.
  
  Reviewer #2 (Public Review):
  
  This study aims to describe the distribution and functional status of monocytes and dendritic cells in the blood and nasopharyngeal aspirate (NPA) after respiratory viral infection in more than 50 patients affected by influenza A, B, RSV and SARS-CoV2. The authors use flow cytometry to define HLA-DR+ lineage negative cells, and within this gate, classical, intermediate and non-classical monocytes and CD1c+, CD141+, and CD123+ dendritic cells (DC). They show a large increase in classical monocytes in NPA and an increase in intermediate monocytes in blood and NPA, with more subtle changes in non-classical monocytes. Changes in intermediate monocytes were age-dependent and resolution was seen with convalescence. While blood monocytes tended to increase in blood and NPA, DC frequency was reduced in blood but also increased in NPA. There were signs of maturation in monocytes and DC in NPA compared with blood as judged by expression of HLA-DR and CD86. Cytokine levels in NPA were increased in infection in association with enrichment of cytokine-producing cells. Various patterns were observed in different viral infections suggesting some specificity of pathogen response. The work did not fully document the diversity of human myeloid cells that have arisen from single-cell transcriptomics over the last 5 years, notably the classification of monocytes which shows only two distinct subsets (intermediate cannot be distinguished from classical), distinct populations of DC1, DC2 and DC3 (DC2 and 3 both having CD1c, but different levels of monocyte antigens), and the lack of distinction provided by CD123 which also includes a precursor population of AXL+SIGLEC6+ myeloid cells in addition to plasmacytoid DC. Furthermore, some greater precision of the gating could have been achieved for the subsets presented. Specifically, CD34+ cells were not excluded from the HLA-DR+ lineage- gate, and the threshold of CD11c may have excluded some DC1 owing to the low expression of this antigen. Overall, the work shows that interesting results can be obtained by comparing myeloid populations of blood and NPA during viral infection and that lineage, viral and age-specific patterns are observed. However, the mechanistic insights for host defense provided by these observations remain relatively modest.
  
  We thank Reviewer 2 for their assessment of our manuscript and summarizing our key findings in their public review. As reviewer 2 noted, our study describes changes in frequencies of monocytes and DCs during acute IAV infection, in blood and in the nasopharynx. Additionally, we also demonstrate pathogen-specific changes in both compartments. Reviewer 2 also highlighted a drawback of our study- that the approach did not fully capture the breadth of monocyte and DC diversity as it currently stands. Despite this, the findings we presented here laid the groundwork for continued research and led to significant progress, including mechanistic insights (Falck-Jones et al, https://doi.org/10.1172/JCI144734 and Cagigi et al, https://doi.org/10.1172/jci.insight.151463, Havervall et al. https://doi.org/10.1056/nejmc2209651 and Marking et al. Lancet Infectious Diseases in press), in understanding the role of myeloid cells in the human airways during viral infections.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2022.01.18.22269508v1
www.biorxiv.org www.biorxiv.org

New submission 13/06/2022, 10:55:30

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In the article "Whole transcriptome-sequencing and network analysis of CD1c+ human dendritic cells identifies cytokine-secreting subsets linked to type I IFN-negative autoimmunity to the eye," Hiddingh, Pandit, Verhagen, et al., analyze peripheral antigen presenting cells from patients with active uveitis and control patients, and find several differentially expressed transcription factors and surface markers. In addition, they find a subset of antigen presenting cells that is decreased in frequency in patients with uveitis that in previous publications was shown to be increased in the eye of patients with active uveitis. The greatest strength of this paper is the ability to obtain such a large number of samples from active uveitis patients that are not currently on systemic therapy. While the validation experiments have methodologic flaws that decrease their usefulness, this study will still serve as a valuable resource in generating hypotheses about the pathogenesis of uveitis that can be tested in future projects.
  
  We thank the reviewer for the constructive comments and effort to review our work in detail.
  
  Since all CD36+CX3CR1+ cells are CD14+ (Figure 4D), how CX3CR1 ended up being differentially regulated in a similar way despite this population was excluded from 2nd bulk RNAseq data set should be commented on by the authors.
  
  We agree with reviewer that the CD14 surface expression in relation to the black-gene module and CD36+CX3CR1+ DC3s requires more detailed analysis. As described in the results section, genes in this module are linked to both CD1c+ DCs and inflammatory CD14+ monocytes, which we cannot distinguish by bulk RNA seq analysis. Therefore, we aimed to use an approach to demonstrate that the black module is a bona fide CD1c+ DC gene signature not dependent on CD14 surface expression: We showed that there was not difference in CD14+ cell fractions in the samples for RNA-seq between patient and control samples (see Fig. 1F). We now further investigated this by additional data and experiments. We now show in Figure 2 Supplement 2A that CD14 – as expected - does not correlate with the black module. To confirm this experimentally, we purified CD14+CD1c+ and CD14- CD1c+ DCs from 6 donors and subjected these to qPCR analysis to evaluate the expression of key genes from the black module (see revised Figure 2A). As illustrated in revised Figure 2 panel B, we show that the expression levels of genes, including CD36 and CX3CR1, are not significantly altered between CD14+/- CD1c+ DCs which supports that the identified gene module is also not dependent on CD14 surface expression by CD1c+ DCs. To assess if the expression of the black module was also independent of CD14 in inflammatory disease, we used RNA-seq data from FACS-sorted CD14+CD1c+ DCs and CD14-CD1c+ DCS from patients with SLE and Scleroderma (GSE13731) and confirm that the expression of the black module genes is independent from CD14 surface expression (see revised Figure 2 panel C). Finally, we removed CD14+ cells from the analysis in the 2nd bulk RNA-seq experiment to proof that indeed the black module could be perceived as being associated with uveitis independent of CD14+ expression which allowed attributing the black module to CD1c+ DCs by bulk RNA-seq analyses. Also, more detailed analysis by flow cytometry (Revised Figure 4) and scRNA-seq (Figure 6) confirm these findings. For example, we show that the CD36+ CX3CR1+ DC3s are in fact a subset of CD14+ CD1c+ DCs (Figure 2 – Supplement 2) and we show that eye-infiltrating CD1c+ DCs that harbor the black module gene signature show increased CD36 and CX3CR1, but not CD14 (Figure 6C). We have addressed all these experiments and data in the result section on page 12-13, 16,17, and in the discussion section on page 19. We hope the reviewer agrees that this has now been sufficiently addressed.
  
  Line 153: "...substantiates this gene set as a core transcriptional feature of human autoimmune uveitis." It would be difficult to argue that when only 137 of the 1236 DEGs from the first module are repeated in a validation data set that this is the core transcriptions set that defines the population in any uveitis. Further concerns include that the validation data set is not the same population, but rather a subset not containing CD14.
  
  We agree with the reviewer and have changed this in the result section to “substantiates this gene set as a robust and bona fide transcriptional feature of CD1c+ DCs in human non-infectious uveitis” at page 13. We agree that - as expected - the removal of CD14+ cells impacted the sensitivity of our analysis, but that this strategy was required to attribute the black module to CD1c+ DCs. Our data supports that the black module gene signature is not restricted to CD14+ CD1c+ DCs by demonstrating that its dysregulation in non-infectious uveitis can even be perceived in CD14- CD1c+ DCs. We show now that the replication of a fraction of genes of the black module is a consequence of sensitivity to detect differentially expressed genes (Figure 2 – Supplement 1C). – most likely due to lower cell number after sorting out CD14+ cells. We have outlined this in greater detail in the result section on page 13. We hope the reviewer agrees this has now been adequately described.
  
  Line 220: Notch-dll experiments: with the experiments presented it is not possible to say that the changes are due to maintenance of CD1c+ DCs without further experiments outlining what NOTCH2 signaling changes throughout time. Is the population fully developed in the first 7 days of culture prior to adding NOTCH2 or ADAM10 inhibitors? Is there more apoptosis in this pathway? Less proliferation? It would be more accurate to say that there are fewer cDC2s after 14 days of culture without speculating the cause. In this experiment it is unclear why the gate of CD141/CD1c was chosen, as this appears to be in the middle of the population. In normal PBMCs CD141+ DCs would be CD1c negative; therefore why exclude the CD141hiCD1c+ and CD141loCD1c+ populations?
  
  We agree with the reviewer that in the current state the additional Notch-DLL experiments are inconclusive. Based on the comments from this reviewer, we believe the most appropriate experiments would be to show changes in the surface protein expression of CD36, CX3CR1 and other key surface markers of the black module upon inhibition of NOTCH2 or ADAM10. To this end, we repeated the experiments with human CD34+ HPC-derived DCs cells to measure cell subset by flow cytometry using the same panel we used for the PBMCs. However, we experienced substantial autofluorescence of human CD34-HPC derived cultures (expected for the complex heterogeneous cellularity of these cultures and as previously reported for CD34+ cells (Donnenberg et al., Methods 2015) that introduced significant artifacts and interfere with optimal identification of CD1c+ DCs and their subsets (see example below). We were unable to control for this so far, unfortunately. Since we agree with the reviewer that in the current form the supplemental figure does not significantly contribute to the manuscript, we removed the supplemental figure entirely from the manuscript. We hope the reviewer agrees that we already provide several complementary lines of evidence that link NOTCH-RUNX3 signaling to the black module (Figure 3A-D), including RNA-seq data from NOTCH2-DLL experiments, and that the current data is sufficient to support the main conclusions of the manuscript. We hope the reviewer agrees with this proposal.
  
  Author response figure 1: Manual gating example of human CD34-HPC derived DCs shows substantial autofluorescence.
  
  Line 256: The hypothesis that the loss of CD36+CX3CR1+ cells was due to migration to the eye doesn't make sense based on volume and number of cells. 0.1% of all PBMC is ~1x107 cells, and distributed throughout the eye would give about 1.3x106 cells/mL of eye volume. This would make the eye turbid which is not consistent with birdshot chorioretinopathy and would be rare in HLA-B27 anterior uveitis and intermediate uveitis
  
  We agree with the reviewer and have changed this in the manuscript section to “We speculated that the decrease in blood CD36+CX3CR1+ CD1c+ DCs was in part the result of migration of these cells to peripheral tissues (lymph nodes) and that these cells may also infiltrate the eye during active uveitis.” On page 17.
  
  Line 267: Would have liked to see the gating of CX3CR1/CD36 cells be more consistent (there are overlapping CX3CR1+ and CX3CR1- populations in 5A, but in Figure 4 quadrants were used to define the populations when evaluating the numbers in uveitis and healthy controls. The populations in Figure 5 are more separated by CD36.
  
  We agree with the reviewer and have added a more detailed example of the gating strategy used to sort CD36/CX3CR1 subsets in Figure 5 – Supplement 1 including the expression of CX3CR1 and CD36 in the sorted populations.
  
  Line 269, IN VITRO stimulation: The experimental paradigm is set up to find a difference between cells but does not to test any biologically relevant scenario. By sorting on a surface marker, then stimulating with the ligand for that receptor, the result better proves that CD36 is important in TLR2 signaling than does it give any information on how these dendritic cells might behave in uveitis.
  
  We agree with the reviewer that the connection between the cytokine expression of the CD1c+ subsets and non-infectious uveitis may benefit from additional experimental data. To this end, we profiled available eye fluid biopsies and paired plasma by Olink proteomics to measure 92 immune mediators from patients and controls from this study (and several additional samples, including aqueous humor from non-inflammatory cataract controls – see revised Figure 5 panel D). This analysis shows that cytokines produced by CD36+CX3CR1+ DCs such as TNF-alpha and IL-6 are specifically increased in eye tissue of patients, but not in blood. We hope the reviewer agrees that we have provided additional experimental data that links the functional differences in DC subsets to cytokines implicated in the pathogenesis of non-infectious uveitis.
  
  Reviewer #3 (Public Review):
  
  First, a note on nomenclature. The authors use the term 'auto-immune' uveitis to encapsulate three different conditions -- HLA-B27 anterior uveitis, idiopathic intermediate uveitis, and birdshot choroidopathy. While I would agree with this terminology for the third set, there is substantial controversy as to whether HLA-B27 is truly autoimmune or autoinflammatory. Indeed, one major hypothesis is that this condition is driven by changes in gut microbiome. Intermediate uveitis is even more problematic; a substantial number of cases of this condition will turn out to be associated with demyelinating disease, which has recently been linked to Epstein Barr virus disease. To my knowledge in none of these diseases has a definitive autoantigen been identified nor passive transfer via transfusion shown; I would suggest the authors abandon this terminology and simply refer to the conditions as they are called.
  
  We would like to thank the reviewer for the constructive suggestions. We agree and have changed the term “autoimmune uveitis” to “non-infectious uveitis” throughout the manuscript.
  
  Further, it would have been very desirable to compare the DC transcriptome for the other class of uveitic disease -- infectious -- for acute retinal necrosis or similar. As well it would have been very useful to compare profiles to other, related immune-mediated diseases such as ankylosing spondylitis.
  
  We agree with the reviewer that comparison of DC transcriptomes is useful for interpretation of biological mechanisms involved. This is precisely the reason we use (in Figure 3) comparison of our DC transcriptomic data to well-controlled transgenic models and DC culture systems. This revealed NOTCH2-RUNX3 signaling driving the uveitis-associated CD1c+ DC signature. We have now included transcriptomic data from CD1c+ DC subsets of type I IFN diseases SLE and Systemic Sclerosis in Figure 2. Although we agree that comparison to infectious uveitis would be interesting, bulk RNA-seq data from CD1c+ DCs are – to the best of our knowledge – unfortunately not available.
  
  Finally, it must be noted that looking for systemic signals in dendritic gene expression may be a bit of a needle in the haystack approach. Presumably, the function of the dendritic cells in uveitis is largely centered on those cells in the eye. It would have been highly desirable to examine the expression profile of intraocular DCs in at least a subset of patients who may have come to surgery (for instance, steroid implantation or vitrectomy).
  
  We agree with the reviewer that analysis of blood requires enormous efforts and controls to dissect disease-relevant changes in gene profiles of cDC2 subsets. We therefore designed a strategy that focusses on replication of gene modules, use independent cohorts, and complementary immunophenotyping technologies to detect key changes in specific subsets of CD1c+ DCs in uveitis patients. To further extend these analyses, we have now also detailed our analysis of intraocular DCs using single-cell RNA seq of eye fluid biopsies (aqueous humor) of HLA-B27 anterior uveitis (identical to our “AU” group of patients). As shown in revised Figure 6, we detected eye-infiltrating CD1c+ DCs and were able to cluster cells positive for the uveitis-associated black module (revised Figure 6B), which showed – as expected - that “black-module+” CD1c+ DCs show higher expression for CD36, CX3CR1, and lower RUNX3, but not CD14 (revised Figure 6C)– closely corroborating our blood CD1c+ DC analyses. These DC3s were also found at higher frequency in the eye of patients with AU (Figure 6D). We hope the reviewer agrees we have sustainably improved the analysis of intraocular DCs and that this has now been sufficiently addressed.
  
  It is also problematic that no effort has been made to assess the severity of uveitis. Flares of disease can range from extremely mild to debilitating. Similarly, intermediate uveitis and BSCR can range greatly in severity. Without normalizing for disease severity it is difficult to fully understand the range of transcriptional changes between cases.
  
  In our view, a key limitation in determination of uveitis severity for molecular analysis is the fact that objective biomarkers that assess disease severity across uveitis entities are lacking. Currently, disease severity is dependent an array of clinical features (i.e, SUN criteria) which cannot be applied consistently to anterior, intermediate and posterior uveitis. For example, the severity of anterior uveitis is in part assessed by grading of inflammation in the anterior chamber, while the anterior chamber is (typically) not involved in Birdshot Uveitis (BU in this study). However, to allow the study of patients with high disease activity, we exclusively used systemic treatment-free patients that all had active uveitis at sampling at our academic institute, making the results highly relevant for understanding the pathophysiology of non-infectious uveitis. For this reviewer’s convenience, we have conducted additional analysis that includes key clinical parameters (anterior chamber cells, vitreous cells, and macular thickness for patients from cohort I). These data showed no clear clustering of patients based on any of the clinical parameters (revised Figure 1 -Supplement 2). We hope the reviewer agrees this has been addressed in sufficient detail.
  
  The use of principal component analysis for clustering may be underpowered; I would suggest the authors apply UMAP to determine if higher dimensional component analyses correlate with disease type.
  
  Upon request of the reviewer, we have conducted UMAP (with different tuning of hyperparameters) on the DEGs (cohort I, see image below). We believe that UMAP analysis did not provide additional insights or correlates with disease type. We hope the reviewer agrees.
  
  The false-discovery rate in large transcriptomic projects is challenging. While the authors are to be commended for employing a validation set, it would be useful to employ a Monte Carlo simulation in which groups are arbitrarily relabeled to determine the number of expected false discoveries within this data set (i.e. akin to Significance Analysis of Microarray techniques).
  
  We determined the adjusted P values via the DESeq2 package (for false-discovery rate of 5% and Benjamini-Hochberg Procedure). The results are shown in Supplemental File 1K-1M and analysis in Figure 1A.
  
  I do not fully understand the significance of the mouse CD11c-Runx3delta mice. It appears these data were derived from previous datasets or from bone marrow stromal line cultures. Did the authors attempt to generate autoimmune uveitis (i.e. EAU) in these animals? Without this the relevance for uveitis is unclear.
  
  We did not attempt to induce experimental autoimmune uveitis in CD11c-Runx3delta mice. We used transcriptomic data from dendritic cells purified from this model to show that loss of RUNX3 induces a gene signature highly reminiscent of the gene module identified in non-infectious uveitis patients. Using enrichment analysis, we show that the transcriptome of patients is highly enriched for this signature which indicates that the decreased RUNX3 observed in patients underlies the upregulation of CD36, CX3CR1 and other surface genes. In other words, we used data from transgenic models to dissect which of the altered transcription factors were driving this gene module and we identified the RUNX3-NOTCH2 axis as an important contributor.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.16.468816v1
www.biorxiv.org www.biorxiv.org

New submission 10/01/2023, 10:07:35

2
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  We thank the reviewers for their positive feedback and thoughtful suggestions that will improve our manuscript. Here we summarise our plan for immediate action. We will resubmit our manuscript once additional experiments have been performed to clarify all the major and minor concerns of the reviewers and the manuscript has been revised. At that point, we will respond to all reviewer’s points and highlight the changes made in the text.
  
  Reviewer #1 (Public Review):
  
  The authors have tried to correlate changes in the cellular environment by means of altering temperature, the expression of key cellular factors involved in the viral replication cycle, and small molecules known to affect key viral protein-protein interactions with some physical properties of the liquid condensates of viral origin. The ideas and experiments are extremely interesting as they provide a framework to study viral replication and assembly from a thermodynamic point of view in live cells.
  
  The major strengths of this article are the extremely thoughtful and detailed experimental approach; although this data collection and analysis are most likely extremely time-consuming, the techniques used here are so simple that the main goal and idea of the article become elegant. A second major strength is that in other to understand some of the physicochemical properties of the viral liquid inclusion, they used stimuli that have been very well studied, and thus one can really focus on a relatively easy interpretation of most of the data presented here.
  
  There are three major weaknesses in this article. The way it is written, especially at the beginning, is extremely confusing. First, I would suggest authors should check and review extensively for improvements to the use of English. In particular, the abstract and introduction are extremely hard to understand. Second, in the abstract and introduction, the authors use terms such as "hardening", "perturbing the type/strength of interactions", "stabilization", and "material properties", for just citing some terms. It is clear that the authors do know exactly what they are referring to, but the definitions come so late in the text that it all becomes confusing. The second major weakness is that there is a lack of deep discussion of the physical meaning of some of the measured parameters like "C dense vs inclusion", and "nuclear density and supersaturation". There is a need to explain further the physical consequences of all the graphs. Most of them are discussed in a very superficial manner. The third major weakness is a lack of analysis of phase separations. Some of their data suggest phase transition and/or phase separation, thus, a more in-deep analysis is required. For example, could they calculate the change of entropy and enthalpy of some of these processes? Could they find some boundaries for these transitions between the "hard" (whatever that means) and the liquid?
  
  The authors have achieved almost all their goals, with the caveat of the third weakness I mentioned before. Their work presented in this article is of significant interest and can become extremely important if a more detailed analysis of the thermodynamics parameters is assessed and a better description of the physical phenomenon is provided.
  
  We thank reviewer 1 for the comments and, in particular, for being so positive regarding the strengths of our manuscript and for raising concerns that will surely improve the manuscript. At this point, we propose the following actions to address the concerns of Reviewer 1:
  
  1) We will extensively revise the use of English, particularly, in the abstract and introduction, defining key terms as they come along in the text to make the argument clearer.
  
  2) We acknowledge the importance of discussing our data in more detail and we propose the following. We will discuss the graphs and what they mean as exemplified in the paragraph below.
  
  Regarding Figure 3 - As the concentration of vRNPs increases, we observe an increase in supersaturation until 12hpi. This means that contrary to what is observed in a binary mixture, in which the Cdilute is constant (Klosin et al., 2020), the Cdilute in our system increases with concentration. It has been reported that Cdilute increases in a multi-component system with bulk concentration (Riback et al., 2020). Our findings have important implications for how we think about the condensates formed during influenza infection. As the 8 different genomic vRNPs have a similar overall structure, they could, in theory, behave as a binary system between units of vRNPs and Rab11a. However, a change in Cdilute with concentration shows that our system behaves as a multi-component system. This means that the differences in length, RNA sequence and valency that each vRNP have are key for the integrity of condensates.
  
  3) The reviewer calls our attention to the lack of analysis of phase separations. We think that phase separation (or percolation coupled to phase separation) governs the formation of influenza A virus condensates. However, we think we ought to exert caution at this point as the condensates we are working with are very complex and that the physics of our system in cells may not be sufficient to claim phase separation without an in vitro reconstitution system. In fact, IAV inclusions contain cellular membranes, different vRNPs and Rab11a. So far, we can only speculate that the liquid character of IAV inclusions may arise from a network of interacting vRNPs that bridge several cognate vRNP-Rab11 units on flexible membranes, similarly to what happens in phase separated vesicles in neurological synapses. However, the speculative model for our system, although being supported by correlative light and electron microscopy, currently lacks formal experimental validation.
  
  For this reason, we thought of developing the current work as an alternative to explore the importance of the liquid material properties of IAV inclusions. By finding an efficient method to alter the material properties of IAV inclusions, we provide proof of principle that it is possible to impose controlled phase transitions that reduce the dynamics of vRNPs in cells and negatively impact progeny virion production. Despite having discussed these issues in the limitations of the study, we will make our point clearer.
  
  We are currently establishing an in vitro reconstitution system to formally demonstrate, in an independent publication, that IAV inclusions are formed by phase separation. For this future work, we teamed up with Pablo Sartori, a theorical physicist to derive in- depth analysis of the thermodynamics of the viral liquid condensates. Collectively, we think that cells have too many variables to derive meaningful physics parameters (such as entropy and enthalpy) as well as models and need to be complemented by in vitro systems. For example, increasing the concentration inside a cell is not a simple endeavour as it relies on cellular pathways to deliver material to a specific place. At the same time, the 8 vRNPs, as mentioned above, have different size, valency and RNA sequence and can behave very differently in the formation of condensates and maintenance of their material properties. Ideally, they should be analysed individually or in selected combinations. For the future, we will combine data from in vitro reconstitution systems and cells to address this very important point raised by the reviewer.
  
  From the paper on the section Limitations of the study: “Understanding condensate biology in living cells is physiologically relevant but complex because the systems are heterotypic and away from equilibria. This is especially challenging for influenza A liquid inclusions that are formed by 8 different vRNP complexes, which although sharing the same structure, vary in length, valency, and RNA sequence. In addition, liquid inclusions result from an incompletely understood interactome where vRNPs engage in multiple and distinct intersegment interactions bridging cognate vRNP-Rab11 units on flexible membranes (Chou et al., 2013; Gavazzi et al., 2013; Haralampiev et al., 2020; Le Sage et al., 2020; Shafiuddin & Boon, 2019; Sugita, Sagara, Noda, & Kawaoka, 2013). At present, we lack an in vitro reconstitution system to understand the underlying mechanism governing demixing of vRNP-Rab11a-host membranes from the cytosol. This in vitro system would be useful to explore how the different segments independently modulate the material properties of inclusions, explore if condensates are sites of IAV genome assembly, determine thermodynamic values, thresholds accurately, perform rheological measurements for viscosity and elasticity and validate our findings”.
  
  Reviewer #2 (Public Review):
  
  During Influenza virus infection, newly synthesized viral ribonucleoproteins (vRNPs) form cytosolic condensates, postulated as viral genome assembly sites and having liquid properties. vRNP accumulation in liquid viral inclusions requires its association with the cellular protein Rab11a directly via the viral polymerase subunit PB2. Etibor et al. investigate and compare the contributions of entropy, concentration, and valency/strength/type of interactions, on the properties of the vRNP condensates. For this, they subjected infected cells to the following perturbations: temperature variation (4, 37, and 42{degree sign}C), the concentration of viral inclusion drivers (vRNPs and Rab11a), and the number or strength of interactions between vRNPs using nucleozin a well-characterized vRNP sticker. Lowering the temperature (i.e. decreasing the entropic contribution) leads to a mild growth of condensates that does not significantly impact their stability. Altering the concentration of drivers of IAV inclusions impact their size but not their material properties. The most spectacular effect on condensates was observed using nucleozin. The drug dramatically stabilizes vRNP inclusions acting as a condensate hardener. Using a mouse model of influenza infection, the authors provide evidence that the activity of nucleozin is retained in vivo. Finally, using a mass spectrometry approach, they show that the drug affects vRNP solubility in a Rab11a-dependent manner without altering the host proteome profile.
  
  The data are compelling and support the idea that drugs that affect the material properties of viral condensates could constitute a new family of antiviral molecules as already described for the respiratory syncytial virus (Risso Ballester et al. Nature. 2021).
  
  Nevertheless, there are some limitations in the study. Several of them are mentioned in a dedicated paragraph at the end of a discussion. This includes the heterogeneity of the system (vRNP of different sizes, interactions between viral and cellular partners far from being understood), which is far from equilibrium, and the absence of minimal in vitro systems that would be useful to further characterize the thermodynamic and the material properties of the condensates.
  
  We thank reviewer 2 for highlighting specific details that need improving and raising such interesting questions to validate our findings. We will address all the minor comments of Reviewer 2. To address the comments of Reviewer 2, we propose the actions described in blue below each point raised that is written in italics.
  
  1) The concentrations are mostly evaluated using antibodies. This may be correct for Cdilute. However, measurement of Cdense should be viewed with caution as the antibodies may have some difficulty accessing the inner of the condensates (as already shown in other systems), and this access may depend on some condensate properties (which may evolve along the infection). This might induce artifactual trends in some graphs (as seen in panel 2c), which could, in turn, affect the calculation of some thermodynamic parameters.
  
  The concern of using antibodies to calculate Cdense is valid. We will address this concern by validating our results using a fluorescent tagged virus that has mNeon Green fused to the viral polymerase PA (PA-mNeonGreen PR8 virus). Like NP, PA is a component of vRNPs and labels viral inclusions, colocalising with Rab11 when vRNPs are in the cytosol without the need of using antibodies.
  
  This virus would be the best to evaluate inclusion thermodynamics, where it not an attenuated virus (Figure 1A below) with a delayed infection as demonstrated by the reduced levels of viral proteins (Figure 1B below). Consistently, it shows differences in the accumulation of vRNPs in the cytosol and viral inclusions form later in infection. After their emergence, inclusions behave as in the wild-type virus (PR8-WT), fusing and dividing (Figure 1C below) and displaying liquid properties. The differences in concentration may shift or alter thermodynamic parameters such as time of nucleation, nucleation density, inclusion maturation rate, Cdense, Cdilute. This is the reason why we performed the thermodynamics profiling using antibodies upon PR8-WT infection. For validating our results, and taking into account a possible delayed kinetics, and differenced that may occur because of reduced vRNP accumulation in the cytosol, this virus will be useful and therefore we will repeat the thermodynamics using it.
  
  As a side note, vRNPs are composed of viral RNA coated with several molecules of NP and each vRNP also contains 1 copy of the trimeric RNA dependent RNA polymerase formed by PA, PB1 and PB2. It is well documented that in the cytosol the vast majority of PA (and other components of the polymerase) is in the form of vRNPs (Avilov, Moisy, Munier, et al., 2012; Avilov, Moisy, Naffakh, & Cusack, 2012; Bhagwat et al., 2020; Lakdawala et al., 2014), and thus we can use this virus to label vRNPs on condensates to corroborate our studies using antibodies.
  
  Figure 1 – The PA- mNeonGreen virus is attenuated in comparison to the WT virus. A. Cells (A549) were infected or mock-infected with PR8 WT or PA- mNeonGreen (PA-mNG) viruses, at a multiplicity of infection (MOI) of 3, for the indicated times. Viral production was determined by plaque assay and plotted as plaque forming units (PFU) per milliliter (mL) ± standard error of the mean (SEM). Data are a pool from 2 independent experiments. B. The levels of viral PA, NP and M2 proteins and actin in cell lysates at the indicated time points were determined by western blotting. C. Cells (A549) were transfected with a plasmid encoding mCherry-NP and co-infected with PA-mNeonGreen virus for 16h, at an MOI of 10. Cells were imaged under time-lapse conditions starting at 16 hpi. White boxes highlight vRNPs/viral inclusions in the cytoplasm in the individual frames. The dashed white and yellow lines mark the cell nucleus and the cell periphery, respectively. The yellow arrows indicate the fission/fusion events and movement of vRNPs/ viral inclusions. Bar = 10 µm. Bar in insets = 2 µm.
  
  2) Although the authors have demonstrated that vRNP condensates exhibit several key characteristics of liquid condensates (they fuse and divide, they dissolve upon hypotonic shock or upon incubation with 1,6-hexanediol, FRAP experiments are consistent with a liquid nature), their aspect ratio (with a median above 1.4) is much higher than the aspect ratio observed for other cellular or viral liquid compartments. This is intriguing and might be discussed.
  
  IAV inclusions have been shown to interact with microtubules and the endoplasmic reticulum, that confers movement, and also undergo fusion and fission events. We propose that these interactions and movement impose strength and deform inclusions making them less spherical. To validate this assumption, we compared the aspect ratio of viral inclusions in the absence and presence of nocodazole (that abrogates microtubule-based movement). The data in figure 2 shows that in the presence of nocodazole, the aspect ratio decreases from 1.42±0.36 to 1.26 ±0.17, supporting our assumption.
  
  Figure 2 – Treatment with nocodazole reduces the aspect ratio of influenza A virus inclusions. Cells (A549) were infected PR8 WT and treated with nocodazole (10 µg/mL) for 2h time after which the movement of influenza A virus inclusions was captured by live cell imaging. Viral inclusions were segmented, and the aspect ratio measured by imageJ, analysed and plotted in R.
  
  3) Similarly, the fusion event presented at the bottom of figure 3I is dubious. It might as well be an aggregation of condensates without fusion.
  
  We will change this, thank you for the suggestion.
  
  4) The authors could have more systematically performed FRAP/FLAPh experiments on cells expressing fluorescent versions of both NP and Rab11a to investigate the influence of condensate size, time after infection, or global concentrations of Rab11a in the cell (using the total fluorescence of overexpressed GFP-Rab11a as a proxy) on condensate properties.
  
  We will try our best to be able to comply with this suggestion as we think it is important.
  
  Reviewer #3 (Public Review):
  
  This study aims to define the factors that regulate the material properties of the viral inclusion bodies of influenza A virus (IAV). In a cellular model, it shows that the material properties were not affected by lowering the temperature nor by altering the concentration of the factors that drive their formation. Impressively, the study shows that IAV inclusions may be hardened by targeting vRNP interactions via the known pharmacological modulator (also an IAV antiviral), nucleozin, both in vitro and in vivo. The study employs current state-of-the-art methodology in both influenza virology and condensate biology, and the conclusions are well-supported by data and proper data analysis. This study is an important starting point for understanding how to pharmacologically modulate the material properties of IAV viral inclusion bodies.
  
  We thank this reviewer for all the positive comments. We will address the minor issues brought to our attention entirely, including changing the tittle of the manuscript and we will investigate the formation and material properties of IAV inclusions in the presence and absence of nucleozin for the nucleozin escape mutant NP-Y289H.
  
  References
  
  Avilov, S. V., Moisy, D., Munier, S., Schraidt, O., Naffakh, N., & Cusack, S. (2012). Replication- competent influenza A virus that encodes a split-green fluorescent protein-tagged PB2 polymerase subunit allows live-cell imaging of the virus life cycle. J Virol, 86(3), 1433- 1448. doi:10.1128/JVI.05820-11
  
  Avilov, S. V., Moisy, D., Naffakh, N., & Cusack, S. (2012). Influenza A virus progeny vRNP trafficking in live infected cells studied with the virus-encoded fluorescently tagged PB2 protein. Vaccine, 30(51), 7411-7417. doi:10.1016/j.vaccine.2012.09.077
  
  Bhagwat, A. R., Le Sage, V., Nturibi, E., Kulej, K., Jones, J., Guo, M., . . . Lakdawala, S. S. (2020). Quantitative live cell imaging reveals influenza virus manipulation of Rab11A transport through reduced dynein association. Nat Commun, 11(1), 23. doi:10.1038/s41467-019-13838-3
  
  Chou, Y. Y., Heaton, N. S., Gao, Q., Palese, P., Singer, R. H., & Lionnet, T. (2013). Colocalization of different influenza viral RNA segments in the cytoplasm before viral budding as shown by single-molecule sensitivity FISH analysis. PLoS Pathog, 9(5), e1003358. doi:10.1371/journal.ppat.1003358
  
  Gavazzi, C., Yver, M., Isel, C., Smyth, R. P., Rosa-Calatrava, M., Lina, B., . . . Marquet, R. (2013). A functional sequence-specific interaction between influenza A virus genomic RNA segments. Proc Natl Acad Sci U S A, 110(41), 16604-16609. doi:10.1073/pnas.1314419110
  
  Haralampiev, I., Prisner, S., Nitzan, M., Schade, M., Jolmes, F., Schreiber, M., . . . Herrmann, A. (2020). Selective flexible packaging pathways of the segmented genome of influenza A virus. Nat Commun, 11(1), 4355. doi:10.1038/s41467-020-18108-1
  
  Klosin, A., Oltsch, F., Harmon, T., Honigmann, A., Julicher, F., Hyman, A. A., & Zechner, C. (2020). Phase separation provides a mechanism to reduce noise in cells. Science, 367(6476), 464-468. doi:10.1126/science.aav6691
  
  Lakdawala, S. S., Wu, Y., Wawrzusin, P., Kabat, J., Broadbent, A. J., Lamirande, E. W., . . . Subbarao, K. (2014). Influenza a virus assembly intermediates fuse in the cytoplasm. PLoS Pathog, 10(3), e1003971. doi:10.1371/journal.ppat.1003971
  
  Le Sage, V., Kanarek, J. P., Snyder, D. J., Cooper, V. S., Lakdawala, S. S., & Lee, N. (2020). Mapping of Influenza Virus RNA-RNA Interactions Reveals a Flexible Network. Cell Rep, 31(13), 107823. doi:10.1016/j.celrep.2020.107823
  
  Riback, J. A., Zhu, L., Ferrolino, M. C., Tolbert, M., Mitrea, D. M., Sanders, D. W., . . . Brangwynne, C. P. (2020). Composition-dependent thermodynamics of intracellular phase separation. Nature, 581(7807), 209-214. doi:10.1038/s41586-020-2256-2
  
  Shafiuddin, M., & Boon, A. C. M. (2019). RNA Sequence Features Are at the Core of Influenza a Virus Genome Packaging. J Mol Biol. doi:10.1016/j.jmb.2019.03.018
  
  Sugita, Y., Sagara, H., Noda, T., & Kawaoka, Y. (2013). Configuration of viral ribonucleoprotein complexes within the influenza A virion. J Virol, 87(23), 12879- 12884. doi:10.1128/JVI.02096-13
  
  AuthorResponse
2. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors have tried to correlate changes in the cellular environment by means of altering temperature, the expression of key cellular factors involved in the viral replication cycle, and small molecules known to affect key viral protein-protein interactions with some physical properties of the liquid condensates of viral origin. The ideas and experiments are extremely interesting as they provide a framework to study viral replication and assembly from a thermodynamic point of view in live cells.
  
  The major strengths of this article are the extremely thoughtful and detailed experimental approach; although this data collection and analysis are most likely extremely time-consuming, the techniques used here are so simple that the main goal and idea of the article become elegant. A second major strength is that in other to understand some of the physicochemical properties of the viral liquid inclusion, they used stimuli that have been very well studied, and thus one can really focus on a relatively easy interpretation of most of the data presented here.
  
  There are three major weaknesses in this article. The way it is written, especially at the beginning, is extremely confusing. First, I would suggest authors should check and review extensively for improvements to the use of English. In particular, the abstract and introduction are extremely hard to understand. Second, in the abstract and introduction, the authors use terms such as "hardening", "perturbing the type/strength of interactions", "stabilization", and "material properties", for just citing some terms. It is clear that the authors do know exactly what they are referring to, but the definitions come so late in the text that it all becomes confusing. The second major weakness is that there is a lack of deep discussion of the physical meaning of some of the measured parameters like "C dense vs inclusion", and "nuclear density and supersaturation". There is a need to explain further the physical consequences of all the graphs. Most of them are discussed in a very superficial manner. The third major weakness is a lack of analysis of phase separations. Some of their data suggest phase transition and/or phase separation, thus, a more in-deep analysis is required. For example, could they calculate the change of entropy and enthalpy of some of these processes? Could they find some boundaries for these transitions between the "hard" (whatever that means) and the liquid?
  
  The authors have achieved almost all their goals, with the caveat of the third weakness I mentioned before. Their work presented in this article is of significant interest and can become extremely important if a more detailed analysis of the thermodynamics parameters is assessed and a better description of the physical phenomenon is provided.
  
  We thank you for the comments and, in particular, for being so positive regarding the strengths of our manuscript and for raising concerns that will surely improve it. We have taken the following actions to address your concerns:
  
  1) Extensive revisions have been made to the use of English, particularly in the abstract and introduction. Key terms are defined as they are introduced in the text to enhance the clarity of the argument. This is a significant revision that is highlighted within the text, but it is too extensive to detail here.
  
  2) In the results section, we improved and extended the discussion of our graphs to the extent possible. However, we found that attempting to explain the graphs' meanings more thoroughly would detract from our manuscript's main focus: identifying thermodynamic changes that could potentially lead to alterations in material properties, specifically aspect ratio, size, and Gibbs free energy. As a result, we introduced the type of information we could obtain from our analyses in the introduction (Lines 112-125) and briefly commented on it in the ‘results’ section (Lines 304-306, sentences below).
  
  From introduction – lines 112-125:
  
  “In addition, other parameters like nucleation density determine how many viral condensates are formed per area of cytosol. Overall, the data will inform us if changing one parameter, e.g. the concentration, drives the system towards larger condensates with the same or more stable properties, or more abundant condensates that are forced to maintain the initial or a different size on account of available nucleation centres (Riback et al., 2020:Snead, 2022 #1152). It will also inform us if liquid viral inclusions behave like a binary or a multi-component system. In a binary mixture, Cdilute is constant (Klosin et al., 2020). However, in multi-component systems, Cdilute increases with bulk concentration (Riback et al., 2020). This type of information could have direct implications about the condensates formed during influenza infection. As the 8 different genomic vRNPs have a similar overall structure, they could, in theory, behave as a binary system between units of vRNPs and Rab11a. However, a change in Cdilute with concentration would mean that the system behaves as a multi-component system. This could raise the hypothesis that the differences in length, RNA sequence and valency that each vRNP has may be relevant for the integrity and behaviour of condensates.”.
  
  From results lines 304-306:
  
  This indicates that the liquid inclusions behave as a multi-component system and allow us to speculate that the differences in length, RNA sequence and valency that each vRNP may be key for the integrity and behaviour of condensates.
  
  3) The reviewer has drawn our attention to the absence of phase separation analysis in our study. We believe that the formation of influenza A virus condensates is governed by phase separation (or percolation coupled to phase separation). However, we must exercise caution at this point because the condensates we are studying are highly complex, and the physics of our cellular system may not be adequate to claim phase separation without being validated by an in vitro reconstitution system. IAV inclusions contain a variety of cellular membranes, different vRNPs, and Rab11a. While we have robust data to propose a model in which the liquid-like properties of IAV inclusions arise from a network of interacting vRNPs that bridge multiple cognate vRNP-Rab11 units on flexible membranes, similar to what occurs in phase-separated vesicles in neurological synapses, our model for this system still lacks formal experimental validation. As a note, the data supporting our model includes: the demonstration of the liquid properties of our liquid inclusions (Alenquer et al. 2019, Nature Communications, 10, 1629); and impairment of recycling endocytic activity during IAV infection Bhagwat et al. 2020, Nat Commun, 11, 23; Kawaguchi et al. 2012, J Virol, 86, 11086-95; Vale-costa et al. 2016, J Cell Sci, 129, 1697-710. This leads to aggregated vesicles seen by correlative light and electron microscopy (Vale-Costa et al., 2016 JCS, 129, 1697-710) and by immunofluorescence and FISH (Amorim et al. 2011,. J Virol 85, 4143-4156; Avilov et al. 2012, Vaccine 30, 7411-7417; Chou et al. 2013, PLoS Pathog 9, e1003358; Eisfeld et al. 2011, J Virol 85, 6117-6126 and Lakdawala et al. 2014, PLoS Pathog 10, e1003971.
  
  To be able to explore the significance of the liquid material properties of IAV inclusions, we used the strategy described in this current work. By developing an effective method to manipulate the material properties of IAV inclusions, we provide evidence that controlled phase transitions can be induced, resulting in decreased vRNP dynamics in cells and a negative impact on progeny virion production. This suggests that the liquid character of liquid inclusions is important for their function in IAV infection. We have improved our explanation addressing this concern in the limitations of our study (as outlined below in the box and in manuscript in lines 857-872).
  
  We are currently establishing an in vitro reconstitution system to formally demonstrate, in an independent publication, that IAV inclusions are formed by phase separation (or percolation coupled to phase separation). For this future work, we teamed up with Pablo Sartori, a theorical physicist to derive in-depth analysis of the thermodynamics of the viral liquid condensates in the in vitro reconstituted system and compare it to results obtained in the cell. This will provide means to establish comparisons. We think that cells have too many variables to derive meaningful physics parameters (such as entropy and enthalpy) and models that need to be complemented by in vitro systems. For example, increasing the concentration inside a cell is not a simple endeavour as it relies on cellular pathways to deliver material to a specific place. At the same time, the 8 vRNPs, as mentioned above, have different size, valency and RNA sequence and can behave very differently in the formation of condensates and maintenance of their material properties. Ideally, they should be analysed individually or in selected combinations. For the future, we will combine data from in vitro reconstitution systems and cells to address this very important point raised by the reviewer.
  
  From the paper on the section ‘Limitations of the study’:
  
  “Understanding condensate biology in living cells is physiological relevant but complex because the systems are heterotypic and away from equilibria. This is especially challenging for influenza A liquid inclusions that are formed by 8 different vRNP complexes, which although sharing the same structure, vary in length, valency, and RNA sequence. In addition, liquid inclusions result from an incompletely understood interactome where vRNPs engage in multiple and distinct intersegment interactions bridging cognate vRNP-Rab11 units on flexible membranes (Chou et al., 2013, Gavazzi et al., 2013, Sugita et al., 2013, Shafiuddin and Boon, 2019, Haralampiev et al., 2020, Le Sage et al., 2020). At present, we lack an in vitro reconstitution system to understand the underlying mechanism governing demixing of vRNP-Rab11a-host membranes from the cytosol. This in vitro system would be useful to explore how the different segments independently modulate the material properties of inclusions, explore if condensates are sites of IAV genome assembly, determine thermodynamic values, thresholds accurately, perform rheological measurements for viscosity and elasticity and validate our findings. The results could be compared to those obtained in cell systems to derive thermodynamic principles happening in a complex system away from equilibrium. Using cells to map how liquid inclusions respond to different perturbations provide the answer of how the system adapts in vivo, but has limitations.
  
  Reviewer #2 (Public Review):
  
  During Influenza virus infection, newly synthesized viral ribonucleoproteins (vRNPs) form cytosolic condensates, postulated as viral genome assembly sites and having liquid properties. vRNP accumulation in liquid viral inclusions requires its association with the cellular protein Rab11a directly via the viral polymerase subunit PB2. Etibor et al. investigate and compare the contributions of entropy, concentration, and valency/strength/type of interactions, on the properties of the vRNP condensates. For this, they subjected infected cells to the following perturbations: temperature variation (4, 37, and 42{degree sign}C), the concentration of viral inclusion drivers (vRNPs and Rab11a), and the number or strength of interactions between vRNPs using nucleozin a well-characterized vRNP sticker. Lowering the temperature (i.e. decreasing the entropic contribution) leads to a mild growth of condensates that does not significantly impact their stability. Altering the concentration of drivers of IAV inclusions impact their size but not their material properties. The most spectacular effect on condensates was observed using nucleozin. The drug dramatically stabilizes vRNP inclusions acting as a condensate hardener. Using a mouse model of influenza infection, the authors provide evidence that the activity of nucleozin is retained in vivo. Finally, using a mass spectrometry approach, they show that the drug affects vRNP solubility in a Rab11a-dependent manner without altering the host proteome profile
  
  The data are compelling and support the idea that drugs that affect the material properties of viral condensates could constitute a new family of antiviral molecules as already described for the respiratory syncytial virus (Risso Ballester et al. Nature. 2021)
  
  Nevertheless, there are some limitations in the study. Several of them are mentioned in a dedicated paragraph at the end of a discussion. This includes the heterogeneity of the system (vRNP of different sizes, interactions between viral and cellular partners far from being understood), which is far from equilibrium, and the absence of minimal in vitro systems that would be useful to further characterize the thermodynamic and the material properties of the condensates.
  
  There are other ones.
  
  We thank reviewer 2 for highlighting specific details that need improving and raising such interesting questions to validate our findings. We have addressed the comments of Reviewer 2, we performed the experiments as described (in blue) below each point raised.
  
  1) The concentrations are mostly evaluated using antibodies. This may be correct for Cdilute. However, measurement of Cdense should be viewed with caution as the antibodies may have some difficulty accessing the inner of the condensates (as already shown in other systems), and this access may depend on some condensate properties (which may evolve along the infection). This might induce artifactual trends in some graphs (as seen in panel 2c), which could, in turn, affect the calculation of some thermodynamic parameters.
  
  The concern of using antibodies to calculate Cdense is valid, and we thought it was very important. We addressed this concern by performing the same analyses using a fluorescent tagged virus that has mNeon Green fused to the viral polymerase PA (PA-mNeonGreen PR8 virus). Like NP, PA is a component of vRNPs and labels viral inclusions, colocalising with Rab11 when vRNPs are in the cytosol. However, per vRNP there is only one molecule of PA, whilst of NP there are 37-96 depending on the size of vRNPs. As predicted, we did observe changes in the Cdilute, Cdense and nucleation density. However, the measurements and values obtained for Gibbs free energy, size, aspect ratio detecting viral inclusions with fluorescently tagged vRNPs or antibody staining followed the same trend and allow us to validate our conclusion that major changes in Gibbs free energy occur solely when there is a change in the valency/strength of interactions but not in temperature or concentration (Figure 1 below). Given the extent of these data, we show here the results but, in the manuscript, we will describe the limitations of using antibodies in our study within the section ‘Limitations of the study’ from lines 881-894. Given the importance of the question regarding the pros and cons of the different systems for analysing thermodynamic parameters, we have decided to systematically assess and explore these differences in detail in a future manuscript.
  
  For more information. This reviewer may be asking why we did not use the PA-fluorescent virus in the first place to evaluate inclusion thermodynamics and avoid problems in accessibility that antibodies may have to get deep into large inclusions. Our answer is that no system is perfect. In the case of the PA-fluorescent virus, the caveats revolve around the fact that the virus is attenuated (Figure 1a below), exhibiting a delayed infection as demonstrated by reduced levels of viral proteins (Figure 1b below). Consistently, it shows differences in the accumulation of vRNPs in the cytosol and viral inclusions form later in infection and the amount of vRNPs in the cytosol does not reach the levels observed in PR8-WT virus. After their emergence, inclusions behave as in the wild-type virus (PR8-WT), fusing and dividing (Figure 1c below) and displaying liquid properties.
  
  As the overarching goal of this manuscript is to evaluate the best strategies to harden liquid IAV inclusions and given that one of the parameters we were testing is concentration, we reasoned that using PR8-WT virus for our analyses would be reasonable.
  
  In conclusions, both systems have caveats that are important to systematically assess, and these differences may shift or alter thermodynamic parameters such as nucleation density, inclusion maturation rate, Cdense, Cdilute in particular by varying the total concentration. As a note, to validate all our results using the PA-mNeonGreen PR8 virus, we considered the delayed kinetics and applied our thermodynamic analyses up to 20 hpi rather than 16 hpi.
  
  However, because of the question raised by this reviewer, on which is the best solution for mitigating errors induced by using antibodies, we re-checked all our data. Not only have we compared the data originated from attenuated fluorescently tagged virus with our data, but also made comparisons with images acquired from Z stacks (as used for concentration and for type/strength of interactions) with those acquired from 2D images. Our analysis revealed that there is a very good match using images acquired with Z-stacks and analysed as Z projections with between antibody staining and vRNP fluorescent virus. Therefore, we re-analysed all our thermodynamic data done with temperature using images acquired from Z stacks and altered entirely Figure 2. We believe that all these comparisons and analyses have greatly improved the manuscript and hence we thank all reviewers for their input.
  
  Figure 1 – The PA-mNeonGreen virus is attenuated in comparison to the WT virus and data obtained is consistent for Gibbs free energy with analyses done with images processed with antibody fluorescent vRNPs. A. Representation of the PA-mNeonGreen virus (PA-mNG; Abbreviations: NCR: non coding region). B. Cells (A549) were transfected with a plasmid encoding mCherry-NP and co-infected with PA-mNeonGreen virus for 16h, at an MOI of 10. Cells were imaged under time-lapse conditions starting at 16 hpi. White boxes highlight vRNPs/viral inclusions in the cytoplasm in the individual frames. The dashed white and yellow lines mark the cell nucleus and the cell periphery, respectively. The yellow arrows indicate the fission/fusion events and movement of vRNPs/ viral inclusions. Bar = 10 µm. Bar in insets = 2 µm. C-D. Cells (A549) were infected or mock-infected with PR8 WT or PA-mNG viruses, at a multiplicity of infection (MOI) of 3, for the indicated times. C. Viral production was determined by plaque assay and plotted as plaque forming units (PFU) per milliliter (mL) ± standard error of the mean (SEM). Data are a pool from 2 independent experiments. D. The levels of viral PA, NP and M2 proteins and actin in cell lysates at the indicated time points were determined by western blotting. (E-G) Biophysical calculations in cells infected with the PA-mNeonGreen virus upon altering temperature (at 10 hpi, evaluating the concentration of vRNPs (over a time course) in conditions expressing native amounts of Rab11a or overexpressing low levels of Rab11a and upon altering the type/strength of vRNP interactions by adding nucleozin at 10 hpi during the indicated time periods. All data: Ccytoplasm/Cnucleus; Cdense, Cdilute, area aspect ratio and Gibbs free energy are represented as boxplots. Above each boxplot, same letters indicate no significant difference between them, while different letters indicate a statistical significance at α = 0.05 using one-way ANOVA, followed by Tukey multiple comparisons of means for parametric analysis, or Kruskal-Wallis Bonferroni treatment for non-parametric analysis.
  
  2) Although the authors have demonstrated that vRNP condensates exhibit several key characteristics of liquid condensates (they fuse and divide, they dissolve upon hypotonic shock or upon incubation with 1,6-hexanediol, FRAP experiments are consistent with a liquid nature), their aspect ratio (with a median above 1.4) is much higher than the aspect ratio observed for other cellular or viral liquid compartments. This is intriguing and might be discussed.
  
  IAV inclusions have been shown to interact with microtubules and the endoplasmic reticulum, that confers movement, and undergo fusion and fission events. We propose that these interactions and movement impose strength and deform inclusions making them less spherical. To validate this assumption, we compared the aspect ratio of viral inclusions in the absence and presence of nocodazole (that abrogates microtubule-based movement). The data in figure 2 shows that in the presence of nocodazole, the aspect ratio decreases from 1.42±0.36 to 1.26 ±0.17, supporting our assumption.
  
  Figure 2 – Treatment with nocodazole reduces the aspect ratio of influenza A virus inclusions. Cells (A549) were infected with PR8 WT for 8 h and treated with nocodazole (10 µg/mL) for 2h, after which the movement of influenza A virus inclusions was captured by live cell imaging. Viral inclusions were segmented, and the aspect ratio measured by imageJ, analysed and plotted in R.
  
  3) Similarly, the fusion event presented at the bottom of figure 3I is dubious. It might as well be an aggregation of condensates without fusion.
  
  We have changed this (check Fig 5A and B in the manuscript), thank you for the suggestion.
  
  4) The authors could have more systematically performed FRAP/FLAPh experiments on cells expressing fluorescent versions of both NP and Rab11a to investigate the influence of condensate size, time after infection, or global concentrations of Rab11a in the cell (using the total fluorescence of overexpressed GFP-Rab11a as a proxy) on condensate properties.
  
  We have included a new figure, figure 5 with the suggested data.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.03.502602v2
www.biorxiv.org www.biorxiv.org

New submission 22/02/2023, 17:50:59

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  1) The main limitation of this study is that the results are primarily descriptive in nature, and thus, do not provide mechanistic insight into how Ryr1 disease mutations lead to the muscle-specific changes observed in the EDL, soleus and EOM proteomes.
  
  An intrinsic feature of the high-throughput proteomic analysis technology is the generation of lists of differentially expressed proteins (DEP) in different muscles from WT and mutated mice. Although the definition of mechanistic insights related to changes of dozens of proteins is very interesting, it is a difficult task to accomplish and goes beyond the goal of the high-throughput proteomic analysis presented here. Nevertheless, the analysis of DEPs may indeed provide arguments to speculate on the pathogenesis of the phenotype linked to recessive RyR1 mutations. In the unrevised manuscript, we pointed out that the fiber type I predominance observed in congenital myopathies linked to recessive Ryr1 mutation are consistent with the high expression level of heat shock proteins in slow twitch muscles. However, as suggested by Reviewer 3, we have removed "vague statements" from the text of the revised manuscript, concerning major insights into pathophysiological mechanisms, since we are aware that the mechanistic information, if any, that we can extract from the data set, cannot go over the intrinsic limitation of the high-throughput proteomic technology.
  
  b) Results comparing fast twitch (EDL) and slow twitch (soleus) muscles from WT mice confirmed several known differences between the two muscle types. Similar analyses between EOM/EDL and EOM/soleus muscles from WT mice were not conducted.
  
  We agree with the point raised by the Reviewer. In the revised manuscript we have changed Figure 2. The new Figure 2 shows the analysis of differentially expressed proteins in EDL, soleus and EOMs from WT mice. We have also added 2 new Tables (new Supplementary Table 2 and 3) and have inserted our findings in the revised Results section (page, 7, lines 157-176, pages 8 and 9).
  
  c) While a reactome pathway analysis for proteins changes observed in EDL is shown in Supplemental Figure 1, the authors do not fully discuss the nature of the proteins and corresponding pathways impacted in the other two muscle groups analyzed.
  
  We have now included in the revised manuscript a new Figure 2 which includes the Reactome pathway analysis comparing EDL with soleus, EDL with EOM and soleus with EOM (panels C, F and I, respectively). We have also inserted into the revised manuscript a brief description of the pathways showing the greatest changes in protein content (page 7 line 156-175, pages 8 and 9). We agree that the data showing changes in protein content between the 3 muscle groups of the WT mice are important also because they validate the results of the proteomic approach. Indeed, the present results confirm that many proteins including MyHCIIb, calsequestrin 1, SERCA1, parvalbumin etc are more abundantly expressed in fast twitch EDL muscles compared to soleus. Similarly, our results confirm that EOMs are enriched in MyHC-EO as well as cardiac isoforms of ECC proteins. This point has been clarified in the revised version of the manuscript (page 8, lines 198-213; page 9 lines 214-228). Nevertheless, we would like to point out that the main focus of our study is to compare the changes of protein content induced by the presence of recessive RyR1 mutations.
  
  Reviewer #3 (Public Review):
  
  a) it would be useful to determine whether changes in protein levels correlated with changes in mRNA levels …….
  
  We performed qPCR analysis of Stac3 and Cacna1s in EDL, Soleus and EOM from WT mice (see Figure 1 below). The expression of transcripts encoding Cacna1s and Stac3 is approximately 9-fold higher in EDL compared to Soleus. The fold change of Stac3 and Cacna1s transcripts in EDL muscles is higher compared to the differences we observed by Mass spectrometry at the protein level between EDL and Soleus. Indeed, we found that the content of the Stac3 protein in EDL is 3-fold higher compared to that in soleus. Although there is no apparent linear correlation between mRNA and protein levels, we believe that a few plausible conclusions can be drawn, namely: (i) the expression level of both transcripts and proteins is higher EDL compared to EOM and soleus muscles, respectively, (ii) the expression level of transcripts encoding Stac3 correlate with those encoding Cacan1s and confirm proteomic data. In addition, the level of Stac3 transcript does not changes between WT and dHT, confirming our proteomic data which show that Stac3 protein content in muscles from dHT is similar to that found in WT littermates. Altogether these results support the concept that the differences in Stac3 content between EDL and soleus occur at both the protein and transcript levels, namely high Stac3 mRNA level correlates with higher protein content (EDL) and low mRNA levels correlated with low Stac3 protein content in Soleus muscles (see Figure 1 below).
  
  Figure 2: qPCR of Cacna1s and Stac3 in muscles from WT mice. The expression levels of the transcripts encoding Cacna1s and Stac3 are the highest in EDL muscles and the lowest in soleus muscles (top panels). There are no significant changes in their relative expression levels in dHT vs WT. Each symbol represents the value from of a single mouse. * p=0.028 Mann Whitney test qPCR was performed as described in Elbaz et al., 2019 (Hum Mol Genet 28, 2987-2999).
  
  ….and whether or not the protein present was functional, and whether Stac3 was in fact stoichiometrically depleted in relation to Cacna1s.
  
  We thought about this point but think that there are no plausible arguments to believe that Stac3 is not functional, one simple reason being that our WT mice do not have a phenotype which would be associated with the absence of Stac3 (Reinholt et al., PLoS One 8, e62760 2013, Nelson et al. Proc. Natl. Acad. Sci. USA 110:11881 2013).
  
  b) In the abstract, the authors stated that skeletal muscle is responsible for voluntary movement. It is also responsible for non-voluntary. The abstract needs to be refocused on the mutation and on what we learn from this study. Please avoid vague statements like "we provide important insights to the pathophysiological mechanisms..." mainly when the study is descriptive and not mechanistic.
  
  The abstract of the revised manuscript has been rewritten. In particular, we removed statements referring to important “pathophysiological mechanistic insight”.
  
  c) The author should bring up the mutation name, location and phenotype early in the introduction.
  
  In the revised manuscript we provide the information requested by the Reviewer (page 2 lines 36-38 and page 4, lines 98-102).
  
  d) This reviewer also suggests that the authors refocus the introduction on the mutation location in the 3D RyR1 structure (available cryo-EM structure), if there is any nearby ligand binding site, protomers junction or any other known interacting protein partners. This will help the reader to understand how this mutation could be important for the channel's function
  
  The residue Ala4329 is present inside the TMx (Auxiliary transmembrane helices) domain which spans from residue 4322 to 4370 and interposes structurally (des Georges A et al. 2016 Cell 167,145-57; Chen W, et al. 2020 EMBO Rep. 21, e49891). Although the structural resolution of the region has been improved (des Georges et al, 2016), parts of the domain still remain with no defined atomic coordinates, especially the region encompassing a.a. E4253 – F4540. Because of such undefined atomic coordinates of the region E4253-F4540, we are not able to determine the real orientation and the disposition of the amino acids in this region, including the A4329 residue. As reference, structure PDB: 5TAL of des Georges et al, 2016 was analyzed with UCSF Chimera (production version 1.16) (Pettersen et al. J. Comput. Chem. 25: 1605-1612. doi: 10.1002/jcc.20084).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.26.509474v1
www.biorxiv.org www.biorxiv.org

New submission 11/07/2022, 11:59:29

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors reveal dual regulatory activity of the complex nuclear receptor element (cNRE; contains hexads A+B+C) in cardiac chambers and its evolutionary origin using computational and molecular approaches. Building upon a previous observation that hexads A and B act as ventricular repressor sequences, in this study the authors identify a novel hexad C sequence with preferential atrial expression. The authors also reveal that the cNRE emerged from an endogenous viral element using comparative genomic approaches. The strength of this study is in a combination of in silico evolutionary analyses with in vivo transgenic assays in both zebrafish and mouse models. Rapid, transient expression assays in zebrafish together with assays using stable, transgenic mice demonstrate dual functionality of cNRE depending on the chamber context. This is especially intriguing given that the cNRE is present only in Galliformes and has originated likely through viral infection. Interestingly, there seem to be some species-specific differences between zebrafish and mouse models in expression response to mutations within the cNRE. Taken together, these findings bear significant implications for our understanding of dual regulatory elements in the evolutionary context of organ formation.
  
  We thank reviewer 1 for the thorough review and are very satisfied with his favorable view of our manuscript. We also thank reviewer 1 for suggestions and opportunities to further clarify some relevant issues.
  
  Reviewer #2 (Public Review):
  
  Nunes Santos et al. investigated the gene regulatory activity of the promoter of the quail myosin gene, SMyHC III, that is expressed specifically in the atria of the heart in quails. To do so, they computationally identified a novel 6-bp sequence within the promoter that is putatively bound by a nuclear receptor transcription factor, and hence is a putative regulatory sequence. They tested this sequence for regulatory activity using transgenic assays in zebrafish and mice, and subjected this sequence to mutagenesis to investigate whether gene regulatory effects are abrogated. They define this sequence, together with two additional known 6-bp regulatory sequences, as a novel regulatory sequence (denoted cNRE) necessary and sufficient for driving atrial-specific expression of SMyHC III. This cNRE sequence is shared across several galliform species but appears to be absent in other avian species. The authors find that there is sequence homology between the cNRE and several virus genomes, and they conclude that this regulatory sequence arose in the quail genome by viral integration.
  
  Strengths: The evolutionary origins of gene regulatory sequences and their impact on directing tissue-specific expression are of great interest to geneticists and evolutionary biologists. The authors of this paper attempt to bring this evolutionary perspective to the developmental biology question of how genes are differentially expressed in different chambers of the heart. The authors test for regulatory activity of the putative regulatory sequence they identified computationally in both zebrafish and mouse transgenic assays. The authors disrupt this sequence using deletions and mutagenesis, and introduce a tandem repeat of the sequence to a reporter gene to determine its consequences on chamber activity. These experiments demonstrate that the identified sequence has regulatory activity.
  
  We appreciate the thorough review of our manuscript and are very stimulated by the reviewer’s understanding of the contents we presented. We will take the liberty to comment after the reviewer’s considerations, in the hope to better answer the relevant points.
  
  Weaknesses: There are several decisions and assumptions that have been made by the authors, the reasons for which have not been articulated. Firstly, the rationale for the approach is not clear. The study is a follow-up to work previously performed by the authors which identified two 6-bp sequences important for controlling atrial-specific expression of the quail SMyHC III gene. This study appears to be motivated by the fact that these two sequences, bound by nuclear receptors, do not fully direct chamber-specific expression, and therefore this study aims to find additional regulatory sequences. It is assumed that any additional regulatory sequences should also be bound by nuclear receptors, and be 6-bp in length, and therefore the authors search for 6-bp sequences bound by nuclear receptors. It is not clear what the input sequence for this analysis was.
  
  Thank you for giving us the opportunity to clarify our rational. Our approach is justified by the natural progression in the understanding of the mechanisms involved in preferential atrial expression by the SMyHC III promoter. The groundwork was solidly laid down by Wang and colleagues (see references as below). They mapped potential atrial stimulators and ventricular repressors throughout the SMyHC III promoter using atrial and ventricular cultures, respectively. Wang and colleagues pinned down the relevant regulators. First between -840 and -680 bp upstream from the transcription start site, then inside this nucleotide stretch, then in the 72-bp fragment contained between -840 and -680 bp, then identified the ventricular repressor in Hexads A and B inside the 72-bp sequence (see references below). We, in this manuscript, contributed with the identification of Hexad C (immediately downstream of Hexads A and B) as a potential nuclear receptor binding site and as a bona fide atrial activator. In summary, our work represents a logical conclusion of previous work by Wang and colleagues. We continued the process of narrowing down sequences previously proven to contain atrial activators (that were unknown before our present work) and ventricular repressors (that were already described).
  
  Why did we use nuclear receptors as models for the putative cardiac chamber regulators binding to the cNRE? This is because previous work by Wang et al., 1996, 1998, 2001 and by Bruneau et al., 2001 showed that the 5’ portion of the cNRE (Hexads A and B) is indeed a hub for the integration of signals conveyed by nuclear receptors. Originally, Wang et al., 1996 showed that the VDR response element is a ventricular repressor acting via the 5’ portion of the cNRE. In a subsequent manuscript, Wang et al., 1998 showed that both RAR and VDR bind the 5’ portion of the cNRE. Bruneau et al., 2001 showed, by crossing IRX4 knockout mice with SMyHC III-HAP mice (Xavier-Neto et al., 1999), that IRX4 plays the role of a repressor of SMyHC III-HAP expression. Finally, Wang et al., 2001 showed that IRX4 interacts with RXR bound to the 5’ portion of the cNRE to inhibit ventricular expression.
  
  Why was the 3’ Hexad included as a research subject? Very early on in our work it was noted that 3’ of the original VDR response element (Hexads A and B), described by Wang et al., 1996 and 1998 as a ventricular repressor, there was a sequence (Hexad C) with almost equal binding potential to nuclear receptors as Hexads A and B (as initially judged on the basis of comparisons with canonical nuclear receptor binding sequences, but later on confirmed by in silico profiling of nuclear receptor binding, see below). This discovery prompted us to design point mutants in the 3’ portion of the cNRE to investigate whether Hexad C contained relevant regulators of heart chamber expression. These analyses revealed a strong atrial activator in the mouse (the missing atrial activator from Wang et al., 1996, 1998, 2001).
  
  Wang, G. F., Nikovits, W., Schleinitz, M., and Stockdale, F. E. (1996). Atrial chamber-specific expression of the slow myosin heavy chain 3 gene in the embryonic heart. J. Biol. Chem. 271, 19836-19845.
  
  Wang, G. F., Nikovits, W. Jr., Schleinitz, M., and Stockdale, F. E. (1998). A positive GATA element and a negative vitamin D receptorlike element control atrial chamber-specific expression of a slow myosin heavy-chain gene during cardiac morphogenesis. Mol. Cell Biol. 18, 6023-6034.
  
  Xavier-Neto, J., Neville, C. M., Shapiro, M. D., Houghton, L., Wang, G. F., Nikovits, W. Jr, Stockdale, F. E., and Rosenthal, N. (1999). A retinoic acid-inducible transgenic marker of sino-atrial development in the mouse heart. Development 126, 2677-2687.
  
  Bruneau, B. G., Bao, Z. Z., Fatkin, D., Xavier-Neto, J., Georgakopoulos, D., Maguire, C. T., Berul, C. I., Kass, D. A., Kuroski-de Bold, M. L., de Bold, A. J., Conner, D. A., Rosenthal, N., Cepko, C. L., Seidman, C. E., and Seidman, J. G. (2001). Cardiomyopathy in Irx4-deficient mice is preceded by abnormal ventricular gene expression. Mol. Cell Biol. 21, 1730-1736.
  
  Wang, G. F., Nikovits, W. Jr., Bao, Z.Z., and Stockdale, F.E. (2001). Irx4 forms an inhibitory complex with the vitamin D and retinoic X receptors to regulate cardiac chamber-specific slow MyHC3 expression. J Biol Chem. 276, 28835-28841.
  
  The methods section mentions the cNRE sequence, but this is their newly defined regulatory sequence based on the newly identified 6-bp sequence. It is therefore unclear why Hexad C was identified to be of interest, and not the GATA binding site for example, and whether other sequences in the promoter might have stronger effects on driving atrial-specific expression.
  
  As far as the existence of binding sites other than Hexads A, B, and C, we cannot, formally, exclude the possibility that there may be other relevant regulators of the SMyHC III gene. But we note that the sequences that we utilized were previously mapped through deletion mutant promoter approach by Wang et al., 1996 as the most powerful atrial activator(s) and ventricular repressor(s). We addressed these concerns in a new session entitled “Limitations of our work”.
  
  Concerning GATA regulation, Wang et al., 1996, 1998 characterized a GATA-4 site that drives generalized (atrial and ventricular) cardiac expression in quail cultures. However, we were unable to identify any relevant changes in cardiac expression in mutant GATA SMyHC III-HAP transgenic mouse lines produced with the same mutated promoter sequences described by Wang et al., 1996, 1998.
  
  Finding Hexad C as an atrial activator was an experimental finding. We identified it as such because we had two important inputs. First, in 1997, we consulted with Ralff Ribeiro, a specialist on nuclear receptors and he pointed out that downstream of the Hexad A + Hexad B VDRE/RARE (the ventricular repressor), there was a sequence with good potential for a nuclear receptor binding motif. This was exactly Hexad C. Then, we confirmed its potential for nuclear receptor binding by nuclear receptor profiling. After these two pieces of evidence, we thought that there was enough evidence to justify a mutant construct (Mut C). The experimental results we obtained in transgenic mice and zebrafish are consistent with the hypothesis that Hexad C does contain the long sought atrial activator predicted by Wang et al., 1996 in atrial cultures. This seems to be the most important atrial activator (a seven-fold activator) predicted by a deletion approach to be located between -840 and 680 bp in Wang et al., 1996.
  
  Wang, G. F., Nikovits, W., Schleinitz, M., and Stockdale, F. E. (1996). Atrial chamber-specific expression of the slow myosin heavy chain 3 gene in the embryonic heart. J. Biol. Chem. 271, 19836-19845.
  
  Wang, G. F., Nikovits, W. Jr., Schleinitz, M., and Stockdale, F. E. (1998). A positive GATA element and a negative vitamin D receptorlike element control atrial chamber-specific expression of a slow myosin heavy-chain gene during cardiac morphogenesis. Mol. Cell Biol. 18, 6023-6034.
  
  Indeed, the zebrafish transgenic assays use the 32 bp cNRE, while in the mouse transgenic assays, a 72 bp region is used. This choice of sequence length is not justified.
  
  As stated above, our rational was built as a continuation of the thorough work by Wang and colleagues in progressively narrowing down the location of relevant atrial stimulators and ventricular repressors. Throughout our work, we sought to obtain maximal coherence with previous studies (see references below) and to simultaneously probe cNRE function at an increased resolution. For that, we utilized previously described mutant SMyHC III promoter constructs (Wang et al., 1996) and introduced novel site-directed dinucleotide substitution mutants of individual Hexads in the SMyHC III promoter.
  
  Wang, G. F., Nikovits, W., Schleinitz, M., and Stockdale, F. E. (1996). Atrial chamber-specific expression of the slow myosin heavy chain 3 gene in the embryonic heart. J. Biol. Chem. 271, 19836-19845.
  
  Wang, G. F., Nikovits, W. Jr., Schleinitz, M., and Stockdale, F. E. (1998). A positive GATA element and a negative vitamin D receptorlike element control atrial chamber-specific expression of a slow myosin heavy-chain gene during cardiac morphogenesis. Mol. Cell Biol. 18, 6023-6034.
  
  Xavier-Neto, J., Neville, C. M., Shapiro, M. D., Houghton, L., Wang, G. F., Nikovits, W. Jr, Stockdale, F. E., and Rosenthal, N. (1999). A retinoic acid-inducible transgenic marker of sino-atrial development in the mouse heart. Development 126, 2677-2687.
  
  Bruneau, B. G., Bao, Z. Z., Fatkin, D., Xavier-Neto, J., Georgakopoulos, D., Maguire, C. T., Berul, C. I., Kass, D. A., Kuroski-de Bold, M. L., de Bold, A. J., Conner, D. A., Rosenthal, N., Cepko, C. L., Seidman, C. E., and Seidman, J. G. (2001). Cardiomyopathy in Irx4-deficient mice is preceded by abnormal ventricular gene expression. Mol. Cell Biol. 21, 1730-1736.
  
  Wang, G. F., Nikovits, W. Jr., Bao, Z.Z., and Stockdale, F.E. (2001). Irx4 forms an inhibitory complex with the vitamin D and retinoic X receptors to regulate cardiac chamber-specific slow MyHC3 expression. J Biol Chem. 276, 28835-28841.
  
  The decisions about which bases to mutate in the three hexads are also not clear. Why are the first two bases mutated in Hexad B and C and the whole region mutated in Hexad A? Is there a reason to believe these bases are particularly important?
  
  As for the reasons behind mutation of the first two bases in Hexad B and Hexad C, there were two:
  
  One reason is because these point mutations in Hexads B and C were planned after the publication of Wang et al., 1996, which defined the major role of Hexad A in ventricular repression. After this discovery, we decided that a higher level of resolution in our mutation approach would be a better way to search for additional regulators of SMyHC III expression, including the atrial regulator that was readily apparent from the results shown in Wang et al., 1996, but had not yet been described.
  
  The second reason is because the two first nucleotides (purines) in a nuclear-receptor binding hexad are critical for the interaction between target DNA and transcription factors of the nuclear receptor family. Substituting pyrimidines for purines in the two first positions of an hexad drastically reduces the affinity of a nuclear response element, and that is why we chose to use TT substitutions in our mutant constructs. Please refer to: Umesono et al., Cell, 1991 65: 12551266 for a review; Mader et al., J Biol Chem, 1993 268:591-600 for a mutation study; Rastinejad et al., EMBO J., 2000 19:1045-1054 for a crystallographic study (as well as additional references listed below).
  
  Mader, S., Chen, J. Y., Chen, Z., White, J., Chambon, P., and Gronemeyer, H. (1993). The patterns of binding of RAR, RXR and TR homo- and heterodimers to direct repeats are dictated by the binding specificites of the DNA binding domains. EMBO J. 12, 50295041.
  
  Ribeiro, R. C., Apriletti, J. W., Yen, P.M., Chin, W. W., and Baxter, J. D. (1994). Heterodimerization and deoxyribonucleic acid-binding properties of a retinoid X receptor-related factor. Endocrinology.135, 2076-2085.
  
  Zhao, Q., Chasse, S. A., Devarakonda, S., Sierk, M. L., Ahvazi, B., and Rastinejad, F. (2000). Structural basis of RXR-DNA interactions. J. Mol. Biol. 296, 509-520.
  
  Shaffer, P. L. and Gewirth, D. T. (2002). Structural basis of VDR-DNA interactions on direct repeat response elements. EMBO J. 21, 2242-2252.
  
  The control mutant also has effects on the chamber distribution of GFP expression.
  
  We note that, in the mouse, MutS did not produce any major changes from the typical wild type phenotypes linked to SMyHC III-HAP transgenic hearts. We concluded, based on our data, that the spacing mutant worked reasonably well as a negative mutation control in mice. We agree that it would have been particularly elegant if a spacing mutant designed for the mouse context worked in the exact same way in the zebrafish. However, the fact that there are slight differences in behavior for the mutated “spacing” constructs in species separated by, millions of years of independent evolution is not really surprising, given that the amino acid sequence of transcription factors can diverge and co-evolve with binding nucleotides and end up drifting quite substantially from an ancestral setup. As we reiterate below, we consider more fundamental the fact that the cNRE is actually able to bias cardiac expression towards a model of preferential atrial expression, even in the context of species separated by millions of years of independent evolution.
  
  Two claims in the paper have weak evidence. Firstly, the conclusion that the cNRE is necessary and sufficient for driving preferential expression in the atrium. Deleting the cNRE does reduce the amount of atrial reporter gene expression but there is not a "conversion" from atrial to ventricular expression as mentioned in line 205. Similarly, a fusion of 5 tandem repeats of the cNRE can induce expression of a ventricular gene in the atria (I'm assuming a single copy is insufficient), but does not abolish ventricular expression.
  
  We agree that our labelling of the cNRE is perhaps too strong, and we have toned it down accordingly to incorporate the much more equilibrated concept that the cNRE biases cardiac expression towards a model of preferential atrial expression.
  
  However, after the corrections suggested, we believe our assertion is now justified. We show that in the mouse, removal of the cNRE is followed by a major reduction of atrial expression coupled to the release of a low, but quite clear level of expression in the ventricles, when compared to the transgenic mouse harboring the wild type SMyHC III promoter. Note that, as expected, the relative power of the cNRE to establish preferential atrial expression is higher in the mouse (a mammal) than it is in the zebrafish (a teleost), which is biologically sound, as mammals and avians are closer, phylogenetically, than teleosts and avians. Yet, the direction of change of expression in atria and ventricles was exactly as expected, if a given motif responsible for preferential atrial expression was removed (the cNRE in our case), that is: marked reduction in atrial expression and small (albeit clearly evident) release of ventricular expression. We believe that these directional changes observed in species separated by millions of years of independent evolution constitute very good biological evidence for the role of the cNRE in driving preferential atrial expression.
  
  Concerning the 5x fusion of cNREs, we chose to produce this multimer for safety purposes only, because we did not want to risk performing incomplete experiments and having to repeat them. However, more to the point, we later compared the efficiency of one (1) versus five (5) cNRE copies in a cell culture context and the results were not different.
  
  Secondly, the authors claim that the cNRE regulatory sequence arose from viral integration into the genomes of galliform species. While this is an attractive mechanism for explaining novel regulatory sequences, the evidence for this is based purely on sequence homology to viral genomes. And this single observation is not robust as the significance of the sequence matches does not appear to be adjusted for sequence matches expected by chance. The "evolutionary pathway" leading to the direction of chamber-specific expression in the heart as highlighted in the abstract has therefore not been demonstrated.
  
  We agree with the reviewer. Because of space constraints, we decided to omit a substantial part of our work from the initial submission of the manuscript. We now include the relevant data in the revised version. We thus mapped the phylogenetic origins of the SMyHC III family of slow myosins and then established how and when the cNREs became topologically associated with the SMyHC III gene. To do that, we repeat masked all available sequences from avian SMyHC III orthologs. As it will become clear below, the cNRE is a rare sequence, rather than a low complexity repeat. Our search for cNREs outside of the quail context (Coturnix coturnix) followed two independent lines. First, we took a scaled, evolution-oriented approach. Initially, we looked for cNREs in species close to the quail (i.e., Galliformes) and then progressively farther, to include derived (i.e., Passeriformes) and basal avians (i.e., Paleognaths) as well as external groups such as crocodilians. While pursuing this line of investigation, it became clear that the cNRE was a rare form of repetitive element, which showed a conserved topological relationship with the SMyHC III gene (i.e., cNREs flanked the SMyHC III genes at 5’ and 3’ regions). Using this topological relationship as a character, we determined when it appeared during avian evolution and then set out to establish the likely origins of this rare repetitive motif. This search for the origins of the cNRE entailed comparisons to databases of repetitive genome elements, until the extreme telomeric nature of the SMyHC III gene became evident. This finding directed us to the fact that the hexad nature of the cNRE is reminiscent of the hexameric character of telomeric direct repeats. Because direct telomeric repeats are exactly featured in the genomes of avian DNA viruses that can infect the germline and integrate into the avian genome, we focused our search for the cNRE on the members of the subfamily Alphaherpesvirinae (Morissette & Flamand, 2010). In this search, we utilized the human herpes simplex virus 1 (HSV1) as a general model for herpes viruses, and a set of four (4) members of the Alphaherpesvirinae family that specifically infect Galliformes (i.e., GaHV1, the virus responsible for avian infectious laryngotracheitis in chicken, GaHV2, the Marek’s disease virus, GaHV3, a non-pathogenic virus, and MeHV1, the non-pathogenic Meleagrid herpesvirus 1 capable of infecting chicken and wild turkey) (Waidner et al., 2009). The search for cNREs in Alphaherpesvirinae was successful. We found six (6) cNRE hits in HSV1, one (1) in GaHV1, and none in MeHV1, GaHV2, and GaHV3. Our evolution-directed approach thus led to the direct recognition that cNREs can be found in the genomes of a family of viruses that contain members that infect avians and integrate their double-stranded DNA into the host germline (Morissette & Flamand, 2010). Therefore, as a second independent approach, as pointed out by the reviewer, we set out to further extend this proof of concept by broadening our search to all known sequenced viruses and perform an unbiased, internally consistent, and quantitative analysis of cNRE presence in viral genomes, as already reported in the initial submission of this manuscript.
  
  Reviewer #3 (Public Review):
  
  Summary:
  
  In this manuscript Nunes Santos et al. use a combination of computation and experimental methods to identify and characterize a cis-regulatory element that mediates expression of the quail Slow Myosin Heavy Chain III (SMyHC III) gene in the heart (specifically in the atria). Previous studies had identified a cis-regulatory element that can drive expression of SMyHC III in the heart, but not specifically (solely) in the atria, suggesting additional regulatory elements are responsible for the specific expression of SMyHC III in the atria as opposed to other elements of the heart. To identify these elements Nunes Santos et al. first used a bioinformatic approach to identify potentially functional nuclear receptor binding sites ("Hexads") in the SMyHC III promoter; previous studies had already shown that two of these Hexads are important for SMyHC III promoter function. They identified a previously unknown third Hexad within the promoter, and propose that the combination of these three (called the complex Nuclear Receptor Element or cNRE) is necessary and sufficient for specific atrial expression of SMyHC III. Next, they use experimental methods to functionally characterize the cNRE including showing that the quail SMyHC III promoter can drive green fluorescent protein (GFP) expression the atrium of developing zebrafish embryos and that the cNRE is necessary to drive the expression of the human alkaline phosphatase reporter gene (HAP) in transgenic mouse atria. Additional experiments show that the cNRE is portable regulatory element that can drive atrial expression and demonstrate the importance of the three Hexad parts. These data demonstrating that the cNRE mediates atrial-specific expression is well-done and convincing. The authors also note the possibility that the cNRE might be derived from an endogenous viral element but further data are needed to support the hypothesis that the cNRE is of viral origin.
  
  Strengths:
  
  1) The experimental work demonstrating that the cNRE is a regulatory element that can mediate the atrial-specific expression of SMyHC III.
  
  We thank reviewer 3 for this thorough appreciation of our work and are pleased with the evaluation of our manuscript’s potential.
  
  Weaknesses:
  
  1) Justification for use of different regulatory elements in the zebrafish (32 bp cNRE) and the mouse transgenic assays (72 bp cNRE), and discussion of the impact of this difference on the results/interpretation.
  
  In general, throughout our work, we sought to obtain maximal coherence with previous studies (see references below) and to simultaneously probe cNRE function at an increased resolution. For that, we utilized previously described mutant SMyHC III promoter constructs (Wang et al., 1996, 1998) and introduced novel site-directed dinucleotide substitution mutants of individual Hexads in the SMyHC III promoter. Actually, the 72-bp construct is not a 72-bp construct. It is a 5’ deletion construct that removed 72 bp from the 840 bp wild type SMyHC III construct, transforming it into a 768-bp SMyHC III promoter construct. Any directional changes observed in cardiac expression by the 768 bp as compared to the wild type promoter was interpreted in the context as missing regulators present in this 5’ 72 bp.
  
  Wang et al., 1996 and 1998 had already shown that Hexads A and B contained a functional VDRE/RARE, which acted as a ventricular repressor. Using the 768-bp SMyHC III promoter in mouse transgenic lines was thus a natural investigation step for us to evaluate whether regulation of the SMyHC III promoter in the mouse was similar in mice as compared to quail cardiac cultures. As shown in the manuscript, deletion of the 72 bp resulted in the release of a low level of expression in ventricles, consistent with the removal of a ventricular repressor (already described by Wang et al., 1996). It also showed a marked reduction in atrial transgene stimulation, suggesting the elimination of a very important atrial activator.
  
  In 1996, Wang and colleagues mapped an atrial activator to the sequence interval of 160 bp, between -840 and -680 bp (Wang et al., 1996). In our mouse transgenics, we reduced this interval to a mere 72 bp, between -840 to -768 bp. This was very useful information. Wang et al., 1998 showed that HF-1a, M-CAT, and E-box sites located between -840 and -808 bp did not influence atrial expression, so now we had a potential interval of only 40 bp between -808 and -768 bp. Further, our transgenic mice indicated that the GATA site located 3’ from Hexads A, B, and C (GATA site changed to a Sal I site at positions -749 to -743 bp) did not work as a general activator, as in the quail. Thus, the only good candidate for the atrial activator in mice inside the 40-bp fragment between -808 and -768 bp was the cNRE, with its three Hexads, A, B and the novel Hexad C. Because Hexads A plus B composed a functional VDRE/RARE that played a role in ventricular repression in the quail, we hypothesized that the atrial activator would be present in Hexad C. We then mutated the two first purines in Hexad C (the most important ones for nuclear receptor binding, please refer to Umesono et al., Cell, 1991 65: 1255-1266 for a review; Mader et al., J Biol Chem, 1993 268:591-600 for a mutation study; Rastinejad et al., EMBO J., 2000 19:1045-1054 for a crystallographic study as well as additional references listed below) and performed the experiments that demonstrated a profound reduction in atrial expression in the mouse context, revealing the long-sought atrial activator.
  
  Mader, S., Chen, J. Y., Chen, Z., White, J., Chambon, P., and Gronemeyer, H. (1993). The patterns of binding of RAR, RXR and TR homo- and heterodimers to direct repeats are dictated by the binding specificites of the DNA binding domains. EMBO J. 12, 50295041.
  
  Ribeiro, R. C., Apriletti, J. W., Yen, P.M., Chin, W. W., and Baxter, J. D. (1994). Heterodimerization and deoxyribonucleic acid-binding properties of a retinoid X receptor-related factor. Endocrinology.135, 2076-2085.
  
  Wang, G. F., Nikovits, W., Schleinitz, M., and Stockdale, F. E. (1996). Atrial chamber-specific expression of the slow myosin heavy chain 3 gene in the embryonic heart. J. Biol. Chem. 271, 19836-19845.
  
  Wang, G. F., Nikovits, W. Jr., Schleinitz, M., and Stockdale, F. E. (1998). A positive GATA element and a negative vitamin D receptorlike element control atrial chamber-specific expression of a slow myosin heavy-chain gene during cardiac morphogenesis. Mol. Cell Biol. 18, 6023-6034.
  
  Zhao, Q., Chasse, S. A., Devarakonda, S., Sierk, M. L., Ahvazi, B., and Rastinejad, F. (2000). Structural basis of RXR-DNA interactions. J. Mol. Biol. 296, 509-520.
  
  Shaffer, P. L. and Gewirth, D. T. (2002). Structural basis of VDR-DNA interactions on direct repeat response elements. EMBO J. 21, 2242-2252.
  
  2) Is the cNRE really "necessary and sufficient"? I define necessary and sufficient in this context as a regulatory element that fully recapitulates the expression of the target gene, so if the cNRE was "necessary and sufficient" to direct the appropriate expression of SMyHC III it should be able to drive expression of a reporter gene solely in the atria. While deletion of the cNRE does reduce expression of the reporter gene in atria it is not completely lost nor converted from atrial to ventricular expression (as I understand the study design would suggest should be the effect), similarly fusion of 5 repeats of the cNRE induces expression of a ventricular gene in the atria but also does not convert expression from ventricle to atria. This doesn't seem to satisfy the requirements of a "necessary and sufficient" condition. Perhaps a discussion of why the expectations for "necessary and sufficient" are not met but are still consistent would be beneficial here.
  
  We agree with your reasoning. Our description of the cNRE was perhaps too strong, and we have toned it down accordingly in the revised manuscript to incorporate a much more equilibrated concept that the cNRE biases cardiac expression towards a model of preferential atrial expression. After these corrections, we believe our novel assertion is justified. We show that in the mouse, removal of the cNRE is followed by a major reduction of atrial expression coupled to the release of a low, but quite clear level of expression in the ventricles, when compared to the transgenic mouse harboring the wild type SMyHC III promoter. Note that, as expected, the relative power of the cNRE to establish preferential atrial expression is higher in the mouse (a mammal) than it is in the zebrafish (a teleost), which is biologically sound, as mammals and avians are closer, phylogenetically, than teleosts and avians. Yet, the direction of change of expression in atria and ventricles was exactly as expected, if a given motif responsible for preferential atrial expression was removed (the cNRE in our case), that is: marked reduction in atrial expression and small (albeit evident) release of ventricular expression. We believe that these directional changes observed in species separated by millions of years of independent evolution constitute very good biological evidence for the role of the cNRE in driving preferential atrial expression.
  
  3) The claim that the cNRE is derived from a viral integration is not supported by the data. Specifically, the cNRE has sequence similarity to some viral genomes, but this need not be because of homology and can also be because of chance or convergence. Indeed, the region of the chicken genome with the cNRE does have repetitive elements but these are simple sequence repeats, such as (CTCTATGGGG)n and (ACCCATAGAG)n, and a G-rich low complexity region, rather than viral elements; The same is true for the truly genome. These data indicate that the cNRE is not derived from an endogenous virus but is a repetitive and low complexity region, these regions are expected to occur more frequently than expected for larger and more complex regions which would cause the BLAST E value to decrease and appear "significant”, but this is entirely expected because short alignments can have high E values by chance. (Also note that E values do not indicate statistical significance, rather they are the number of hits one can "expect" to see by chance when searching specific database.)
  
  We do understand the criticism, but we would like to advance another concept, based on a series of results that we obtained using bioinformatics-oriented and evolution-oriented analyses. We performed a cNRE scan in the Gallus gallus genome (galGal5), using varying numbers of nucleotide mismatches. When we searched the galGaL5 genome with coordinates matching the localization of cNREs obtained using matchPattern with up to 8 mismatches, only thirty-one (31) and thirty-four (34) hits were found in the 5’ and 3’ strands, respectively. This indicates that a cNRE match is a rather uncommon finding in the Gallus gallus genome.
  
  A more systematic profiling of genome occurrence versus nucleotide mismatch indicated that a significant upward inflexion in the relationship between number of cNRE hits and divergence from the original cNRE version (Coturnix coturnix) is recorded only at 12 mismatches or greater. At 8 mismatches, the total number of cNREs on each DNA strand varied little among all avian species examined, remaining close to the average (31+/- 2,2 cNREs for the 5’ strand, range 1748; 34 +/- 3,3 for the 3’ strand, range 14-64). Consistent with the idea that the cNRE is a specific regulatory motif, rather than an abundant, low complexity sequence, there are only two cNRE occurrences in chromosome 19, which harbors AMHC1, the Gallus gallus ortholog of the Coturnix coturnix SMyHC III gene.
  
  Figure 1: Number of cNRE hits to galGal5 according to maximum mismatches allowed: the cNRE is not an abundant low complexity sequence, but rather a rare repetitive sequence with a clear cutoff level of mismatches allowed. Consistent with this, there are only two (2) cNRE sequences in chromosome 19, the chromosome that contains the AMHC1 gene (the chicken ortholog of the quail SMyHC III gene). ## [1] chr19 [16510, 16541] * | 5’-CAAGGACAAAGAGGGGACAAAGAGGCGGAGGT-3 ## [2] chr19 [32821, 32852] * ‘5’-CAAGGACAAAGAGTGGACAAAGAGGCAGACGT-3
  
  In the evolutionary strategy, which we now include, we first mapped the phylogenetic origins of the SMyHC III family of slow myosins and then established how and when the cNREs became topologically associated with the SMyHC III gene. To do that we repeat masked all available sequences from avian SMyHC III orthologs. As it will become clear below, the cNRE is a rare sequence, rather than a low complexity repeat. Our search for cNREs outside of the quail context (Coturnix coturnix) followed two independent lines. First, we took a scaled, evolution-oriented approach. Initially, we looked for cNREs in species close to the quail (i.e., Galliformes) and then progressively farther, to include derived (i.e., Passeriformes) and basal avians (i.e., Paleognaths) as well as external groups such as crocodilians. While pursuing this line of investigation, it became clear that the cNRE was a rare form of repetitive element, which showed a conserved topological relationship with the SMyHC III gene (i.e., cNREs flanked the SMyHC III genes at 5’ and 3’ regions). Using this topological relationship as a character, we determined when it appeared during avian evolution, and then set out to establish the likely origins of this rare repetitive motif. This search for the origins of the cNRE entailed comparisons to databases of repetitive genome elements, until the extreme telomeric nature of the SMyHC III gene became evident. This finding directed us to the fact that the hexad nature of the cNRE is reminiscent of the hexameric character of telomeric direct repeats. Because direct telomeric repeats are exactly featured in the genomes of avian DNA viruses that can infect the germline and integrate into the avian genome (Morissette & Flamand, 2010), we focused our search for the cNRE on the members of the subfamily Alphaherpesvirinae. In this search, we utilized the human herpes simplex virus 1 (HSV1) as a general model for herpes viruses and a set of four (4) members of the Alphaherpesvirinae family that specifically infect Galliformes (i.e., GaHV1, the virus responsible for avian infectious laryngotracheitis in chickens, GaHV2, the Marek’s disease virus, GaHV3, a non-pathogenic virus and MeHV1, the non-pathogenic Meleagrid herpesvirus 1 capable of infecting chicken and wild turkey) (Waidner et al., 2009). The search for cNREs in Alphaherpesvirinae was successful. We found six (6) cNRE hits in HSV1 and one (1) cNRE was detected in GaHV1, but none in MeHV1, GaHV2, and GaHV3.
  
  Our evolution-directed approach thus led to the direct recognition that cNREs up to a cutoff mismatch value of 11 can be found in the genomes of a family of viruses that contain members that infect avians and integrate their double-stranded DNA into the host germline. Therefore, as a second independent approach, we set out to extend this proof of concept by broadening our search to all known sequenced viruses to perform an unbiased, internally consistent, and quantitative analysis of cNRE presence in viral genomes, as already reported in the initial submission of this manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.18.469087v1
www.biorxiv.org www.biorxiv.org

A morphological transformation in respiratory syncytial virus leads to enhanced complement activation

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  In this study, Kuppan, Mitrovich, and Vahey investigated the impact of antibody specificity and virus morphology on complement activation by human respiratory syncytial virus (RSV). By quantifying the deposition of components of the complement system on RSV particles using high-resolution fluorescence microscopy, they found that antibodies that bind towards the apex of the RSV F protein in either the pre- or post-fusion conformation activated complement most efficiently. Additionally, complement deposition was biased towards globular RSV particles, which were frequently enriched in F in the post-fusion conformation compared to filamentous particles on which F exists predominantly in the pre-fusion conformation.
  
  Strengths:
  
  1) While many previous studies have examined the properties of antibodies that impact Fc-mediated effector functions, this study offers a conceptual advance in its demonstration that heterogeneity in virus particle morphology impacts complement activation. This novel finding will motivate further research on this topic both in the context of RSV and other viral infections.
  
  2) The use of site-specific labeling of viral proteins and high-resolution fluorescence microscopy represents a technical advance in monitoring interactions among different components of antiviral immune responses at the level of single virus particles.
  
  3) The paper is well written, data are clearly presented and support key claims of the paper with caveats appropriately acknowledged.
  
  We appreciate the reviewer’s supportive comments. In our revised manuscript, we have focused on improving clarity regarding the minor weaknesses noted below.
  
  Minor weaknesses:
  
  Working models and their implications could be clarified and extended. Specifically:
  
  1) The finding that globular particles enriched in F proteins in the post-fusion conformation (Fig 3F) are dominant targets of complement activation as measured by C3 deposition by not only post-F- but also pre-F-specific antibodies (Fig 4B, left) is interesting. This is despite the fact that, as expected, pre-F antibodies bind less efficiently to globular particles (Fig 4B, right). How do the authors reconcile these observations, given that C3 deposition seems to be IgG-concentration-dependent (Fig 2E)?
  
  The reviewer raises an excellent point: globular particles, which accumulate as the virus ages, contain more post-F and less pre-F than particles that have recently been shed from infected cells. These ‘aged’ particles nonetheless accumulate more C3 when incubated with pre-F mAbs than ‘younger’ particles, where the proportion of pre-F is higher. We attribute this to the lower surface curvature of globular particles: they accumulate more C3 in the presence of pre-F mAbs in spite of the reduced availability of pre-F epitopes. Figure 1C and 1F help to support this point. This data shows C3 deposition driven by different antibodies bound to particles enriched in either pre-F (Figure 1C) or post-F (Figure 1F). Importantly, for this experiment the conversion to post-F was driven in such a way that virion morphology is preserved (Figure 1E). In this case, we see a clear reduction in C3 deposition by pre-F mAbs on post-F particles (e.g. for CR9501, the percentage of C3-positive particles drops from 24% on pre-F virus to 6% on post-F-enriched virus). This demonstrates that, in the absence of other changes, conversion of pre-F to post-F reduces complement deposition by pre-F specific mAbs.
  
  Similarly, the reviewer correctly points out that reduced levels of antibody binding lead to lower levels of C3 deposition (Figure 2E); however, as in Figure 1, this data is collected from particles with the same morphologies. Thus, in the absence of additional factors, reduction in mAbs bound to pre-F leads to a reduction in C3 deposition driven by these mAbs. The fact that we observe the opposite trend when changes in particle morphology accompany changes in post-F abundance points to an important role for particle shape in activation of the classical pathway.
  
  2) Based on data in Figure 5-figure supplement 2, the authors argue that "large viruses are poised to evade complement activation when they emerge from cells as highly-curved filaments, but become substantially more susceptible as they age or their morphology is physically disrupted." Could the increase in C3 deposition be alternatively explained by a higher density of F proteins on larger particles instead of / in addition to a larger potential decrease in membrane curvature?
  
  We agree that the density of F on a virus – the number of F trimers per unit surface area - likely contributes to the efficiency of C3 deposition. In Figure 6 – figure supplement 2 (Figure 5 – figure supplement 2 in the original submission), we control for this potential effect by comparing viruses that have the same amount of F (as measured by fluorescence intensities of SrtA-labeled F) that are either in filamentous form or globular form (induced through osmotic swelling). The total amount of F per virus is preserved during swelling, and the membrane surface area will remain constant due to the limited ability of lipid bilayers to stretch7. As a result, the input material for these comparisons is the same in terms of F trimers per unit area, yet the C3:F ratio differs substantially. This leads us to conclude that the differences must be attributable to factors other than the density of F. Importantly, this does not mean that the amount of F per unit surface area does not matter for C3 deposition – only that this is not the effect we are observing here. We have added text (Line 299) to help clarify this point: “This effect is unlikely to arise due to changes in the abundance or density of F in the viral membrane, both of which will remain constant following swelling. Similarly, it does not appear to be purely related to size, as larger viral filaments show similar C3:F ratios as smaller viral filaments.”
  
  3) In the discussion, the authors acknowledge that the implications based on the findings are speculative. However, more clarity on the basis of these speculative models would be useful. For example, it is not clear how the findings directly inform the presented model of immunodominance hierarchies in infants.
  
  We agree that this was unclear in the original manuscript. We have rewritten paragraph 4 of the Discussion to clarify how our results may contribute to the changes in immunodominance that have been observed in RSV between infants and adults.
  
  Reviewer #2 (Public Review):
  
  This is an intriguing study that investigates the role of virus particle morphology on the ability of the first few components in the complement pathway to bind and opsonize RSV virions. The authors use primarily fluorescence microscopy with fluorescently tagged F proteins and fluorescently labeled antibodies and complement proteins (C3 and C4). They observed that antibodies against different epitopes exhibited different abilities to induce C3 binding, with a trend reflecting positioning of IgG Fc more distal to the viral membrane resulting in better complement "activation". They also compared the ability of C3 to deposit on virus produced from cells +/- CD55, which inhibits opsonization, and showed knockout led to greater C3 binding, indicating a role for this complement "defense protein" in RSV opsonization. They also examined kinetics of complement protein deposition (probed by C4 binding) to globular vs filamentous particles, observing that deposition occurred more rapidly to non-filaments.
  
  A better understanding of complement activation in response to viruses can lead to a more comprehensive understanding of the immune response to antigen both beneficial and detrimental, when dysfunctional, during infection as well as mechanisms of combating the viral infection. The study provides new mechanistic information for understanding the properties of an enveloped virus that can influence complement activation, at least in an in vitro setting. It remains to be determined whether these effects manifest in the considerably more complex setting of natural infection or even in the presence of a polyclonal antibody mixture.
  
  The studies are elegantly designed and carefully executed with reasonable checks for reproducibility and controls, which is important especially in a relatively complex and heterogeneous experimental system.
  
  We thank the reviewer for the insightful comments. We have revised the manuscript to help to clarify points of confusion and to address some of the technical points raised here.
  
  Specific points:
  
  1) "Complement activation" involves much more than C3 or C4 binding. Better to use more specific terminology relating to the observable (i.e. fluorescently labeled complement component binding)
  
  We agree with the reviewer. We have revised the manuscript throughout to make our language more accurate and precise.
  
  2) What is the rationalization for concentrations of antibodies used? What range was tested, and how dependent on antibody concentration were the observed complement deposition trends? How do they relate to physiological concentrations, and how would the presence of a more complex polyclonal response that is typically present (e.g. as the authors noted, the serum prior to antibody depletion already mediates complement activation) affect the complement activation trends? The neat, uniform display of Fc for monoclonals that were tested is likely to be quite garbled in more natural antibody response situations. This should be discussed.
  
  We have added discussion of antibody concentrations and possible differences between monoclonal and polyclonal responses to the revised manuscript. Below, we address the specific questions raised here by the reviewer.
  
  We chose to use antibody concentrations that are comparable to the concentrations of dominant clonotypes in post-vaccination serum1. Our goal in selecting relatively high antibody concentrations for our experiments was to focus on understanding the capacity of an antibody to drive complement deposition when it has reached maximum densities on RSV particles. This is discussed starting on Line 125 of Results, and in paragraph 2 of Discussion. Experiments testing a range of antibody concentrations would be valuable, but are likely to strongly reflect differences in the binding affinities of these antibodies, which have been characterized previously.
  
  Although we have not performed titrations for each of the antibodies tested due to the large number of conditions needed and the limited throughput of our experimental approach, the manuscript does present a dilution series for CR9501, the IgG1 mAb with the greatest potency in driving C3 deposition among those tested here. This data (shown in Figure 3E & F in the revised manuscript) shows that as the amount of antibody added in solution decreases over a 16-fold range, C3 deposition decreases as well. The decrease in C3 deposition is roughly commensurate with the reduction in antibody binding, reaching levels that are just above background at an antibody concentration of ~0.6μg/ml (1:800 dilution). We think it is likely that other activating antibodies would show similar trends, while antibodies that do not activate the classical pathway at saturating concentrations would be unlikely to do so across a range of lower concentrations.
  
  We agree with the reviewer that complement deposition driven by polyclonal antibodies is more complex than the monoclonal responses studied here. As discussed in paragraph 2 of our revised Discussion, one effect that polyclonal serum might have is to increase the density of Fcs on the virus by providing antibody mixtures that bind to multiple non-overlapping antigenic sites. We speculate that this would generally increase complement deposition, provided that sufficient antibodies are present that bind to productive antigenic sites (e.g. sites 0/ , II, and V).
  
  Finally, we note that we observe a similar phenomenon where globular particles are preferentially opsonized with C3 in our experiments with polyclonal serum where IgG and IgM have not been depleted (Figure R1). The major limitation of this data – which is resolved by using monoclonal antibodies – is the difficulty of determining to what extent this bias arises due to the epitopes targeted by the polyclonal serum versus the intrinsic sensitivity of the virus particles.
  
  Figure R1: RSV opsonized with polyclonal human serum. A similar bias towards globular particles (white dashed circles) is observed as in experiments with monoclonal antibodies.
  
  3) Are there artifacts or caveats resulting from immobilization of virus particles on the coverslips?
  
  As pointed out by the reviewer, a few possible artifacts or caveats could arise due to the immobilization of viruses on coverslips. These include (1) spurious binding of C1 or other complement components to the immobilizing antibody (3D3); (2) reduced access to viral antigens as a result of immobilization; and (3) inhibition of antibody-induced viral aggregation. We are able to rule out issues associated with (1), because we do not see attachment of C1 or C3 to the coverslip (i.e. outside regions occupied by virus particles). This is consistent with the fact that the antibodies are immobilized on the surface via a C-terminal biotin attached to the heavy chain, which would limit access for C1 binding and prevent the formation of Fc hexamers.
  
  Immobilization on coverslips could reduce the accessibility of a portion of the virus for binding by antibodies and complement proteins. This could effectively shield a portion of the viral surface from assembly of an activating complex, which we estimate requires ~35nm of clearance above the targeted epitope on F8. Importantly, the fraction of the viral surface area that would be shielded would vary for filaments and spheres; to determine if this could influence our results, we calculated the expected magnitude of this effect (Figure R2). To do this, we modeled the virus as being tethered to the surface via a 25nm linkage. This accounts for the length of the biotinylated PEG (~5-15nm for PEG2K, depending on the degree of extension), streptavidin (~5nm), and the anti-G antibody (~10-15nm including the biotinylated C-terminal linker). Although limited structural information is available for RSV G, the ~100 residue, heavily glycosylated region between the viral membrane and the 3D3 epitope likely extends above the height of F (~12nm). Our model assumes that a shell of thickness d surrounding the virus is necessary for antibody-C1 complexes to fit without clashing with the surface (this shell is shaded in gray in the schematic from Figure R2). Tracing the angles at which this shell clashes with the coverslip allows us to calculate the fraction of total surface area that is inaccessible for activation of the classical pathway. The results are plotted on the right side of Figure R2. The relative surface area accessible to a 35nm activating antibody-C1 complex differs between a filament and a sphere of equivalent surface area by about 15%. We conclude that this difference is modest compared to the ~5-fold difference in deposition kinetics we observe between viral filaments and spheres (Figure 4), or the 3- to 10-fold difference in relative C3 deposition we observe on larger filamentous particles after conversion to spheres (Figure 6 – figure supplement 2C).
  
  Finally, by performing experiments on immobilized viruses, we eliminate the possibility for antibody-dependent particle aggregation. While this was necessary for us to get interpretable results, the formation of viral aggregates could affect the dynamics and extent of complement deposition. For example, activation of the classical pathway on one particle in an aggregate could spread to non-activating particles through a “bystander effect”, as has been reported in other contexts9. We are interested in this question and have begun preliminary experiments in this direction; however, we believe that a definitive answer is outside the scope of this current work. To alert readers to this consideration, we have added this to paragraph 2 of the revised Discussion (Line 359).
  
  Figure R2: Estimating the surface accessibility of RSV particles bound to coverslips. Definition of variables: af: radius of cylindrical RSV filament; as: radius of spherical RSV particle of equivalent surface area (see Figure 6 – figure supplement 2A); d: distance needed above the viral surface to accommodate IgG-C1 activating complexes; h: height of viral surface above the coverslip; L: length of the viral filament.
  
  4) How is the "density of antigen" quantitated? What fraction of F or G is labeled? For fluorescence intensity measurements in general, how did the authors ensure their detection was in a linear sensitivity range for the detectors for the various fluorescent channels? Since quantitation of fluorescence intensities is important in this study, some discussion in methods would be valuable.
  
  We have performed this important additional characterization of our fluorescence system and our overall labeling and quantification strategy to address these concerns. The results of this characterization are now included in two new figure supplements in the revised manuscript (Figure 1 – figure supplements 2 & 3).
  
  5) The authors also show that the particle morphology, whether globular or filamentous, as well as relative size and resulting apparent curvature, correlate with ability of C3 to bind. Some link to the abundance of post-fusion F (post-F) is examined and discussed, but I found the back and forth discussion between morphology, C3 binding, and post-F abundance to be confusing and in need of clarification and streamlining. Is there a mechanistic link between morphology changes and post-F level increases? Are the two linked or coincidental (for example does pre-F interaction with matrix help stabilize that conformation, and if lost lead to spontaneous conversion to post-F?). Please clarify.
  
  Specifically, we have separated the discussion of pre-F versus post-F abundance and particle morphology into two different sections in Results, and we have rearranged Figures 4 and 5 (Figures 3 and 4 in the original submission) to improve clarity.
  
  Regarding the question of whether changes in morphology and the pre-F to post-F conversion are coincidental or mechanistically linked: the answer is not entirely clear, although we have collected new data that suggests a connection. We first want to note that the two effects are at least partly separable: brief treatment with a low osmolarity solution causes particle shape to change while preserving pre-F (Figure 6A & B), whereas treating with an osmotically balanced solution with low ionic strength converts pre-F to post-F without affecting virus shape (Figure 1E). However, we were motivated by the reviewer’s questions to look into this further. To determine if the change in viral shape may serve to destabilize the pre-F conformation over time, we compared the relative amounts of pre-F and post-F present in particles that were osmotically swollen to those that were not at 0h and at 24h. In these experiments, particles were swollen using a brief (~1 minute) exposure to low osmolarity conditions before returning them to PBS (Figure R3, left). As expected, we observe no immediate change in pre-F abundance following the brief osmotic shock (Figure R3, right: 0h time point), consistent with Figure 6B. After incubating the particles an additional 24h at 37oC, the post-F-to-pre-F ratio is ~3.5-fold higher in osmotically-swollen particles than in those where filamentous morphology was initially preserved (Figure R3, right: 24h time point). This supports the reviewer’s suggestion that interactions with the matrix may help to stabilize F in the prefusion conformation, since the conversion to post-F is faster when this interaction is disrupted. Whether or not this has any relevance for RSV entry into cells remains to be determined; however, it is worth noting that we observed no clear loss or gain of infectivity in RSV particles following osmotic swelling (Figure 6 – figure supplement 1A). Since this result may be of interest to readers, we have included this new data in Figure 6 – figure supplement 1B, and it is discussed briefly in Results (Line 250).
  
  Figure R3: Determining stability of pre-F following matrix detachment. Left: Experimental design. Right: Comparison of pre-F stability on untreated particles (gray) and particles subjected to brief osmotic swelling (magenta). Distributions show the ratio of post-F (ADI-14353) to pre-F (5C4) intensities per particle, combined for four biological replicates, sampled at 0h (immediately after swelling) and after an additional incubation at 37oC for 24h. Black points show median values for each individual replicate. P-values are determined from a two-sample T test.
  
  6) Since their conclusion is that curvature of the virus surface is a major influence on the ability of complement proteins to bind, I feel that some effort at modeling this effect based upon known structures is warranted. One might also anticipate then that there would be some epitope-dependent effect as a result of changes in curvature that may lead to an exaggeration of the epitope-specific effects for more highly curved particles perhaps than those with lower curvature? Is this true?
  
  The reviewer raises two excellent points: that it may be possible to gain insight into the mechanisms through which curvature dictates C1 binding and other aspects of complement activation through structural modeling, and that such a model may help to identify specific epitope effects that could contribute to curvature dependence.
  
  We developed simulations based on the geometry of RSV, F, and hexameric IgG to try to better understand how curvature may influence initiation of the classical pathway. This model is described in the Methods section (Modeling IgG hexamers on curved surfaces), and the results are discussed in the final two paragraphs of the Results section. In addition, we have included a new figure (Figure 7) to summarize the model’s predictions. This model corroborates the curvature sensitivity of IgG hexamer formation and suggests a possible intuitive explanation for our findings: high curvature effectively increases the distance between epitopes that sit high above the viral membrane, decreasing the likelihood of hexamer formation (Figure 7D). Regarding epitope specific effects, this model suggests that the further the epitope is above the viral membrane, the greater the effect that decreasing curvature will have. However, we find that epitopes closer to the membrane (e.g. those bound by 101F or ADI-19425) are overall very inefficient at activating the classical pathway, potentially due to steric obstruction of the formation of IgG hexamers. Thus, there may be an inherent tradeoff between overcoming steric obstruction (by binding to epitopes distal to the membrane) and sensitivity to surface curvature.
  
  It is important to note that this model is reductionist and does not include detailed structural information. Additional factors may be important for considering epitope-specific effects. For example, antibodies that bind equatorially on F (e.g. ADI-19425, which binds to antigenic site III), show minimal complement deposition in our experiments. However, particles whose curvature approaches the diameter of hexameric IgG or IgM (~20nm) may display these epitopes in a manner that is more accessible. If the curvature necessary to observe such an effect falls outside of the biologically accessible range, it would not be observable in our experiments. Nonetheless, it is possible that a different set of antibodies may drive complement deposition on highly-curved nanoparticle vaccines that are in development10. We have added this important point to the second paragraph of the Discussion.
  
  7) Line 265: it would be useful to confirm the increase C1 binding as a function of morphology as was done for antibody-angle of binding experiments.
  
  We believe that this data is shown in Figure 6B (Figure 5B in the original manuscript).
  
  Reviewer #3 (Public Review):
  
  Overall the manuscript is clearly written and the data are displayed well, with helpful diagrams in the figures to illustrate assays and RSV F epitopes. The engineering of the RSV strain to include a fluorescent reporter and tags on F and G that serve as substrates for fluorophore attachment is impressive and is a strength. The RSV literature is well cited and the interpretation of the results is consistent with structure/function data on RSV F and its interaction with antibodies. This reviewer is not an expert on the experiments performed in this manuscript, but they appear to be rigorously performed with appropriate controls. As such, the conclusions are justified by the data. One weakness is the extent to which the results regarding virion morphology are biologically relevant. Non-filamentous forms of the virion are generally obtained only in vitro as a result of virion purification or biochemical treatment. However, these results may be relevant for certain vaccine candidates, including the failed formalin-inactivated RSV vaccine that was evaluated in the late 1960s and caused vaccine-enhanced disease upon natural RSV infection.
  
  Thank you for these suggestions, which have helped us to better place our results regarding RSV morphology in the context of prior work. We agree with the reviewer that non-filamentous RSV particles are commonly obtained in vitro, and that this morphology does not reflect the structure of the virus as it is budding from infected cells. Our work has characterized the transition from filament to globular / amorphous form, with the finding that it can occur rapidly upon physical or chemical perturbations, as well as more gradually during natural aging: i.e. in the absence of handling or purification. We are also able to detect globular particles accumulating in cultured A549 cells, where no handling has occurred prior to observation (Figure 5 – figure supplement 1). While we do not currently know how well this reflects the tendency of RSV to undergo conversion from filament to sphere in vivo, we propose that it is plausible that such a transformation could occur. To distinguish between what we demonstrate and what we speculate, we write (Line 401): “Although more work is needed to understand the prevalence of globular particles during in vivo infection, our observations that these particles accumulate over time through the conversion of viral filaments – even under normal cell culture conditions - suggest that their presence in vivo is feasible, where the physical and chemical environment would be considerably harsher and more complex.”
  
  We agree with the reviewer that our results may have relevance towards understanding the failed formalin-inactivated vaccine trial. We have added this to paragraph 5 of the Discussion section.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.06.442421v1
www.biorxiv.org www.biorxiv.org

Determinants shaping the nanoscale architecture of the mouse rod outer segment

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2 (Public Review):
  
  The novelty of the current observation of two types of links is overstated, for example, in the abstract: "Our data reveal the existence of two molecular connectors/spacers which likely contribute to the nanometer scale precise stacking of the ROS disks" (Line 25). In fact, both of these links have been shown before (Usukura and Yamada, 1981; Roof and Heuser, 1982; Corless and Schneider, 1987; Corless et al., 1987; Kajimura et al., 2000). These previous studies deserve to be recognized. Of special note is the paper by Usukura and Yamada whose images of the disc rim connectors are by no means less convincing than shown in the current manuscript. On the other hand, the novelty and impact of the data related to peripherin appears to be understated, particularly in the abstract.
  
  We changed the abstract line 27 to: “Our data confirm the existence of two previously observed molecular connectors …”, cite the recommended references in the introduction (lines 54-55), the results (lines 131-132), and the discussion (lines 282/285). To highlight the previous reports, we rephrased the sentence in lines 132-133, “In agreement with these previous findings, we observed structures that connect membranes of two adjacent disks …”; the discussion is rephrased in lines 280-281, “Similar connectors have been observed previously ...” and “… and their statistical analysis confirmed the existence of two distinct connector species.”, and in lines 291-292, “Based on previous studies combined with our quantitative analysis, we put forward a hypothesis for the molecular identity of the disk rim connector which agrees in part with recent models”.
  
  Notably, ROM-1 has not been found in peripherin oligomers larger than octamers (e.g. Loewen and Molday, 2000 and subsequent studies by Naash and colleagues). This should be discussed in the context of the current model.
  
  We agree that this is an important aspect. We pick subvolumes along all disk rims, and on average we obtain the ordered scaffold as shown in the manuscript. We expect heterogeneity in the data because of the different degrees of oligomerization and the exclusion of ROM1 from higher oligomers. Our analysis required substantial classification to achieve convergence to a stable average, indeed indicating heterogeneity in the rim structure. However, we could not resolve additional structures to sufficient quality. It might be that this heterogeneity is what ultimately limits our achievable resolution. We added these thoughts in the discussion starting in lines 377-378, “PRPH2-Rom1 oligomers isolated from native sources exhibit varying degrees of polymerization (Loewen and Molday, 2000), and ROM1 is excluded from larger oligomers (Milstein et al., 2020). We could not resolve this heterogeneity as additional structures to sufficient quality by subvolume averaging, but in combination with the inherent flexibility of the disk rim, this heterogeneity might be the reason for the restricted resolution of our averages.”
  
  The following statement should be reconsidered given the established role of cysteine-150 in peripherin oligomerization: "We hypothesize that the necessary cysteine residues are located in the head domain of the tetramers (Figure 5B), ..." It has been firmly established that only one cysteine (C150) located in the intradiscal loop is not engaged in intramolecular interactions and is essential for peripherin oligomerization.
  
  Thank you for this advice. We agree and rephrased our discussion in lines 368-371, “The intermolecular disulfide brides are exclusively formed by the PRPH2-C150 and ROM1-C153 cysteine residues, which are located in the luminal domain (Zulliger et al., 2018). We hypothesize that these disulfide bonds (Figure 5B), are responsible for the contacts across rows (Figure 3) ...”
  
  Line 340: "A model involving V-shaped tetramers for membrane curvature formation was proposed recently (Milstein et al., 2020), but it comprises two rows of tetramers which are linked in a head-tohead manner. Our analysis instead resolves three rows organized side-by side in situ (Figure 5A)." I am confused by this statement: doesn't your model also show long rows connected head-to-head? The real difference is that Milstein and colleagues proposed four tetramers per rim whereas the current data reveal three.
  
  Thank you for pointing out this imprecise description. The model proposed by Milstein and the model in the old version of our manuscript, both propose linkage between tetramers via their disk luminal domains. In our manuscript, we refer to the luminal domain as the head domain. However, to our understanding, the Milstein model suggests two rows of tetramers, where one tetramer in the first row is rotated 180° with respect to a tetramer in the second row (therefore head-to-head), while our data indicate that the V-shaped repeats which we originally hypothesized to be tetramers are only rotated ~63° with respect to one another and are therefore rather oriented side-by-side:
  
  Fig. 2: Comparison of models for the organization of the ROS disk rim as proposed in in Milstein et al., 2020 (top panel)
  
  and in our work (lower panel). We now rephrased lines 383-385, “Instead, our analysis in situ resolves three rows of repeats which are also linked by the luminal domain but are rather organized side-by-side (Figure 5A).”
  
  Line 347: "Our data indicate that the luminal domains of tetramers hold the disk rim scaffold together (Figure 3C), which is supported by the fact that most pathological mutations of PRPH2 affect its luminal domain (Boon et al., 2008; Goldberg et al., 2001). It is possible that these mutations impair the formation of tetramers, rows of tetramers, and their disulfide bond-stabilized oligomerization. These alterations could impede or completely prevent disk morphogenesis which, in turn, would disrupt the structural integrity of ROS, compromise the viability of the retina and ultimately lead to blindness." This is not an original idea, as many studies showed that disruptions in peripherin oligomerization lead to anatomical defects in disc formation and subsequent photoreceptor cell death.
  
  Thank you for pointing this out. Our data are indeed in good agreement with the results made by many groups and further expand on them. We rephrased the manuscript in several places to clarify this relationship: in the abstract lines 32-34, “Our Cryo-ET data provide novel quantitative and structural information on the molecular architecture in ROS and substantiate previous results on proposed mechanisms underlying pathologies of certain PRPH2 mutations leading to blindness.”; in the introduction lines 78-79, “… allowed us to obtain 3D molecular-resolution images of vitrified ROS in a close-to-native state providing further evidence for previously suggested mechanisms leading to ROS dysfunction”; and in the discussion lines 393-397, “In good agreement with previous work, it is possible that these mutations impair the formation of complexes, and their disulfide bond-stabilized oligomerization (Chang et al., 2002; Conley et al., 2019; Zulliger et al., 2018). Hence, these alterations could impede or completely prevent disk morphogenesis …”. Also, additional relevant publications are cited in line 395.
  
  In regards to the distance between disc rims and plasma membrane, the authors cite the data obtained with frogs (10 nm) but not a more relevant, previously reported measurement in mice (Gilliam et al, 2012). The value of 18 nm reported in that study is much closer to the currently reported value.
  
  We appreciate the reference to this excellent paper. We added it in lines 335-337, “This value was derived from amphibians (Roof and Heuser, 1982) and deviates considerably from recent results (18 nm, (Gilliam et al., 2012)) and from our current measurements in mice (~25 nm).” Our aim was to point out that a model for ROS organization that is often cited and is otherwise well-founded (BatraSafferling et al., 2006) makes a wrong assumption about distance in the context of the mammalian systems. 7. The authors are (correctly) being very careful in assigning the molecular identity of disc interior connectors to PDE6. However, they are more confident in assigning the disc rim connectors to GARP2, which is reflected in the labeling of these links in figure
  
  Their arguments are valid, but these links are not attached to peripherin (a protein considered to be the membrane binding partner for GARPs), which is not immediately consistent with this hypothesis. Perhaps it would be fair to re-label the corresponding links in figure 5 as "disc rim connectors".
  
  That is an excellent and fair suggestion. We changed Figure 5 accordingly.
  
  On a similar note, the disc rim connectors seem to be located where ABCA4 is presumed to be localized within the rim, which may not be just a coincidence. The authors already have tomograms obtained from ABCA4 knockout animals. Is it possible to analyze whether these links are preserved in these tomograms?
  
  We agree, this is an important question to address. Unfortunately, neither the biological preparation nor the tomograms of the ABCA4 knockout were as good in quality as for the WT. Still, we frequently see connectors at the disk rim, especially after denoising of the tomograms.
  
  Fig. 3: connectors at disk rims in WT (left) and ABCA4 knockout mice (right).
  
  Sometimes it appears the connectors between adjacent disks are linked via an intradisk densities, which was already observed in Corless et al., 1987. We thought that these densities could be ABCA4 and tried to find them with two approaches in our WT tomograms (data not shown). In the first approach using a segmentation similar to what we did for the connectors between disks, we found an order of magnitude fewer intradisk connectors than (inter)disk rim connectors. In the second approach, we used the positions of segmented (inter)disk rim connectors and classified rotational averages which focused on the disk luminal space next to the contact point of a connector with the disk membrane. Again, less than 10% of the disk rim connector subvolumes were assigned to classes with an additional luminal density. Both experiments indicate that disk rim connectors sometimes occur with an additional luminal density. In total, we found less than 100 of these intradisk densities, an observation which seems to be preserved in WT and ABCA4 KO. Based on this small number of positions/locations, however, we cannot draw any conclusion. Therefore, we did not add this point to the manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.08.18.456753v1
www.biorxiv.org www.biorxiv.org

New submission 11/11/2022, 10:34:14

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Public Evaluation Summary:
  
  The authors re-analyzed a previously published dataset and identify patterns suggestive of increased bacterial biodiversity in the gut may creating new niches that lead to gene loss in a focal species and promote generation of more diversity. Two limitations are (i) that sequencing depth may not be sufficient to analyze strain-level diversity and (ii) that the evidence is exclusively based on correlations, and the observed patterns could also be explained by other eco-evolutionary processes. The claims should be supported by a more detailed analysis, and alternative hypotheses that the results do not fully exclude should be discussed. Understanding drivers of diversity in natural microbial communities is an important question that is of central interest to biomedically oriented microbiome scientists, microbial ecologists and evolutionary biologists.
  
  We agree that understanding the drivers of diversity in natural communities is an important and challenging question to address. We believe that our analysis of metagenomes from the gut microbiomes is complementary to controlled laboratory experiments and modeling studies. While these other studies are better able to establish causal relationships, we rely on correlations – a caveat which we make clear, and offer different mechanistic explanations for the patterns we observe.
  
  We also mention the caveat that we are only able to measure sub-species genetic diversity in relatively abundant species with high sequencing depth in metagenomes. These relatively abundant species include dozens of species in two metagenomic datasets, and we see no reason why they would not generalize to other members of the microbiome. Nonetheless, further work will be required to extend our results to rarer species.
  
  Our revised manuscript includes two major new analyses. First, we extend the analysis of within-species nucleotide diversity to non-synonymous sites, with generally similar results. This suggests that evolutionarily older, less selectively constrained synonymous mutations and more recent non-synonymous mutations that affect protein structure both track similarly with measures of community diversity – with some subtle differences described in the manuscript.
  
  Second, we extend our analysis of dense time series data from one individual stool donor and one deeply covered species (B. vulgatus) to four donors and 15 species. This allowed us to reinforce the pattern of gene loss in more diverse communities with greater statistical support. Our correlational results are broadly consistent with the predictions of DBD from modeling and experimental studies, and they open up new lines of inquiry for microbiome scientists, ecologists, and evolutionary biologists.
  
  Reviewer #1 (Public Review):
  
  This paper makes an important contribution to the current debate on whether the diversity of a microbial community has a positive or negative effect on its own diversity at a later time point. In my view, the main contribution is linking the diversity-begets-diversity patterns, already observed by the same authors and others, to genomic signatures of gene loss that would be expected from the Black Queen Hypothesis, establishing an eco-evolutionary link. In addition, they test this hypothesis at a more fine-grained scale (strain-level variation and SNP) and do so in human microbiome data, which adds relevance from the biomedical standpoint. The paper is a well-written and rigorous analysis using state-of-the-art methods, and the results suggest multiple new experiments and testable hypotheses (see below), which is a very valuable contribution.
  
  We thank the reviewer for their generous comments.
  
  That being said, I do have some concerns that I believe should be addressed. First of all, I am wondering whether gene loss could also occur because of environmental selection that is independent of other organisms or the diversity of the community. An alternative hypothesis to the Black Queen is that there might have been a migration of new species from outside and then loss of genes could have occurred because of the nature of the abiotic environment in the new host, without relationship to the community diversity. Telling the difference between these two hypotheses is hard and would require extensive additional experiments, which I don't think is necessary. But I do think the authors should acknowledge and discuss this alternative possibility and adjust the wording of their claims accordingly.
  
  We concur with the reviewer that the drivers of the correlation between community diversity and gene loss are unclear. Therefore, we have now added the following text to the Discussion:
  
  “Here we report that genome reduction in the gut is higher in more diverse gut communities. This could be due to de novo gene loss, preferential establishment of migrant strains encoding fewer genes, or a combination of the two. The mechanisms underlying this correlation remain unclear and could be due to biotic interactions – including metabolic cross-feeding as posited by some models (Estrela et al., 2022; San Roman and Wagner, 2021, 2018) but not others (Good and Rosenfeld, 2022) – or due to unknown abiotic drivers of both community diversity and gene loss.”
  
  Additionally, we have revised Figure 1 to show that strain invasions/replacements, in addition to evolutionary change, could be an important driver of changes in intra-species diversity in the microbiome.
  
  Another issue is that gene loss is happening in some of the most abundant species in the gut. Under Black Queen though, we would expect these species to be most likely "donors" in cross-feeding interactions. Authors should also discuss the implications, limitations, and possible alternative hypotheses of this result, which I think also stimulates future work and experiments.
  
  We thank the reviewer for raising this point. It is unclear to us whether the more abundant species would be donors in cross-feeding interactions. If we understand correctly, the reviewer is suggesting that more abundant donors will contribute more total biomass of shared metabolites to the community. This idea makes sense under the assumption that the abundant species are involved in cross-feeding interactions in the first place, which may or may not be the case. As our work heavily relies on a dataset that we previously analyzed (HMP), we wish to cite Figure S20 in Garud, Good et al. 2019 PLoS Biology in which we found there are comparable rates of gene changes across the ~30 most abundant species analyzed in the HMP. This suggests that among the most abundant species analyzed, there is no relationship between their abundance and gene change rate.
  
  That being said, we acknowledge that our study is limited to the relatively abundant focal species and state now in the Discussion: “Deeper or more targeted sequencing may permit us to determine whether the same patterns hold for rarer members of the microbiome.”
  
  Regarding Figure 5B, there is a couple of questions I believe the authors should clarify. First, How is it possible that many species have close to 0 pathways? Second, besides the overall negative correlation, the data shows some very conspicuous regularities, e.g. many different "lines" of points with identical linear negative slope but different intercept. My guess is that this is due to some constraints in the pathway detection methods, but I struggle to understand it. I think the authors should discuss these patterns more in detail.
  
  We sincerely thank the reviewer for raising this issue, as it prompted us to investigate more deeply the patterns observed at the pathway level. In short, we decided to remove this analysis from the paper because of a number of bioinformatics issues that we realized were contributing to the signal. However, in support of BQH-like mechanisms at play, we do find evidence for gene loss in more diverse communities across multiple species in both the HMP and Poyet datasets. Below we detail our investigation into Figure 5b and how we arrived at the conclusion that is should be removed:
  
  (1) Regarding data points in Figure 5B where many focal species have “zero pathways”,we firstly clarify how we compute pathway presence and richness. Pathway abundance data per species were downloaded from the HMP1-2 database, and these pathway abundances were computed using HUMAnN (HMP Unified Metabolic Analysis Network). According to HUMAnN documentation, pathway abundance is proportional to the number of complete copies of the pathway in the community; this means that if at least one component reaction in a certain pathway is missing coverage (for a sample-species pair), the pathway abundance may be zero (note that HUMAnN also employs “gap filling” to allow no more than one required reaction to have zero abundance). As such, it is likely that insufficient coverage, especially for low-abundance species, causes many pathways to report zero abundance in many species in many samples. Indeed, 556 of the 649 species considered had zero “present” pathways (i.e. having nonzero abundance) in at least 400 of the 469 samples (see figure below).
  
  (2) We thank the reviewer for pointing out the “conspicuous regularities” in Figure 5B,particularly “parallel lines” of data points that we discovered are an artifact of the flawed way in which we computed “community pathway richness [excluding the focal species].” Each diagonal line of points corresponds to different species in the same sample, and because community pathway richness is computed as the total number of pathways [across all species in the sample] minus the number of pathways in the focal species, the current Figure 5B is really plotting y against X-y for each sample (where X is a sample’s total community pathway richness, and y is the pathway richness of an individual species in that sample). This computation fails to account for the possibility that a pathway in an excluded focal species will still be present in the community due to redundancy, and indeed BQH tests for whether this redundancy is kept low in diverse communities due to mechanisms such as gene loss.
  
  We attempted to instead plot community pathway richness defined as the number of unique pathways covered by all species other than the focal species. This is equivalent to [number of unique pathways across all species in a sample] minus the [number of pathways that are ONLY present in the focal species and not any other species in the sample]. However, when we recomputed community pathway richness this way, it is rare that a pathway is present in only one species in a sample. Moreover, we find that with the exception of E. coli, focal species pathway richness tended to be very similar across the 469 samples, often reaching an upper limit of focal species pathway richness observed. (It is unclear to what extent lower pathway richnesses are due to low species abundance/low sample coverage versus gene loss). This new plot reveals even more regularities and is difficult to interpret with respect to BQH. (Note that points are colored by species; the cluster of black dots with outlying high focal pathway richness corresponds to the “unclassified” stratum which can be considered a group of many different species.)
  
  Overall, because community pathway richness (excluding a focal species) seems to primarily vary with sample rather than focal species in this dataset when using the most simple/strict definition of community pathway richness as described above, it is difficult to probe the Black Queen Hypothesis using a plot like Figure 5B. As pointed out by reviewers, lack of sequencing depth to analyze strain-level diversity and accurately quantify pathway abundance, irrespective of species abundance, seems to be a major barrier to this analysis. As such, we have decided to remove Figure 5B from the paper and rewrite some of our conclusions accordingly.
  
  Finally, I also have some conceptual concerns regarding the genomic analysis. Namely, genes can be used for biosynthesis of e.g. building blocks, but also for consumption of nutrients. Under the Black Queen Hypothesis, we would expect the adaptive loss of biosynthetic genes, as those nutrients become provided by the community. However, for catabolic genes or pathways, I would expect the opposite pattern, i.e. the gain of catabolic genes that would allow taking advantage of a more rich environment resulting from a more diverse community (or at least, the absence of pathway loss). These two opposing forces for catabolic and biosynthetic genes/pathways might obscure the trends if all genes are pooled together for the analysis. I believe this can be easily checked with the data the authors already have, and could allow the authors to discuss more in detail the functional implications of the trends they see and possibly even make a stronger case for their claims.
  
  We thank the reviewer for their suggestion. As explained above, we have removed the pathway analysis from the paper due to technical reasons. However, we did investigate catabolic and biosynthetic pathways separately as suggested by the reviewer as we describe below:
  
  We obtained subsets of biosynthetic pathways and catabolic pathways by searching for keywords (such as “degradation” for catabolic) in the MetaCyc pathway database. After excluding the “unclassified” species stratum, we observe a total of 279 biosynthetic and 167 catabolic pathways present in the HMP1-2 pathway abundance dataset. Using the corrected definition of community pathway richness excluding a focal species, for each pathway type—either biosynthetic or catabolic—we plotted focal species pathway richness against community pathway richness including all pathways regardless of type:
  
  We observe the same problem where, within a sample, community pathway richness excluding the focal species hardly varies no matter which focal species it is, due to nearly all of its detected pathways being present in at least one other species; this makes the plots difficult to interpret.
  
  Reviewer #2 (Public Review):
  
  The authors re-analysed two previously published metagenomic datasets to test how diversity at the community level is associated with diversity at the strain level in the human gut microbiota. The overall idea was to test if the observed patterns would be in agreement with the "diversity begets diversity" (DBD) model, which states that more diversity creates more niches and thereby promotes further increase of diversity (here measured at the strain-level). The authors have previously shown evidence for DBD in microbiomes using a similar approach but focusing on 16S rRNA level diversity (which does not provide strain-level insights) and on microbiomes from diverse environments.
  
  One of the datasets analysed here is a subset of a cross-sectional cohort from the Human Microbiome Project. The other dataset comes from a single individual sampled longitudinally over 18 months. This second dataset allowed the authors to not only assess the links between different levels of diversity at single timepoints, but test if high diversity at a given timepoint is associated with increased strain-level diversity at future timepoints.
  
  Understanding eco-evolutionary dynamics of diversity in natural microbial communities is an important question that remains challenging to address. The paper is well-written and the detailed description of the methodological approaches and statistical analyses is exemplary. Most of the analyses carried out in this study seem to be technically sound.
  
  We thank the reviewer for their kind words, comments, and suggestions.
  
  The major limitation of this study comes with the fact that only correlations are presented, some of which are rather weak, contrast each other, or are based on a small number of data points. In addition, finding that diversity at a given taxonomic rank is associated with diversity within a given taxon is a pattern that can be explained by many different underlying processes, e.g. species-area relationships, nutrient (diet) diversity, stressor diversity, immigration rate, and niche creation by other microbes (i.e. DBD). Without experiments, it remains vague if DBD is the underlying process that acts in these communities based on the observed patterns.
  
  We thank the reviewer for their comments. First, regarding the issue of this being a correlative study, we now more clearly acknowledge that mechanistic studies (perhaps in experimental settings) are required to fully elucidate DBD and BQH dynamics. However, we note that our correlational study from natural communities is complementary to experimental and modeling studies, to test the extent to which their predictions hold in more complex, realistic settings. This is now mentioned throughout the manuscript, most explicitly at the end of the Introduction:
  
  “Although such analyses of natural diversity cannot fully control for unmeasured confounding environmental factors, they are an important complement to controlled experimental and theoretical studies which lack real-world complexity.”
  
  Second, to increase the number of data points analyzed in the Poyet study, we now include 15 species and four different hosts (new Figure 5). The association between community diversity and gene loss is now much more statistically robust, and consistent across the Poyet and HMP time series.
  
  Third, we acknowledge more clearly in the Discussion that other processes, including diet and other environmental factors can generate the DBD pattern. We also now stress more prominently the possibility that strain migration across hosts may be responsible for the patterns observed. For example, in Figure 1, we illustrate the possibility of strain migration generating the patterns we observe.
  
  Below we quote a paragraph that we have now added in the Discussion:
  
  "Second, we cannot establish causal relationships without controlled experiments. We are therefore careful to conclude that positive diversity slopes are consistent with the predictions of DBD, and negative slopes with EC, but unmeasured environmental drivers could be at play. For example, increased dietary diversity could simultaneously select for higher community diversity and also higher intra-species diversity. In our previous study, we found that positive diversity slopes persisted even after controlling for potential abiotic drivers such as pH and temperature (Madi et al., 2020), but a similar analysis was not possible here due to a lack of metadata. Neutral processes can account for several ecological patterns such as species-area relationships (Hubbell, 2001), and must be rejected in favor of niche-centric models like DBD or EC. Using neutral models without DBD or EC, we found generally flat or negative diversity slopes due to sampling processes alone and that positive slopes were hard to explain with a neutral model (Madi et al., 2020). These models were intended mainly for 16S rRNA gene sequence data, but we expect the general conclusions to extend to metagenomic data. Nevertheless, further modeling and experimental work will be required to fully exclude a neutral explanation for the diversity slopes we report in the human gut microbiome.”
  
  Finally, we now put more emphasis on the importance of migration (strain invasion) as a non-exclusive alternative to de novo mutation and gene gain/loss. This is mentioned in the Abstract and is also illustrated in the revised Figure 1.
  
  Another limitation is that the total number of reads (5 mio for the longitudinal dataset and 20 mio for the cross-sectional dataset) is low for assessing strain-level diversity in complex communities such as the human gut microbiota. This is probably the reason why the authors only looked at one species with sufficient coverage in the longitudinal dataset.
  
  Indeed, this is a caveat which means we can only consider sub-species diversity in relatively abundant species. Nevertheless, this allows us to study dozens of species in the HMP and 15 in the more frequent Poyet time series. As more deeply sequenced metagenomes become available, future studies will be able to access the rarer species to test whether the same patterns hold or not. This is now mentioned prominently as a caveat our study in the second Discussion paragraph:
  
  “First, using metagenomic data from human microbiomes allowed us to study genetic diversity, but limited us to considering only relatively abundant species with genomes that were well-covered by short sequence reads. Deeper or more targeted sequencing may permit us to determine whether the same patterns hold for rarer members of the microbiome. However, it is notable that the majority of the dozens of species across the two datasets analyzed support DBD, suggesting that the phenomenon may generalize.”
  
  We also note that rarefaction was only applied to calculate community richness, not to estimate sub-species diversity. We apologize for this confusion, which is now clarified in the Methods as follows:
  
  “SNV and gene content variation within a focal species were ascertained only from the full dataset and not the rarefied dataset.”
  
  Analyzing the effect of diversity at a given timepoint on strain-level diversity at a later timepoint adds an important new dimension to this study which was not assessed in the previous study about the DBD in microbiomes by some of the authors. However, only a single species was analysed in the longitudinal dataset and comparisons of diversity were only done between two consecutive timepoints. This dataset could be further exploited to provide more insights into the prevailing patterns of diversity.
  
  We thank the reviewer for raising this point. We now have considered all 15 species for which there was sufficient coverage from the Poyet dataset, which included four different stool donors. Additionally, in the HMP dataset, we analyze 54 species across 154 hosts, with both datasets showing the same correlation between community diversity and gene loss.
  
  Additionally, we followed the suggestion of the reviewer of examining additional time lags, and in Figure 5 we do observe a dependency on time. This is now described in the Results as follows:
  
  “Using the Poyet dataset, we asked whether community diversity in the gut microbiome at one time point could predict polymorphism change at a future time point by fitting GAMs with the change in polymorphism rate as a function of the interaction between community diversity at the first time point and the number of days between the two time points. Shannon diversity at the earlier time point was correlated with increases in polymorphism (consistent with DBD) up to ~150 days (~4.5 months) into the future (Figure S4), but this relationship became weaker and then inverted (consistent with EC) at longer time lags (Fig 5A, Table S8, GAM, P=0.023, Chi-square test). The diversity slope is approximately flat for time lags between four and six months, which could explain why no significant relationship was found in HMP, where samples were collected every ~6 months. No relationship was observed between community richness and changes in polymorphism (Table S8, GAM, P>0.05).”
  
  Finally, the evidence that gene loss follows increase in diversity is weak, as very few genes were found to be lost between two consecutive timepoints, and the analysis is based on only a single species. Moreover, while positive correlation were found between overall community diversity and gene family diversity in single species, the opposite trend was observed when focusing on pathway diversity. A more detailed analysis (of e.g. the functions of the genes and pathways lost/gained) to explain these seemingly contrasting results and a more critical discussion of the limitations of this study would be desirable.
  
  We agree that our previous analysis of one species in one host provided weak support for gene loss following increases in diversity. As described in the response above, we have now expanded this analysis to 15 focal species and 4 independent hosts with extensive time series. We now analyze this larger dataset and report the more statistically robust results as follows:
  
  “We found that community Shannon diversity predicted future gene loss in a focal species, and this effect became stronger with longer time lags (Fig 5B, Table S9, GLMM, P=0.006, LRT for the effect of the interaction between the initial Shannon diversity and time lag on the number of genes lost). The model predicts that increasing Shannon diversity from its minimum to its maximum would result in the loss of 0.075 genes from a focal species after 250 days. In other words, about one of the 15 focal species considered would be expected to lose a gene in this time frame.
  
  Higher Shannon diversity was also associated with fewer gene gains, and this relationship also became stronger over time (Fig 5C, Table S9, GLMM, P=1.11e-09, LRT). We found a similar relationship between community species richness and gene gains, although the relationship was slightly positive at shorter time lags (Fig 5D, Table S9, GLMM, P=3.41e-04, LRT). No significant relationship was observed between richness and gene loss (Table S9, GLMM, P>0.05). Taken together with the HMP results (Fig 4), these longer time series reveal how the sign of the diversity slope can vary over time and how community diversity is generally predictive of reduced focal species gene content.”
  
  As described in detail in the response to Reviewer 1 above, we found that the HUMAnN2 pathway analyses previously described suffered from technical challenges and we deemed them inconclusive. We have therefore removed the pathway results from the manuscript.
  
  Reviewer #3 (Public Review):
  
  This work provides a series of tests of hypothesis, which are not mutually exclusive, on how genomic diversity is structured within human microbiomes and how community diversity may influence the evolution of a focal species.
  
  Strengths:
  
  The paper leverages on existing metagenomic data to look at many focal species at the same time to test for the importance of broad eco-evolutionary hypothesis, which is a novelty in the field.
  
  Thank you for the succinct summary and recognition of the strengths of our work.
  
  Weaknesses:
  
  It is not very clear if the existing metagenomic data has sufficient power to test these models.
  
  It is not clear, neither in the introduction nor in the analysis what precise mechanisms are expected to lead to DBD.
  
  The conclusion that data support DBD appears to depend on which statistics to measure of community diversity are used. Also, performing a test to reject a null neutral model would have been welcome either in the results or in the discussion.
  
  In our revised manuscript, we emphasize several caveats – including that we only have power to test these hypotheses in focal species with sufficient metagenomic coverage to measure sub-species diversity. We also describe more in the Introduction how the processes of competition and niche construction can lead to DBD. We also acknowledge that unmeasured abiotic drivers of both community diversity and sub-species diversity could also lead to the observed patterns. Throughout the manuscript, we attempt to describe the results and acknowledge multiple possible interpretations, including DBD and EC acting with different strengths on different species and time scales. Our previous manuscript assessing the evidence for DBD using 16S rRNA gene amplicon data from the Earth Microbiome Project (Madi et al., eLife 2020) assessed null models based on neutral ecological theory, and found it difficult to explain the observation of generally positive diversity slopes without invoking a non-neutral mechanism like DBD. While a new null model tailored to metagenomic data might provide additional nuance, we think developing one is beyond the scope of the manuscript – which is in the format of a short ‘Research Advance’ to expand on our previous eLife paper, and we expect that the general results of our previously reported null model provide a reasonable intuition for our new metagenomic analysis. This is now mentioned in the Discussion as follows:
  
  “In our previous study, we found that positive diversity slopes persisted even after controlling for potential abiotic drivers such as pH and temperature (Madi et al., 2020), but a similar analysis was not possible here due to a lack of metadata. Neutral processes can account for several ecological patterns such as species-area relationships (Hubbell, 2001), and must be rejected in favor of niche-centric models like DBD or EC. Using neutral models without DBD or EC, we found generally flat or negative diversity slopes due to sampling processes alone and that positive slopes were hard to explain with a neutral model (Madi et al., 2020). These models were intended mainly for 16S rRNA gene sequence data, but we expect the general conclusions to extend to metagenomic data. Nevertheless, further modeling and experimental work will be required to fully exclude a neutral explanation for the diversity slopes we report in the human gut microbiome.”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.08.483496v1
www.medrxiv.org www.medrxiv.org

New submission 07/06/2022, 15:59:04

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Although a bunch of studies have been carried out to see whether calcium supplementation is a prerequisite for the promotion of bone health or prevention of bone diseases, this is the first trial to see its effect on the population whose age is reaching peak bone mass. Outcomes are clear and justified by sound methodology. Also, the message from this systematic review could directly influence the clinical decision on who might gain benefit from calcium supplementation.
  
  We are very grateful for your considerate comments and your recognition of our work in this study. Your suggestions really helped us to improve the clarity of this manuscript.
  
  Strengths of this study are:
  
  1) This is the first systematic review by meta-analysis to focus on people at the age before achieving peak bone mass (PBM) and at the age around the PBM. 2) Detailed subgroup and sensitivity analyses drew consistent and clear results.
  
  Thank you very much for your comments. We are very grateful for your recognition of our work in this study.
  
  Limitations of this study are:
  
  1) Substantial intertrial heterogeneity should be considered in terms of dose effect of calcium supplementation and differences between both sexes etc.
  
  Thank you very much for your kind comments. We performed subgroup analyses to explore whether different doses of calcium supplementation had different effects, and the results are showed in Table 4a and 4b at the end of this Author Response. The results showed that the intertrial heterogeneity in the subgroup with doses of calcium supplementation greater than or equal to 1000 mg/day was significantly smaller than that in the subgroup with doses less than 1000 mg/day, suggesting that different doses of calcium supplementation across trials may be a potential source of the substantial intertrial heterogeneity.
  
  Similarly, we also performed subgroup analyses by sexes. Of all included trials, 23 trials focused on women only, and 20 trials involved both men and women participants, however these 20 trials did not report the results for men or women separately. We therefore divided the included trials into two subgroups: trials with women only and trials with both men and women. The corresponding results of subgroup analyses are showed in Table 5a and 5b at the end of this Author Response. The results showed that the subgroup with both men and women seemed to have less heterogeneous than the subgroup with women only, suggesting that sex may be a possible source of the observed heterogeneity.
  
  In addition, we were also aware of the large heterogeneity between trials and explored the possible sources through several additional approaches. Firstly, instead of using fixed-effects models, we have chosen random-effects models to summarize the effect estimates. Secondly, we performed meta-regression analyses by age, population regions, calcium doses, baseline intake and sample sizes to explain the intertrial heterogeneity. The results of meta-regression are provided in Table 6 at the end of this Author Response. The results suggested that this heterogeneity could be explained partially by differences in regions of participants.
  
  We have updated the results and discussions about potential sources of heterogeneity in the revised manuscript, as follows:
  
  In general, the heterogeneity between trials was obvious in the analysis for BMD (P<.001, I2=86.28%) and slightly smaller for BMC (P<.001, I2=79.28%). The intertrial heterogeneity was significantly distinct across the sites measured. Subgroup analyses and meta-regression analyses suggested that this heterogeneity could be explained partially by differences in age, duration, calcium dosages, types of calcium supplement, supplementation with or without vitamin D, baseline calcium intake levels, sex and region of participants. (See Lines 293-298 on Page 20 in the Main Text)
  
  Several limitations need to be considered. First, there was substantial intertrial heterogeneity in the present analysis, which might be attributed to the differences in baseline calcium intake levels, regions, age, duration, calcium doses, types of calcium supplement, supplementation with or without vitamin D and sexes according to subgroup and meta-regression analyses. To take heterogeneity into account, we used random effect models to summarize the effect estimates, which could reduce the impact of heterogeneity on the results to some extent. (See Lines 394-399 on Page 24 in the Main Text)
  
  2) Rarity of RCTs focused on the 20-35-year age group.
  
  Thank you very much for raising this point. We have comprehensively searched databases for eligible studies and found only three RCTs (Islam et al; Barger-Lux et al; Winters-Stone et al) focused on the 20-35-year age group. We did notice this fact as well. Because of this, we intend to perform a randomised controlled trial to evaluate the effects of calcium supplementation in this age group. In fact, this trial has already been started and is currently ongoing (Registration number: ChiCTR2200057644, http://www.chictr.org.cn/showproj.aspx?proj=155587).
  
  In this open-label, randomized controlled trial, we will randomly assign (1:1) 116 subjects (age 18-22 years) to receive either or not calcium supplementation with milk (500 mL/day, contains about 500 mg/d calcium) for 6 months. The primary outcomes are bone mineral density and bone mineral content at the lumbar spine, femoral neck and total hip. The secondary outcomes are clinical indicators related to bone health, such as serum osteocalcin, bone-specific alkaline phosphatase, urinary deoxypyridinoline, etc. We will conduct the current trial with great care and diligence and look forward to the results of this trial.
  
  Reviewer #2 (Public Review):
  
  This systematic review and meta-analysis titled 'The effect of calcium supplementation in people under 35 years old: A systematic review and meta-analysis of randomized controlled trials' provide good evidence for the importance of calcium supplementation at the age around the plateau of PBM. The statistical analyses were good overall and the manuscript was generally well written.
  
  We are very grateful for your considerate comments and for your recognition to our work in this study. Your suggestions really helped us to improve the clarity of this manuscript.
  
  One concern in this study is that RCTs included were substantially heterogenous in subjects, calcium types, duration, vitamin D supplements, etc. According to the inclusion criteria, RCTs with calcium or calcium plus vitamin D supplements with a placebo or no treatment were included in this study. However, no information about vitamin D supplementation was provided. Therefore, it seems unclear whether the effect of improving BMD or BMC is due to calcium alone or calcium plus vitamin D.
  
  We are extremely grateful for your great patience and for your kind suggestions. According to your suggestions, we have added the corresponding analyses regarding calcium supplementation with or without vitamin D supplementation. Among the included RCTs, 32 trials used calcium-only supplementation (without vitamin D supplementation) and 11 trials used calcium plus vitamin D supplementation. The detailed information are provided in the Table 1 and 2 at the end of this Author Response. We have added subgroup analyses by vitamin D supplementation as you suggested, and the corresponding results are provided in Table 3a and 3b at the end of this Author Response.
  
  When we pooled the data from the two subgroups separately, we found that calcium supplementation with vitamin D had greater beneficial effects on both the femoral neck BMD (MD: 0.758, 95% CI: 0.350 to 1.166, P < 0.001 VS. MD: 0.477, 95% CI: 0.045 to 0.910, P = 0.031) and the femoral neck BMC (MD: 0.393, 95% CI: 0.067 to 0.719, P = 0.018 VS. MD: 0.269, 95% CI: -0.025 to 0.563, P = 0.073) than calcium supplementation without vitamin D. However, for both BMD and BMC at the other sites (including lumbar spine, total hip, and total body), the observed effects in the subgroup without vitamin D supplementation appeared to be slightly better than in the subgroup with vitamin D supplementation. Therefore, these results suggested that calcium supplementation alone could improve BMD or BMC, although additional vitamin D supplementation may be beneficial in improving BMD or BMC at the femoral neck.
  
  We have added relevant parts in the main text of the revised manuscript. (See Lines 258-263 on Pages 12-13 and Lines 367-374 on Page 23 in the Main Text)
  
  As you mentioned, there exists large intertrial heterogeneity in this study, for which we compulsorily chose the random effect model, which was appropriate to get more conservative results. In addition, we did meta-subgroup analyses by calcium dose, sex, age, duration, regions, baseline calcium intake, types of calcium supplements, in order to explore possible sources of heterogeneity.
  
  The results of subgroup analyses by dose of calcium supplementation are showed in Table 4a and 4b at the end of this Author Response. For both BMD and BMC at the lumbar spine and whole body, the intertrial heterogeneity was significantly smaller in the subgroup with a calcium supplementation dose greater than or equal to 1000 mg/day than that in the subgroup with a calcium supplementation dose less than 1000 mg/day, suggesting that different doses of calcium supplementation may be a potential source of the heterogeneity.
  
  The results of subgroup analyses by sex are showed in Table 5a and 5b at the end of this Author Response. The intertrial heterogeneity was significantly smaller in the subgroup with both men and women than that in the subgroup with women only, also suggesting that sex could be a possible source of the heterogeneity.
  
  The results of subgroup analyses by age (pre-peak VS. peri-peak ) are showed in Table 7a and 7b at the end of this Author Response. The intertrial heterogeneity was significantly smaller in the peri-peak subgroup than that in the pre-peak subgroup, also suggesting that age may be a potential source of the heterogeneity.
  
  The results of subgroup analyses by intervention duration (pre-peak VS. peri-peak ) are showed in Table 8a and 8b at the end of this Author Response. For both BMD and BMC at the lumbar spine and total hip, the intertrial heterogeneity was smaller in the subgroup with a intervention period less than 18 months than that in the subgroup with a intervention period greater than or equal to 18 months, suggesting that intervention duration might be a potential source of the heterogeneity.
  
  Table 9a and 9b at the end of this Author Response showed the results of subgroup analyses by population region. The intertrial heterogeneity was significantly smaller in the Asian subgroup than that in the Western subgroup, also suggesting that population region may be a source of the heterogeneity.
  
  Table 10a and 10b at the end of this Author Response showed the results of subgroup analyses by dietary calcium intake levels at baseline. The intertrial heterogeneity was smaller in the subgroup with the dietary calcium intake level greater than or equal to 714 mg/day than that in the subgroup with the dietary calcium intake level lower than 714 mg/day, also suggesting that dietary calcium intake levels at baseline could be a potential source of the heterogeneity.
  
  Table 11a and 11b at the end of this Author Response showed the results of subgroup analyses by types of calcium supplements. For both BMD and BMC at the lumbar spine, the intertrial heterogeneity was smaller in the subgroup with calcium supplementation than that in the subgroup with dietary calcium, also suggesting that types of calcium supplements might be a source of the heterogeneity.
  
  In conclusion, the observed heterogeneity might be due to the differences in sex, age, regions of subjects, doses, intervention duration, and types of calcium supplementation, dietary calcium intake levels at baseline, and with or without vitamin D supplementation. We have updated the discussion on heterogeneity in the revised manuscript. (See Lines 394-397 on Pages 24 in the Main Text)
  
  Thanks again for your comments, we have tried to analyze and explain the large heterogeneity through a variety of approaches, however, there may still remain some inadequacies. Please tell us directly if it needs further corrections, we will be very grateful and appreciate it, and try our best to revise this part of heterogeneity.
  
  Reviewer #3 (Public Review):
  
  This paper will be welcome for clinicians and researchers related to the field. The authors, applying a well-structured meta-analysis, showed that calcium supplementation or calcium intake during 20-35 years is better than the <20 years. The clinical impact is directly associated with improving the bone mass of the femoral neck, and thus proposes a window of intervention for osteoporosis treatment. The manuscript is very well prepared and represents a thorough analysis of available randomized controlled clinical trials, but a few issues require additional consideration.
  
  We are very grateful for your considerate comments and for your recognition to our work in this study. Your comments are invaluable and have been very helpful in revising and improving our manuscript.
  
  After a careful read of the literature, it is important to highlight that the paper is a statistically robust study with a well-delineated meta-analysis of youth-adult subjects. But, I would like better to understand why the authors didn't use other datasets such as WHO Global Index Medicus (Index Medicus for Africa, the Eastern Mediterranean Region, South-East Asia, and Western Pacific, and Latin America and the Caribbean Literature on Health Sciences, Index Medicus), ClinicalTrials.gov, and the WHO ICTRP.
  
  Thank you so much for your thoughtful advice and your generosity in recommending these datasets to us. Based on your advice, we thoroughly searched these databases (the detailed search terms are provided in the Appendix File at the end of this Author Response). We have identified 23 potentially related studies and registered trials in these databases. After careful screening and review, however, no new studies were ultimately included in this meta-analysis. Some studies, which had not been completed, are recruiting subjects, and some studies were duplicates of the RCTs we had included. Finally, no new additional trials were included in our meta-analysis. The detailed screening process and the reasons for exclusion are showed in Figure 1. These three additional global databases will provide us with more comprehensive information for our future studies, thank you very much for your suggestions and guidance.
  
  Figure 1. Flow chart of search and selection
  
  References: 1. ID: emr-156089 (https://pesquisa.bvsalud.org/gim/resource/en/emr-156089) 2. ID: wpr-270003 (https://pesquisa.bvsalud.org/gim/resource/en/wpr-270003) 3. ID: lil-243754 (https://pesquisa.bvsalud.org/gim/resource/en/lil-243754) 4. ID: sea-23757 (https://pesquisa.bvsalud.org/gim/resource/en/sea-23757) 5. ID: NCT00067925 (https://clinicaltrials.gov/ct2/show/NCT00067925?term=NCT00067925&draw=2&rank=1) 6. ID: NCT00979511 (https://clinicaltrials.gov/ct2/show/NCT00979511?term=NCT00979511&draw=2&rank=1) 7. ID: NCT00065247 (https://clinicaltrials.gov/ct2/show/NCT00065247?term=NCT00065247&draw=2&rank=1) 8. Matkovic V, Landoll JD, Badenhop-Stevens NE, et al. Nutrition influences skeletal development from childhood to adulthood: a study of hip, spine, and forearm in adolescent females. J Nutr. 2004;134(3):701S-705S. doi:10.1093/jn/134.3.701S 9. Barger-Lux MJ, Davies KM, Heaney RP. Calcium supplementation does not augment bone gain in young women consuming diets moderately low in calcium. J Nutr. 2005;135(10):2362-2366. doi:10.1093/jn/135.10.2362 10. Cornes R, Sintes C, Peña A, et al. Daily Intake of a Functional Synbiotic Yogurt Increases Calcium Absorption in Young Adult Women. J Nutr. 2022;152(7):1647-1654. doi:10.1093/jn/nxac088 11. ID: NCT00063011 (https://clinicaltrials.gov/ct2/show/NCT00063011?term=NCT00063011&draw=2&rank=1) 12. ID: NCT00063024 (https://clinicaltrials.gov/ct2/show/NCT00063024?term=NCT00063024&draw=2&rank=1) 13. ID: NCT01857154 (https://clinicaltrials.gov/ct2/show/NCT01857154?term=NCT01857154&draw=2&rank=1) 14. ID: NCT00067600 (https://clinicaltrials.gov/ct2/show/NCT00067600?term=NCT00067600&draw=2&rank=1) 15. ID: NCT00063037 (https://clinicaltrials.gov/ct2/show/NCT00063037?term=NCT00063037&draw=2&rank=1) 16. ID: NCT00063050 (https://clinicaltrials.gov/ct2/show/NCT00063050?term=NCT00063050&draw=2&rank=1) 17. ID: TCTR20190624002 (https://trialsearch.who.int/Trial2.aspx?TrialID=TCTR20190624002) 18. ID: JPRN-UMIN000024182 (https://trialsearch.who.int/Trial2.aspx?TrialID=JPRN-UMIN000024182) 19. ID: NCT02636348 (https://trialsearch.who.int/Trial2.aspx?TrialID=NCT02636348) 20. ID: ACTRN 12612000374864 (https://trialsearch.who.int/Trial2.aspx?TrialID=ACTRN12612000374864) 21. ID: NCT01732328 (https://trialsearch.who.int/Trial2.aspx?TrialID=NCT01732328) 22. ID: ISRCTN28836000 (https://trialsearch.who.int/Trial2.aspx?TrialID=ISRCTN28836000) 23. ID: ISRCTN84437785 (https://trialsearch.who.int/Trial2.aspx?TrialID=ISRCTN84437785)
  
  We have also updated the literature search section and the flow chart in the main text of the revised manuscript, as follows:
  
  We applied search strategies to the following electronic bibliographic databases without language restrictions: PubMed, EMBASE, ProQuest, CENTRAL (Cochrane Central Register of Controlled Trials), WHO Global Index Medicus, Clinical Trials.gov, WHO ICTRP, China National Knowledge Infrastructure and Wanfang Data in April 2021 and updated the search in July 2022 for eligible studies addressing the effect of calcium or calcium supplementation, milk or dairy products with BMD or BMC as endpoints. (see Lines 80-85 on Page 5 and Figure 1 in the Main Text)
  
  The manuscript compares two sources of participants (in line 233) evaluating the effect of improvements on the femoral neck being "obviously stronger in Western countries than in Asian countries". But, I didn't identify if the searches were conducted applying language restrictions. This is important because we can be considering the entire world or specific countries.
  
  We are extremely grateful for your great patience and for your kind suggestions. We did not apply any language restrictions during the search process, as documented in the protocol of PROSPERO (CRD42021251275, https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=251275). Following your suggestion, we have added a description of this in the revised manuscript. (See Lines 80-81 on Page 5 in the Main Text)
  
  During the search process, we did identify five eligible articles from the Chinese databases including China National Knowledge Infrastructure (CNKI, https://www.cnki.net) and WanFang Data (https://www.wanfangdata.com.cn). However, we confirmed that these five studies were duplicates of the articles from the PubMed (PMID: 15230999; PMID: 17627404; PMID: 18296324; PMID: 20044757; PMID: 20460227). For those possibly relevant studies published in other languages than Chinese or English, the full text was downloaded and translated using DeepL translation website (https://www.deepl.com/translator) and then carefully reviewed. Ultimately, all included studies that met the inclusion and exclusion criteria were published in English. In view of this, after a systematic and comprehensive search, especially with the addition of your suggested databases, we could assume that our current study has incorporated all original researches in this field worldwide, rather than only from specific countries or regions.
  
  To explore whether the effects of calcium supplementation differ across different population regions, we performed subgroup analyses. Prior to the analysis, we hypothesized that the effect might be slightly better, or at least not worse, in populations with lower baseline dietary calcium intakes (lower baseline BMD/BMC levels) than that in populations with higher baseline dietary calcium intakes (higher baseline BMD/BMC levels). However, the results showed that the improvement effects on BMD at the femoral neck and total body and BMC at the femoral neck and lumbar spine were obviously stronger in Western countries than in Asian countries. These findings are likely to be contrary to our common sense, which is, that under normal circumstances, the effects of calcium supplementation should be more obvious in people with lower calcium intakes than in those with higher calcium intakes. Therefore, this issue needs to be tested and confirmed in future trials.
  
  The manuscript does not describe which version was used with the RoB tool.
  
  Thank you for your suggestion. As you mentioned, we completed the description of RoB tool in the Methods section, as follows:
  
  The quality of the included RCTs was assessed independently by two reviewers (SYL, HNJ) based on the Revised Cochrane Risk-of-Bias Tool for Randomized Trials (RoB 2 tool, version 22 August 2019), and each item was graded as low risk, high risk and some concerns. (See Lines 101-103 on Page 6 in the Main Text)
  
  Figures and Supplementary: No critique.
  
  Thanks for your kind comments and for your recognition to our work in this study.
  
  Appendix 1
  
  Search strategy • WHO Global Index Medicus:
  
  (tw:(calcium)) OR (mj:(calcium)) OR (tw:(calcium carbonate)) OR (tw:(calcium citrate)) OR (tw:(calcium pills)) OR (tw:(calcium supplement)) OR (tw:(Ca2)) OR (tw:(dairy product)) OR (tw:(milk)) OR (tw:(yogurt)) OR (tw:(cheese)) OR (tw:(dietary supplement)) AND (tw:(bone density)) OR (tw:(bone mineral density)) OR (tw:(bone mineral content))
  
  • ClinicalTrials.gov
  
  (calcium) OR (calcium supplementation) OR (milk) OR (dairy product) OR (yogurt) OR (cheese) Applied Filters: Interventional (clinical trial); Child (birth–17); Adult (18–64)
  
  • WHO ICTRP
  
  (calcium) OR (milk) OR (dairy) OR (yogurt) OR (cheese) in the Intervention
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2022.04.14.22273724v1
www.biorxiv.org www.biorxiv.org

New submission 23/09/2023, 13:07:18

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Zou et al. presented a comprehensive study where they generated single-cell RNA profiling of 138,982 cells from 13 samples of six patients including AK, squamous cell carcinoma in situ (SCCIS), cSCC, and their matched normal tissues, covering comprehensive clinical courses of cSCC. Using bioinformatics analysis, they identified keratinocytes, CAFs, immune cells, and their subpopulations. The authors further compared signatures within subpopulations of keratinocytes along with the clinical progression, especially basal cells, and identified many interesting genes. They also further validate some of the markers in an independent cohort using IHC, followed by some knockdown experiments using cSCC cell lines.
  
  The strength of this study is the unique data set they have created, providing the community with invaluable resources to study and validate their findings. However, a lot of analyses were not robust enough to support the claims and conclusions in the paper. More clarification and cross-comparison with polished data are needed to further strengthen the study and claims.
  
  1) Stemness markers were used. The authors used COL17A1, TP63, ITGB1, and ITGA3 to represent stemness markers. However, these were not common classic stemness markers used in cSCC. What is the source claiming these genes were stemness markers in cSCC? TP63 is a master regulator and early driver event in SCC, while COL17A1, ITGB1, and ITGA3 are all ECM genes. The authors need to use commonly well-known stem cell markers in cSCC, e.g., LGR5, to mark stem-like cells.
  
  Thanks for raising this good point. We may not have provided a clear description of the markers COL17A1, TP63, ITGB1, and ITGA3 in the previous texts. We would like to clarify that these genes were used as the markers of epidermal stem cells in normal skin samples rather than tumor stem cells in cSCC. To avoid any possible misunderstanding, we revised the main text accordingly and added the references [4-11].
  
  2) Cell proportion analysis. The authors used the mean proportions to compare different clinical groups for subpopulations of keratinocytes, e.g., Figure 2B, and Figure 5B. This is not robust, as no statistics can be derived from this. For example, from Fig 2A, it is clearly shown there is a high level of heterogeneity of cellular compositions for normal samples. One cannot say which group is higher or lower simply based on mean not variance as well.
  
  We replotted the proportion analysis with statistics and presented the new graphs in Figure 2-figure supplement 1 for Figure 2B and Figure 5-figure supplement 1 for Figure 5B.
  
  3) Basal tumour cells in SCCIS and SCC. To make the findings valid, authors need to compare these cells/populations with the keratinocyte cell populations defined by Ji et al. Cell 2020. Do basal-SCCIS-tumours cells, also in SCC samples, resemble any of the population defined in Ji et al. Ji et al. also had 10 match normal, thus the authors need to validate their findings of SCC vs normal analysis using the Ji et al. dataset.
  
  Thanks for this valuable suggestion. We compared basal tumor cell in our study with the cell populations defined in Ji et al. Cell 2020 data using SingleCellNet [1]. The results showed that both the basal-SCCIS-tumor cells of SCCIS and basal tumor cells of cSCC in our study closely resemble the Tumor_KC_Basal subcluster defined in Ji et al’s paper (Figure 4-figure supplement 4, C and D). Tumor_KC_Basal highly expressed CCL2, CXCL14, FTH1, MT2A, which is consistent with our findings in basal tumor cells.
  
  4) Copy number analysis. Authors used inferCNV to perform copy number analysis using scRNA-seq data and identified CNVs in subpopulations of keratinocytes in SCCIS and SCC. To ensure these CNVs were not artefacts, were some of the CNVs identified by inferCNV well-known copy number changes previously reported in cSCC?
  
  In poorly-differentiated cSCC sample, the significant gains in chromosome 7, 9 and deletion in chromosome 10 were reported in previous study, indicating the reliability of the CNV analysis results (Figure 5-figure supplement 2) [12].
  
  5) Pseudotime analysis lines 308-313. Not sure the pseudotime analysis added much as, as it is unclear two distinct subgroups were identified from this analysis. Suggest removing this to keep it neater
  
  Thank you for this suggestion. We have deleted the result of pseudotime analysis.
  
  6) Selection of candidate genes for validation using IHC and cell line work. For example, lines 205-206, lines 352-356 and lines 437-441, authors selected several genes associated with AK and SCC to further validate using IHC and cell line knockdown work. What are the criteria for selecting those genes for validation? It is unclear to readers how these were selected. It reads like a fishing experiment, then followed by a knockdown. Clear rationale/criteria need to be elaborated.
  
  The first consideration of candidate gene selection is the fold change of expression. We have provided the statistical results of DEGs in Supplementary file 1b, 1h, 1j-1m. Then we selected top changed genes and conducted an extensive literature search on these genes. We prioritized genes that, although not directly associated with cSCC development, have a close relationship with related pathways, as determined through functional enrichment analysis. These genes were arranged for further verification experiments. We have added more details in main text and methods section.
  
  7) TME. Compared to keratinocytes populations, the investigation of TME cells was weak. (a) can authors produce UMAP files just for T cells, DC cells, and fibroblasts separately? Figure 7B is not easy to see those subclusters. (b) similar to what was done for keratinocytes, can authors find differentially expressed clusters and genes among the different clinical groups, associated with disease progression? (c) where are the myeloid cell populations, also B cells?
  
  Thank you for your suggestions. (a) We have added the UMAP files for T cells, DC cells and stromal cells separately in new Figure 7A. (b) We identified DEGs in TME cells among the different groups. Several key genes showed monotonically changing trends associated with disease progression. For example, with the increase of malignancy, FOS shows down-regulation while S100A8 and S100A9 monotonically increase in all three types of TME cells (Figure 7C). (c) We identified two types of myeloid cell populations, macrophage and monocyte derived DCs (MoDC). We didn’t find other myeloid cells, such as neutrophil. For B cells, there were only 28 B cells in poorly-differentiated cSCC sample, which didn’t meet the threshold for further cell-cell communication analysis.
  
  8) Heat shock protein genes line 327-329. HSP signature was well-known to be induced via tissue dissociation and library prep during the scRNA experiment. How could the authors be sure these were not artefacts induced by the experiment? If authors regress their gene expression against HSP gene signatures, would this cluster still be identified?
  
  Thank you for this valuable suggestion. It is important to note that the Basal-SCCIS-tumor cluster was identified through CNV analysis, rather than the HSP signature. To address this concern and further validate this result, “AddModuleScore” function in Seurat package was used to regress gene expression against HSP gene signatures for retrieved basal cells. Our result showed that Basal_SCCIS tumor population still can be identified after regression, even more clearly (Author response image 1).
  
  Author response image 1.
  
  The identity of Basal-SCCIS-tumor cluster considering regression against HSP signatures.
  
  9) Cell-cell communication analysis. The authors claimed that that cell-to-cell interaction was significantly enhanced in poorly-differentiated cSCC, and multiple interaction pathways were significantly active. How was this kind of analysis carried out? How did the authors define significance? what statistical method was used? these were all unclear. Furthermore, it is difficult to judge the robustness of the cell-cell communication analysis. Were these findings also supported by another method, such as celltalker, and cellphoneDB?
  
  To determine the significance of the increased overall cell-to-cell interaction strength between two groups, we utilized CellChat to obtain the communication strength in different samples. We combined the communication strength based on cell type pairs, where missing values were set to 0. We performed a paired Wilcoxon test to determine whether the enhancement of cell-to-cell interaction between samples was significant.
  
  For the comparison of outgoing or incoming interaction strength of the same cell types between two groups, we first extracted the communication strength of each signal pathway contributing to outgoing or incoming strength, and then merged the strengths of signal pathways among samples, where the strength of non-shared pathways with missing value was determined to be 0. Subsequently, we performed a paired Wilcoxon test to define the significance.
  
  For multiple groups comparisons, the Kruskal-Wallis rank sum test was first performed. If the p-value is less than 0.1, the pairwise Wilcoxon test was used for subsequent pairwise comparisons. The comparison of individual signaling pathways between groups is similar to the above. We defined p-value < 0.1 as significance threshold. We have added the significance test method in figure legend for Figure 7 and Figure 8 as well as and detailed statistical data in new Supplementary file 1q-1u.
  
  As suggested, we also used the approach of CellPhoneDB based on CellChatDB database to verify our cell-cell communication results. There are 55-58% of the ligand-receptor interactions predicted by CellChat were also predicted by CellPhoneDB (Author response image 2). The enhancement of cell interaction through MHC-II, Laminin and TNF signaling pathways in poorly-differentiated cSCC sample compare to normal sample were consistent in both CellChat and CellPhoneDB (Figure 8C and Figure 8-figure supplement 1B).
  
  Author response image 2.
  
  The overlap of the predicted ligand-receptor interactions between CellChat and CellPhoneDB.
  
  10) Statistics and significance. In general, the detail of statistics and significance was lacking throughout the paper. Authors need to specify what statistical tests were used, and the p-values. It is difficult to judge the correctness of the test, and robustness without seeing the stats.
  
  We have included all statistics and significance values in the figure legend and supplemental tables, and described the statistical tests in the methods section. In this revision, we have added the necessary details of statistics and significance in the main text and figures.
  
  11) Overall, this manuscript needs a lot of re-writing. A lot of discussion was also included in the results, making it really difficult to read overall. The authors should simplify the results sections, remove the discussion bits, and further highlight and streamline with the key results of this paper.
  
  Thanks a lot for this advice. We have revised the paper thoroughly, removed discussion in results section to make the manuscript easier to read.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.12.22.521622v1
www.biorxiv.org www.biorxiv.org

Left Hemisphere Dominance for Bilateral Kinematic Encoding in the Human Brain

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  5.The reported data point to an important role of the premotor and parietal regions of the left as compared to the right hemisphere in the control of ipsilateral and contralateral limb movements. These are also the regions where the electrodes were primarily located in both subgroups of patients. I have 2 concerns in this respect. The first concern refers to the specific locus of these electrodes. For premotor cortex, the authors suggest PMd as well as PMv as potential sites for these bilateral representations. The other principal site refers to parietal cortex but this covers a large territory. It would help if more specific subregions for the parietal cortex can be indicated, if possible. Do the focal regions where electrodes were positioned refer to the superior vs inferior parietal cortex (anterior or posterior), or intra-parietal sulcus. Second, the manuscript's focus on the premotor-parietal complex emerges from the constraints imposed by accessible anatomical locations in the participants but does not preclude the existence of other cortical sites as well as subcortical regions and cerebellum for such bilateral representations. It is meaningful to clarify this and/or list this as a limitation of the current approach.
  
  On the first issue, we have updated the manuscript to specify the subregion within the parietal cortex in which we see stronger across-arm generalization - namely, the superior parietal cortex. On the second issue, we have added text in the Discussion that reference subcortical areas shown to exhibit laterality differences in bimanual coordination, providing a more holistic picture of bimanual representations across the brain. In addition, we acknowledge that with our current patient population we are limited to regions with substantial electrode coverage, which does not include all areas of the brain.
  
  6.The evidence for bilateral encoding during unilateral movement opens perspectives for a better understanding of the control of bimanual movements which are abundant during every day life. In the discussion, the authors refer to some imaging studies on bimanual control in order to infer whether the obtained findings may be a consequence of left hemisphere specialization for bimanual movement control, leading to speculations about the information that is being processed for each of both limb movements. Another perspective to consider is the possibility that making a movement with one limb may require postural stabilization in the trunk and contralateral body side, including a contribution from the opposite limb that is supposedly resting on the start button. Have the authors considered whether this postural mechanism could (partly) account for this bilateral encoding mechanism, in particular, because it appears more prominent during movement execution as compared to preparation. Furthermore, could the prominence of bilateral encoding during movement execution be triggered by inflow of sensory information about both limbs from the visual as well as the somatosensory systems.
  
  Thank you for these comments. We have added a paragraph to the Discussion to address the hypothesis that some component of ipsilateral encoding may be related to postural stabilization.
  
  In response to the final point in this comment, we agree that bilateral information during execution could be reflective of afferent inputs (somatosensory and/or visual). However, the encoding model shows that activity in premotor and parietal regions are well predicted based on kinematics during the task. While visual and somatosensory system information are likely integrated in these areas, the kinematic encoding would point to a more movement-based representation.
  
  Reviewer #2 (Public Review):
  
  Weaknesses: 1. Although the current human ECoG data set is valuable, there is still large variability in electrode coverage across the patients (I fully acknowledge the difficulty). This makes statistical assessment a bit tricky. The potential factors of interest in the current study would be Electrode (=Region), Subject, Hemisphere, and their interactions. The tricky part is that Electrode is nested within Subject, and Subject is nested within Hemisphere. Permutation-based ANOVA used for the current paper requires proper treatment of these nested factors when making permutations (Anderson and Braak, 2003). With this regard, sufficient details about how the authors treated each factor, for instance, in each pbANOVA, are not provided in the current version of the manuscript. Similarly, the scope of statistical generalizability, whether the inference is within-sample or population-level, for the claims (e.g., statement about the hemispheric or regional difference) needs to be clarified.
  
  We discuss at length the issue of electrode variability and have addressed this in the revised manuscript. Graphically, we have added a Supplemental Figure (S2). Statistically, we appreciate the point about the need for the analysis to address the nested structure of the data. We have redone all of the statistics, now using a permutation-based linear mixed effects model with a random effect of patient. This approach did not change any of the findings.
  
  As to the comment about hemispheric or regional differences, the data show that both are important factors. Our hemispheric effect is characterized by stronger ipsilateral encoding in the left hemisphere and subsequently better across-arm generalization (Figures 2-4). We then examine the spatial distribution of electrodes that generalized well or poorly and found clusters in both hemispheres of electrodes that generalize poorly. In contrast, only in the left hemisphere did we find clusters of electrodes that generalize well. These electrodes were localized to PMd, PMv and superior parietal cortex (Fig 5D). In summary, we argue that activity patterns in M1 are similar in the left and right hemispheres, but there is a marked asymmetry for activity patterns over premotor and parietal cortices.
  
  Additional contexts that would help readers interpret or understand the significance of the work: The greater amount of shared movement representation in the left hemisphere may imply the greater reliance of the left arm on the left hemisphere. This may, in turn, lead to the greater influence of the ongoing right arm motion on the left arm movement control during the bimanual coordination. Indeed, this point is addressed by the authors in the Discussion (page 15, lines 26-41). One critical piece of literature missing in this context is the work done by Yokoi, Hirashima, and Nozaki (2014). In the experiments using the bimanual reaching task, they in fact found that the learning by the left arm is to the greater degree influenced by the concurrent motion of the right arm than vice versa (Yokoi et al., J Neurosci, 2014). Together with Diedrichsen et al. (2013), this study will strengthen the authors' discussion and help readers interpret the present result of left hemisphere dominance in the context of more skillful bimanual action.
  
  The Yokoi paper is a very important paper in revealing hemispheric asymmetries during skilled bimanual movements. However, we think it is problematic to link the hemispheric asymmetries we observe to the behavioral effects reported in the Yokoi paper (namely, that the nondominant, left arm was more strongly influenced by the kinematics of the right arm). One could hypothesize that the left hemisphere, given its representation of both arms, could be controlling both arms in some sort of direct way (and thus the action of the right arm will have an influence on left arm movement given the engagement of the same neural regions for both movements). It is also possible that the left hemisphere is receiving information about the state of both the right and left arms, and this underlies the behavioral asymmetry reported in Yokoi.
  
  Reviewer #3 (Public Review):
  
  In the present work, Merrick et al. analyzed ECoG recordings from patients performing out-and-back reaching movements. The authors trained a linear model to map kinematic features (e.g., hand speed, target position) to high frequency ECoG activity (HFA) of each electrode. The two primary findings were: 1) encoding strength (as assessed by held-out R2 values) of ipsilateral and contralateral movements was more bilateral in the left hemisphere than in the right and 2) across-arm generalization was stronger in the left hemisphere than in the right. As the authors point out in the Introduction, there are known 'asymmetries between the two hemispheres in terms of praxis', so it may not be surprising to find asymmetries in the kinematic encoding of the two hemispheres (i.e., the left hemisphere contributes 'more equally' to movements on either side of the body than the right hemisphere).
  
  There is one point that I feel must be addressed before the present conclusions can be reached and a second clarification that I feel will greatly improve the interpretability of the results.
  
  First, as is often the case when working with patients, the authors have no control over the recording sites. This led to some asymmetries in both the number of electrodes in each hemisphere (as the authors note in the Discussion) and (more importantly) in the location of the recording electrodes. Recording site within a hemisphere must be controlled for before any comparisons between the hemispheres can be made. For example, the authors note that 'the contralateral bias becomes weaker the further the electrodes are from putative motor cortex'. If there happen to be more electrodes placed further from M1 in the left hemisphere (as Supplementary Figure 1 seems to suggest), than we cannot know whether the results of Figures 2 and 3 are due to the left hemisphere having stronger bilateral encoding or simply more electrodes placed further from M1.
  
  The reviewer makes a very valid point and this comment has led to our inclusion of a new Supplementary Figure, S2, in which we quantify the percentage of electrodes in each subregion.
  
  Second, it would be useful if the authors provided a bit of clarification about what type of kinematic information the linear model is using to predict HFA. I believe the paragraph titled 'Target modulation and tuning similarity across arms' suggests that there is very little across-target variance in the HFA signal. Does this imply that the model is primarily ignoring the Phi and Theta (as well as their lagged counterparts) and is instead relying on the position and speed terms? How likely is it that the majority of the HFA activity around movement onset reflects a condition-invariant 'trigger signal' (Kaufman, et al., 2016). This trigger signal accounts for the largest portion of neural variance around movement onset (by far), and the weight of individual neurons in trigger signal dimensions tend to be positive, which means that this signal will be strongly reflected in population activity (as measured by ECoG). This interpretation does not detract from the present results in any way, but it may serve to clarify them.
  
  To address this comment, we have added a new figure (Fig 6) which shows the relative contribution of each kinematic feature as well as their average weights across time for both contralateral and ipsilateral movements. This figure also addresses the reviewer’s question about the contribution of the target position to the model. As can be seen, features that reflect timing/movement initiation (position, speed) make a larger contribution compared to the two features which capture directional tuning (theta, phi). As the reviewer suggested, this result is in line Kaufman et al. (2016) which reported that a condition-invariant ‘trigger signal’ comprises the largest component of neural activity. We note that the target dependent features theta and phi still make a substantial contribution to the model (relative contribution: contra = 32%, ipsi = 37%). Previously, we have tested the contribution of the theta and phi features by comparing two models, one that only used position and speed (Movement model) and one that also included the two angular components phi and theta (Target Model). For a subset of electrodes, the held-out predictions were significantly better using the Target Model, a result we take as further evidence of electrode tuning within our dataset.
  
  The figure below shows an electrode located in M1 that is tuned to targets when the patient reached with their contralateral arm as an example. We believe that having an explicit depiction of how the four features contribute to the HFA predictions will help the reader evaluate the model. These points are now addressed in the text in the results section discussing Figure 6.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.01.442295v1
www.biorxiv.org www.biorxiv.org

Macrophages regulate gastrointestinal motility through complement component 1q

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  This manuscript by Pendse et al aimed to identify the role of the complement component C1q in intestinal homeostasis, expecting to find a role in mucosal immunity. Instead, however, they discovered an unexpected role for C1qa in regulating gut motility. First, using RNA-Seq and qPCR of cell populations isolated either by mechanical separation or flow cytometry, the authors found that the genes encoding the subunits of C1q are expressed predominantly in a sub-epithelial population of cells in the gut that Cd11b+MHCII+F4/80high, presumably macrophages. They support this conclusion by analyzing mice in which intestinal macrophages are depleted with anti-CSF1R antibody treatment and show substantial loss of C1qa, b and c transcripts. Then, they generate Lyz2Cre-C1qaflx/flx mice to genetically deplete C1qa in macrophages and assess the consequences on the fecal microbiome, transcript levels of cytokines, macromolecular permeability of the epithelial barrier, and immune cell populations, finding no major effects. Furthermore, provoking intestinal injury with chemical colitis or infection (Citrobacter) did not reveal macrophage C1qa-dependent changes in body weight or pathogen burden.
  
  Then, they analyzed C1q expression by IHC of cross-sections of small and large intestine and find that C1q immunoreactivity is detectable adjacent to, but not colocalizing with, TUBB3+ nerve fibers and CD169+ cells in the submucosa. Interestingly, they find little C1q immunoreactivity in the muscularis externa. Nevertheless, they perform RNA-sequencing of LMMP preparations (longitudinal muscle with adherent myenteric plexus) and find a number of changes in gene ontology pathways associates with neuronal function. Finally, they perform GI motility testing on the conditional knockout mice and find that they have accelerated GI transit times manifesting with subtle changes in small intestinal transit and more profound changes in measures of colonic motility.
  
  Overall, the manuscript is very well-written and the observation that macrophages are the major source of C1q in the intestine is well supported by the data, derived from multiple approaches. The observations on C1q localization in tissue and the strength of the conclusions that can be drawn from their conditional genetic model of C1qa depletion, however, would benefit from more rigorous validation.
  
  1) Interpretation of the majority of the findings in the paper rest on the specificity of the Lyz2 Cre for macrophages. While the specificity of this Cre to macrophages and some dendritic cells has been characterized in the literature in circulating immune cells, it is not clear if this has been characterized at the tissue level in the gut. Evidence demonstrating the selectivity of Cre activity in the gut would strengthen the conclusions that can be drawn.
  
  As indicated by the reviewer, Cre expression driven by the Lyz2 promoter is restricted to macrophages and some myeloid cells in the circulation (Clausen et al., 1999). To better understand intestinal Lyz2 expression at a cellular level, we analyzed Lyz2 transcripts from a published single cell RNAseq analysis of intestinal cells (Xu et al., 2019; see Figure below). These data show that intestinal Lyz2 is also predominantly expressed in gut macrophages with limited expression in dendritic cells and neutrophils.
  
  Figure. Lyz2 expression from single cell RNAseq analysis of mouse intestinal cells. Data are from Xu et al., Immunity 51, 696-708 (2019). Analysis was done through the Single Cell Portal, a repository of scRNAseq data at the Broad Institute.
  
  Additionally, our study shows that intestinal C1q expression is restricted to macrophages (CD11b+MHCII+F4/80hi) and is absent from other gut myeloid cell lineages (Figure 1E-H). This conclusion is supported by our finding that macrophage depletion via anti-CSF1R treatment also depletes most intestinal C1q (Figure 2A-C). Importantly, we found that the C1qaDMf mice retain C1q expression in the central nervous system (Figure 2 – figure supplement 1). Thus, the C1qaDMf mice allow us to assess the function of macrophage C1q in the gut and uncouple the functions of macrophage C1q from those of C1q in the central nervous system.
  
  2) Infectious and inflammatory colitis models were used to suggest that C1qa depletion in Lyz2+ lineage cells does not alter gut mucosal inflammation or immune response. However, the phenotyping of the mice in these models was somewhat cursory. For example, in DSS only body weight was shown without other typical and informative read-outs including colon length, histological changes, and disease activity scoring. Similarly, in Citrobacter only fecal cfu were measured. Especially if GI motility is accelerated in the KO mice, pathogen burden may not reflect efficiency of immune-mediated clearance alone.
  
  We have added additional results which support our conclusion that C1qaDMf mice do not show a heightened sensitivity to acute chemically induced colitis. In Figure 3 – figure supplement 1 we now show a histological analysis of the small intestines of DSS-treated C1qafl/fl and C1qaΔMφ mice. This analysis shows that C1qaDMf mice have similar histopathology, colon lengths, and histopathology scores following DSS treatment. Likewise, our revised manuscript includes histological images of the colons of Citrobacter rodentium-infected C1qafl/fl and C1qaΔMφ mice showing similar pathology (Figure 3 – figure supplement 2).
  
  3) The evidence for C1q expression being restricted to nerve-associated macrophages in the submucosal plexus was insufficient. Localization was shown at low magnification on merged single-planar images taken from cross-sections. The data shown in Figure 4C is not of sufficient resolution to support the claims made - C1q immunoreactivity, for example, is very difficult to even see. Furthermore, nerve fibers closely approximate virtually type of macrophage in the gut, from those in the lamina propria to those in the muscularis….Finally, the resolution is too low to rule out C1q immunoreactivity in the muscularis externa.
  
  Similar points were raised by Reviewer 2. Our original manuscript claimed that C1q-expressing macrophages were mostly located near enteric neurons in the submucosal plexus but were largely absent from the myenteric plexus. However, as both Reviewers have pointed out, this conclusion was based solely on our immunofluorescence analysis of tissue cross-sections.
  
  To address this concern we further characterized C1q+ macrophage localization by performing a flow cytometry analysis on macrophages isolated from the mucosa (encompassing both the lamina propria and submucosa) and the muscularis, finding similar levels of C1q expression in macrophages from both tissues (Figure 4 – figure supplement 1 in the revised manuscript). Although the mucosal macrophage fraction encompasses both lamina propria and submucosal macrophages, our immunofluorescence analysis (Figure 4 B and C) suggests that the mucosal C1q-expressing macrophages are mostly from the submucosal plexus. This observation is consistent with the immunofluorescence studies of CD169+ macrophages shown in Asano et al., which suggest that most C169+ macrophages are located in or near the submucosal region, with fewer near the villus tips (Fig. 1e, Nat. Commun. 6, 7802).
  
  Most importantly, our flow cytometry analysis indicates that the muscularis/myenteric plexus harbors C1q-expressing macrophages. To further characterize C1q expression in the muscularis, we performed RNAscope analysis by confocal microscopy of the myenteric plexus from mouse small intestine and colon (Figure 4D). The results show numerous C1q-expressing macrophages positioned close to myenteric plexus neurons, thus supporting the flow cytometry analysis. We note that although the majority of C1q immunofluorescence in our tissue cross-sections was observed in the submucosal plexus, we did observe some C1q expression in the muscularis by immunofluorescence (Figure 4B and C). We have rewritten the Results section to take these new findings into account.
  
  Is the 5um average on the proximity analysis any different for other macrophage populations to support the idea of a special relationship between C1q-expressing macrophages and neurons?
  
  We agree that the proximity analysis lacks context and have therefore removed it from the figure. The other data in the figure better support the idea that C1q+ macrophages are found predominantly in the submucosal and myenteric plexuses and that they are closely associated with neurons at these tissue sites.
  
  There are many vessels in the submucosa and many associated perivascular nerve fibers - could the proximity simply reflect that both cell types are near vessels containing C1q in circulation?
  
  Our revised manuscript includes RNAscope analysis showing C1q transcript expression by macrophages that are closely associated with enteric neurons (Figure 4D). These findings support the idea that the C1q close to enteric neurons is derived from macrophages rather than from the circulation.
  
  4) A major disconnect was between the observation that C1q expression is in the submucosa and the performance of RNA-seq studies on LMMP preparations. This makes it challenging to draw conclusions from the RNA-Seq data, and makes it particularly important to clarify the specificity of Lyz2-Cre activity.
  
  Our revised manuscript provides flow cytometry data (Figure 4 – figure supplement 1) and RNAscope analysis (Figure 4D) showing that C1q is expressed in macrophages localized to the myenteric plexus. This accords with the results of our RNAseq analysis, which indicates altered LMMP neuronal function in C1qa∆Mφ mice (Figure 6A and B). Since neurons in the myenteric plexus are known to govern gut motility, it also helps to explain our finding that gut motility is accelerated in C1qa∆Mφ mice.
  
  Finally, the pathways identified could reflect a loss of neurons or nerve fibers. No assessment of ENS health in terms of neuronal number or nerve fiber density is provided in either plexus.
  
  Reviewers 1 and 2 also raised this point. Our revised manuscript includes a comparison of the numbers of enteric neurons in C1qafl/fl and C1qaΔMφ mice. There were no marked differences in neuron numbers in C1qaDMf mice when compared to C1qafl/fl controls (Figure 5A and B). There were also similar numbers of inhibitory (nitrergic) and excitatory (cholinergic) neuronal subsets and a similar enteric glial network (Figure 5C-E). Thus, our data suggest that the altered gut motility in the C1qaΔMφ mice arises from altered neuronal function rather than from an overt loss of neurons or nerve fibers. This conclusion is further supported by increased neurogenic activity of peristalsis (Figure 6H and I), and the expression of the C1q receptor BAI1 on enteric neurons (Figure 6 – figure supplement 4).
  
  5) To my knowledge, there is limited evidence that the submucosal plexus has an effect on GI motility. A recent publication suggests that even when mice lack 90% of their submucosal neurons, they are well-appearing without overt deficits (PMID: 29666241). Submucosal neurons, however, are well known to be involved in the secretomotor reflex and fluid flux across the epithelium. Assessment of these ENS functions in the knockout mice would be important and valuable.
  
  Our revised manuscript provides new data showing C1q expression by muscularis macrophages in the myenteric plexus. We analyzed muscularis macrophages by flow cytometry and found that they express C1q (Figure 4 – figure supplement 1). These findings are further supported by RNAscope analysis of C1q expression in wholemounts of LMMP from small intestine and colon (Figure 4D and E). These results are thus consistent with the increased CMMC activity and accelerated gut motility in the C1qaDMf mice. As suggested by the reviewer, our finding of C1q+ macrophages in the submucosal plexus indicates that C1q may also have a role controlling the function of submucosal plexus neurons. We are further exploring this idea through extensive additional experimentation. Given the expanded scope of these studies, we are planning to include them in a follow-up manuscript.
  
  6) Immune function and GI motility can be highly sex-dependent - in all experiments mice of both sexes were reportedly used but it is not clear if sex effects were assessed.
  
  This is a great point, and as suggested by the reviewer we indeed did encounter differences between male and female mice in our preliminary assays of gut motility. We therefore conducted our quantitative comparisons of gut motility between C1qafl/fl and C1qaDMf mice in male mice and now clearly indicate this point in the Materials and Methods.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.01.27.478097v1
www.biorxiv.org www.biorxiv.org

Inhibition is the hallmark of CA3 intracellular dynamics around awake ripples

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This is a very interesting paper describing membrane potential dynamics of hippocampal principal cells during UP/DOWN transitions and sharp-wave ripples. Using whole-cell in combination with linear LFP recordings in head-fixed awake mice, the authors show striking differences of membrane potential responses in principal cells from the dentage gyrus, CA3 and CA1 sectors. The authors propose that switches between a dominant inhibitory excitable state and a disinhibited non-excitable state control the intra-hippocampal dynamics during UP/DOWN transitions.
  
  Obtaining intracellular recordings in vivo is commendable. The authors provide valuable data and analysis. While data show clear trends and some of the conclusions are well supported, the authors may need to clarify the following potential confounds, which can actually impact their conclusions and interpretation:
  
  1- All the analysis is based in z-scored membrane potential responses but the mean resting membrane potential is never reported. For DG granule cells recorded in awake conditions, the membrane potential is usually hyperpolarized so that most of the effect may be due to reversed GABAa mediated currents. Similarly, for those cells exhibiting the non-expected polarization during UP/DOWN states there may be drifts around reversal potentials explaining their behavior. Moreover, regional trends on passive and active membrane parameters and connectivity can actually explain part of the variability. A longitudinal comparison of state Vm and spikes in fig.5 suggests that some of the largest depolarized responses are not correlated with firing. Authors should evaluate this angle, ideally showing the distribution of membrane potential values across cells and regions and confronting this with the different membrane potential responses.
  
  We added Figure 1 - figure supplement 4, which now describes the mean resting membrane potential, input resistance, burst propensity, and spikes per burst for the recorded cells. These data are provided in Figure 1 - source data 1 together with a recording identifier that can be used to link each cell to all other figure panels and data files. We further added Figure 1 - figure supplement 1, which provides examples of morphological information for our recordings, Figure 1 - figure supplement 2 that shows examples of bursts from morphologically identified neurons, and Figure 1 - figure supplement 3 that shows the locations of recorded cells.
  
  In addition, we added Figure 5 - figure supplement 4 that includes the resting Vm and proximodistal location of cells in relation to their UP-DOWN modulation. We did not detect any significant trends with respect to brain state modulation. DG cells are more hyperpolarized compared to CA3 and CA1 cells and are closest to the reversal potential for GABAa (Figure 1 - figure supplement 4). The lack of any clear trends with respect to the resting Vm suggests that drifts around the GABAa reversal potential are unlikely to be a major factor driving variability in the observed UDS modulation.
  
  2- While there are some trends for each hippocampal regions, there is also individual variability across cells during UP/DOWN transitions (fig.5) and near ripples (fig.6). What part of this variability can be explained by proximodistal and/or deep-superficial differences of cell location and identity? Can authors provide some morphological validation, even if in only a subset of cells? For CA3, proximodistal heterogeneity for intrinsic properties and entorhinal input responses are well documented in intracellular recordings both in vitro and in vivo. What is the location of CA3 cell contributing to this study? For CA1 cells, deep-superficial trends of GABAergic perisomatic inhibition and connectivity with input pathways dominate firing responses. Regarding DG cells, are all they from the upper blade?
  
  We now provide morphological validation for a subset of cells (Figure 1 - figure supplement 1). Since we patch multiple cells in each experiment it is not possible to unequivocally determine their depth within the cell layer, although it is possible to confirm that they are granule cells or pyramidal cells in experiments where all labeled cells are principal neurons (Figure 1 - figure supplement 1). In addition, we added Figure 1 - figure supplement 3 that shows the proximodistal locations of recorded cells. With respect to the DG cells 20/22 are from the upper blade, with only two granule cells recorded in the lower blade (Figure 1 - figure supplement 3).
  
  We added Figure 5 - figure supplement 4 that includes the resting Vm and proximodistal location of each cell as a function of UP-DOWN modulation. We did not detect any significant trends with respect to UDS modulation.
  
  In addition, we added Figure 6 - figure supplement 1 that includes the resting Vm and proximodistal location of each cell as a function of ripple modulation. This figure shows that the most depolarized CA3 cells tend to hyperpolarize most during ripples, consistent with the fact that these cells are furthest away from the GABAa reversal potential and experience the highest driving force. No other significant trends were detected, although we would like to note that our recordings do not span the full proximodistal axis and may hence not be ideally suited to test the dependence of our results on proximodistal location.
  
  3- AC-coupled LFP recordings cannot provide unambiguous identification of the sign of phasic CSD signals, because fluctuations accompanying UP/DOWN states alter the baseline reference. This is actually the case, given changes of membrane potential accompanying UP/DOWN transitions. I recommend reading Brankack et al. 1993 doi: 10.1016/0006-8993(93)90043-m. The authors should acknowledge this limitation and discuss how it could influence their results. One potential solution to get rid of this effect is using principal/independent component analysis for blind source separation.
  
  We acknowledge the inherent limitations of AC-coupled recordings in regards to CSD analysis (Brankack et al., 1993). However, we do not believe these limitations affect our analysis or results for the reasons illustrated in Figure R1. Specifically, we do not attempt to measure the low frequency (< 1 Hz) CSD content directly. Instead, we extract the envelope of the rectified fast CSD transients. In the original submission we referred to this envelope signal as “DG CSD magnitude”, which may have been confusing. In the revised manuscript we use “DG CSD activity” instead to remove any suggestion that the low frequency CSD signal was directly measured. Notice that because of the rectification step the envelope signal is insensitive to the actual polarity of the fast transient CSD fluctuations. Using the envelope, we identify UP states as time periods when the rate and amplitude of EC input current transients, rather than the DC level, increases, in accordance with previous publications (Isomura et al., 2006). We further validated that the extracted UP/DOWN states reflect modulation of pupil diameter and ripple rate, quantities that are independently measured.
  
  Figure R1. Deriving slow envelope signal from AC coupled recordings. (A) In this example the true CSD signal contains both a slow component (8 Hz) and a fast component (80 Hz) that is amplitude modulated by the slow component. Such phase-amplitude coupling is well known between theta and gamma oscillations in the hippocampus. The true CSD shows a current sink with time-varying magnitude. (B) The power spectral density (PSD) estimate of the signal in (A) shows both the slow (8 Hz) and fast (three peaks near 80 Hz) components. (C) Assume LFP recordings are obtained with a high-pass filter that has eliminated the slow component. Consequently, the estimated CSD signal contains only fast fluctuations. Furthermore, instead of a time-varying current sink it shows quickly alternating sinks and sources (both negative and positive values). The slow component can be visualized as the amplitude envelope (interrupted red line) of the signal. (D) PSD estimate shows that the slow component is absent from the extracted CSD signal. (E) Rectifying the CSD estimate (black) and then filtering (red) approximately recovers the true slow component (red interrupted). This is how the DG CSD activity signal is obtained. (F) PSD estimate of the rectified and filtered CSD signal recovers the slow component (interrupted red vertical line).
  
  Reviewer #2 (Public Review):
  
  In this manuscript "Inhibition is the hallmark of CA3 intracellular dynamics around awake ripples" the authors obtained Vm recordings from CA1, CA3 and DG neurons while also obtaining local field potentials across the CA1 and DG layers. This enabled them to identify periods of up and down state transitions, and to detect sharp-wave ripples (SWRs). Using these data, they then came to the conclusion that compared to CA1 and DG, the Vm of more CA3 neurons is hyperpolarized at the approximate time of SWRs.
  
  Unfortunately, for the following reasons, the current manuscript does not necessarily support this conclusion:
  
  Recordings are obtained in mice who are recently (same day) recovering from craniotomy surgery/anesthesia and have no training on head fixation. This means that the behavioral state is abnormal, and the animal may have residual anesthesia effects.
  
  The main surgery for implanting the head-fixation apparatus and marking the coordinates for multisite and pipette insertion was carried out at least two days before the experiment. On the day of the experiment animals were briefly lightly anesthetized (<1 hr, at <1% isoflurane at 1 lit/min) for the sole purpose of resecting the dura at the two sites for multisite probe and pipette insertion. This procedure was carried out on the same day as the experiment in order to minimize the time the brain was exposed and optimize the quality of the recordings. Experiments began at least six hours after this short procedure. Furthermore, animals were given time to get familiarized with the behavioral apparatus before recordings began and showed no signs of distress.
  
  Previous studies show that about 95% of isoflurane is eliminated within minutes by exhalation (Holaday et al., 1975). The further elimination of isoflurane proceeds with a fast phase with half-time of about 7-9 min and a slower phase with half-time of about 100-115 min (Chen et al., 1992), with the faster phase reflecting elimination from the brain (Litt et al., 1991). Given these considerations there should be negligible residual isoflurane from the short anesthesia six hours later when recordings are initiated.
  
  In order to further investigate whether the short and light anesthesia during the day of recordings has any effect on the results reported in the paper, we carried out additional experiments in which we performed the surgery, including dura removal, 3 days before the recording session. The animals were habituated under head-fixation on the spherical treadmill for two hour periods each of the two days following the surgery. On the third day after surgery, we carried out recordings without any surgical procedures or anesthesia. The durations of UP and DOWN states without same day anesthesia were similar to those obtained in our previous experiments (Figure 2- figure supplement 4). The additional CA3 whole-cell recordings obtained in these new experiments have the same hyperpolarization features typical of our previous recordings. These additional experiments argue that the brief anesthesia on the day of recordings has no significant effect on the results.
  
  Most of the paper is dedicated to dynamics around up-down state transitions, not focused on ripples.
  
  We changed the title to “Up-Down states and ripples differentially modulate membrane potential dynamics across DG, CA3, and CA1 in awake mice” to reflect the analysis of both UP-DOWN state transitions and ripples. The two analyses are linked as the brain state modulation accounts for the slow Vm modulation around ripples.
  
  Vm should be examined raw first, then split into fast and slow -the cell lives with the raw Vm.
  
  The raw Vm can be obtained by adding the slow and fast Vm components. Hence the behavior of the Vm around ripples can be obtained by adding the panels of columns 1 and 3 in Figure 6. Decomposing into the slow and fast components illustrates how the slow modulation around ripples is due to brain state modulation of the slow component of the Vm (Figure 6).
  
  While some (assumed) CA3 principal cells were hyperpolarized around the time of ripples, saying inhibition is the hallmark of CA3 dynamics around ripples is an exaggeration, especially because it does not seem mechanistically tied to anything else.
  
  While a small fraction of CA3 cells is excited around ripples, the majority is inhibited. We suggest that the inhibition of the majority of CA3 neurons can account for the sparse and selective activation of CA3 around ripples.
  
  The use of ripple onset time is questionable, since the detected onset of the ripple depends on the detector settings, amplifier signal-to-noise ratio, etc. The best and most widely used (including by a subset of these authors) metric is the ripple peak time.
  
  We added Figure 6 - figure supplement 2, which shows that the Vm modulation around peak ripple power is the same as the modulation around ripple start, except for a small time shift due to the fact that the ripple power peaks shortly after ripple start. Our focus on ripple onset facilitates characterizing the timing of pre-ripple activity, such as the Vm depolarization observed before ripple onset for DG and CA1 neurons.
  
  There is not enough raw data (or quality metrics) shown to judge the quality of the data, especially for the whole cell recordings. For instance what was the input resistance of the neurons? Was the access resistance constant?
  
  We added Figure 1 - figure supplement 4, which now describes the mean resting membrane potential, input resistance, burst propensity, and spikes per burst for the recorded cells. These data are provided in Figure 1 - source data 1 together with a recording identifier that can be used to link each cell to all other figure panels and data files. We further added Figure 1- figure supplement 1, which provides examples of morphological information for our recordings, Figure 1 - figure supplement 2 that shows examples of bursts from morphologically identified neurons, and Figure 1 - figure supplement 3 that shows the locations of recorded cells.
  
  There is not enough explanation regarding why the reported results on the spiking of CA1 and CA3 neurons in SWRs is so different than previously published. In general, whole cell recording is not the most reliable way to record spike timing, and the presented whole cell data differ from previously published juxtacellular and extracellular recording methods, which better preserve physiological spiking activity.
  
  The CA1 neurons in this study depolarize and elevate their firing around ripples, consistent with previous intracellular and extracellular recordings. Our study reveals hyperpolarization of the majority of CA3 cells while only a small fraction is depolarized. This is consistent with the sparse activation of CA3 around ripples previously reported with extracellular studies. The overall firing rate change of CA3 neurons around ripples is a balance between the firing rate elevation of the small subset of activated cells and the net decrease in firing across the rest of the population. Since the baseline firing rate of CA3 pyramidal neurons in quiet wakefulness and sleep is low, the ripple-associated inhibition may not be readily observable in the spiking of individual CA3 neurons due to a “floor effect”. The overall rate of CA3 neurons we record increases before ripple onset, consistent with previous studies (Fig. 6D4). The subthreshold hyperpolarization of the majority of neurons provides novel insights into the mechanisms ensuring sparse and selective activation of the CA3 population around ripples.
  
  The number of neurons from each area is not reported.
  
  The number of cells was (indirectly) reported as the number of rows in Figs. 3-7. We now report the number of cells explicitly: 22 DG cells, 32 CA3 cells, and 32 CA1 cells.
  
  There is no verification of cell type so it is inappropriate to assume that all neurons are the principal neurons.
  
  We added Figure 1 - figure supplement 1, which shows morphological identification of recorded cells. We patch multiple cells in each experiment, but we can confirm the morphological identity of principal neurons when all stained cells have morphology of dentate granule cells or CA3/CA1 pyramidal neurons. The properties of morphologically identified cells in Figure 1 - figure supplement 1 are typical of all recorded cells (morphologically identified neurons from Figure 1 - figure supplement 1 are shown as diamonds in Figure 1- figure supplement 4, while the rest are shown as dots). There were no significant differences between the two groups (p > 0.05 t-test; p > 0.05 Wilcoxon rank sum test).
  
  Are the fluctuations in the CA3 Vm generally smaller than for CA1 and DG because of physiology or technical reasons?
  
  The recordings were done in exactly the same way across areas, arguing against technical reasons for any differences observed across the hippocampal subfields.
  
  Reviewer #3 (Public Review):
  
  During slow wave sleep and quiet immobility, communication between the hippocampus and the neocortex is thought to be important for memory formation notably during periods of hippocampal synchronous activity called sharp-wave ripple events. The cellular mechanisms of sharp-wave ripple initiation in the hippocampus are still largely unknown, notably during awake immobility. In this paper, the authors addressed this question using patch-clamp recordings of principal cells in different hippocampal subfields (CA3, CA1 and the dentate gyrus) combined with extracellular recordings in awake head-fixed mice as well as computer modeling. Using the current source density (CSD) profile of local field potential (LFP) recordings in the molecular layer of the dentate gyrus as a proxy of UP/DOWN state activity in the entorhinal cortex they report the preferential occurrence of sharp-wave ripple (recorded in area CA1) during UP states with a higher probability toward the end of the UP state (unlike eye blinks which preferentially occur during DOWN states). Patch-clamp recordings reveal that a majority of dentate granule cells get depolarized during UP state while a majority of CA3 pyramidal cells get hyperpolarized and CA1 pyramidal cells show a more mixed behavior. Closer examination of Vm behavior around state transitions revealed that CA3 pyramidal cells are depolarized and spike at the DOWN/UP transition (with some cells depolarizing even earlier) and then progressively hyperpolarize during the course of the UP state while DGCs and CA1 pyramidal cells tend to depolarize and fire throughout the UP state. Interestingly, CA3 pyramidal cells also tend to be hyperpolarized during ripples (except for a minority of cells that get depolarized and could be instrumental in ripple generation), while DGCs and CA1 pyramidal cells tend to be depolarized and fire. The strong activation of dentate granule cells during ripples is particularly interesting and deserves further investigations. The observation that the probability of ripple occurrence increases toward the end of the UP state, when CA3 pyramidal cells are maximally hyperpolarized, suggests that the inhibitory state of the CA3 hippocampal network could be permissive for ripple generation possibly by de-inactivation of voltage-gated channels thus increasing their excitability (i.e. ability to get excited). Altogether, these results confirm previous work on the impact of slow oscillations on the membrane potential of hippocampal neurons in vivo under anesthesia but also point to specificities possibly linked to the awake state. They also invite to revisit previous models derived from in vitro recordings attributing synchronous activity in CA3 to a global build-up of excitatory activity in the network by suggesting a role for Vm hyperpolarization in preserving the excitability of the CA3 network.
  
  1) In light of recent report of heterogeneity within hippocampal cell types (and notably description of a new CA3 pyramidal cell type instrumental for sharp-wave ripple generation) (Hunt et al., 2018), the small minority of CA3 pyramidal cells depolarized during ripples deserve more attention. These cells are indeed likely key in the generation of sharp wave ripple. Several analyses could be performed in order to decipher whether they have specific intrinsic properties (baseline Vm, firing threshold, burst propensity), whether they are located in specific sub-areas of CA3 (a versus b, deep versus superficial) and whether they are distinctively modulated during UP/DOWN states.
  
  Following the reviewer’s suggestion we now analyze the properties and UDS modulation of the CA3 neurons that are depolarized around ripples (Figure 6 - figure supplement 3). These neurons have comparable resting Vm, spike thresholds, and burst propensity as the rest of the CA3 population (p > 0.05, t-test). These CA3 cells had lower firing probability in the DOWN state. The locations of the depolarized cells are distributed across CA3c,b and are not clustered compared to the rest of the cells (Figure R2).
  
  Figure R2. Proximodistal locations of CA3 cells that depolarize during ripples. Same as Figure 1 - figure supplement 3, but CA3 cells showing depolarization in their ripple-triggered average (RTA) response are marked with black dots. There was no significant difference in the proximodistal locations of these cells compared to the rest of the CA3 population (p > 0.05, t-test).
  
  The population of athorny cells described in Hunt et al. represents a small percentage of CA3 cells (10-20%) that are concentrated in the CA3a region, which we do not sample in our recordings. Hence, the depolarized cells are unlikely to correspond to the athorny cells reported in Hunt et al.
  
  2) The authors use CSD analysis in the DG as a proxy of synaptic inputs coming from the EC to define alternating periods of UP and DOWN states. I have few questions concerning this procedure: 1- It is unclear if only periods when animals was still/immobile were analyzed. 2- How coherent were these periods with slow oscillations recorded in the cortex (which are also recorded with the linear probe?).
  
  The analysis was restricted to periods of immobility, which comprise the majority of the recording time as the animals are not performing any task. Cortical LFPs exhibit high coherence for low frequencies (<1 Hz) with the rectified DG CSD signal (Figure R3), although the contribution of volume conduction to this effect cannot be ruled out.
  
  Figure R3. Coherence between DG CSD power and cortical LFP. (Top) population average magnitude squared coherence between DG CSD power (rectified CSD from the DG molecular layer) and cortical LFP across all recorded datasets. Notice the elevated coherence at low frequencies (< 1 Hz, vertical interrupted line) as well as the peak at theta ( 7-8 Hz). Volume conduction from other brain areas (i.e. the hippocampus) contributes to the cortical LFP and may be responsible for the coherence at theta, as well as at low frequencies. (Bottom) Each row in the pseudocolor image shows the coherence between DG CSD power and cortical LFP for a given dataset.
  
  3- How long did these periods last? Did they occur during classically described hippocampal states (LIA/SIA) or do they correspond to a different state (Wolansky et al., J Neurosci 2006).
  
  The distribution of UP and DOWN state durations is shown in Figure 2 - figure supplement 4.
  
  We also added Figure 2 - supplementary figure 8 that shows the distribution of LIA and SIA transitions as a function of UDS phase. The LIA and SIA states were computed based on LFPs from CA1 stratum radiatum as described in (Hulse et al., 2017). The detected LIA→SIA transitions map very closely to UP→DOWN transitions. The SIA→LIA transitions are also concentrated around DOWN→UP transitions, but the distribution is broader compared to the LIA→SIA transitions. These observations are consistent with UP states broadly overlapping with LIA and DOWN states with SIA.
  
  3) To better characterize hippocampal CSD profiles around ripples and UP/Down states transitions, could you plot ripple and UDS transition-triggered average CSD profiles across hippocampal subfields?
  
  We added Figure 2 - supplementary figure 7 that shows average CSD profiles around UP/DOWN state transitions and ripples.
  
  4) The duration of UP states appears longer than that reported in anesthetized animals. To ascertain this fact could the authors quantify and report mean UP and DOWN states durations? Shorter DOWN states would decrease the probability to detect ripple. Could the authors correct for this bias in their analysis of ripple occurrence during UP and DOWN states?
  
  We report the medians and means of the distributions of UP and DOWN durations in Figure 2 - figure supplement 4. Ripples occur almost exclusively during the UP states, with almost no ripples occurring in DOWN states. Furthermore, the duration of UP and DOWN states is comparable suggesting that the duration of DOWN states does not bias the probability of ripple detection. We also added Figure 2 - figure supplement 2B, showing the rate (in Hz) of ripple occurrence as a function of UDS phase, which explicitly controls for UDS phase occupancy.
  
  The duration of UP and DOWN states in quiet wakefulness depend on the behavior of the animal, attentional state, and external stimuli and need not be the same as in anesthesia or sleep when the animal is not behaving and is less responsive to external stimuli. To provide validation that the extracted UP and DOWN states in quiet wakefulness indeed correspond to genuine brain states, we show that the pupil diameter and ripple rates which are independently extracted are strongly modulated around the extracted UP and DOWN states.
  
  5) The authors report a high coherence between the Vm of an example CA3 pyramidal cells and UP/DOWN state in DG. Was it a general property of a majority of CA3 pyramidal cells? The coherence values should be reported for all CA3 pyramidal cells.
  
  We added Figure 2 - figure supplement 1, which reports the coherence of all cells across the subfields with the rectified DG CSD. The coherence values are similar across cells and subfields. We also report correlations between the slow component of the Vm and DG CSD activity for all cells in Figure 3. Neurons in CA3 exhibit negative correlations in contrast to DG and CA1, with the absolute values of the correlations similar across the subfields.
  
  6) Was the high coherence between DG CSD magnitude and CA3 Vm specific to these slow oscillatory periods or a more general feature of the DG/CA3 functional coupling. For example, was it also observed during theta/movement periods?
  
  Figure 2 - figure supplement 1 reports the coherence of all cells across the subfields with rectified DG CSD over the entire recording duration. Mice do not perform any tasks during the recordings so periods of immobility and quiet wakefulness comprise the majority of the recording session and are the focus of our analysis. During some occasional theta periods there is increased coherence in the theta frequency band (figure R4).
  
  7) Fig. 6 shows depolarization and increase firing in DGCs up to 150 ms prior to ripple onset. However, ripples sometime occur in bursts with one ripple following others. Could such phenomenon explain the firing prior to ripples? (which would in fact correspond to firing during a previous ripple). What is the behavior of firing rate and Vm of different cells types if analysis is restricted to isolated ripples? This analysis is notably important in CA3 where feedback inhibition following a first ripple could lead to hyperpolarization « during » the next ripple.
  
  We added a new figure (Figure 7 - figure supplement 2) that compares Vm aligned to the onset of isolated single ripples vs. ripple doublets. The pre-ripple depolarization in DG and CA1 is similar for isolated ripples and ripple doublets arguing against the hypothesis that pre-ripple responses are a reflection of ripple bursts.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.04.20.440699v1
www.biorxiv.org www.biorxiv.org

New submission 08/10/2023, 17:01:47

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The manuscript by Royall et al. builds on previous work in the mouse that indicates that neural progenitor cells (NPCs) undergo asymmetric inheritance of centrosomes and provides evidence that a similar process occurs in human NPCs, which was previously unknown.
  
  The authors use hESC-derived forebrain organoids and develop a novel recombination tag-induced genetic tool to birthdate and track the segregation of centrosomes in NPCs over multiple divisions. The thoughtful experiments yield data that are concise and well-controlled, and the data support the asymmetric segregation of centrosomes in NPCs. These data indicate that at least apical NPCs in humans undergo asymmetric centrosome inheritance. The authors attempt to disrupt the process and present some data that there may be differences in cell fate, but this conclusion would be better supported by a better assessment of the fate of these different NPCs (e.g. NPCs versus new neurons) and would support the conclusion that younger centriole is inherited by new neurons.
  
  We thank the reviewer for their supportive comments (“…thoughtful experiments yield data that are concise and well-controlled…”).
  
  Reviewer #2 (Public Review):
  
  Royall et al. examine the asymmetric inheritance of centrosomes during human brain development. In agreement with previous studies in mice, their data suggest that the older centrosome is inherited by the self-renewing daughter cell, whereas the younger centrosome is inherited by the differentiating daughter cell. The key importance of this study is to show that this phenomenon takes place during human brain development, which the authors achieved by utilizing forebrain organoids as a model system and applying the recombination-induced tag exchange (RITE) technology to birthdate and track the centrosomes.
  
  Overall, the study is well executed and brings new insights of general interest for cell and developmental biology with particular relevance to developmental neurobiology. The Discussion is excellent, it brings this study into the context of previous work and proposes very appealing suggestions on the evolutionary relevance and underlying mechanisms of the asymmetric inheritance of centrosomes. The main weakness of the study is that it tackles asymmetric inheritance only using fixed organoid samples. Although the authors developed a reasonable mode to assign the clonal relationships in their images, this study would be much stronger if the authors could apply time-lapse microscopy to show the asymmetric inheritance of centrosomes.
  
  We thank the reviewer for their constructive and supportive comments (“…the study is well executed and brings new insights of general interest for cell and developmental biology with particular relevance to developmental neurobiology….”). We understand the request for clonal data or dynamic analyses in organoids (e.g., using time-lapse microscopy). We also agree that such data would certainly strengthen our findings. However, as outlined above (please refer to point #1 of the editorial summary), this is unfortunately currently not feasible. However, we have explicitly discussed this shortcoming in our revised manuscript and why future experiments (with advanced methodology) will have to do these experiments.
  
  Reviewer #3 (Public Review):
  
  In this manuscript, the authors report that human cortical radial glia asymmetrically segregates newly produced or old centrosomes after mitosis, depending on the fate of the daughter cell, similar to what was previously demonstrated for mouse neocortical radial glia (Wang et al. 2009). To do this, the authors develop a novel centrosome labelling strategy in human ESCs that allows recombination-dependent switching of tagged fluorescent reporters from old to newly produced centrosome protein, centriolin. The authors then generate human cortical organoids from these hESCs to show that radial glia in the ventricular zone retains older centrosomes whereas differentiated cells, i.e. neurons, inherit the newly produced centrosome after mitosis. The authors then knock down a critical regulator of asymmetric centrosome inheritance called Ninein, which leads to a randomization of this process, similar to what was observed in mouse cortical radial glia.
  
  A major strength of the study is the combined use of the centrosome labelling strategy with human cortical organoids to address an important biological question in human tissue. This study is similarly presented as the one performed in mice (Wang et al. 2009) and the existence of the asymmetric inheritance mechanism of centrosomes in another species grants strength to the main claim proposed by the authors. It is a well-written, concise article, and the experiments are well-designed. The authors achieve the aims they set out in the beginning, and this is one of the perfect examples of the right use of human cortical organoids to study an important phenomenon. However, there are some key controls that would elevate the main conclusions considerably.
  
  We thank the reviewer for their overall support of our findings (“..authors achieve the aims they set out in the beginning, and this is one of the perfect examples of the right use of human cortical organoids to study an important phenomenon…”). We also understand the reviewer’s request for additional experiments/controls that “…would elevate the main conclusions considerably.”
  
  1) The lack of clonal resolution or timelapse imaging makes it hard to assess whether the inheritance of centrosomes occurs as the authors claim. The authors show that there is an increase in newly made non-ventricular centrosomes at a population level but without labelling clones and demonstrating that a new or old centrosome is inherited asymmetrically in a dividing radial glia would grant additional credence to the central conclusion of the paper. These experiments will put away any doubt about the existence of this mechanism in human radial glia, especially if it is demonstrated using timelapse imaging. Additionally, knowing the proportions of symmetric vs asymmetrically dividing cells generating old/new centrosomes will provide important insights pertinent to the conclusions of the paper. Alternatively, the authors could soften their conclusions, especially for Fig 2.
  
  We understand the reviewer’s request. As outlined above (please refer to point #1 of the editorial summary), we had tried previously to add data using single cell timelapse imaging. However, due to the size and therefore weakness of the fluorescent signal we had failed despite extensive efforts. According to the reviewer’s suggestion we have now explicitly discussed this shortcoming and softened our conclusions.
  
  2) Some critical controls are missing. In Fig. 1B, there is a green dot that does not colocalize with Pericentrin. This is worrying and providing rigorous quantifications of the number of green and tdTom dots with Pericentrin would be very helpful to validate the labelling strategy. Quantifications would put these doubts to rest. Additionally, an example pericentrin staining with the GFP/TdTom signal in figure 4 would also give confidence to the reader. For figure 4, having a control for the retroviral infection is important. Although the authors show a convincing phenotype, the effect might be underestimated due to the incomplete infection of all the analyzed cells.
  
  We have included more rigorous quantifications in our revised manuscript.
  
  For Figure 1: There are indeed some green speckles that might be misinterpreted as a green centrosome. However, the speckles are usually smaller and by applying a strict size requirement we exclude speckles. To check whether the classifier might interpret any speckles as centrosomes, we manually checked 60 green “dots” that were annotated as centrosome. From these images all green spots detected as centrosome co-localized with Pericentrin signal (Images shown in Author response image 1).
  
  For Figure 4: as we are comparing cells that were either infected with a retrovirus expressing scrambled or Ninein-targeting shRNA we compare cells that experienced a similar treatment. Besides that, only cells infected with the virus express Cre-ERT2 whereby only the centrosomes of targeted cells were analyzed. Accordingly, we only compare cells expressing scrambled or Ninein-targeting shRNA, all surrounding “wt” cells are not considered.
  
  Author response image 1.
  
  Pictures used to test the classifier. Each of the green “dots” recognized by the classifier as a Centriolin-NeonGreen-containing centrosome (green) co-localized with Pericentrin signal (white).
  
  3) It would be helpful if the authors expand on the presence of old centrosomes in apical radial glia vs outer radial glia. Currently, in figure 3, the authors only focus on Sox2+ cells but this could be complemented with the inclusion of markers for outer radial glia and whether older centrosomes are also inherited by oRGCs. This would have important implications on whether symmetric/asymmetric division influences the segregation of new/old centrosomes.
  
  That is an interesting question and we do agree that additional analyses, stratified by ventricular vs. oRGCs would be interesting. However, at the time points analysed there are only very few oRGCs present (if any) in human ESC-derived organoids (Qian et al., Cell, 2016). However, we have now added this point for future experiments to our discussion.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.20.508710v1
www.biorxiv.org www.biorxiv.org

Coupling of pupil- and neuronal population dynamics reveals diverse influences of arousal on cortical processing

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  [...] Recently, pupil dilation was linked to cholinergic and noradrenergic neuromodulation as well as cortical state dynamics in animal research. This work adds substantially to this growing research field by revealing the temporal and spatial dynamics of pupil-linked changes in cortical state in a large sample of human participants.
  
  The analyses are thorough and well conducted, but some questions remain, especially concerning unbiased ways to account for the temporal lag between neural and pupil changes. Moreover, it should be stressed that the provided evidence is of indirect nature (i.e., resting state pupil dilation as proxy of neuromodulation, with multiple neuromodulatory systems influencing the measure), and the behavioral relevance of the findings cannot be shown in the current study.
  
  Thank you for your positive feedback and constructive suggestions. We are especially grateful for the numerous pointers to other work relevant to our study.
  
  Concerning the temporal lag: The authors' uniformly shift pupil data (but not pupil derivative) in time for their source-space analyses (see above). However, the evidence for the chosen temporal lags (930 ms and 0 ms) is not that firm. For instance, in the cited study by Reimer and colleagues [1] , cholinergic activation shows a temporal lag of ~ 0.5 s with regard to pupil dilation - and the authors would like to relate pupil time series primarily to acetylcholine. Moreover, Joshi and colleagues [2] demonstrated that locus coeruleus spikes precede changes in the first derivative of pupil dilation by about 300 ms (and not 0 ms). Finally, in a recent study recording intracranial EEG activity in humans [3], pupil dilation lagged behind neural events with a delay between ~0.5-1.7s. Together, this questions the chosen temporal lags.
  
  More importantly, Figures 3 and S3 demonstrate variable lags for different frequency bands (also evident for the pupil derivative), which are disregarded in the current source-space analyses. This biases the subsequent analyses. For instance, Figure S3 B shows the strongest correlation effect (Z~5), a negative association between pupil and the alpha-beta band. However, this effect is not evident in the corresponding source analyses (Figure S5), presumably due to the chosen zero-time-lag (the negative association peaked at ~900 ms)).
  
  As the conducted cross-correlations provided direct evidence for the lags for each frequency band, using these for subsequent analyses seems less biased.
  
  This is an important point and we gladly take the opportunity to clarify this in detail. In essence, choosing one particular lag over others was a decision we took to address the multi-dimensional issue of presenting our results (spectral, spatial and time dimensions) and fix one parameter for the spatial description (see e.g. Figure 4). It is worth pointing out first that our analyses were all based on spectral decompositions that necessarily have limited temporal resolutions. Therefore, any given lag represents the center of a band that we can reasonably attribute to a time range. In fact, Figure 3C shows how spread out the effects are. It also shows that the peaks (troughs) of low and high frequency ranges align with our chosen lag quite well, while effects in the mid-frequency range are not “optimally” captured.
  
  As picking lags based on maximum effects may be seen as double dipping, we note that we chose 0.93 sec a priori based on the existing literature, and most prominently based on the canonical impulse response of the pupil to arousing stimuli that is known to peak at that latency on average (Hoeks & Levelt, 1993; Wierda et al. 2012; also see Burlingham et al.; 2021). This lag further agrees with the results of reference [3] cited by the reviewer as it falls within that time range, and with Reimer et al.’s finding (cited as [1] above), as well as Breton-Provencher et al. (2019) who report a lag of ~900 ms sec (see their Supplementary Figure S8) between noradrenergic LC activation and pupil dilation. Finally, note that it was not our aim to relate pupil dilations to either ACh or NE in particular as we cannot make this distinction based on our data alone. Instead, we point out and discuss the similarities of our findings with time lags that have been reported for either neurotransmitter before.
  
  With respect to using different lags, changing the lag to 0 or 500 msec is unlikely to alter the reported effects qualitatively for low- and high frequency ranges (see Figure 3C), as both the pupil time series as well as fluctuations in power are dominated by very slow fluctuations (<< 1 Hz). As a consequence, shifting the signal by 500 msec has very little impact. For comparison, below we provide the reviewer with the results presented in Figure 4 but computed based on zero (Figure R1) and 500-msec (Figure R2) lags. While there are small quantitative differences, qualitatively the results remain mostly identical irrespective of the chosen lag.
  
  Figure R1. Figure equivalent to main Figure 4, but without shifting the pupil.
  
  In sum, choosing one common lag a priori (as we did here) does not necessarily impose more of a bias on the presentation of the results than choosing them post-hoc based on the peaks in the cross-correlograms. However, we have taken this point as a motivation to revise the Results and Methods sections where applicable to strengthen the rationale behind our choice. Most importantly, we changed the first paragraph that mentions and justifies the shift as follows, because original wording may have given the false impression that the cross-correlation results influenced lag choice:
  
  “Based on previous reports (Hoeks & Levelt, 1993; Joshi et al., 2016; Reimer et al., 2016), we shifted the pupil signal 930 ms forward (with respect to the MEG signal). We introduced this shift to compensate for the lag that had previously been observed between external manipulations of arousal (Hoeks & Levelt, 1993) as well as spontaneous noradrenergic activity (Reimer et al., 2016) and changes in pupil diameter. In our data, this shift also aligned with the lags for low- and high-frequency extrema in the cross-correlation analysis (Figure 3B).”
  
  Figure R2. Figure equivalent to main Figure 4, but with shifting the pupil with respect to the MEG by 500 ms.
  
  Related to this aspect: For some parts of the analyses, the pupil time series was shifted with regard to the MEG data (e.g., Figure 4). However, for subsequent analyses pupil and MEG data were analyzed in concurrent 2 s time windows (e.g., Figure 5 and 6), without a preceding shift in time. This complicates comparisons of the results across analyses and the reasoning behind this should be discussed.
  
  The signal has been shifted for all analyses that relate to pupil diameter (but not pupil derivative). We have added versions of the following statement in the respective Results and Methods section to clarify (example from Results section ‘Nonlinear relations between pupil-linked arousal and band-limited cortical activity’):
  
  “In keeping with previous analyses, we shifted the pupil time series forward by 930 msec, while applying no shift to the pupil derivative.”
  
  The authors refer to simultaneous fMRI-pupil studies in their background section. However, throughout the manuscript, they do not mention recent work linking (task-related) changes in pupil dilation and neural oscillations (e.g., [4-6]) which does seem relevant here, too. This seems especially warranted, as these findings in part appear to disagree with the here-reported observations. For instance, these studies consistently show negative pupil-alpha associations (while the authors mostly show positive associations). Moreover, one of these studies tested for links between pupil dilation and aperiodic EEG activity but did not find a reliable association (again conflicting with the here-reported data). Discussing potential differences between studies could strengthen the manuscript.
  
  We have added a discussion of the suggested works to our Discussion section. We point out however that a recent study (Podvalny et al., https://doi.org/10.7554/eLife.68265) corroborates our finding while measuring resting-state pupil and MEG simultaneously in a situation very similar to ours. Also, we note that Whitmarsh et al. (2021) (reference [6]) is actually in line with our findings as we find a similar negative relationship between alpha-range activity in somatomotor cortices and pupil size.
  
  Please also take into account that results from studies of task- or event-related changes in pupil diameter (phasic responses) cannot be straightforwardly compared with the findings reported here (focusing on fluctuations in tonic pupil size) , due to the inverse relationship between tonic (or baseline) and phasic pupil response (e.g. Knapen et al., 2016). This means that on trials with larger baseline pupil diameter, phasic pupil dilation will be smaller and vice versa. Hence, a negative relation between the evoked change in pupil diameter and alpha-band power can very well be consistent with the positive correlation between tonic pupil diameter and alpha-band activity that we report here for visual cortex.
  
  In section ‘Arousal modulates cortical activity across space, time and frequencies’ we have added:
  
  “Seemingly contradicting the present findings, previous work on task-related EEG and MEG dynamics reported a negative relationship between pupil-linked arousal and alpha-range activity in occipito-parietal sensors during visual processing (Meindertsma et al, 2017) and fear conditioning (Dahl et al. 2020).Note however that results from task-related experiments, that focus on evoked changes in pupil diameter rather than fluctuations in tonic pupil size, cannot be directly compared with our findings. Similar to noradrenergic neurons in locus coeruleus (Aston-Jones & Cohen, 2005), phasic pupil responses exhibit an inverse relationship with tonic pupil size (Knapen et al., 2016). This means that on trials with larger baseline pupil diameter (e.g. during a pre-stimulus period), the evoked (phasic) pupil response will be smaller and vice versa. As a consequence, a negative correlation between alpha-band activity in the visual cortex and task-related phasic pupil responses does not preclude a positive correlation with tonic pupil size during baseline or rest as reported here. In line with this, Whitmarsh et al., 2021 found a negative relationship between alpha-activity and pupil size in the somatosensory cortex that agrees with our finding. Although using an event-related design to study attention to tactile stimuli, this relationship occurred in the baseline, i.e. before observing any task-related phasic effects on pupil-linked arousal or cortical activity.”
  
  In section ‘Arousal modulation of cortical excitation-inhibition ratio’ we have added: “The absence of this effect in visual cortices may explain why Kosciessa et al. (2021) found no relationship between pupil-linked arousal and spectral slope when investigating phasic pupil dilation in response to a stimulus during visual task performance. However, this behavioral context, associated with different arousal levels, likely also changes E/I in the visual cortex when compared with the resting state (Pfeffer et al., 2018).”
  
  Finally, in the Conclusion we added (note: ‘they’ = the present results): “Further, they largely agree with similar findings of a recent independent report (Podvalny et al., 2021).”
  
  Related to this aspect: The authors frequently relate their findings to recent work in rodents. For this it would be good to consider species differences when comparing frequency bands across rodents and primates (cf. [7,8]).
  
  Throughout our Results section we have mainly remained agnostic with respect to labeling frequency ranges when drawing between-species comparisons, and have only reverted to it as a justification for a dimension reduction for some of the presented analysis. Following your comment however, we have phrased the following section in the Discussion, section ‘Arousal modulates cortical activity across space, time and frequencies’, more carefully:
  
  “The low-frequency regime referred to in rodent work (2—10Hz; e.g., McGinley et al., 2015) includes activity that shares characteristics with human alpha rhythms (3—6Hz; Nestogel and McCormick, 2021; Senzai et al. 2019). The human equivalent however clearly separates from activity in lower frequency bands and,here, showed idiosyncratic relationships with pupil-linked arousal.”
  
  Figure 1 highlights direct neuromodulatory effects in the cortex. However, seminal [9-11] and more recent work [12,13] demonstrates that noradrenaline and acetylcholine also act in the thalamus which seems relevant concerning the interpretation of low frequency effects observed here. Moreover, neural oscillations also influence neuromodulatory activity, thus the one-headed arrows do not seem warranted (panel C) [3,14].
  
  This is a very good point. First, we would like to note that we have extended on acknowledging thalamic contributions to low-frequency (specifically alpha) effects in response to the Reviewer’s point 11 (‘Recommendations for authors’ section below). Also, we have added a reference to the role of potential top-down (reverse) influences to our Discussion, section ‘Arousal modulates cortical activity across space, time and frequencies’, as follows:
  
  “Further, we note that our analyses and interpretations focus on arousal-related neuromodulatory influences on cortical activity, whereas recent work also supports a reverse “top-down” route, at least for frontal cortex high-frequency activity on LC spiking activity (Totah et al., 2021).”
  
  Ultimately, however, we decided to leave the arrows in Figure 1C uni-directional to keep in line with the rationale of our research that stems mostly from rodent work, which also emphasises the indicated directionality. Also, reference [3] is highly interesting for us because it actually aligns with our data: The authors show that a spontaneous peak of high-frequency band activity (>70 Hz) in insular cortex precedes a pupil dilation peak (or plateau) in two of three participants by ~500msec (which mimics a pattern found for task-evoked activity; see their Figure 5b/c). We find a maximum in our cross-correlation between pupil size and high frequency band activity (>64 Hz) that indicates a similar lag (see our Figure 3B). Importantly, both results do not rule out a common source of neuromodulation for the effects. We have added the following to the end of the section ‘An arousal-triggered cascade of activity in the resting human brain’:
  
  “In fact, Kucyi & Parvizi (2020) found spontaneous peaks of high-frequency band activity (>70 Hz) in the insular cortex of three resting surgically implanted patients that preceded pupil dilation by ~500msec - a time range that is consistent with the lag of our cross-correlation between pupil size and high frequency (>64Hz) activity (see Figure 3B). Importantly, they showed that this sequence mimicked a similar but more pronounced pattern during task performance. Given the purported role of the insula (Menon & Uddin, 2015), this finding lends support to the idea that spontaneous covariations of pupil size and cortical activity signal arousal events related to intermittent 'monitoring sweeps' for behaviourally relevant information.”
  
  In their discussion, the authors propose a pupil-linked temporal cascade of cognitive processes and accompanying power changes. This argument could be strengthened by showing that earlier events in the cascade can predict subsequent ones (e.g., are the earlier low and high frequency effects predictive of the subsequent alpha-beta synchronization?)-
  
  We added this cascade angle as one possible interpretation of the observed effects. We fully agree that this is an interesting question but would argue that this would ideally be tested in follow-up research specifically designed for that purpose. The suggested analysis would add a post-hoc aspect to our exploratory investigation in the absence of a suitable contrast, while also potentially side-tracking the main aim of the study. We have revised the language in this section and added the following changes (bold) to the last paragraph to emphasise the speculatory aspect, and clarify what we think needs to be done to look into this further and with more explanatory power.
  
  “The three scenarios described here are not mutually exclusive and may explain one and the same phenomenon from different perspectives. Further, it remains possible that the sequence we observe comprises independent effects with specific timings. A pivotal manipulation to test these assumptions will be to contrast the observed sequence with other potential coupling patterns between pupil-linked arousal and cortical activity during different behavioural states.”
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.25.449734v1
www.biorxiv.org www.biorxiv.org

New submission 23/09/2023, 13:12:27

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The study by Akter et al demonstrates that astrocyte-derived L-lactate plays a key role in schema memory formation and promotes mitochondrial biogenesis in the Anterior Cingulate Cortex (ACC).
  
  The main tool used by the authors is the DREADD technology that allows to pharmacologically activate receptors in a cell-specific manner. In the study, the authors used the DREADD technique to activate appropriately transfected astrocytes, a subtype of muscarinic receptor that is not normally present in cells. This receptor being coupled to a Gi-mediated signal transduction pathway inhibiting cAMP formation, the authors could demonstrate cell-(astrocyte) specific decreases in cAMP levels that result in decreased L-lactate production by astrocytes.
  
  Behaviorally this pharmacological manipulation results in impairments of schema memory formation and retrieval in the ACC in flavor-place paired associate paradigms. Such impairments are prevented by co-administration of L-lactate.
  
  The authors also show that activation of Gi signaling resulting in L-lactate decreased release by astrocytes impairs mitochondrial biogenesis in neurons in an L-lactate reversible manner.
  
  By using MCT 2 inhibitors and an NMDAR antagonist the authors conclude that the molecular mechanisms underlying the observed effects are mediated by L-lactate entering neurons through MCT2 transporters and involve NMDAR.
  
  Overall, the article's conclusions are warranted by the experimental evidence, but some weak points could be addressed which would make the conclusions even stronger.
  
  The number of animals in some of the experiments is on the low side (4 to 6).
  
  In the revised manuscript, we have increased the animal numbers in two key experimental groups (hM4Di-CNO and Control groups) of behavioral experiments. Now the animal numbers in different groups are as follows:
  
  • 15 rats in hM4Di-CNO group
  
  o Further divided into two subgroups for probe tests (PT1-4) conducted during flavor-place paired associate training; 8 rats in the hM4Di-CNO (saline) and 7 rats in the hM4Di-CNO (CNO) subgroups receiving I.P. saline or I.P. CNO, respectively, before these PTs.
  
  • 8 rats in the Control group
  
  • 7 rats in the Rescue group (hM4Di-CNO+L-lactate)
  
  • 4 rats in the Control-CNO group. Animal number in this group was not increased as it was apparent from these 4 rats that CNO alone was not impairing the PA learning and memory retrieval in these rats (AAV8-GFAP-mCherry injected). Their result was very similar to the control group. Additionally, in a previous study (Liu et al., 2022), we showed that CNO administration in the rats injected with AAV8-GFAP-mCherry into the hippocampus does not show any impairments in schema.
  
  Also, in the newly added open field test experiments to investigate the locomotor activity as suggested by the Reviewer #2, 8 rats were used in each group.
  
  The use of CIN to inhibit MCT2 is not optimal. Authors may want to decrease MCT2 expression by using antisense oligonucleotides.
  
  In the revised manuscript, we have conducted the experiment using MCT2 antisense oligodeoxynucleotide (ODN) as suggested.
  
  To test whether the L-lactate-induced neuronal mitochondrial biogenesis is dependent on MCT2, we bilaterally injected MCT2 antisense oligodeoxynucleotide (MCT2-ODN, n=8 rats, 2 nmol in 1 μl PBS per ACC) or scrambled ODN (SC-ODN, n=8 rats, 2 nmol in 1 μl PBS per ACC) into the ACC. After 11 hours, bilateral infusion of L-lactate (10 nmol, 1 μl) or ACSF (1 μl) was given into the ACC and the rats were kept in the PA event arena. After 60 mins (12 hours from MCT2-ODN or SC-ODN administration), the rats were sacrificed. As shown in Author response image 1B, SC-ODN+L-lactate group showed significantly increased relative mtDNA copy number compared to the SC-ODN+ACSF group (p<0.001, ANOVA followed by Tukey's multiple comparisons test). However, this effect was completely abolished in MCT2-ODN+L-lactate group, suggesting that MCT2 is required for the L-lactate-induced mitochondrial biogenesis in the ACC.
  
  We have integrated this new data and results in the revised manuscript.
  
  Author response image 1.
  
  Mitochondrial biogenesis by L-lactate is dependent on MCT2 and NMDAR. A. Experimental design to investigate whether MCT2 and NMDAR activity are required for L-lactate-induced mitochondrial biogenesis. B and C. mtDNA copy number abundance in the ACC of different rat groups relative to nDNA. Data shown as mean ± SD (n=4 rats in each group). ***p<0.001, ANOVA followed by Tukey's multiple comparisons test.
  
  The experiment using AVP to block NMDAR only partially supports the conclusions. Indeed, blocking NMDAR will knock down any response that involves these receptors, whether L-lactate is necessary or not.
  
  In the current study we found that Astrocytic Gi activation in the ACC reduced L-lactate level in the ECF of ACC which was also associated with decreased PGC-1α/SIRT3/ATPB/mtDNA abundance suggesting downregulation of mitochondrial biogenesis pathway. We also found that exogenous administration of L-lactate into the ACC of astrocytic Gi-activated rats rescued this downregulation. In line with this, in a recently published study (Akter et al., 2023), we found upregulation of mitochondrial biogenesis pathway in the hippocampus neurons of exogenous L-lactate-treated anesthetized rats. Another recent study has demonstrated that exercise-induced L-lactate release from skeletal muscle or I.P. injection of L-lactate can induce hippocampal PGC-1α (which is a master regulator of mitochondrial biogenesis) expression and mitochondrial biogenesis in mice (Park et al., 2021). Together, these results provide compelling evidence that L-lactate promotes mitochondrial biogenesis.
  
  L-lactate is known to promote expression of synaptic plasticity genes like Arc, c-Fos, and Zif268 in neurons (Yang et al., 2014). After entry into the neuronal cytoplasm, mainly through MCT2, it is converted into pyruvate by lactate dehydrogenase 1 (LDH1). This conversion also produces NADH, affecting the redox state of the neuron. NADH positively modulates the activity of NMDAR resulting in enhanced Ca2+ currents, the activation of intracellular signaling cascades, and the induction of the expression of plasticity-associated genes (Yang et al., 2014; Magistretti & Allaman, 2018). The study demonstrated that L-lactate–induced plasticity gene expression was abolished in the presence of NMDAR antagonists including D-APV (Yang et al., 2014). These results suggested that the MCT2 and NMDAR are key players in the regulation of L-lactate induced plasticity gene expression.
  
  In the current study, we investigated whether similar mechanisms might be involved in L-lactate-induced neuronal mitochondrial biogenesis. We now used MCT2 antisense oligodeoxynucleotide to decrease the expression of MCT2 (as mentioned in the previous response and Author response image 1B) and showed that MCT2 is necessary for L-lactate-induced mitochondrial biogenesis to manifest, indicating that L-lactate’s entry into the neuron is required. As mentioned before, after entry into neuron, L-lactate is converted into pyruvate by LDH, which also produce NADH, which in turn potentiates NMDAR activity. Therefore, we investigated whether NMDAR activity is required for L-lactate-induced mitochondrial biogenesis. We used D-APV to inhibit NMDAR (Author response image 1C) and found that L-lactate does not increase mtDNA copy number abundance if D-APV is given, suggesting that NMDAR activity is required for L-lactate to promote mitochondrial biogenesis.
  
  NMDAR serves diverse functions. Therefore, as mentioned by the reviewer, blocking NMDAR may knock down many such functions. While our current data only suggests the involvement of MCT2 and NMDAR in the upregulation of mitochondrial biogenesis by L-lactate, we have not investigated other mechanisms and pathways modulating mitochondrial biogenesis that are either dependent or independent of MCT2 and NMDAR activity. Further studies are needed in future to dissect and better understand this interesting observation. We have now clarified this in the discussion section of the manuscript.
  
  Is inhibition of glycogenolysis involved in the observed effects mediated by Gi signaling? Indeed, L-lactate is formed both by glycolysis and glycogenolysis. The authors could test whether the glycogen metabolism-inhibiting drug DAB would mimic the effects of Gi activation.
  
  In this study we have shown that astrocytic Gi activation in the ACC leads to a decrease in the cAMP and L-lactate. L-lactate is produced by glycogenolysis and glycolysis. cAMP in astrocytes acts as a trigger for L-lactate production (Choi et al., 2012; Horvat, Muhič, et al., 2021; Horvat, Zorec, et al., 2021; Zhou et al., 2021) by promoting glycogenolysis and glycolysis (Vardjan et al., 2018; Horvat, Muhič, et al., 2021; Horvat, Zorec, et al., 2021). Therefore, one promising explanation of reduced L-lactate level observed in our study is the reduction of L-lactate production in the astrocyte due to decreased glycogen metabolism as a result of decreased cAMP. We have now mentioned this in the discussion.
  
  DAB is an inhibitor of glycogen phosphorylase that suppresses L-lactate production. It was shown to impair memory by decreasing L-lactate (Newman et al., 2011; Suzuki et al., 2011; Iqbal et al., 2023). As we found that the impairment in the schema memory and mitochondrial biogenesis was associated with decreased L-lactate level in the ACC and that the exogenous L-lactate administration can rescue the impairments, it is likely that DAB will mimic the effect of Gi activation in terms of schema memory and mitochondrial biogenesis. However, further study is needed to confirm this.
  
  Reviewer #2 (Public Review):
  
  The manuscript of Akter et al is an important study that investigates the role of astrocytic Gi signaling in the anterior cingulate cortex in the modulation of extracellular L-lactate level and consequently impairment in flavor-place associates (PA) learning. However, whereas some of the behavioral observations and signaling mechanism data are compelling, the conclusions about the effect on memory are inadequate as they rely on an experimental design that does not allow to differentiate acute or learning effect from the effect outlasting pharmacological treatments, i.e. effect on memory retention. With the addition of a few experiments, this paper would be of interest to the larger group of researchers interested in neuron-glia interactions during complex behavior.
  
  • Largely, I agree with the authors' conclusion that activating Gi signaling in astrocytes impairs PA learning, however, the effect on memory retrieval is not that obvious. All behavioral and molecular signaling effects described in this study are obtained with the continuous presence of CNO, therefore it is not possible to exclude the acute effect of Gi pathway activation in astrocytes. What will happen with memory on retrieval test when CNO is omitted selectively during early, middle, or late session blocks of PA learning?
  
  We have now added 8 more rats to the hM4Di-CNO group (i.e., the group with astrocytic Gi activation) to clarify the memory retrieval. These rats underwent flavor-place paired associate (PA) training similar to the previously described rats (n=7) of this group, that is they received CNO 30 minutes before and 30 minutes after the PA training sessions (S1-2, S4-8, S10-17). However, contrasting to the previous rats of this group which received CNO before PTs (PT1, PT2, PT3), we omitted the CNO (instead administered I.P. saline) selectively on these PTs conducted at the early, middle, and late stage of PA training, as suggested by the reviewer. These newly added rats did not show memory retrieval in these PTs, suggesting that the rats were not learning the PAs from the PA training sessions. See Author response image 2C-E, where this subgroup is denoted as hM4Di-CNO (Saline).
  
  We then continued more PA training sessions (S21 onwards, Author response image 2B) for these rats without CNO. They gradually learned the PAs. PTs (PT5, PT6, PT7; Author response image 2G-I) were done during this continuation phase of PA training; once without CNO (i.e., with I.P. saline instead), and another one with CNO. As seen in the Author response image 2H and 2I, they retrieved the memory when PT6 and PT7 were done without CNO. However, if these PTs were done with CNO, they could not retrieve the memory. Together these results suggest that ACC astrocytic Gi activation by CNO during PT can impair memory retrieval in rats which have already learned the PAs.
  
  As shown in the Author response image 2B, we replaced two original PAs with two new PAs (NPA 9 and 10) at S34. This was followed by PT8 (S35). As seen in Author response image 2J, these rats retrieved the NPA memory if the PT is done without CNO. However, they could not retrieve the NPA memory if the PT was done with CNO. This result suggests that ACC astrocytic Gi activation by CNO during PT can impair NPA memory retrieval.
  
  In summary, these data show that astrocytic Gi activation in the ACC can impair PA memory retrieval. We have integrated this new data and results in the revised manuscript.
  
  Author response image 2.
  
  A. PI (mean ± SD) during the acquisition of the six original PAs (OPAs) (S1-2, 4-8, 10-17) and new PAs (NPAs) (S19) of the control (n=8), hM4Di-CNO (n=15), and rescue (hM4Di-CNO+L-lactate) (n=7) groups. From S6 onwards, hM4Di-CNO group consistently showed lower PI compared to control. However, concurrent L-lactate administration into the ACC (rescue group) can rescue this impairment. B. PI (mean ± SD) of hM4Di-CNO group (n=8) from S21 onwards showing gradual increase in PI when CNO was withdrawn. C, D, and E. Non-rewarded PTs (PT1, PT2, and PT3 conducted on S3, S9, and S18, respectively) to test memory retrieval of OPAs for the control, hM4Di-CNO, and rescue groups. The percentage of digging time at the cued location relative to that at the non-cued locations are shown (mean ± SD). In both PT2 and PT3, the control group spent significantly more time digging the cued sand well above the chance level, indicating that the rats learned OPAs and could retrieve it. Contrasting to this, hM4Di-CNO group did not spend more time digging the cued sand well above the chance level irrespective of CNO administration before the PTs. The rescue group showed results similar to the hM4Di-CNO group if CNO is given without L-lactate. On the other hand, they showed results similar to the control group if L-lactate is concurrently given with CNO, indicating that this group learned OPAs and could retrieve it. p < 0.05, p < 0.01, p < 0.001, one-sample t-test comparing the proportion of digging time at the cued sand well with the chance level of 16.67%. F. Non-rewarded PT4 (S20) which was conducted after replacing two OPAs with two NPAs (NPA 7 & 8) in S19 for the control, hM4Di-CNO, and rescue groups. Results show that the control group spent significantly more time digging the new cued sand well above the chance level indicating that the rats learned the NPAs from S19 and could retrieve it in this PT. Contrasting to this, hM4Di-CNO group did not spend more time digging the new-cued sand well above the chance level irrespective of CNO administration before the PT. The rescue group showed results similar to the hM4Di-CNO group if CNO is given without L-lactate. On the other hand, they showed results similar to the control group if L-lactate is concurrently given with CNO indicating that this group learned NPAs from S19 and could retrieve it. p < 0.001, one-sample t-test comparing the proportion of digging time at the new cued sand well with the chance level of 16.67%. G, H, and I. Non-rewarded PTs (PT5, PT6, and PT7 conducted on S23, S27, and S33, respectively) to test memory retrieval of OPAs for the hM4Di-CNO group. In both PT6 and PT7, the rats spent significantly more time digging the cued sand well above the chance level if the tests are done without CNO, indicating that the rats learned the OPAs and could retrieve it. However, CNO prevented memory retrieval during these PTs. p < 0.001, one-sample t-test comparing the proportion of digging time at the cued sand well with the chance level of 16.67%. J. Non-rewarded PT4 (S35) which was conducted after replacing two OPAs with two NPAs (NPA 9 & 10) in S34 for the hM4Di-CNO group. Results show that the rats spent significantly more time digging the new cued sand well above the chance level if CNO was not given before the PT, indicating that the rats learned the NPAs from S34 and could retrieve it in this PT. However, if CNO is given before the PT, the retrieval is impaired. *p < 0.001, one-sample t-test comparing the proportion of digging time at the new cued sand well with the chance level of 16.67%.
  
  • I found it truly exciting that the administration of exogenous L-lactate is capable to rescue CNO-induced PA learning impairment, when co-applied. Would it be possible that this treatment has a sensitivity to a particular stage of learning (acquisition, consolidation, or memory retrieval) when L-lactate administration would be the most efficacious?
  
  The hM4Di-CNO group, when continued with PA training without CNO (S21-S32) (Author response image 2B), was able to learn the six original PAs (OPAs). In the PT7 done at S33 (Author response image 2I), this group of rats was able to retrieve the memory if the test was done without CNO but could not retrieve the memory if CNO was given. Similarly, the Rescue group (hM4Di-CNO+L-lactate) (Author response image 2A), which received both CNO and L-lactate during PA training sessions (S1-S17), they were able to learn the OPAs. And at PT3 done at S18 (Author response image 2E), these rats were able to retrieve the memory when the test was done with CNO+L-lactate but not if the test is done with only CNO. Together, these results clearly show that ACC astrocytic Gi activation with CNO impairs memory retrieval and exogenous L-lactate can rescue the impairment. Therefore, it can be concluded that the memory retrieval is sensitive to L-lactate.
  
  The PA learning is hippocampus-dependent. Over the course of repeated PA training, systems consolidation occurs in the ACC, after which the already learned PA memory (schema) becomes hippocampus-independent (Tse et al., 2007; Tse et al., 2011). A higher activation (indicated by expression of c-Fos) in the hippocampus relative to the ACC during the early period of schema development, and the reverse at the late stage was observed in our previous study (Liu et al., 2022). However, rapid assimilation of new PA into the ACC requires simultaneous activation/retrieval of previous schema from ACC and hippocampus dependent new PA learning (Tse et al., 2007; Tse et al., 2011). During new PA learning, increase of c-Fos neurons in both CA1 and ACC was detected (Liu et al., 2022).
  
  Our hM4Di-CNO group received CNO 30 mins before and after each PA training session in S1-S17 (Author response image 2A). Also, the Rescue group similarly received CNO+L-lactate before and after each PA training session in S1-S17. Therefore, while this study design allowed us to conclude that ACC astrocytic Gi activation impairs PA learning and that exogenous L-lactate can rescue the impairment, it does not allow clear differentiation of the effects of these treatments on memory acquisition and consolidation. Further studies are needed to investigate this.
  
  • The hypothesis that observed learning impairments could be associated with diminished mitochondrial biogenesis caused by decreased l-lactate in the result of astrocytic Gi-DREADDS stimulation is very appealing, but a few key pieces of evidence are missing. So far, the hypothesis is supported by experiments demonstrating reduced expression of several components of mitochondrial membrane ATP synthase and a decrease in relative mtDNA copy numbers in ACC of rats injected with Gi-DREADDs. L-lactate injections into ACC restored and even further increased the expression of the above-mentioned markers. Co-administration of NMDAR antagonist D-APV or MCT-2 (mostly neuronal) blocker 4-CIN with L-lactate, prevented L-lactate-induced increase in relative mtDNA copy. I am wondering how the interference with mitochondrial biogenesis is affecting neuronal physiology and if it would result in impaired PA learning or schema memory.
  
  The observation of diminished mitochondrial biogenesis in the astrocytic Gi-activated rats that showed impaired PA learning is exciting. However, our study does not provide experimental data on how mitochondrial biogenesis could be associated with impaired PA learning and schema memory. Results from several previous studies linked mitochondrial biogenesis and its regulators such as PGC-1α and SIRT3 to diverse neuronal and cognitive functions as described in the discussion section of the manuscript. In the revised manuscript, we have provided further discussion as follows to discuss potential mechanisms:
  
  “In this study, we have demonstrated that ACC astrocytic Gi activation impairs PA learning and schema formation, PA memory retrieval, and NPA learning and retrieval by decreasing L-lactate level in the ACC. Although we have shown that these impairments are associated with diminished expression of proteins of mitochondrial biogenesis, the precise mechanisms of how astrocytic Gi activation affects neuronal functions and schema memory remain to be elucidated. We previously demonstrated that neuronal inhibition in either the hippocampus or the ACC impairs PA learning and schema formation (Hasan et al., 2019). In another recent study (Liu et al., 2022), we showed that astrocytic Gi activation in the CA1 impaired PA training-associated CA1-ACC projecting neuronal activation. Yao et al. recently showed that reduction of astrocytic lactate dehydrogenase A (an enzyme that reversibly catalyze L-lactate production from pyruvate) in the dorsomedial prefrontal cortex reduces L-lactate levels and neuronal firing frequencies, promoting depressive-like behaviors in mice (Yao et al., 2023). These impairments could be rescued by L-lactate infusion. It is possible that the impairment in PA learning and schema observed in our study might have involved a similar functional consequence of reduced neuronal activity in the ACC neurons upon astrocytic Gi activation.
  
  Schema consolidation is associated with synaptic plasticity-related gene expression (such as Zif268, Arc) in the ACC (Tse et al., 2011). L-lactate, after entry into neurons, can be converted to pyruvate during which NADH is also produced, promoting synaptic plasticity-related gene expression by potentiating NMDA signaling in neurons (Yang et al., 2014; Margineanu et al., 2018). Furthermore, L-lactate acts as an energy substrate to fuel learning-induced de novo neuronal translation critical for long-term memory (Descalzi et al., 2019). On the other hand, mitochondria play crucial role in fueling local translation during synaptic plasticity (Rangaraju et al., 2019). Therefore, it could be hypothesized that the rescue of astrocytic Gi activation-mediated impairment of schema by exogenous L-lactate could have been mediated by facilitating synaptic plasticity-related gene expression by directly fueling the protein translation, potentiating NMDA signaling, as well as increasing mitochondrial capacity for ATP production by promoting mitochondrial biogenesis. Furthermore, the potential involvement of HCAR1, a receptor for L-lactate that may regulate neuronal activity (Bozzo et al., 2013; Tang et al., 2014; Herrera-López & Galván, 2018; Abrantes et al., 2019), cannot be excluded. Future research could explore these potential mechanisms, examining the interactions among them, and determining their relative contributions to schema. Our previous study also showed that ACC myelination is necessary for PA learning and schema formation, and that repeated PA training is associated with oligodendrogenesis in the ACC (Hasan et al., 2019). Oligodendrocytes facilitate fast, synchronized, and energy efficient transfer of information by wrapping axons in myelin sheath. Furthermore, they supply axons with glycolysis products, such as L-lactate, to offer metabolic support (Fünfschilling et al., 2012; Lee et al., 2012). The association of oligodendrogenesis and myelination with schema memory may suggest an adaptive response of oligodendrocytes to enhance metabolic support and neuronal energy efficiency during PA learning. Given the impairments in PA learning observed in the ACC astrocytic Gi-activated rats in the current study, it is reasonable to conclude that the direct metabolic support to axons provided by oligodendrocytes is not sufficient to rescue the schema impairments caused by decreased L-lactate levels upon astrocytic Gi activation. On the other hand, L-lactate was shown to be important for oligodendrogenesis and myelination (Sánchez-Abarca et al., 2001; Rinholm et al., 2011; Ichihara et al., 2017). Therefore, it is tempting to speculate that a decrease in L-lactate level may also impede oligodendrogenesis and myelination, consequently preventing the enhanced axonal support provided by oligodendrocytes and myelin during schema learning. Recently, a study has demonstrated that upon demyelination, mitochondria move from the neuronal cell body to the demyelinated axon (Licht-Mayer et al., 2020). Enhancement of this axonal response of mitochondria to demyelination, by targeting mitochondrial biogenesis and mitochondrial transport from the cell body to axon, protects acutely demyelinated axons from degeneration. Given the connection between schema and increased myelination, it remains an open question whether L-lactate-induced mitochondrial biogenesis plays a beneficial role in schema through a similar mechanism. Nevertheless, our results contribute to the mounting evidence of the glial role in cognitive functions and underscores the new paradigm in which glial cells are considered as integral players in cognitive functions alongside neurons. Disruption of neurons, myelin, or astrocytes in the ACC can disrupt PA learning and schema memory.”
  
  Reviewer #3 (Public Review):
  
  Akter et al. investigated how the astroglial Gi signaling pathway in the rat anterior cingulate cortex (ACC) affects cognitive functions, in particular schema memory formation. Using a stereotactic approach they intracranially introduced AAV8 vectors carrying mCherry-tagged hM4Di DREADD (Designer Receptor Exclusively Activated by Designer Drugs) under astrocyte selective GFAP promotor (AAV8-GFAP-hM4Di-mCherry) into the AAC region of the rat brain. hM4Di DREADD is a genetically modified form of the human M4 muscarinic (hM4) receptor insensitive to endogenous acetylcholine but is activated by the inert clozapine metabolite clozapine-N-oxide (CNO), triggering the Gi signaling pathway. The authors confirmed that hM4Di DREADD is selectively expressed in astrocytes after the application of the AAV8 vector by analysing the mCherry signals and immunolabeling of astrocytes and neurons in the ACC region of the rat brain. They activated hM4Di DREADD (Gi signalling) in astrocytes by intraperitoneal administration of CNO and measured cognitive functions in animals after CNO administration. Activation of Gi signaling in astrocytes by CNO application decreased paired-associate (PA) learning, schema formation, and memory retrieval in tested animals. This was associated with a decrease in cAMP in astrocytes and L-lactate in extracellular fluid as measured by immunohistochemistry in situ and in awake rats by microdialysis, respectively. Administration of exogenous L-lactate rescued the astroglial Gi-mediated deficits in PA learning, memory retrieval, and schema formation, suggesting that activation of astroglial Gi signalling downregulates L-lactate production in astrocytes and its transport to neurons affecting memory formation. Authors also show that expression level of proteins involved in mitochondrial biogenesis, which is associated with cognitive functions, is decreased in neurons, when Gi signalling is activated in astrocytes, and rescued when exogenous L-lactate is applied, suggesting the implication of astrocyte-derived L-lactate in the maintenance of mitochondrial biogenesis in neurons. The latter depended on lactate MCT2 transporter activity and glutamate NMDA receptor activity.
  
  The paper is very well written and discussed. The conclusions of this paper are well supported by the data. Although this is a study that uses established and previously published methodologies, it provides new insights into L-lactate signalling in the brain, particularly in AAC, and further confirms the role of astroglial L-lactate in learning and memory formation. It also raises new questions about the molecular mechanisms underlying astrocyte-derived L-lactate-mediated mitochondrial biogenesis in neurons and its contribution to schema memory formation.
  
  • The authors discuss astrocytic L-lactate signalling without considering the recently discovered L-lactate-sensitive Gs and Gi protein-coupled receptors in the brain, which are present in both astrocytes and neurons. The use of nonendogenous L-lactate receptor agonists (Compound 2, 3-chloro-5-hydroxybenzoic acid) would clarify the implication of L-lactate receptor signalling in schema memory formation.
  
  In the revised manuscript, we have included this point in the discussion section to mention the potential role of HCAR1 in schema memory as follows:
  
  “Schema consolidation is associated with synaptic plasticity-related gene expression (such as Zif268, Arc) in the ACC (Tse et al., 2011). L-lactate, after entry into neurons, can be converted to pyruvate during which NADH is also produced, promoting synaptic plasticity-related gene expression by potentiating NMDA signaling in neurons (Yang et al., 2014; Margineanu et al., 2018). Furthermore, L-lactate acts as an energy substrate to fuel learning-induced de novo neuronal translation critical for long-term memory (Descalzi et al., 2019). On the other hand, mitochondria play crucial role in fueling local translation during synaptic plasticity (Rangaraju et al., 2019). Therefore, it could be hypothesized that the rescue of astrocytic Gi activation-mediated impairment of schema by exogenous L-lactate could have been mediated by facilitating synaptic plasticity-related gene expression by directly fueling the protein translation, potentiating NMDA signaling, as well as increasing mitochondrial capacity for ATP production by promoting mitochondrial biogenesis. Furthermore, the potential involvement of HCAR1, a receptor for L-lactate that may regulate neuronal activity (Bozzo et al., 2013; Tang et al., 2014; Herrera-López & Galván, 2018; Abrantes et al., 2019), cannot be excluded. Future research could explore these potential mechanisms, examining the interactions among them, and determining their relative contributions to schema.”
  
  • The use of control animals transduced with an "empty" AAV9 vector (AAV8-GFAP-mCherry) compared with animals transduced with AAV8-GFAP-hM4Di-mCherry throughout the study would strengthen the results of this study, since transfection itself, as well as overexpression of the mCherry protein, may affect cell function.
  
  We thank the reviewer for pointing this. The schema experiment includes a control group (Control-CNO group) of rats injected with AAV8-GFAP-mCherry bilaterally into the ACC. As shown in Author response image 3, after habituation and pretraining, these rats were trained for PA learning similarly to the other groups. Before 30 mins and after 30 mins of each PA training session, they received I.P. CNO. The PA learning, schema formation, memory retrieval, NPA learning and retrieval, and latency (time needed to commence digging at the correct well) were similar to the control group of rats. This result is consistent with our previous study where rats bilaterally injected with AAV8-GFAP-mCherry into CA1 of hippocampus did not show impairments in PA learning and schema formation upon CNO treatment (Liu et al., 2022).
  
  Author response image 3.
  
  A. PI (mean ± SD) during the acquisition of the original six PAs (OPAs) (S1-2, 4-8, 10-17) and new PAs (NPAs) (S19) of the control (n=6) and control-CNO (n=4) groups. B. Non-rewarded PTs (PT1, PT2, and PT3 done on S3, S9, and S18, respectively) to test memory retrieval of OPAs for the control-CNO group. C. Non-rewarded PT4 (S20) which was done after replacing two OPAs with two NPAs (NPA 7 & 8) in S19 for the control-CNO group. D. Latency (in seconds) before commencing digging at the correct well for control and control-CNO groups. Data shown as mean ± SD.
  
  References
  
  Abrantes, H. d. C., Briquet, M., Schmuziger, C., Restivo, L., Puyal, J., Rosenberg, N., Rocher, A.-B., Offermanns, S., & Chatton, J.-Y. (2019). The Lactate Receptor HCAR1 Modulates Neuronal Network Activity through the Activation of Gα and Gβγ Subunits. The Journal of Neuroscience, 39(23), 4422-4433. https://doi.org/10.1523/jneurosci.2092-18.2019
  
  Akter, M., Ma, H., Hasan, M., Karim, A., Zhu, X., Zhang, L., & Li, Y. (2023). Exogenous L-lactate administration in rat hippocampus increases expression of key regulators of mitochondrial biogenesis and antioxidant defense [Original Research]. Frontiers in Molecular Neuroscience, 16. https://doi.org/10.3389/fnmol.2023.1117146
  
  Bozzo, L., Puyal, J., & Chatton, J.-Y. (2013). Lactate Modulates the Activity of Primary Cortical Neurons through a Receptor-Mediated Pathway. PLoS One, 8(8), e71721. https://doi.org/10.1371/journal.pone.0071721
  
  Choi, H. B., Gordon, G. R., Zhou, N., Tai, C., Rungta, R. L., Martinez, J., Milner, T. A., Ryu, J. K., McLarnon, J. G., Tresguerres, M., Levin, L. R., Buck, J., & MacVicar, B. A. (2012). Metabolic communication between astrocytes and neurons via bicarbonate-responsive soluble adenylyl cyclase. Neuron, 75(6), 1094-1104. https://doi.org/10.1016/j.neuron.2012.08.032
  
  Covelo, A., Eraso-Pichot, A., Fernández-Moncada, I., Serrat, R., & Marsicano, G. (2021). CB1R-dependent regulation of astrocyte physiology and astrocyte-neuron interactions. Neuropharmacology, 195, 108678. https://doi.org/https://doi.org/10.1016/j.neuropharm.2021.108678
  
  Descalzi, G., Gao, V., Steinman, M. Q., Suzuki, A., & Alberini, C. M. (2019). Lactate from astrocytes fuels learning-induced mRNA translation in excitatory and inhibitory neurons. Communications Biology, 2(1), 247. https://doi.org/10.1038/s42003-019-0495-2
  
  Endo, F., Kasai, A., Soto, J. S., Yu, X., Qu, Z., Hashimoto, H., Gradinaru, V., Kawaguchi, R., & Khakh, B. S. (2022). Molecular basis of astrocyte diversity and morphology across the CNS in health and disease. Science, 378(6619), eadc9020. https://doi.org/10.1126/science.adc9020
  
  Fünfschilling, U., Supplie, L. M., Mahad, D., Boretius, S., Saab, A. S., Edgar, J., Brinkmann, B. G., Kassmann, C. M., Tzvetanova, I. D., Möbius, W., Diaz, F., Meijer, D., Suter, U., Hamprecht, B., Sereda, M. W., Moraes, C. T., Frahm, J., Goebbels, S., & Nave, K.-A. (2012). Glycolytic oligodendrocytes maintain myelin and long-term axonal integrity. Nature, 485(7399), 517-521. https://doi.org/10.1038/nature11007
  
  Harris, R. A., Lone, A., Lim, H., Martinez, F., Frame, A. K., Scholl, T. J., & Cumming, R. C. (2019). Aerobic Glycolysis Is Required for Spatial Memory Acquisition But Not Memory Retrieval in Mice. eNeuro, 6(1). https://doi.org/10.1523/ENEURO.0389-18.2019
  
  Hasan, M., Kanna, M. S., Jun, W., Ramkrishnan, A. S., Iqbal, Z., Lee, Y., & Li, Y. (2019). Schema-like learning and memory consolidation acting through myelination. FASEB J, 33(11), 11758-11775. https://doi.org/10.1096/fj.201900910R
  
  Herrera-López, G., & Galván, E. J. (2018). Modulation of hippocampal excitability via the hydroxycarboxylic acid receptor 1. Hippocampus, 28(8), 557-567. https://doi.org/https://doi.org/10.1002/hipo.22958
  
  Horvat, A., Muhič, M., Smolič, T., Begić, E., Zorec, R., Kreft, M., & Vardjan, N. (2021). Ca2+ as the prime trigger of aerobic glycolysis in astrocytes. Cell Calcium, 95, 102368. https://doi.org/https://doi.org/10.1016/j.ceca.2021.102368
  
  Horvat, A., Zorec, R., & Vardjan, N. (2021). Lactate as an Astroglial Signal Augmenting Aerobic Glycolysis and Lipid Metabolism [Review]. Frontiers in Physiology, 12. https://doi.org/10.3389/fphys.2021.735532
  
  Ichihara, Y., Doi, T., Ryu, Y., Nagao, M., Sawada, Y., & Ogata, T. (2017). Oligodendrocyte Progenitor Cells Directly Utilize Lactate for Promoting Cell Cycling and Differentiation. J Cell Physiol, 232(5), 986-995. https://doi.org/10.1002/jcp.25690
  
  Iqbal, Z., Liu, S., Lei, Z., Ramkrishnan, A. S., Akter, M., & Li, Y. (2023). Astrocyte L-Lactate Signaling in the ACC Regulates Visceral Pain Aversive Memory in Rats. Cells, 12(1), 26. https://www.mdpi.com/2073-4409/12/1/26
  
  Jourdain, P., Rothenfusser, K., Ben-Adiba, C., Allaman, I., Marquet, P., & Magistretti, P. J. (2018). Dual action of L-Lactate on the activity of NR2B-containing NMDA receptors: from potentiation to neuroprotection. Sci Rep, 8(1), 13472. https://doi.org/10.1038/s41598-018-31534-y
  
  Kofuji, P., & Araque, A. (2021). G-Protein-Coupled Receptors in Astrocyte-Neuron Communication. Neuroscience, 456, 71-84. https://doi.org/10.1016/j.neuroscience.2020.03.025
  
  Lee, Y., Morrison, B. M., Li, Y., Lengacher, S., Farah, M. H., Hoffman, P. N., Liu, Y., Tsingalia, A., Jin, L., Zhang, P. W., Pellerin, L., Magistretti, P. J., & Rothstein, J. D. (2012). Oligodendroglia metabolically support axons and contribute to neurodegeneration. Nature, 487(7408), 443-448. https://doi.org/10.1038/nature11314
  
  Licht-Mayer, S., Campbell, G. R., Canizares, M., Mehta, A. R., Gane, A. B., McGill, K., Ghosh, A., Fullerton, A., Menezes, N., Dean, J., Dunham, J., Al-Azki, S., Pryce, G., Zandee, S., Zhao, C., Kipp, M., Smith, K. J., Baker, D., Altmann, D., Anderton, S. M., Kap, Y. S., Laman, J. D., Hart, B. A. t., Rodriguez, M., Watzlawick, R., Schwab, J. M., Carter, R., Morton, N., Zagnoni, M., Franklin, R. J. M., Mitchell, R., Fleetwood-Walker, S., Lyons, D. A., Chandran, S., Lassmann, H., Trapp, B. D., & Mahad, D. J. (2020). Enhanced axonal response of mitochondria to demyelination offers neuroprotection: implications for multiple sclerosis. Acta Neuropathologica, 140(2), 143-167. https://doi.org/10.1007/s00401-020-02179-x
  
  Liu, S., Wong, H. Y., Xie, L., Iqbal, Z., Lei, Z., Fu, Z., Lam, Y. Y., Ramkrishnan, A. S., & Li, Y. (2022). Astrocytes in CA1 modulate schema establishment in the hippocampal-cortical neuron network. BMC Biol, 20(1), 250. https://doi.org/10.1186/s12915-022-01445-6
  
  Magistretti, P. J., & Allaman, I. (2018). Lactate in the brain: from metabolic end-product to signalling molecule. Nat Rev Neurosci, 19(4), 235-249. https://doi.org/10.1038/nrn.2018.19
  
  Margineanu, M. B., Mahmood, H., Fiumelli, H., & Magistretti, P. J. (2018). L-Lactate Regulates the Expression of Synaptic Plasticity and Neuroprotection Genes in Cortical Neurons: A Transcriptome Analysis. Front Mol Neurosci, 11, 375. https://doi.org/10.3389/fnmol.2018.00375
  
  Netzahualcoyotzi, C., & Pellerin, L. (2020). Neuronal and astroglial monocarboxylate transporters play key but distinct roles in hippocampus-dependent learning and memory formation. Progress in Neurobiology, 194, 101888. https://doi.org/https://doi.org/10.1016/j.pneurobio.2020.101888
  
  Newman, L. A., Korol, D. L., & Gold, P. E. (2011). Lactate produced by glycogenolysis in astrocytes regulates memory processing. PLoS One, 6(12), e28427. https://doi.org/10.1371/journal.pone.0028427
  
  Park, J., Kim, J., & Mikami, T. (2021). Exercise-Induced Lactate Release Mediates Mitochondrial Biogenesis in the Hippocampus of Mice via Monocarboxylate Transporters. Front Physiol, 12, 736905. https://doi.org/10.3389/fphys.2021.736905
  
  Peterson, S. M., Pack, T. F., & Caron, M. G. (2015). Receptor, Ligand and Transducer Contributions to Dopamine D2 Receptor Functional Selectivity. PLoS One, 10(10), e0141637. https://doi.org/10.1371/journal.pone.0141637
  
  Rangaraju, V., Lauterbach, M., & Schuman, E. M. (2019). Spatially Stable Mitochondrial Compartments Fuel Local Translation during Plasticity. Cell, 176(1), 73-84.e15. https://doi.org/10.1016/j.cell.2018.12.013
  
  Rinholm, J. E., Hamilton, N. B., Kessaris, N., Richardson, W. D., Bergersen, L. H., & Attwell, D. (2011). Regulation of oligodendrocyte development and myelination by glucose and lactate. J Neurosci, 31(2), 538-548. https://doi.org/10.1523/JNEUROSCI.3516-10.2011
  
  Sánchez-Abarca, L. I., Tabernero, A., & Medina, J. M. (2001). Oligodendrocytes use lactate as a source of energy and as a precursor of lipids. Glia, 36(3), 321-329. https://doi.org/10.1002/glia.1119
  
  Suzuki, A., Stern, S. A., Bozdagi, O., Huntley, G. W., Walker, R. H., Magistretti, P. J., & Alberini, C. M. (2011). Astrocyte-neuron lactate transport is required for long-term memory formation. Cell, 144(5), 810-823.
  
  Tang, F., Lane, S., Korsak, A., Paton, J. F. R., Gourine, A. V., Kasparov, S., & Teschemacher, A. G. (2014). Lactate-mediated glia-neuronal signalling in the mammalian brain. Nature Communications, 5(1), 3284. https://doi.org/10.1038/ncomms4284
  
  Tauffenberger, A., Fiumelli, H., Almustafa, S., & Magistretti, P. J. (2019). Lactate and pyruvate promote oxidative stress resistance through hormetic ROS signaling. Cell Death Dis, 10(9), 653. https://doi.org/10.1038/s41419-019-1877-6
  
  Tse, D., Langston, R. F., Kakeyama, M., Bethus, I., Spooner, P. A., Wood, E. R., Witter, M. P., & Morris, R. G. (2007). Schemas and memory consolidation. Science, 316(5821), 76-82. https://doi.org/10.1126/science.1135935
  
  Tse, D., Takeuchi, T., Kakeyama, M., Kajii, Y., Okuno, H., Tohyama, C., Bito, H., & Morris, R. G. (2011). Schema-dependent gene activation and memory encoding in neocortex. Science, 333(6044), 891-895. https://doi.org/10.1126/science.1205274
  
  Vardjan, N., Chowdhury, H. H., Horvat, A., Velebit, J., Malnar, M., Muhič, M., Kreft, M., Krivec, Š. G., Bobnar, S. T., Miš, K., Pirkmajer, S., Offermanns, S., Henriksen, G., Storm-Mathisen, J., Bergersen, L. H., & Zorec, R. (2018). Enhancement of Astroglial Aerobic Glycolysis by Extracellular Lactate-Mediated Increase in cAMP [Original Research]. Frontiers in Molecular Neuroscience, 11. https://doi.org/10.3389/fnmol.2018.00148
  
  Vezzoli, E., Cali, C., De Roo, M., Ponzoni, L., Sogne, E., Gagnon, N., Francolini, M., Braida, D., Sala, M., Muller, D., Falqui, A., & Magistretti, P. J. (2020). Ultrastructural Evidence for a Role of Astrocytes and Glycogen-Derived Lactate in Learning-Dependent Synaptic Stabilization. Cereb Cortex, 30(4), 2114-2127. https://doi.org/10.1093/cercor/bhz226
  
  Wang, J., Tu, J., Cao, B., Mu, L., Yang, X., Cong, M., Ramkrishnan, A. S., Chan, R. H. M., Wang, L., & Li, Y. (2017). Astrocytic l-Lactate Signaling Facilitates Amygdala-Anterior Cingulate Cortex Synchrony and Decision Making in Rats. Cell Rep, 21(9), 2407-2418. https://doi.org/10.1016/j.celrep.2017.11.012
  
  Yang, J., Ruchti, E., Petit, J. M., Jourdain, P., Grenningloh, G., Allaman, I., & Magistretti, P. J. (2014). Lactate promotes plasticity gene expression by potentiating NMDA signaling in neurons. Proc Natl Acad Sci U S A, 111(33), 12228-12233. https://doi.org/10.1073/pnas.1322912111
  
  Yao, S., Xu, M.-D., Wang, Y., Zhao, S.-T., Wang, J., Chen, G.-F., Chen, W.-B., Liu, J., Huang, G.-B., Sun, W.-J., Zhang, Y.-Y., Hou, H.-L., Li, L., & Sun, X.-D. (2023). Astrocytic lactate dehydrogenase A regulates neuronal excitability and depressive-like behaviors through lactate homeostasis in mice. Nature Communications, 14(1), 729. https://doi.org/10.1038/s41467-023-36209-5
  
  Yu, X., Zhang, R., Wei, C., Gao, Y., Yu, Y., Wang, L., Jiang, J., Zhang, X., Li, J., & Chen, X. (2021). MCT2 overexpression promotes recovery of cognitive function by increasing mitochondrial biogenesis in a rat model of stroke. Anim Cells Syst (Seoul), 25(2), 93-101. https://doi.org/10.1080/19768354.2021.1915379
  
  Zhou, Z., Okamoto, K., Onodera, J., Hiragi, T., Andoh, M., Ikawa, M., Tanaka, K. F., Ikegaya, Y., & Koyama, R. (2021). Astrocytic cAMP modulates memory via synaptic plasticity. Proc Natl Acad Sci U S A, 118(3), e2016584118. https://doi.org/10.1073/pnas.2016584118
  
  Zhu, J., Hu, Z., Han, X., Wang, D., Jiang, Q., Ding, J., Xiao, M., Wang, C., Lu, M., & Hu, G. (2018). Dopamine D2 receptor restricts astrocytic NLRP3 inflammasome activation via enhancing the interaction of β-arrestin2 and NLRP3. Cell Death Differ, 25(11), 2037-2049. https://doi.org/10.1038/s41418-018-0127-2
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.01.08.523156v1
www.biorxiv.org www.biorxiv.org

Oligomerization of the Human Adenosine A2A Receptor is Driven by the Intrinsically Disordered C-terminus

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The physical principles underlying oligomerization of GPCRs are not well understood. Here, authors focused on oligomerization of A2AR. They found that oligomerization of A2AR is mediated by the intrinsically disordered, extramembraneous C-terminal tail. Using experiment and MD simulation, they mapped the regions that are responsible for oligomerization and dissected the driving forces in oligomerization.
  
  This is a nice piece of work that applies fundamental physical principles to the understanding of an important biological problem. It is a significant finding that oligomerization of A2AR is mediated by multiple weak interactions that are "tunable" by environmental factors. It is also interesting that solute-induced, solvent-mediated "depletion interactions" can be a key driving force in membrane protein-protein interactions.
  
  Although this work is potentially a significant contribution to the fields of GPCRs and molecular biophysics of membrane proteins in general, there are several concerns that would need to be implemented to strengthen the conclusions.
  
  1) How reasonably would the results obtained in the micellar environment be translated into the phenomenon in the cell membranes?
  
  1a) Here authors measured oligomerization of A2AR in detergent micelles, not in the bilayer or cellular context. Although the cell membranes would provide another layer of complexity, the hydrophobic properties and electrostatics of the negatively charged membrane surface may cooperate or compete with the interactions mediated by the C-terminal tail, especially if the oligomerization is mediated by multiple weak interactions.
  
  The translatability of properties of membrane proteins in detergent micelles to the cellular context is a valid concern. However, this shortcoming applies to all biophysical studies of membrane proteins in non-native environments. Even for membrane proteins reconstituted in liposomes, the question arises whether the artificial lipid composition that differs from that in the human plasma membrane would alter protein properties, especially as surface charges and cholesterol content can impact membrane protein dynamics, association, and stability. In that sense, this question cannot be answered satisfyingly, especially for GPCRs that are notoriously difficult to isolate. However, we can offer some perspectives. The propensity for membrane proteins to associate and oligomerize, if anything, is greater in lipid bilayers compared to that in detergent micelles, while detergent micelles can effectively solubilize membrane protein monomers (Popot and Engelman, Biochem 1990, 29 (17), 4031–4037). Hence, the findings that A2AR readily oligomerizes in detergent micelles and that the degree of oligomerization can be systematically tuned by the C-terminal length of A2AR in the same micellar system suggest that inter-A2AR interactions are modulating receptor oligomerization; we speculate that A2AR oligomers will be present or be enhanced in the lipid bilayer environment. In fact, in the cellular context, it has been shown that A2AR assembles into homodimers at the cell surface in transfected HEK293 cells (Canals et al, J Neurochem 2004, 88, 726–734) and into higher- order oligomers at the plasma membrane in Cath.A differentiated neuronal cells (Vidi et al, FEBS Lett 2008, 582, 3985–3990). Furthermore, C-terminally truncated A2AR has been demonstrated to show no protein aggregation or clustering on the cell surface, a process otherwise observed in the WT form (Burgueno et al, J Biol Chem 2003, 278 (39), 37545–37552). These results provide the research community with a valid starting point to discover factors that control oligomerization of A2AR in the cellular context.
  
  1b) Related to the point above (1a), I wonder if MD simulation could provide an insight into the role of the lipid bilayer in the inter- or intra-molecular interactions involving the tail. Although the neutral POPC bilayer was employed in the simulation, the tail-membrane interaction may affect oligomerization since the tail is intrinsically disordered and possess a significant portion of nonpolar residues (Fig. S4).
  
  The reviewer brings up a valid point about the ability for MD simulations to provide insights into the role of membrane-protein interactions. In response to the reviewer, we performed additional analysis focusing on the interactions of the C-terminus with the lipid bilayer. Overall, as the C-terminus is extended, there is a decrease in its interaction with the cytoplasmic leaflet of the membrane (left figure below). More specifically, we find that the C-terminal segment associated with helix 8 (residues 291 to 314) interacts tightly with the membrane, while the rest of the C-terminus (an intrinsically disordered segment) more weakly interacts with the membrane, regardless of truncation (right figure below). As the C-terminus is extended, the inherent conformational flexibility leads to a decrease in the interactions between the protein and the bilayer. We also observe that shorter stretches of the disordered segment do have the ability to interact more closely with the membrane. While these portions include charged residues that can participate in formation of the dimer interface, no general trends are observed. We therefore cannot draw any conclusions regarding the role of C-terminal-membrane interactions on the dimerization of A2AR. What we do know is that the MD simulations presented here should be considered a model study that reveals that the charged and disordered C-terminus of A2AR can account for oligomerization via multiple and weak inter-protomer contacts.
  
  MD simulations showing (Left) average distance of all C-terminal residues and (right) average per-residue distance from the cytoplasmic membrane of the lipid bilayer.
  
  2) Ensuring that the oligomer distributions are thermodynamic products.
  
  Since authors interpret the SEC results on the basis of thermodynamic concepts (driving forces, depletion interactions, etc.), it would be important to verify that the distribution of different oligomeric states is the outcome of the thermodynamic control. There is a possibility that the distribution is the outcome of the "kinetic trapping" during detergent solubilization.
  
  This is an important question. As we have shown in the manuscript, the A2AR dimer level was found to be reduced in the presence of TCEP (Figure 2B), suggesting that disulfide linkages have a role in facilitating A2AR oligomerization. However, disulfide cross-linking reaction cannot be the sole driving force of A2AR oligomerization because (1) a significant population of A2AR dimer remained resistant to TCEP (Figure 2B), (2) A2AR oligomer levels decreased progressively with the shortening of the C-terminus (Figure 3), and (3) A2AR oligomerization is driven by depletion interactions enhanced with increasing ionic strength (Figure 5).
  
  To answer whether A2AR oligomer is a thermodynamic or kinetic product, we tested the stability and reversibility of the A2AR monomer and dimer/oligomer population. We used SEC to separate these populations of both the A2AR-WT and A2AR-Q372ΔC variants, then performed a second round of SEC to observe their repopulation, if any. The results are summarized in the figure below, which we will include in the revised manuscript as Figure 5-figure supplement 1.
  
  We find that the SEC-separated monomers repopulate measurably into dimer/oligomer, with the total oligomer level after redistribution comparable with that of the initial samples for both A2AR WT (initial: 2.87; redistributed: 1.60) and A2AR-Q372ΔC (initial: 1.49; redistributed: 1.40) (Figure 5-figure supplement 1A). This observation indicates that A2AR oligomer is a thermodynamic product with a lower free energy compared with that of the monomer. This is consistent with the results we have shown in the manuscript that the oligomer levels of A2AR-WT are consistent (1.34–2.87; Table S1) and that A2AR oligomerization can be modulated with ionic strengths via depletion interactions (Figure 5).
  
  Figure S5. The dimer/oligomerization of A2AR is a thermodynamic process where the dimer and HMW oligomer once formed are kinetically trapped. (A) SEC chromatograms of the consecutive rounds of SEC performed on A2AR-WT and Q372ΔC. The first rounds of SEC are to separate the dimer/oligomer population and the monomer population, while the second rounds of SEC are performed on these SEC-separated populations to assess their stability and reversibility. The total oligomer level is expressed relative to the monomeric population in arbitrary units. (B) Energy diagram depicting A2AR oligomerization progress. The monomer needs to overcome an activation barrier (EA), driven by depletion interactions, to form the dimer/oligomer. Once formed, the dimer/oligomer populations are kinetically trapped by disulfide linkages.
  
  Interestingly, the SEC-separated dimer/oligomer populations do not repopulate to form monomers (Figure 5-figure supplement 1). This observation is, again, consistent with a published study of ours on A2AR dimers (Schonenbach et al, FEBS Lett 2016, 590, 3295–3306). This observation furthermore indicates that once the oligomers are formed, some are kinetically trapped and thus cannot redistribute into monomers.
  
  We believe that disulfide linkages are likely candidates that kinetically stabilize A2AR oligomers, as demonstrated by their redistribution into monomers only in the presence of a reducing agent (Figure 2B). Taken together, we suggest that A2AR oligomerization is a thermodynamic process (Figure 5-figure supplement 1B), with the monomer overcoming the activation energy (EA) by depletion interactions to repopulate into dimer/oligomer with a slightly lower free energy (given that we see a distribution between the two). Once formed, the redistributed dimer/oligomer populations can be kinetically stabilized by disulfide linkages.
  
  3) The claim that the C-terminal tail is engaged in "cooperative" interactions is too qualitative (p. 11 line 274, p.12 line 279 and p.18 line 426).
  
  This claim seems derived from Fig. 3b and Figs. 4b-c. However, the gradual decrease in the dimer level and the number of interactions may indicate that different parts in the C-terminal tail contribute to dimerization additively rather than cooperatively. The large decrease in the number of interactions may stem from the large decrease in the length (395 to 354). Probably, a more quantitative measure would be the number of interactions (H-bonds/salt bridges) normalized to the tail length upon successive truncation. Even in that case, the polar/charged residues would not be uniformly distributed along the primary sequence, making the quantitative argument of cooperativity challenging.
  
  The request to clarify our basis to refer to a cooperative interaction is well taken. Figure 4B and 4C show that the truncation of one part of the C-terminus (segment 335–394) leads to a reduction in contacts of a different part (segment 291–334) of A2AR. Therefore, we conclude that the binding interactions that occur in segment 291–334 are altered by the interactions exerted by the segment 335–394. This characteristic is consistent with allosteric interactions. We believe that characterizing these interactions as “cooperative” is possible but is not fully justified in this work. We also agree with the comment that quantifying the role and segments involved in contacts would be challenging. The manuscript has been amended to use the term “allosteric” in place of “cooperative”.
  
  4) On the compactness and conformation of the C-terminal tail:
  
  Although the C-terminal tail is known as "intrinsically disordered", the results seem to indicate that its conformation is rather compact (or collapsed) with a number of intra- and intermolecular polar interactions (Fig. 4) and buried nonpolar residues (Fig. 6), which are subject to depletion interactions (Fig. 5). This raises a question if the tail indeed "intrinsically disordered" as is known. Recent folding studies on IDPs (Riback et al. Science 2017, 358, 238-; Best, Curr Opin Struct Biol 2020, 60, 27-) suggest that IDPs are partially expanded or expanded rather than collapsed.
  
  We agree that our results seem to suggest that the conformation of the C-terminus could be partially compact. However, by stating that the C-terminus on average is an intrinsically disordered region (IDR), we do not exclude the possibility of partially structured regions, or greater compactness than that of an excluded volume polymer. IDR or IDP should refer to all proteins or protein regions that do not adopt a unique structure. By that standard, we know that the C-terminus of A2AR falls into that category according to our experiments and MD simulation, as well as the literature. In isolation, the majority the C-terminus is indeed an IDR, as has been demonstrated not only by simulations but also by experimental data. In fact, the C-terminus exhibits partial alpha-helical structure, and transiently populates beta-sheet conformations, depending on its state and buffer conditions (Piirainen et al, Biophys J 2015, 108 (4), 903–917). The literature studies suggest that A2AR’s C-terminus may adopt a greater level of compactness when interactions are formed between the C-terminus and the rest of the A2AR oligomer.
  
  Reviewer #2 (Public Review):
  
  The authors expressed A2A receptor as wild type and modified with truncations/mutations at the C-terminus. The receptor was solubilized in detergent solution, purified via a C-terminal deca-His tag and the fraction of ligand binding-competent receptor separated by an affinity column. Receptor oligomerization was studied by size exclusion chromatography on the purified receptor solubilized in a DDM/CHAPS/CHS detergent solution. It was observed that truncation greatly reduces the tendency of A2A to form dimers and oligomers. Mechanistic insights into interactions that facilitate oligomerization were obtained by molecular simulations and the study of aggregation behavior of peptide sequences representing the C-terminus of A2A. It is concluded that a multitude of interactions including disulfide linkages, hydrogen bonds electrostatic- and depletion interactions contribute to aggregation of the receptor.
  
  The general conclusions appear to be correct and the paper is well written. This is a study of protein association in detergent solution. It is conceivable that observations are relevant for A2A receptors in cell membranes as well. However, extrapolation of mechanisms observed on receptor in detergent micelles to receptor in membranes should proceed with caution. In particular, the spatial arrangement of oligomerized receptor molecules in micelles may differ from arrangement in lipid bilayers. The lipid matrix may have a profound influence on oligomerization.
  
  The ultimate question to answer is how oligomerization alters receptor function. This will have to be addressed in a future study.
  
  We could not agree more. We address the concern regarding the translatability of properties of membrane proteins in detergent micelles to the cellular context in our response to Reviewer 1. In short, we believe the general propensity for A2AR to form dimers/oligomers and the role of the C-terminus will hold in the cellular context. However, even if it does not, given that biophysical structure-function studies of GPCRs are conducted in detergent micelles and other artificial environments, it is critical to understand the role of the C-terminus in the oligomerization of reconstituted A2AR in detergent micelles. How oligomerization alters receptor function is a question that is always on our mind and should be the the focus of future studies. Indeed, it has been demonstrated that truncation of the A2AR C-terminus significantly reduces receptor association with Gαs and cAMP production in cellular assays (Koretz et al, Biophys J 2021, https://doi.org/10.1016/j.bpj.2021.02.032). The results presented in this manuscript, which have demonstrated the impact of C-terminal truncation on A2AR oligomerization, will offer critical understanding for such study of the functional consequences of A2AR oligomerization.
  
  Reviewer #3 (Public Review):
  
  The work of Nguyen et al. demonstrates the relevant role of the C-terminus of A2AR for its homo-oligomerization. A previous work (Schonenbach et al. 2016) found that a point mutation of C394 in the C-terminus (C394S) reduces homo-oligomerization. Following this direction, more mutants were generated, the C-terminus was also truncated at different levels, and, using size-exclusion chromatography (SEC), the oligomerization levels of A2AR variants were assessed. Overall, these experiments support the role of the C-terminus in the oligomerization process. MD studies were performed and the non-covalent interactions were monitored. To 'identify the types of non-covalent interaction(s)', A2AR variants were also analysed modulating the ionic strength from 0.15 to 0.95 M. The C-terminus peptides were investigated to assess their interaction in absence of the TM domain.
  
  The SEC results on the A2AR variants strongly support the main conclusion of the paper, but some passages and methodologies are less convincing. The different results obtained for dimerization and oligomerization are low discussed. The MD simulations are performed on models that are not accurately described - structural information currently available may compromise the quality of the model and the validity of the results (i.e., applying MD simulations to low-resolution models may not be appropriate for the goal of this analysis, moreover the formation of disulfide bonds cannot be simulated but this can affect the conformation and consequently the interactions to be monitored). Although the C-terminus is suggested as 'a driving factor for the oligomerization', the TM domain is indeed involved in the process and if and how it will be affected by modulating the solvent ionic strength should be discussed.
  
  We thank the reviewer for the overall positive assessment and critical input. We will respond to the comments as followed.
  
  The qualitative trend for dimerization is consistent with that for oligomerization, as demonstrated in Figs. 2A, 3B, and 5. For example, a reduction in both dimerization and oligomerization was observed upon C394X mutations (Figure 2A), as well as upon systematic truncations (Figure 3B), while very similar trends were seen for the change in the dimer and oligomer levels of all four constructs upon variation of ionic strength (Figure 5).
  
  We agree that the experimental observation and MD simulation only incompletely describe the state of the A2AR dimer/oligomer. For example, we discover the impact of ERR:AAA mutations of the C-terminus (Figure 3C) on oligomer formation, but do not know whether this segment interacts with the TM domain or C-terminus of the neighboring A2AR. MD simulations suggest that the inter-protomer interface certainly involves inter-C-termini contact. We also mention that the A2AR oligomeric interfaces could be asymmetric, suggesting that the C-terminus can interact with other parts of the receptor, including the TM domain. However, we do not have evidence that the TM domain directly interact with each other to stabilize A2AR oligomers, and thus cannot discuss the effect of the solvent ionic strength on how the TM domain contributes to A2AR oligomerization. We minimize such discussion in our manuscript because we have incomplete insights. What we can say is that multiple and weak inter-protomer interactions that contribute to the dimer and oligomer interface formation prominently involve the C-terminus. Ultimately, the structure of the A2AR dimer/oligomer needs to be solved to answer the reviewer’s question fully.
  
  With respect to the validity of our model, we restricted ourselves to using the best-available X-ray crystal structure for A2AR. Since this structure (PDB 5G53) does not include the entire C-terminus, we resorted to using homology modeling software (i.e., MODELLER) to predict the structures of the C-terminus. In our model, the first segment of the C-terminus consisting of residues 291 to 314 were modeled as a helical segment parallel to the cytoplasmic membrane surface while the rest of the C-terminus was modeled as intrinsically disordered. MODELLER is much more accurate in structural predictions for segments less than 20 residues. This limitation necessitated that we run an equilibrium MD simulation for 2 µs to obtain a well-equilibrated structure that possesses a more viable starting conformation. We have included this detailed description of our model in lines 641–650. To validate our models of all potential variants of A2AR, we calculated the RMSD and RMSF for each truncated variant. Our results clearly show that the transmembrane helical bundle is very stable, as expected, and that the C-terminus is more flexible (see figure below). This flexibility is somewhat consistent for lengths up to 359 residues, with a more noticeable increase in flexibility for the 394-residue variant of A2AR.
  
  Root mean square fluctuation (RMSF) from sample trajectories of truncated variants modeled from the crystal structure of the adenosine A2AR bound to an engineered G protein (PDB ID 5G53), and the root mean square deviation (RMSD) of the C-terminus of each variant starting from residue 291.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.12.21.423144v1
www.biorxiv.org www.biorxiv.org

m6A modifications regulate intestinal immunity and rotavirus infection

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Wang et al., investigated the role of RNA m6A modification in intestinal epithelial cells (IECs) in the context of rotavirus infection. The authors found that the mice which specifically lacks METTL3 in IECs show resistance to rotavirus infection. They attributed this effect to increased IFN and ISG expression presumably via IRF7 upregulation. Further genetic IRF7 ablation in IECs led to the sensitivity rotavirus infection. They also found that ALKBH5 is suppressed by a rotaviral protein, although the knockout of ALKBH5 in IECs did not influence viral infection.
  
  Overall, although the resistance of IEC-specific METTL3-deficient mice upon rotavirus infection via the control of IRF7 is a novel and interesting finding, the proposed model is not fully supported by the findings here. Especially, the following points need to be addressed:
  
  We are grateful to the reviewer for the complimentary summary of our research. We also appreciate the valuable experiments suggested by the reviewer to improve our manuscript. We have added additional important controls and mechanistic data to further support our conclusions.
  
  1) The m6A dot blot used in Figure 1 is not a good measurement system of total m6A modification levels, because the antibody used here also detects other RNA modification, m6Am (PMID: 31676230). Therefore, it is unclear if the increase of m6A dot blot intensity is due to the increase of m6A in RNAs mediated by METTL3 in IECs. The authors should investigate the m6A levels in IECs, not BMDMs, under METTL3 deficiency. Ideally, this analysis should be done using mass spectrometry.
  
  We thank the reviewer for raising a critical point. We have tried several methods to avoid the potential non-specific detection of the previous antibody (Synaptic System, #202003) we used, which was reported to detect m6Am as well.
  
  1.We have included Dot Blot data for m6A modification in Mettl3^△IEC and WT IECs during RV infection by using another m6A antibody (Anti-N6-methyladenosine (m6A), Sigma-Aldrich, Cat. No. ABE572-I). (see below and also Fig. 1d, 1e)
  
  2.We have included mass spectrometry data for m6A modification in IECs during development (see below and also Fig. 1c) or RV infection (see below and also Fig. s3a).
  
  These data suggested m6A modifications in IECs are indeed regulated during the development or RV infection. We have included the descriptions in the text.
  
  Figure 1. Rotavirus infection increases global m6A modifications, and Mettl3 deficiency in intestinal epithelial cells results in increased resistance to rotavirus infection. (c) MS analysis of m6A level in ileum tissue from mice with different ages. (mean ± SEM), Statistical significance was determined by Student’s t-test (*P < 0.05, NS., not significant). (d) WT and Mettl3^△IEC mice were infected by rotavirus EW strain at 8 days post birth. m6A dot blot analysis of total RNA in ileum IEC at 2 dpi. Methylene blue (MB) staining was the loading control. (e) Quantitative analysis of (d) (mean ± SEM). Statistical significance was determined by Student’s t-test (*P < 0.05, ***P<0.001, NS., not significant). The quantitative m6A signals were normalized to quantitative MB staining signals.
  
  Figure s3. MS analysis of total m6A level in mice ileum. (a) WT and Mettl3 △IEC mice were infected by rotavirus EW strain at 8 days post birth. MS analysis of m6A level in ileum tissue from mice at 2 dpi (mean ± SEM), Statistical significance was determined by Student’s t-test (**P < 0.005)
  
  2) The authors show that Alkbh5 expression is increased when the mice grow up to 3 weeks old. However, the Alkbh5 protein expression changes are missing.
  
  We thank the reviewer for raising this point. We have included the protein expression of ALKBH5 in intestine during the development (see below and Fig. s1). The ALKBH5 protein levels are increased in the intestine along with the age (Fig. s1a, s1b), which is consistent to the changes of mRNA levels of ALKBH5 during the development (Fig. 1d).
  
  Figure s1. ALKBH5 regulate total m6A level in intestine. (a) Immunoblotting with antibodies target ALKBH5 and TUBULIN in ileum tissues from mice with different ages. (b) Quantitative analysis of (a) (mean ± SEM), Statistical significance was determined by Student’s t-test (*P < 0.05, NS., not significant).
  
  3) The authors claim that m6A declined from 2 to 2 weeks post birth is caused by increased Alkbh5 (Line 110). However, it is not clear if the subtle increase in Alkbh5 mRNA leads to the change in global m6A levels. The author can use ALKBH5-deficient mouse cells to confirm this point.
  
  We thank the reviewer for pointing out an important point. We have included the ALKBH5 over-expression or knock-down data in a mouse IEC cell line MODE-K, to test whether the regulation of Alkbh5 mRNA in IECs leads to the change in global m6A levels.
  
  Over-expression of ALKBH5 in MODE-K cells largely reduced the global m6A level (see below and Fig. s1d). 1. Crispr-mediated knock down of ALKBH5 in MODE-K cells augmented the global m6A level while knock down of another m6A eraser FTO in MODE-K cells didn’t affect the global m6A level (see below and Fig. s10b).
  
  Figure s1. ALKBH5 regulate total m6A level in intestine. (d) Immunoblotting with antibodies target ALKBH5 and TUBULIN in MODE-K cells transfected with pSIN-EV or pSIN-mAlkbh5-3xFlag for 24h. m6A dot blot analysis of total RNA in indicated samples. Methylene blue (MB) staining was the loading control.
  
  Figure s10. Alkbh5 is the dominant m6A eraser in intestine. (b) m6A dot blot analysis of total RNA in different MODE-K cells. Methylene blue (MB) staining was the loading control.
  
  4) The authors should describe the overall phenotype of IEC-specific METTL3-deficient mice at the steady state. It is important to clarify if the augmented expression of ISG upon METTL3 deficiency is dependent on rotavirus infection. Also, the authors should describe any detectable abnormalities or changes without stimulation.
  
  We actually collaborated another group and found there is a defect in intestinal stem cells in IEC-specific METTL3-deficient mice. However, as RV normally infected IECs in the villi but not in the crypt, and stem cells are not the major producers of IFN/ISGs (Sue E. Crawford et al. Nature reviews disease primers, 2017). The defect in intestinal stem cells will less likely affect the RV infection phenotype. As it is another story that are under review, we tend to not include this part of the data in our manuscript. Moreover, we have crossed Irf7^−/− mice to Mettl3^ΔIEC mice and verified Irf7 mediated induction of ISGs is critical for the anti-viral phenotype in Mettl3^ΔIEC mice.
  
  Our bulk RNA-seq data in IECs showed the augmented expression of ISGs upon METTL3 deficiency in steady state (Fig. 2a). We also found an augmented ISG expression in intestine of METTL3-deficient mice in steady state or early infection of RV (2d) by qPCR. However, as the RV loads in METTL3-deficient mice during the late infection stage are significantly lower than WT mice, thus the inducible ISGs expressions are consequently lower in intestine of METTL3-deficient mice than WT mice in day 4 post infection (Fig. 3f).
  
  5) The finding that IRF7 is targeted by METTL3 is not convincing. First, the authors performed MeRIP-seq and -qPCR experiments only using RNAs from wild-type IECs not from METTL3-deficient cells. It is necessary to show that the modification levels on IRF7 mRNA is indeed reduced upon METTL3 deficiency. Second, it is unclear if MeRIP-seq is properly performed or not, because there is no quality checking figure shown. For instance, the authors can generate metagene plots or gene logos of m6A modified sites to see if there is any consistency with previous reports. Third, in Figure 2h, the authors should show that the change in luciferase activity between wild-type and mutant Irf7-3'UTR reporters is dependent on METTL3 activity by performing METTL3 knockdown or knockout. Also, the authors should describe how they mutagenize the sequences for clarification. Fourth, in Figures 2F and 3C, they showed that IRF7 is upregulated in METTL3-deficient IECs while in Figure 3F, IRF7 is conversely downregulated in METTL3-deficient IECs. This is apparently contradictory to each other.
  
  We appreciate the valuable suggestion provided by the reviewer to improve our manuscript.
  
  We have done RIP-qPCR in Mettl3 knock-down and WT MODE-K cells to verify the m6A modification on IRF7 mRNA, the modification levels on IRF7 mRNA is indeed reduced upon METTL3 deficiency (see below and Fig. s5c, s5d). We have added the description of the experiment in the manuscript.
  
  Figure s5. Characterization of m6A modifications on Irf7 mRNA. (c) m6A-RIP-qPCR confirms Irf7 as an m6A-modified gene in IECs. Fragmented RNA of sgEV and sgMettl3 MODE-K cells was incubated with an anti-m6A antibody (Sigma Aldrich ABE572-I). The eluted RNA and input were processed as described in ‘RT-qPCR’section, the data were normalized to the input samples (n=3, mean ± SEM, Statistical significance was determined by Student’s t-test (*P < 0.05, **P < 0.005, NS., not significant). Tlr3 and Rps14 were measured with m6A sites specific qPCR primer as positive control and negative control, Irf7 was measured with predicted m6A sites specific qPCR primers. (d) Knock down efficiency of METTL3 in MODE-K cells.
  
  We have performed metagene plots as suggested. As shown in figure s5b, the m6A peak is enriched near the stop codon and 3’UTR region, which is consistent with previously study (Xuan et al. 2018; Dominissini et al., 2012; Yang et al., 2019). We have added the description in the manuscript.
  
  Figure s5. Characterization of m6A modfications on Irf7 mRNA. (b) Metagene plots of m6A modified sites.
  
  We have performed the luciferase assay in WT and METTL3 knockdown 293t cell, and found increased luciferase activity in mutant Irf7-3'UTR reporters is dependent on METTL3 activity (see below and fig. 2h, s5e). We have added the description of the experiment into the manuscript.
  
  Figure 2. Mettl3 deficiency in intestinal epithelial cells results in decreased m6A deposition on Irf7, and increased interferon responses. (h) Relative luciferase activity of sgEV and sgMettl3 HEK293T cells transfected with pmirGLO-Irf7-3’UTR (Irf7-WT) or pmirGLO-Irf7-3’UTR containing mutated m6A modification sites (Irf7-MUT). The firefly luciferase activity was normalized to Renilla luciferase activity (n=3, mean ± SEM). Statistical significance was determined by Student’s t-tests between genotypes (*P < 0.05, NS., not significant).
  
  Figure s5. Characterization of m6A modifications on Irf7 mRNA. (e) Knock down efficiency of METTL3 in 293t cells used for luciferase assay.
  
  IRF7 is an ISG. The expression of IRF7 is controlled by both PAMP (such as virus component)-induced transcription and post-transcriptional regulation like m6A modification mediated mRNA decay. In steady state or early stage (2d) of rotavirus infection, there is no virus or the viral loads is comparable in both Mettl3^△IEC mice and WT mice, thus, IRF7 expression is mainly regulated by m6A and is higher in IECs from Mettl3^△IEC mice in comparison with that from WT mice. However, as the RV loads in Mettl3^△IEC mice during the late infection stage are significantly lower than WT mice, in this case, IRF7 expression is mainly regulated by the PAMP from virus, thus the inducible IRF7 expressions is consequently lower in intestine of Mettl3^△IEC than WT mice in day 4 post infection (Fig. 3f).
  
  6) It is unclear if the augmented expression of IRF7 per se upregulates IFN and ISG expression. Since IRF7 exerts its transcriptional activity upon phosphorylation, the authors should examine IRF7 phosphorylation and total protein levels in METTL3-deficient IECs. Also, it is interesting to see if the phosphorylation of TBK1 is augmented or not.
  
  We have provided the phosphorylation and total protein levels of IRF7 and TBK1 in MODE-K cells treated with poly I:C. Both total IRF7 and phosphorylated IRF7 are upregulated in Mettl3-knock down cells compare to control cells (see below and Fig s5f). However, Both total TBK1 and phosphorylated TBK1 remain unchanged (Fig s5f), suggesting the augmented ISGs are less likely due to the activation of the upstream signal of IFN.
  
  Figure s5. Characterization of m6A modifications on Irf7 mRNA. (f) Western blot analysis of sgEV and sgMettl3 MODE-K cells transfected by lipo3000 with 2ug/ml poly I:C at indicated hours post transfection, at least three replicate experiments were performed.
  
  7) In Figure 3, the authors utilized METTL3 and IRF7 deficient mice to show the contribution of METTL3-mediated IRF7 regulation in rotavirus infection. However, if IRF7 is totally abrogated, IFN production should be greatly impaired as shown in Figure 3A. Thus, it is not surprising to see that the IFN response is diminished. The authors can use heterozygous IRF7 deficient mice instead to check if upregulation of IRF7 under METTL3 deficiency is critical to control rotavirus infection.
  
  We thank the reviewer for pointing out an important issue. However, we checked the IRF7 expression levels in IECs from Irf7^+/+ , Irf7^+/- and Irf7^-/- mice and found that there is no difference between IRF7 levels in IECs from Irf7^+/- mice and that in IECs from Irf7^+/+ mice. Thus, it is not feasible to use heterozygous IRF7 deficient mice to test the idea (Supporting Figure 1).
  
  Supporting Figure 1. WT and Irf7 Heterozygous mice show same IRF7 expression level in IECs. (a) IECs from 2-weeks-old Irf7^+/+ , Irf7^+/-, Irf7^-/- mice were isolated. Western blot analysis show IRF7 expression level in different mice. (b) Quantitative analysis of (a) (mean ± SEM), statistical significance was determined by Student’s t-test ( ***P < 0.001, NS., not significant).
  
  8) Given no effect of ALKBH5 knockout on rotavirus infection as shown in Figure 4, it is questionable if ALKBH5 has a profound role in the regulation of m6A in IECs. The authors should determine if m6A modification levels are increased in IECs under ALKBH5 deficiency.
  
  We performed the m6A dot blot assay to detect m6A modification levels in ALKBH5-knock down MODE-K cells and we do find an increase of m6A modification level under ALKBH5 deficiency (see above and Fig s10). No effect of ALKBH5 knockout on rotavirus infection actually puzzled us as well before (Fig.4c, 4d and 4e), until we found RV infection down-regulated ALKBH5 expression in the intestine of WT mice (Fig.4a).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.17.460776v1
www.biorxiv.org www.biorxiv.org

Mechanical bistability enabled by ectodermal compression facilitates Drosophila mesoderm invagination

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This work raises the question of how in plane forces generated at the apical surface of an epithelial cell sheet cause out of plane motion, an important morphogenetic motif. To address this question, a new ontogenetic dominant negative rho1 tool, based on the cry2-CIBN system is presented. The authors use this tool to analyze the well studied biophysical process of ventral furrow formation, and dissect the spatiotemporal requirement of rho1 signaling to modulate myosin accumulation. They separate the effect on morphogenesis into an early phase that becomes significantly slowed down by myosin inhibition, and a late phase where the kinetics is comparable to wild type despite treatment. For interpretation of the data, an older model of cell mechanics treating tissue as a purely elastic material is presented. It fails to reproduce the observations. As a modification, in analogy to buckling of a thin beam under load, a compressive stress exerted by the adjacent ectoderm is introduced. Further analysis of cell behaviors in response to various laser mediated tissue manipulations is presented as support of the proposed mechanism.
  
  Overall, the manuscript addresses an important aspect of morphogenesis. In particular the use of optogenetic tools promises new insights that might be more challenging to achieve with traditional mutant analysis. However, reservations remain with respect to (1) rigor of the analysis, and (2) interpretation and quality of the data in support of the proposed mechanism; this applies in particular to presentation of biophysical observations, including experiment and simulations.
  
  The manuscript adds valuable quantitative data, in particular the findings described in Fig 2ab. However, insufficient analysis are performed to fully support the claims of the manuscript by the data presented.
  
  (I) The manuscript proposes an elasticity based model of tissue mechanics, but provides no experimental evidence in support of this assumption. Many rheology studies performed in a wide range of specimen (including the Drosophila embryo) found a separation of time scales, that shows elasticity is a good approximation of tissue mechanics only for time scales short compared to the process studied here.
  
  We agree with the reviewer that an elasticity-based model of tissue mechanics is a simplification for the actual tissue properties in the real embryos. To provide justification for this simplification, in the revised manuscript, we have cited a previous biophysical study measuring tissue viscoelasticity in early Drosophila embryos (Doubrovinski et al., 2017). Using a magnetic tweezers-based approach, Doubrovinski et al. shows that the lower bound of the decay time of the elastic response is four minutes (the lower limit on the timescales where tissue behaves elastically). In addition, when history dependence of the response is considered, the decay time increases to nine minutes, which is close to the duration of ventral furrow formation (~ 15 – 20 minutes). Therefore, we consider elasticity is a reasonable approximation of tissue mechanics during ventral furrow formation. The elasticity assumption has been widely used in the previously published modeling work to simulate ventral furrow formation (Allena et al., 2010; Conte et al., 2009; Gracia et al., 2019; Heer et al., 2017; Hocevar Brezavšček et al., 2012; Muñoz et al., 2007; Rauzi et al., 2015).The modeling framework used in our current study, which is initially described in Polyakov et al. 2014, successfully predicts the intermediate and final furrow morphologies with a minimal set of active and passive forces without prescribing individual cell shape changes. It is therefore advantageous to use this model to explore the main novel aspect of the folding mechanics underlying ventral furrow formation. We show that the model can recapitulate the binary tissue response to acute myosin inhibition. In addition, it accurately predicts the intermediate furrow morphology at the transitional state and several other morphological properties associated with myosin inhibition. We therefore believe that this minimalistic model captures the central aspect of the physical mechanism underlying mesoderm bistability observed in the experiments.
  
  (II) The manuscript uses a method of micro-dissection to soften cells, but does not provide a clear definition of the concept softening, provides no rational for the methods functioning, and does not provide independent validation. The described treatment might affect cells in many alternative ways to the offered interpretation. This data is the central experimental evidence given in support of the proposed ectoderm compression mechanism, and therefore it is essential to provide a precise physical explanation of the method, and validation of measurements that bolster the conclusion.
  
  We apologize for not explaining the meaning of “softening” clearly in our original manuscript and the rationale for using laser ablation to detect compression. By “softening”, we meant to describe the mechanical status of the cell when the subcellular structures that normally support the mechanical integrity (e.g., cortical actin) are disrupted. We reason that when such a change in mechanical properties happens in a specific region of a tissue that is under compression, the cells in this region should have an impaired ability to resist compression from outside of the region and thereby cause the region to shrink.
  
  Laser ablation has been widely used to measure tensile stresses in cells and tissues by disruption of cells or subcellular structures. The method we used is adapted from previous described protocols, where a femtosecond near infrared laser is used to disrupt subcellular structures for detection of tissue tension (Rauzi et al., 2015; Rauzi et al., 2008).It has been shown that when laser intensity is properly controlled, the treatment can leave the plasma membrane intact but disrupt subcellular structures associated with the plasma membrane, such as adherens junctions and the cortical actomyosin networks (Rauzi et al., 2015; Rauzi et al., 2008).Using a femtosecond near infrared laser, we were able to ablate embryonic tissues that are under tension and observe tissue recoil after laser ablation, suggesting that our approach has disrupted the cortical cytoskeleton in the laser treated region (e.g., Figure 3 and Authors’ Response Figure 1). In these experiments, the lack of damage on the plasma membrane is indicated by the readily recovery of the plasma membrane signal after laser treatment, as well as the lack of bright burn marks on the tissue.
  
  As we noted before, we reasoned that if tissue is compressive, similar laser treatment that generates tissue recoil in tissues under tension should result in tissue shrinking within the laser-treated region. The data presented in our original manuscript demonstrate that tissue shrinking is not a non-specific response to our laser treatment – we did not observe such a response when we treat the tissue during cellularization or within the first five minutes of gastrulation, although identical experimental conditions were used (Original Figure 4). We have also obtained additional evidence that supports the use of tissue shrinking as a readout of tissue compression. We tested our laser ablation approach in Stage 8 – 9 embryos at regions where cells are actively dividing/proliferating, which would expect to generate compressive stresses in the tissue. As we perform laser ablation in this region, we observed shrinking of the treated region, which was distinct from the tensile tissue response (Authors’ Response Figure 1). While this preliminary evidence is encouraging, we agree with the reviewer that further independent validations are needed given that the methods for detecting tissue compression have not been well established in the field. Following the editor’s suggestion, we have removed this experiment from the current manuscript and focus on the characterization of the optogenetic tool and the binary tissue response after acute actomyosin inhibition.
  
  Authors’ Response Figure 1: Laser ablation in regions of tissues with active cell proliferation (a) or undergoing apical constriction (b). The movement of tissues is indicated by overlaying membrane signals (Ecadherin-GFP) at T = 0 sec and at T = 10 sec. T = 0 in the “After ablation” panels marks the time immediately after ablation. (a) Stage 8 – 9 embryos. Multiple cells are in the process of cell division, as indicated by mitotic rounding (yellow arrowheads) or the appearance of cleavage furrows (red arrowheads). Immediately after laser ablation, the surrounding cells moved towards the ablated region (cyan arrows). (b) An embryo undergoing ventral furrow formation. Ablation within the constriction domain results in recoil of the surrounding cells away from the ablated region (cyan arrows).
  
  (III) Mechanical isolation of the mesoderm is a very exciting approach to test the possible involvement of adjacent tissues in folding. Indeed, the authors report a delay of ventral furrow formation. However, there is no evidence provided that (a) the mesoderm is mechanically uncoupled, and (b) that the treatment did not have undesired side effects. For example, a similar procedure (so-called cauterization, see Rauzi 2015) has been used to immobilize cells in the Drosophila embryo. Such an effect could account for the observed delay in furrow formation.
  
  We agree with the reviewer that “mechanical uncoupling” is merely a prediction based on our observation but has not been directly demonstrated. On the other hand, since the purpose of this experiment is to ask whether the presence of the lateral ectoderm is important for the mesoderm to transition between apical constriction and invagination (and our result shows yes), whether the approach we used mechanically uncoupled mesoderm and the ectoderm is no longer an immediately relevant question. We apologize for the imprecise use of the term “mechanically uncoupling” in our original manuscript and we thank the reviewer for pointing this out.
  
  As for the reviewer’s point (b), we have several pieces of evidence indicating that our approach did not cause anchoring of the tissue to the vitelline membrane. The major difference between the approach we used and that used by Rauzi et al. 2015 is the location of the tissue where the laser treatment was imposed. In order to anchor the tissue to the vitelline membrane, Rauzi et al. target the laser to the apical side of the tissue, adjacent to the vitelline membrane. The resulting cauterization of the tissue caused anchoring of the tissue to the vitelline membrane, presumably by fusion of the tissue with the vitelline membrane. In our approach, we used similar type of laser (femtosecond near infrared laser) to perform tissue disruption, but instead of targeting the apical side of the tissue, we targeted the basal region of the invaginating cleavage furrows during cellularization, with the goal to block cell formation. While the laser intensity we used is high enough to cause cauterization of the tissue as indicated by the appearance of bright autofluorescence in the laser treated region, these “burn marks” are not located at the apical side of the cells (Authors’ Response Figure 2a). The lack of “burn marks” on the vitelline membrane in our experiment is in sharp contrast to the result shown in Rauzi et al 2015 (see Authors’ Response Figure 2b for an example from Rauzi et al in comparison to our own data in 2a). Because of the difference in the location of cauterization, we do not expect that the tissue would be fused with the vitelline membrane after our treatment. This is further suggested by the observation that the burn marks can move before the onset of gastrulation, which again indicates that the tissue is not anchored to the vitelline membrane (Authors’ Response Figure 2c).
  
  That being said, we acknowledge that we do not fully understand the impact of the laser treatment on the embryo (e.g., what causes the reduced rate of apical constriction), and more control experiments are required in order to fully describe the tissue response we observed. As suggested by the editor, we decided to remove the ectoderm-ablation experiment from the revised manuscript and focus on the characterization of the optogenetic tool and the binary tissue response after acute actomyosin inhibition.
  
  Authors’ Response Figure 2: Laser disruption of cell formation in the lateral ectodermal region. (a) Cross-section and en face views showing the basal location of the “burn marks” after laser disruption in the lateral ectodermal region. No burn marks are observed at the level of the vitelline membrane. Blue and red curves in the cross-section views indicate the vitelline membrane and the position where the projections were made for the en face views. Magenta arrows: burn marks. (b) Figure 5a from Rauzi et al., 2015, clear bright burn marks can be seen from the apical surface view. (c) Overlay of the signal at T = -10 min and 0 min (onset of gastrulation) showing the movement of burn marks before gastrulation (yellow arrows).
  
  (IV) Some panels show two distinct molecules tagged with the same or spectrally overlapping flurophores, that unfortunately localize in similar spatial patterns. This encumbers data validation.
  
  We agree with the reviewer that having two distinct proteins tagged with the same fluorophore is not ideal for understanding the behavior of the tagged proteins, however, it usually does not affect the evaluation of the cell or tissue morphology, as far as the cell membrane is explicitly labeled. For example, in our original Figure 2 (new Figure 4), although GFP is tagged on both CIBN and Sqh, and mCherry is tagged on both CRY2-Rho1DN and Sqh, the cell and tissue morphology is clearly discernable by these markers, which allowed us to evaluate the progression of ventral furrow formation. In the cases where there was a need to evaluate the behavior of a particular molecule (e.g. Sph), we always repeated the experiments in a way such that the molecule of interest is tagged with a distinct fluorophore that does not spectrally overlap with other fluorophores – this often requires the use of an plasma membrane anchored CIBN that is not fluorescently tagged (e.g. Figure 1, Figure 4 – figure supplement 3).
  
  (V) The physical model is a central part for data interpretation. In its current form it is very challenging to follow. It is also critical the system be studied with proper cell aspect ratio, as the elasticity of thin sheets has a well established non-linear thickness dependence.
  
  These are valid critiques of our thin layer physical model (original Figure 5). The original purpose of this model is not to recapitulate the actual furrow morphology or cell shape change observed in the actual embryo, but rather to test the possibility of recapitulating the acceleration in tissue flow during the folding process by combining local constriction and global compression in a spherical (circular in 2D) elastic shell. Developing a dynamic vertex model that contains the realistic cell aspect ratio comparable to the actual cells in the embryo while displaying realistic cellular dynamics during the folding process is nontrivial and need substantial further development of the model. Since the manuscript is now focused on the bistable characteristics of the mesoderm during gastrulation rather than tissue dynamics during the folding process, we decide to leave the dynamics vertex model out of the revised manuscript, as suggested by the editor.
  
  Reviewer #2 (Public Review):
  
  Guo and colleagues aim to unravel the mechanisms driving the fast process of mesoderm invagination in the Drosophila early developing embryo. While cell apical constriction is known to drive ventral furrowing (1st phase), it is still not clear if apical constriction is necessary/sufficient to drive mesoderm internalization (2nd phase) and weather other mechanisms cooperate during this process. By using 1ph optogenetics, the authors cannot test specifically the role of apical constriction but can systematically affect the overall actomyosin network in ventral cells in a time specific fashion (1-minute resolution). In this way, they come to the conclusion that actomyosin contractility is necessary for the 1st phase but not for the 2nd phase of mesoderm invagination. Interestingly, they conclude that the system is bistable. In the second part of this study, the authors test the role of the coupling between mesoderm and ectoderm by using 2D computational modelling and infrared pulsed laser dissection. They propose that the ectoderm can generate compressive forces on the mesoderm facilitating mesoderm internalization (2nd phase).
  
  This project is of interest since it tackles a key morphogenetic process that is necessary for the development of the embryo. The conclusion of 'bistability' resulting from the RhoDN optogenetic experiments (1st part of this study) are well supported and quite interesting. The IR laser experiments used to tackle the coupling between ectoderm and mesoderm (2nd part of the study) are key to support main conclusions, nevertheless their experimental design and results are puzzling. It is not clear what the authors are actually doing to the tissues. The experiments performed in the 2nd part of this study need to be revisited and conclusions eventually softened.
  
  Major comments:
  
  1) The 920 nm laser ablation of ectoderm cells is a key experiment in this study to support the ectoderm compression hypothesis. Nevertheless, this experiment is puzzling: the rationale of the experimental design, the effect of the laser on cells and the interpretation of the results are unclear.
  
  The rationale for the laser ablation experiment designed to test tissue compression is analogous to the widely used laser ablation approach for detecting tissue tension (Rauzi et al., 2015; Rauzi et al., 2008). In typical experiments where laser ablation was used to measure tensile stresses in cells and tissues, ablation of cells or subcellular structures that are under tension results in recoil of surrounding cell/tissue structures. We reasoned that if the tissue is under compression, similar laser treatment should result in shrinking of the laser-treated region, as the cells in the laser-treated region are expected to have an impaired ability to resist compressive stresses from outside of the region.
  
  In our experiment, we used the reduction of the width of the laser treated region within the first 10 sec after laser treatment as the measure for tissue shrinking, which we considered as an indication for the presence of compressive stresses. This tissue response, albeit mild, is not a non-specific tissue response to our laser treatment – we did not observe tissue shrinking when we treat the tissue during cellularization or within the first five minutes of gastrulation, although identical experimental conditions were used. The rate and magnitude of tissue shrinking after laser treatment is determined by multiple factors, including the level of compressive stresses, the difference in cell rigidity before and after laser treatment, and the overall viscosity of the tissue. We acknowledge that the knowledge on these factors is largely lacking, and therefore additional independent validations of our approach are needed to further strengthen our conclusion on the presence of tissue compression. Following the editor’s suggestion, we decided to remove the laser ablation experiment from the current manuscript and focus on the characterization of the optogenetic tool and the binary tissue response after acute actomyosin inhibition.
  
  2) The authors propose to use again 920 nm laser ablation but this time to "physically separate" the two ectoderms from the ventral tissue. This is again a key experiment, but it raises some concerns:
  
  a. "Physical separation" would need to be demonstrated (e.g., EM after laser ablation). From Fig. 6b it is clear that IR laser ablation results in prominent auto-fluorescent zones. This has been already reported in previous work (De Medeiros G. et al. Scientifc Reports 2020) showing that high power and sustained IR fs laser targeting produces auto-fluorescence and highly electron-dense structures in the early developing Drosophila embryo. This process is referred to laser cauterization that does not induce separation between tissues. This structures eventually displace together with the lateral tissue (also shown in Fig.6 b). b. This strong laser "treatment", that should be ectoderm specific, results in perturbation of other non-ectoderm related processes (e.g., mesoderm apical constriction as shown by the authors). This can support the idea that many other processes are affected and that in general this laser heating "treatment" has global effects. These results might invalidate the conclusion proposed by the authors.
  
  These are both valid critiques. As for the reviewer’s point “a”, we agree with the reviewer that a “physical separation” of the mesoderm from the ectoderm has not been rigorously demonstrated in our original manuscript. As detailed in our response to reviewer #1 comment #3, since the purpose of this experiment is to ask whether the presence of the lateral ectoderm is important for the mesoderm to transition between apical constriction and invagination (and our result shows yes), whether the approach we used physically separated the mesoderm and the ectoderm is no longer an immediately relevant question. We apologize for the vague use of “physical separation” in our original manuscript and we thank the reviewer for pointing this out.
  
  To address the reviewer’s point “b” and to ask whether the laser treatment used in our experiment has a global effect, we performed a control experiment where we treated the yolk region of the embryo with the identical approach. Despite the appearance of burn marks in the treated yolk region, mesoderm invagination proceeded largely normally under this condition, with a mild reduction in the rate of furrow invagination (Authors’ Response Figure 3). Therefore, the prominent delay in the transitional state we observed after disruption of lateral ectoderm (Original Figure 6) is not likely caused by non-specific laser heating effect. In addition, in both the yolk-ablation and the ectoderm-ablation experiments, cellularization occurred normally outside of the laser-treated regions, in further support of the lack of strong non-specific effect from our laser treatment. That being said, we acknowledge that we do not fully understand the impact of the laser treatment on the embryo (e.g., what causes the reduced rate of apical constriction), and more control experiments are required in order to fully describe the tissue response we observed. As suggested by the editor, we decided to remove the ectoderm-ablation experiment from the revised manuscript and focus on the characterization of the optogenetic tool and the binary tissue response after acute actomyosin inhibition.
  
  Authors’ Response Figure 3. Laser treatment in the yolk region of the embryo. (a) Cartoon depicting the position of laser treatment. Similar laser condition was used as described in the original Figure 6. Laser ablation was performed during cellularization and the treated embryo was imaged during gastrulation. (b) An example control embryo without laser treatment. (d-e) Two examples showing ventral furrow formation after laser treatment in the yolk region. Only a mild delay in furrow invagination was observed. Red arrowheads indicate the invagination front. Scale bar: 25μm.
  
  Reviewer #3 (Public Review):
  
  The authors address how contractile forces near the apical surface of a cell sheet drive out-of-plane bending of the sheet. To determine whether actomyosin contractility is required throughout the folding process and to identify potential actomyosin independent contributions for invagination, they develop an optogenetic-mediated inhibition of myosin and show that myosin contractility is critical to prevent tissue relaxation during the early stage of folding but is dispensable for the deepening of the invagination. Their results support the idea that the mesoderm is mechanically bistable during gastrulation. They propose that this mechanical bistability arises from an in-plane compression from the surrounding ectoderm and that mesoderm invagination is achieved through the combination of apical constriction and tissue compression. Regarding global message of the manuscript, I have two main critics. The authors consider their work as the first to prove that there is a additional mechanism to apical constriction leading to invagination. This is not true. First, the fact that the ectoderm could exert a compressive force on the invaginating mesoderm is not new and has been not only proposed, but tested previously (Rauzi and Leptin, 2015). Second, several recent publications demonstrated that on top of apical constriction, lateral forces were also required for the invagination and the authors ignore these data (Gracia et al, 2019 ; John et al, 2021).
  
  We thank the reviewer for this important comment. In the original Introduction, we have mentioned several previous studies that suggest the presence of additional mechanisms to apical constriction during ventral furrow formation. We stated: “The observation that the maximal rate of apical constriction and the maximal rate of tissue invagination occur at distinct times suggests that apical constriction does not directly cause tissue invagination (Polyakov et al., 2014; Rauzi et al., 2015). A number of computational models also predict that mesoderm invagination requires additional mechanical input, such as “pushing” forces from the surrounding ectodermal tissues, but experimental evidence for this additional mechanical input remains sparse (Munoz et al., 2007; Conte et al., 2009; Allena et al., 2010; Brodland et al., 2010).”
  
  To address the reviewer’s comment, in the revised manuscript, we expanded this paragraph to further elaborate the previous contributions: “However, accumulating evidence suggests that apical constriction does not directly drive invagination during the shortening phase. First, it has been observed that the maximal rate of apical constriction (or cell lengthening) and the maximal rate of tissue invagination occur at distinct times (Polyakov et al., 2014; Rauzi et al., 2015). Second, it has been previously proposed, and more recently experimentally demonstrated, that myosin accumulated at the lateral membranes of constricting cells (‘lateral myosin’) facilitates furrow invagination by exerting tension along the apical-basal axis of the cell (Brodland et al., 2010; Conte et al., 2012; Gracia et al., 2019; John and Rauzi, 2021). Finally, a number of computational models predict that mesoderm invagination requires additional mechanical input from outside of the mesoderm, such as “pushing” forces from the surrounding ectodermal tissue (Munoz et al., 2007; Conte et al., 2009; Allena et al., 2010; Brodland et al., 2010). These models are in line with the finding that blocking the movement of the lateral ectoderm by laser cauterization inhibits mesoderm invagination (Rauzi et al., 2015). A similar disruption of ventral furrow formation can also be achieved by increasing actomyosin contractility in the lateral ectoderm (Perez-Mockus et al., 2017). While these pioneer studies highlight the importance of cross-tissue coordination during mesoderm invagination, the actual mechanical mechanism that drives the folding of the mesodermal epithelium and the potential role of the surrounding ectodermal tissue remain to be elucidated.”
  
  One of the motivations for us to develop experimental approaches to detect compression in the ectoderm (original Figure 4) and to disrupt the ectoderm (original Figure 6) is the lack of direct evidence demonstrating the mechanical contribution of the ectoderm to mesoderm invagination. Several studies have shown that manipulations of the ectodermal tissue can impair ventral furrow formation. One study shows that preventing the movement of the lateral ectoderm, by anchoring ectodermal cell apices to the vitelline membrane, blocks ventral furrow invagination(Rauzi et al., 2015). Another study shows that upregulation of apical myosin contractility in the lateral ectodermal tissues can inhibit or even reverse the furrow invagination process (Perez-Mockus et al., 2017). These results indicate that an increase in the resistance to mesoderm movement can impair mesoderm invagination. However, this would be expected even if the ectoderm does not provide active mechanical input to facilitate mesoderm invagination. Therefore, these experiments, while very informative, did not provide direct evidence for a role of ectodermal compression in mesoderm invagination.
  
  Another motivation for us to examine potential mechanisms outside of the mesoderm is the observation that ventral furrow invagination continues even when both apical myosin and lateral myosin are disrupted after Ttrans (Late Group embryos). This result indicates that factors other than apical or lateral myosin must be responsible for the invagination of the furrow in Late Group embryos. In the revised manuscript, we used a modeling approach to demonstrate that lateral myosin and ectodermal compression may function in parallel to promote the invagination of the ventral furrow (Figure 7). In the revised Discussion, we propose that “ventral furrow formation is mediated through a joint action of multiple mechanical inputs. Apical constriction drives initial indentation of ventral furrow, which primes the tissue for folding, whereas the subsequent rapid folding of the furrow is promoted by bistable characteristic of the mesoderm and by lateral myosin contractions in the constricting cells.”
  
  They generated an optogenetic tool, "Opto-Rho1DN", to inhibit Rho1 through light-dependent plasma membrane recruitment of a dominant negative form of Rho1 (Rho1DN). The specificity of local inactivation of Myosin was tested on apical myosin before and during invagination. They observed a strong reduction of Myosin II recruitment and a phenotype that mimicks Rok inhibition. They found that acute loss of myosin contractility during most of the lengthening phase results in immediate relaxation of the constricted tissue, but similar treatment near or after the lengthening-shortening transition does not impede invagination. They conclude that the second part of furrow invagination is not due to myosin activities at the apical or lateral cortices of the mesodermal cells and that actomyosin contractility is required in the early but not the late phase of furrow formation. This part regarding the temporal requirement of Myosin during invagination brings novelty in the field since it has never been tested before.
  
  We thank the reviewer for the comment on the novelty of our work.
  
  They observe that ectodermal cells shorten their apico-basal axis prior to Ttrans, and that compression from the ectoderm is independent of ventral furrow formation since it still occurs even if invagination is inhibited.
  
  They further develop two types of simulations to test theoretically the importance of compressive stress in the invagination process. The theoretical part would need to be further developed and discussed. They would need to integrate all the different components that have been shown to be essential for the invagination (not only apical constriction) and the dynamic aspect of the vertex model has to be clearly explained.
  
  We thank the reviewer for the suggestions on the modeling parts. In the energy-based vertex model (the Polyakov model, original Figure 3), two previously identified mechanisms, apical constriction and basal relaxation, have been implemented in the model to drive lengthening-shortening cell shape change and furrow invagination. Following the reviewer’s suggestions, we have modified the Polyakov model to include additional mechanisms that have been shown to facilitate ventral furrow invagination. In particular, we focused our analysis on the role of lateral myosin in the constricting cells on furrow invagination (Figure 7). Please refer to our response to the combined comments for details (in the section “ Additional modeling analysis to test the known mechanisms for mesoderm invagination”).
  
  As for the dynamic vertex model presented in our original manuscript (original Figure 5), as detailed in our response to Reviewer #1’s comment #5, since the revised manuscript is focused on the bistable characteristics of the mesoderm during gastrulation rather than tissue dynamics during the folding process, we decide to leave this part out of our revised manuscript as suggested by the editor.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.18.435928v1
www.biorxiv.org www.biorxiv.org

New submission 03/12/2023, 15:35:32

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This thorough study expands our understanding of BMP signaling, a conserved developmental pathway, involved in processes diverse such as body patterning and neurogenesis. The authors applied multiple, state-of-art strategies to the anthozoan Nematostella vectensis in order to first identify the direct BMP signaling targets - bound by the activated pSMAD1/5 protein - and then dissect the role of a novel pSMAD1/5 gradient modulator, zwim4-6. The list of target genes features multiple developmental regulators, many of which are bilaterally expressed, and which are notably shared between Drosophila and Xenopus. The analysis identified in particular zswim4-6 a novel nuclear modulator of the BMP pathway conserved also in vertebrates. A combination of both loss-of-function (injection of antisense morpholino oligonucleotide, CRISPR/Cas9 knockout, expression of dominant negative) and gain-of-function assays, and of transcriptome sequencing identified that zwim acts as a transcriptional repression of BMP signaling. Functional manipulation of zswim5 in zebrafish shows a conserved role in modulating BMP signaling in a vertebrate.
  
  The particular strength of the study lies in the careful and thorough analysis performed. This is solid developmental work, where one clear biological question is progressively dissected, with the most appropriate tools. The functional results are further validated by alternative approaches. Data is clearly presented and methods are detailed. I have a couple of comments.
  
  1) I was intrigued - as the authors - by the fact that the ChiP-Seq did not identify any known BMP ligand bound by pSMAD1/5. Are these genes found in the published ChiP-Seq data of the other species used for the comparative analysis? One hypothesis could be that there is a change in the regulatory interactions and that the initial set-up of the gradient requires indeed a feedback loop, which is then turned off at later gastrula. In this case, immunoprecipitation at early gastrula, prior to the set-up of the pSMAD1/5 gradient, could reveal a different scenario. Alternately, the regulation could be indirect, for example, through RGM, an additional regulator of BMP signaling expressed on the side of lower BMP activity, which is among the targets of the ChiP-Seq. This aspect could be discussed. Additionally, even if this is perhaps outside the scope of this study, I think it would be informative to further assess the effect of ZSWIM manipulation on RGM (and vice versa).
  
  Indeed, BMP genes are direct BMP signaling targets in Drosophila (dpp) (Deignan et al., 2016, https://doi.org/10.1371/journal.pgen.1006164) and frog (bmp2, bmp4, bmp5, bmp7) (Stevens et al., 2021, https://doi.org/10.1242/dev.145789). Of all these ligands, only the dorsally expressed Xenopus bmp2 is repressed by BMP signaling, while another dorsally expressed Xenopus BMP gene admp is not among the direct targets. All other BMP genes listed here are expressed in the pMad/pSMAD1/5/8-positive domain and are activated by BMP signaling.
  
  In Nematostella, we do not find BMP genes among the ChIP-Seq targets, but this is not that surprising considering the dynamics of the bmp2/4, bmp5-8 and chordin expression, as well as the location of the pSMAD1/5-positive cells. In late gastrulae/early planulae, Chordin appears to be shuttling BMP2/4 and BMP5-8 away from their production source and over to the gdf5-like side of the directive axis (Genikhovich et al., 2015; Leclere and Rentsch, 2014). By 4 dpf, chordin expression stops, and BMP2/4 and BMP5-8 start to be both expressed AND signal in the mesenteries. If bmp2/4 and bmp5-8 expression were directly suppressed by pSMAD1/5 (as is the case chordin or rgm expression), this mesenterial expression would not be possible. Therefore, in our opinion, it is most likely that at late gastrula and early planula the regulation of bmp2/4 and bmp5-8 expression by BMP signaling is indirect. We do not have an explanation for why gdf5-like (another BMP gene expressed on the “high pSMAD1/5” side) is not retrieved as a direct BMP target in our ChIP data. Since we do not understand well enough how BMP gene expression is regulated, we do not discuss this at length in the manuscript.
  
  As the Reviewer suggested, we analyzed the effect of ZSWIM4-6 KD on the expression of rgm. Expectedly, since it is expressed on the “low BMP side”, its expression was strongly expanded (Figure 6 - Figure Supplement 4)
  
  2) I do not fully understand the rationale behind the choice of performing the comparative assays in zebrafish: as the conservation was initially identified in Xenopus, I would have expected the experiment to be performed in frog. Furthermore, reading the phylogeny (Figure 4A), it is not obvious to me why ZSWIM5 was chosen for the assay (over the other paralog ZSWIM6). Could the Authors comment on this experiment further?
  
  The comparison was done in zebrafish because we were planning to generate zswim5 mutants, whose analysis is currently in progress. ZSWIM6 is not expressed at the developmental stages we were interested in, while ZSWIM5 was, based on available zebrafish expression data (White et al., 2017):
  
  Reviewer #2 (Public Review):
  
  The authors provide a nice resource of putative direct BMP target genes in Nematostella vectensis by performing ChIP-seq with an anti-pSmad1/5 antibody, while also performing bulk RNA-seq with BMP2/4 or GDF5 knockdown embryos. Genes that exhibit pSmad1/5 binding and have changes in transcription levels after BMP signaling loss were further annotated to identify those with conserved BMP response elements (BREs). Further characterization of one of the direct BMP target genes (zswim4-6) was performed by examining how expression changed following BMP receptor or ligand loss of function, as well as how loss or gain of function of zswim4-6 affected development and BMP signaling. The authors concluded that zswim4-6 modulates BMP signaling activity and likely acts as a pSMAD1/5 dependent co-repressor. However, the mechanism by which zswim4-6 affects the BMP gradient or interacts with pSMAD1/5 to repress target genes is not clear. The authors test the activity of a zswim4-6 homologue in zebrafish (zswim5) by over-expressing mRNA and find that pSMAD1/5/9 labeling is reduced and that embryos have a phenotype suggesting loss of BMP signaling, and conclude that zswim4-6 is a conserved regulator of BMP signaling. This conclusion needs further support to confirm BMP loss of function phenotypes in zswim5 over-expression embryos.
  
  Major comments
  
  1) The BMP direct target comparison was performed between Nematostella, Drosophila, and Xenopus, but not with existing data from zebrafish (Greenfeld 2021, Plos Biol). Given the functional analysis with zebrafish later in the paper it would be nice to see if there are conserved direct target genes in zebrafish, and in particular, is zswim5 (or other zswim genes) are direct targets. Since conservation of zswim4-6 as a direct BMP target between Nematostella and Xenopus seemed to be part of the rationale for further functional analysis, it would also be nice to know if this is a conserved target in zebrafish.
  
  Thank you for the suggestion. In the paper by Greenfeld et al., 2021, zebrafish zswim5 was downregulated approximately 2.4x in the bmp7 mutant at 6 hpf, while zswim6 was barely expressed and not affected at this stage. We added this information to the text of the manuscript. Expression of several other zebrafish zswim genes was also affected in the bmp7 mutant, but these genes do not appear relevant for our study since their corresponding orthologs are not identified as pSMAD1/5 ChIP-Seq targets in Nematostella. Notably, zebrafish zzswim5 is not clearly differentially expressed in BMP or Chd overexpression conditions (See Supplementary file 1 in Rogers et al. 2020). Importantly, in the paper, we wanted to compare ChiP-Seq data with ChIP-Seq data, however, unfortunately, no ChIP-Seq data for pSMAD1/5/8 is currently available for zebrafish, thus precluding comparisons.
  
  Related to this, in the discussion it is mentioned that zswim4/6 is also a direct BMP target in mouse hair follicle cells, but it wasn't obvious from looking at the supplemental data in that paper where this was drawn from.
  
  Please see Supplementary Table 1, second Excel sheet labeled “Mx ChIP_Seq” in Genander et al., 2014, https://doi.org/10.1016/j.stem.2014.09.009. Zswim4 has a single pSMAD1 peak associated with it, Zswim6 has two.
  
  2) The loss of zswim4-6 function via MO injection results in changes to pSmad1/5 staining, including a reduction in intensity in the endoderm and gain of intensity in the ectoderm, while over-expression results in a loss of intensity in the ectoderm and no apparent change in the endoderm. While this is interesting, it is not clear how zswim4-6 is functioning to modify BMP signaling, and how this might explain differential effects in ectoderm vs. endoderm. Is the assumption that the mechanism involves repression of chordin? And if so one could test the double knockdown of zswim4-6 and chordin and look for the rescue of pSad1/5 levels or morphological phenotype.
  
  We do not think that the mechanism of the ZSWIM4-6 action is via repression of Chordin. As loss of chordin leads to the loss of pSMAD1/5 in Nematostella (Genikhovich et al., 2015), the proposed experiment is, unfortunately, not feasible to test this hypothesis. Currently, we see two distinct effects of the modulation of zswim4-6 expression. First, it affects the pSMAD1/5 gradient, possibly by destabilizing nuclear SMAD1/5, as has been proposed by Wang et al., 2022 for the vertebrate Zswim4. This is in line with our results shown on Fig. 6C-F’ and Fig. 6-Figure supplement 3. In our opinion, the reaction of the genes expressed on the “high BMP” side of the directive axis to the overexpression or KD of ZSWIM4-6 (Fig. 6I-K’, 6N-P’) can be explained by these changes in the pSMAD1/5 signaling intensity. Secondly, zswim4-6 appears to promote pSMAD1/5-mediated gene repression. This is in line with the reaction of the genes expressed on the “low BMP” side of the directive axis (Fig. 6G-H’, 6L-M’, Fig. 6-Figure Supplement 4). These genes are repressed by BMP signaling, but they expand their expression upon zswim4-6 KD in spite of the increased pSMAD1/5. Our ChiP experiment (Fig. 6Q) supports this view.
  
  3) Several experiments are done to determine how zswim4-6 expression responds to the loss of function of different BMP ligands and receptors, with the conclusion being that swim4-6 is a BMP2/4 target but not a GDF5 target, with a lot of the discussion dedicated to this as well. However, the authors show a binary response to the loss of BMP2/4 function, where zswim4-6 is expressed normally until pSmad1/5 levels drop low enough, at which point expression is lost. Since the authors also show that GDF5 morphants do not have as strong a reduction in pSmad1/5 levels compared to BMP2/4 morphants, perhaps GDF5 plays a positive but redundant role in swim4-6 expression. To test this possibility the authors could inject suboptimal doses of BMP2/4 MO with GDF5 MO and look for synergy in the loss of zswim4-6 expression.
  
  Thanks for this great suggestion! We performed this experiment (Fig. 5H’’-L) and indeed, a suboptimal dose of BMP2/4MO + GDF5lMO results in a complete radialization of the embryo and abolished zswim4–6, similar to the effect of a high dose of BMP2/4. This result suggests that rather than being a ligand-specific signaling function, GDF5-like signaling alone still provides sufficiently high pSmad1/5 levels to activate zswim4-6 expression to apparent wildtype levels, demonstrating the sensitivity of this gene to even very low amounts of BMP signaling.
  
  4) The zswim4-6 morphant embryos show increased expression of zswim4-6 mRNA, which is said to indicate that zswim4-6 negatively regulates its own expression. However in zebrafish translation blocking MOs can sometimes stabilize target transcripts, causing an artifact that can be mistakenly assumed to be increased transcription (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162184/). Some additional controls here would be warranted for making this conclusion.
  
  Thanks for raising this important experimental consideration. To-date, we do not have any evidence for MO-mediated transcript stabilization in Nematostella, and we have not found such data in the literature on models other than zebrafish. mRNA stabilization by the MO also seemed unlikely because we were unable to KD zswim4-6 using several independent shRNAs - an effect we frequently observe with genes, whose activity negatively regulates their own expression. However, to test the possibility that zswim4-6MO binding stabilizes zswim4-6 mRNA, we injected mRNA containing the zswim4-6MO recognition sequence followed by the mCherry coding sequence (zswim4-6MO-mCherrry) with either zswim4-6MO or control MO. We could clearly detect mCherry fluorescence at 1 dpf if control MO was co-injected with the mRNA, but not if zswim4-6MO was coninjected with the mRNA. At 2 dpf (the stage at which we showed upregulation of zswim4-6 upon zswim4-6MO injection on Fig. 6I-I’), zswim4-6MO-mCherrry mRNA was undetectable by in situ hybridization with our standard FITC-labeled mCherry probe independent of whether zswim4-6MO-mCherrry mRNA was co-injected with the control MO or ZSWIM4-6MO, while hybridization with the FITC-labeled FoxA probe worked perfectly.
  
  Author response image 1.
  
  We are currently offering two alternative hypothesis for the observed increase in zswim4-6 levels in the paper rather than stating explicitly that ZSWIM4-6 negatively regulates its own expression: “The KD of zswim4-6 translation resulted in a strong upregulation of zswim4-6 transcription, especially in the ectoderm, suggesting that ZSWIM4-6 might either act as its own transcriptional repressor or that zswim4-6 transcription reacts to the increased ectodermal pSMAD1/5 (Fig. 6I-I’).” Given the sensitivity of zswim4-6 to even the weakest pSMAD1/5 signal (zswim4/6 is expressed upon GDF5-like KD, which drastically reduces pSMAD1/5 signaling intensity (see Fig. 1 and 2 in Genikhovich et al., 2015, http://doi.org/10.1016/j.celrep.2015.02.035 and Fig. 6-Figure supplement 3 of this paper), the latter option (that it reacts to the increased ectodermal pSMAD1/5) is, in our opinion, clearly the more probable one.
  
  5) Zswim4-6 is proposed to be a co-repressor of pSmad1/5 targets based on the occupancy of zswim4-6 at the chordin BRE (which is normally repressed by BMP signaling) and lack of occupancy at the gremlin BRE (normally activated by BMP signaling). This is a promising preliminary result but is based only on the analysis of two genes. Since the authors identified BREs in other direct target genes, examining more genes would better support the model.
  
  We suggest that ZSWIM4-6 may be a co-repressor of pSMAD1/5 targets because it is a nuclear protein (Fig. 4G), whose knockdown results in the expansion of the ectodermal expression of several genes repressed by pSMAD1/5 in spite of the expansion of pSMAD1/5 itself (Fig. 6G-H’, 6L-M’, Fig. 6-Figure Supplement 4). Our limited ChIP analysis supports this idea by showing that ZSWIM4-6 is bound to the pSMAD1/5 site of chordin (repressed by pSMAD1/5) but not on gremlin (activated by pSMAD1/5). We agree that adding the analysis of more targets in order to challenge our hypothesis would be good. However, given technical limitations (having to inject many thousands of eggs with the EF1a::ZSWIM4-6-GFP plasmid in order to get enough nuclei to extract sufficient immunoprecipitated chromatin for qPCR on 3 genes (chordin, gremlin, GAPDH) for each biological replicate, it is currently unfortunately not feasible to test more genes. It will be of great interest for follow up studies to generate a knock-in line with tagged zswim4-6 to analyze target binding on a genome-wide scale. We stress in the discussion that currently the power of our conclusion is low.
  
  6) The rationale for further examination of zswim4-6 function in Nematostella was based in part on it being a conserved direct BMP target in Nematostella and Xenopus. The analysis of zebrafish zswim5 function however does not examine whether zswim5 is a BMP target gene (direct or indirect). BMP inhibition followed by an in situ hybridization for zswim5 would establish whether its expression is activated downstream of BMP.
  
  In the paper by Greenfeld et al., 2021, zebrafish zswim5 was downregulated approximately 2.4x in the bmp7 mutant at 6 hpf. However, this gene was not among the 57 genes, which were considered to be direct BMP targets because their expression was affected by bmp7 mRNA injection into cycloheximide-treated bmp7 mutants (Greenfeld et al., 2021). We added this information to the text of the manuscript.
  
  7) Although there is a reduction in pSmad1/5/9 staining in zebrafish injected with zswim5 mRNA, it is difficult to tell whether the resulting morphological phenotypes closely resemble zebrafish with BMP pathway mutations (such as bmp2b). More analysis is warranted here to determine whether stereotypical BMP loss of function phenotypes are observed, such as dorsalization of the mesoderm and loss of ventral tail fin.
  
  We agree, and we have tuned down all zebrafish arguments. Analyses of zswim5 mutants are currently ongoing.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.06.03.494682v1
www.biorxiv.org www.biorxiv.org

New submission 22/09/2022, 08:59:12

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  1) Validation of reagents: The authors generated a pY1230 Afadin antibody claiming that (page 6) "this new antibody is specific to tyrosine phosphorylated Afadin, and that pY1230 is targeted for dephosphorylation by PTPRK, in a D2-domain dependent manner". The WB in Fig 1B shows a lot of background, two main bands are visible which both diminish in intensity in ICT WT pervanadate-treated MCF10A cell lysates. The claim that the developed peptide antibody is selective for pY1230 in Afadin would need to be substantiated, for instance by pull down studies analysed by pY-MS to substantiate a claim of antibody specificity for this site. However, for the current study it would be sufficient to demonstrate that pY1230 is indeed the dephosphorylated site. I suggest therefore including a site directed mutant (Y1230F) that would confirm dephosphorylation at this site and the ability of the antibody recognizing the phosphorylation state at this position.
  
  We would like this antibody to be a useful and freely accessible tool in the field and have taken on board the request for additional validation. To this end we have significantly expanded Supplementary Figure 2 (now Figure 1 - figure supplement 2) and included a dedicated section of the results as follows: 1. We have now included information about all of the Afadin antibodies used in this study, since Afadin(BD) appears to be sensitive to phosphorylation (Figure 1 - figure supplement 2A). 2. We have demonstrated that the Afadin pY1230 antibody detects an upregulated band in PTPRK KO MCF10A cells, consistent with our previous tyrosine phosphoproteomics (Figure 1 - figure supplement 2B). This indicates that the antibody can be used to detect endogenous Afadin phosphorylation. 3. We have included two new knock down experiments demonstrating the recognition of Afadin by our antibody (Figure 1 - figure supplement 2C). There appear to be two Afadin isoforms recognised in HEK293T cells by both the BD and pY1230 antibody, consistent with previous reports (Umeda et al. MBoC, 2015). We have highlighted these in the figure. 4. We have performed mutagenesis to demonstrate the specificity of the antibody. We tagged Afadin with a fluorescent protein tag, reasoning that it would cause a shift in molecular weight that could be resolved by SDS PAGE, as is the case. We noted that the phosphopeptide used spans an additional tyrosine, Y1226, which has been detected as phosphorylated (although to a much lower extent than Y1230) on Phosphosite plus. The data clearly show that Afadin cannot be phosphorylated when Y1230 is mutated to a phenylalanine (compared to CIP control), indicating that this is the predominant site recognised by the antibody. In addition, the endogenous pervanadate-stimulated signal is completely abolished by CIP treatment (Figure 1 - figure supplement 2D). 5. We have included densitometric quantification of the dephosphorylation assay shown in Figure 1B, which was part of a time course and shows preferential dephosphorylation by the PTPRK ICD compared to the PTPRK D1. The signal stops declining with time, which could indicate antibody background, or an inaccessible pool of Afadin-pY1230 (Figure 1 - figure supplement 2E). 6. To further demonstrate that this site is modulated by PTPRK in post-confluent cells, we have used doxycycline (dox)-inducible cell lines generated in Fearnley et al, 2019. Upon treatment with 500 ng/ml Dox for 48 hours PTPRK is induced to lower levels than wildtype, however, normalized quantification of the Afadin pY1230 against the Afadin (CST) signal clearly indicates downregulation by PTPRK WT, but not the catalytically inactive mutant (Figure 1 - figure supplement 2F and 2G). Together these data strengthen our assertion that this antibody recognises endogenously phosphorylated Afadin at site Y1230, which is modulated in vitro and in cells by PTPRK phosphatase activity. For clarity, we have highlighted and annotated the relevant bands in figures. We have also included identifiers for each Afadin total antibody was used in particular experiments.
  
  2) The authors claim that a short, 63-residue predicted coiled coil (CC) region, is both necessary and sufficient for binding to the PTPRK-ICD. The region is predicted to have alpha-helical structure and as a consequence, a helical structure has been used in the docking model. Considering that the authors recombinantly expressed this region in bacteria, it would be experimentally simple confirming the alpha-helical structure of the segment by CD or NMR spectroscopy.
  
  To clarify, the helical structure in the docking model was independently predicted by several sequence and structural analysis programmes including AlphaFold2, RobettaFold, NetSurfP and as annotated in Uniprot (as a coiled coil). We did not stipulate prior to the AF2 prediction that it was helical. Isolated short peptides frequently adopt helical structure, therefore prediction of a helix within the context of the full Afadin sequence is, in our opinion, stronger evidence than CD of an isolated fragment.
  
  3) Only two mutants have been introduced into PTPRK-ICD to map the Afadin interaction site. One of the mutations changes a possibly structurally important residues (glycine) into a histidine. Even though this residue is present in PTPRM, it does not exclude that the D2 domain no longer functionally folds. Also the second mutation represents a large change in chemical properties and the other 2 predicted residues have not been investigated.
  
  The residues that were selected for mutation are all localised to the protein surface and therefore are unlikely to be involved in stable folding of PTPRK. In support of the correct folding of the mutated PTPRK, we include in Figure 1 below SEC elution traces for wild-type and mutant D2 showing that they elute as single symmetric peaks at the same elution volume as the WT protein. This is consistent with them having a similar shape and size, and not being aggregated or unfolded.
  
  Figure 1. PTPRK-D2 wild-type and mutant preparative SEC elution profiles. A280nm has been normalised to help illustrate that the different proteins elute at the same volume. The main peak from these samples was used for binding assays in the main paper.
  
  Furthermore, the yield for the double mutant was very high (4 mg of pure protein from a 2 L culture, see A280 value in graph below), whereas poorly folded proteins tend to have significantly reduced yields. This protein was also very stable over time whereas unfolded proteins tend to degrade during or following purification.
  
  Figure 2. Analytical SEC elution profile for the PTPRK-D2 DM construct showing the very high yield consistent with a well-folded, stable protein.
  
  Finally, we have carried out thermal melt curves of the WT and mutant PTPRK D2 domains showing that they all possess melting temperatures between 39.3°C and 41.7°C, supporting that they are all equivalently folded. We include these data as an additional Supplementary Figure (Figure 4 - figure supplement 3) in the paper.
  
  4) The interface on the Afadin substrate has not been investigated apart from deleting the entire CC or a central charge cluster. Based on the docking model the authors must have identified key positions of this interaction that could be mutated to confirm the proposed interaction site.
  
  We have now made and tested several additional mutations within both the Afadin-CC and PTPRK-D2 domains to further validate the AF2 predicted model of the complex.
  
  For Afadin-CC we introduced several single and double mutations along the helix including residues predicted to be in the interface and residues distal from the interface. These mutations and the pulldown with PTPRK are described in the text and are included as additional panels to a modified Figure 3. All mutations have the expected effect on the interaction based on the predicted complex structure. To help illustrate the positions of these mutations we have also included a figure of the interface with the residues highlighted.
  
  For the PTPRK-D2 we have also introduced two new mutations, one buried in the interface (F1225A) and one on the edge of the interface encompassing a loop that is different in PTPRM (labelled the M-loop). GST-Afadin WT protein was bound to GSH beads and tested for their ability to pulldown WT and mutated PTPRK. These new mutations (illustrated in the new Figure 4 – figure supplement 2) further support the model prediction. F1225A almost completely abolishes binding as predicted, while the M-loop retains binding. These mutations and their effects are now described in the main text and the pull-down data, including controls and retesting of the original DM mutant, are included as panel H in a newly modified Figure 4 focussed solely on the PTPRK interface.
  
  5) A minor point is that ITC experiments have not been run long enough to determine the baseline of interaction heats. In addition, as large and polar proteins were used in this experiment, a blank titration would be required to rule out that dilution heats effect the determined affinities.
  
  All control experiments including buffer into buffer, Afadin into buffer and buffer into PTPRK were carried out at the same time as the main binding experiment and are shown below overlaid with the binding curve. These demonstrate the very small dilution heats consistent with excellent buffer matching of the samples.
  
  We were able to obtain excellent fits to the titration curves by fitting 1:1 binding with a calculated linear baseline (see Figure 2B,D). Very similar results were obtained by fitting to the sum (‘composite’) of fitted linear baselines obtained for the three control experiments for each titration.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.27.489754v1
www.biorxiv.org www.biorxiv.org

A functional genetic toolbox for human tissue-derived organoids

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2 (Public Review):
  
  There is now a considerable body of knowledge about the genetic and cellular mechanisms driving the growth, morphogenesis and differentiation of organs in experimental organisms such as mouse and zebrafish. However, much less is known about the corresponding processes in developing human organ systems. One powerful strategy to achieve this important goal is to use organoids derived from self-renewing, bona fide progenitor cells present in the fetal organ. The Rawlins' lab has pioneered the long-term culture of organoids derived from multipotent epithelial progenitors located in the distal tips of the early human lung. They have shown that clonal cell "lines" can be derived from the organoids and that they capable of not only long-term self-renewal but also limited differentiation in vitro or after grafting under the kidney capsule of mice. Here, they now report a strategy to efficiently test the function of genes in the embryonic human lung, regardless of whether the genes are actively transcribed in the progenitor cells. The strengths of the paper are that the authors describe a number of different protocols (work-flows), based on Crisper/Cas9 and homology directed repair, for making fluorescent reporter alleles (suitable for cell selection) and for inducible over-expression or knockout of specific genes. The so-called "Easytag" protocols and results are carefully described, with controls. The work will be of significant interest to scientists using organoids as models of many human organ systems, not just the lung. The weaknesses are that they authors do not show that their lines can undergo differentiation after genetic manipulation, and therefore do not provide proof of principle that they can determine the function in human lung development of genes known to control mouse lung epithelial differentiation. It would also be of general interest to know whether their methods based on homologous recombination are more accurate (fewer incorrect targeting events or off target effects) than methods recently described for organoid gene targeting using non homologous repair.
  
  We thank Reviewer #2 for capturing the key advances of our toolbox for understanding gene function using a tissue organoid system and the constructive suggestions for the manuscript.
  
  We agree with the Reviewer that it would strengthen the current manuscript if we could differentiate the genetically targeted organoids. Therefore, as a proof of concept, we have successfully differentiated the SOX9 reporter organoids into the alveolar lineage (New figure: Figure 2-figure supplement 1g, shown above). We have also tested the dual SMAD inhibition approach recently reported for basal cell differentiation (Miller et al., 2020). However, this has led to massive cell death even in WT organoids (data not shown). We reason that this might be because our organoids are ~8 pcw, whereas in the literature ~12 pcw organoids were used. We believe that efficient airway differentiation will take a long time to optimise for our organoids and is therefore beyond the scope of this manuscript.
  
  In regard to the Easytag workflow in comparison with the recent CRISPR-HOT method using non-homologous end joining (Artegiani et al., 2020), we consider our approach as a complement to the CRISPR-HOT approach. This can be reflected in the following points: (1) The Organoid Easytag workflow allows precise N-terminal tagging of endogenous genes, exemplified by N-terminal tagging of ACTB. This is not possible using CRISPR-HOT as large pieces of plasmid DNA would disrupt the targeted gene; (2) The Organoid Easytag workflow is based on HDR and the efficient insertion sites for exogenous genes are within a ~30-bp window of the gRNA cleavage sites (Kwart et al., 2017), which gives more flexibility for choosing gRNAs compared with CRISPR-HOT tagging; (3) The Organoid Easytag workflow gives researchers more control of where and how the targeted sites can be modified, and offers a minimal change to the targeted genomic region, whereas CRISPR-HOT introduces large pieces of backbone plasmids, which potentially increases the risk of gene dysregulation. However, HDR requires cells to be at the G2/M phase of the cell cycle, therefore heavily relying on fast cycling cells to gain the most efficient targeting. CRISPR-HOT has the great advantage of not depending on a specific cell cycle stage and therefore being more efficient in slow cycling cells. With this said, we do believe that the efficiency would very much rely on the context, including the cell type used and locus targeted, as a recent report suggested targeting efficiency is influenced also by genomic context (Schep et al., 2021).
  
  In summary, when N-terminal tagging, minimal changes and precise control of targeting is desired, Organoid Easytag is more favourable; whereas when targeting slowly cycling cells, CRISPR-HOT has its strength. Therefore, we consider these two methods as complementary approaches that will both be of benefit to organoid-based research. We have summarised this comparison into a simple table (New table: Figure 2-figure supplement 5f)
  
  Figure 2-figure supplement 5(f). A comparison of Organoid Easytag and CRISPR-HOT methods (Artegiani et al., 2020).
  
  Reviewer #3 (Public Review):
  
  Sun et al have assembled, modified, and applied a series of existing gene editing tools to tissue-derived human fetal lung organoids in a workflow they have termed "Organoid Easytag". Using approaches that have previously been applied in iPSCs and other cell models in some cases including organoids, the authors demonstrate: 1) endogenous loci can be targeted with fluorochromes to generate reporter lines; 2) the same approach can be applied to genes not expressed at baseline in combination with an excisable, constitutively active promoter to simplify identification of targeted clones; 3) that a gene of interest could be knocked-out by replacing the coding sequence with a fluorescent reporter; 4) that knockdown or overexpression can be achieved via inducible CRISPR interference (CRISPRi) or activation (CRISPRa). In the case of CRISPRi, the authors alter existing technology to lessen unwanted leaky expression of dCas9-KRAB. While these tools have previously been applied in other models, their assembly and demonstrated application to tissue-derived organoids here could facilitate their use in tissue-derived organoids by other groups.
  
  Limitations of the study include:
  
  1) is demonstrated application of these technologies to a limited set of gene targets;
  
  2) a lack of detail demonstrating the efficiency and/or kinetics of the approaches demonstrated.
  
  While access to human fetal lung organoids is likely not available to many or most researchers, it is probable that the principles applied here could carry over to other organoid models.
  
  We thank the Reviewer for accurately summarising the details of our manuscript and positive comments on its potential to facilitate tissue-derived organoid related research. We are very grateful for the Reviewer’s detailed and constructive comments to help strengthen our manuscript.
  
  In regard to the limitations pointed out by Reviewer #3, we have systematically tested the kinetics of the inducible CRISPRi knockdown effect and its reversibility using CD71 and SOX2 (New figure: Figure 3-figure supplement 2). At the same time, we have generated SOX9 reporter human foetal intestinal organoids using the Easytag workflow to further demonstrate it can be applied to another organoid system. As suggested by Reviewer #3, we also attempted to implement the inducible CRISPRi system in HBECs. However, due to their sensitivity to lentiviral transduction, infected HBECs died shortly after transduction with gRNA lentivirus. We believe that further optimisation of DNA delivery approach is required for implementation of the inducible CRISPRi/CRISPRa systems in HBECs (perhaps nucleofection and PiggyBac-based vectors).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.05.04.076067v2
www.biorxiv.org www.biorxiv.org

Full assembly of HIV-1 particles requires assistance of the membrane curvature factor IRSp53

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  After infection, new HIV-particles assemble at the host cell plasma membrane in a process that requires the viral protein Gag. Here, Inamdar et al. showed that a component of the host cell, the membrane curvature-inducing protein IRSp53, contributes to efficiently promote the formation of viral particles in synergy with the viral Gag protein.
  
  In cells depleted of IRSp53, the formation of HIV-1 Gag viral-like particles (VLPs) was compromised. The authors showed in compelling electron micrographs that the formation of VLPs was arrested at about half stage of particle budding. Biochemical data (co-IPs and analysis of VLPs and HIV particle content), super-resolution nanoscopy (single molecule localization microscopy) data, and in vitro biophysics measurements (in GUVs), all seem to indicate a functional connection between Gag and the iBAR-domain containing protein IRSp53. The combination of the different techniques and approaches is a clear strength of this manuscript. However, to my opinion, the interpretation of some of the experimental data is somehow limited by the lack of some appropriate controls (that are lacking for different reasons, as the authors state in some parts of the text). These are:
  
  1) Specificity of the IRSp53 siRNA. Although the authors showed that the siRNA used can deplete the expression of the protein (both endogenous and ectopic), they did not presented any rescue experiments of the phenotypes (or corroboration with different siRNA oligoes).
  
  We have tried several different commercial and home-designed siRNA targeting IRSp53 from different companies (providing single siRNA and multiple siRNA mix): we have summarizing all in the Figure R1 (see below). One can see that indeed only 2 siRNA were effective in extinguishing IRSp53 gene: one from Invitrogen on endogenous IRSp53 and ectopic IRSp53-GFP and one from Dharmacon that was only effective on ectopic IRSp53-GFP, as revealed by Western Blot (Fig R1A). Furthermore, the specificity of the siRNA was challenge by testing siRNA IRSp53 on human IRSp53-GFP and on mouse I-BAR-GFP in HEK293T transfected cells and visualized by fluorescence microscopy. Results show in figure R1B that only siIRSp53 is able to extinguished human IRSp53-GFP and not mouse I- BAR-GFP. SiIRTKS and siCtrl are not extinguishing any of these genes. Overall these results confirm the specificity of IRSp53 siRNA-mediated knockdowns.
  
  Figure R1: Specificity of siRNA-mediated knockdowns: (A) Western blots of HEK293T cells lysates probed with anti-IRSp53 antibody (and house-keeping gene GAPDH) showing a series of different siRNA IRSp53 (and siRNA Control, CTRL from Invitrogen, Dharmacon or Sigma) on endogenous and ectopic IRp53 genes in human HEK293T cells and their efficacy in specifically down regulating IRSp53. (B) siRNA IRSp53 from Invitrogen was tested for its specificity in extinguishing human IRSp53-GFP protein expressed in transfected HEK293T cells, but not mouse I-BAR-GFP, and as compare to siRNA control and IRTKS, revealed by fluorescence imaging (GFP).
  
  To further answer the reviewers’ comments, we also perform one rescue experiment of the phenotype as shown in Figure R2 below. We observed that, upon co-transfection of pGag+pIRSp53- GFP+siRNA IRSp53 (lane 2), about 50% of the ectopic IRSp53-GFP was extinguished (since this construct is not siRNA resistant), leaving 50% of this ectopic protein expressed in the cells. In this context, one can observe that Gag-VLP release is ~50% (lane 2), similar to the condition pGag+siCTRL (lane 3). When we compare this to pGag+siIRSp53 (lane 4) which is reduced by 2-3 fold (data from Figure 1b of the manuscript), we can say that the remaining IRSp53-GFP in the Lane 2 seems to rescue the defect caused by extinction of the endogenous IRSp53. In the condition pGag+pIRSp53- GFP +siCTRL, VLP-Gag release was slightly reduced. This is an atypical rescue experiment since we do not have an IRSp53-GFP that is resistant to the siRNA IRSp53 used in this study (Figure R1B), but it suggests that if IRSp53-GFP is overexpressed in the presence of Gag and the siRNA IRSp53, VLP-Gag release is at a normal 50% level in contrast to the absence of IRSp53-GFP (compare lane 2 with lane 4). Unfortunately, due to limited time and by the siRNA IRSp53 out of stock, and the delay in supply, we could only provide one experiment. We thus decided to show it for answering the reviewers but not as part of a figure in the final manuscript.
  
  Figure R2: Rescue of siRNA IRSp53 knock-down with overexpression of IRSp53-GFP: 293T cell were transfected with pGag, pIRSp53 and siRNA control (siCTRL, lane 1) or siRNA IRSp53 (lane 2); cell lysat and VLP wre loaded on SDS-PAGE gels and immunoblots were revealed with anti-GFP (for IRSp53-GFP) and anti-CAp24 (for HIV-1 Gag). One graph on the left shows the percentage of IRSp53-GFP expression upon siRNA IRSp53 cell treatment (lane 2) as compare to the siRNA CTRL (lane 1). The graph on the right shows the resulting gel quantification for the % of Gag-VLP release upon siRNA IRSp53 cell treatment (lane 2) as compare to the siRNA CTRL (lane 1) in the presence of IRSp53-GFP over-expression, or without (lane 3 and 4, as in Figure 1b). N=1 rescue experiment.
  
  2) In the co-IPs (IRSp53 IP + Gag co-IP) there is no assessment of the IRSp53 IP efficiency in the different conditions. The authors argued that IgG signal masking precluded them from doing that.
  
  See the new figure 2. In the new figure 2b, we have assess the IP/co-IP of IRSp53-GFP/Gag efficiency by adding a complete experiment showing that an anti-GFP is able to pull down IRSp53- GFP very efficiently (lanes 2 and 3) and co-IP Gag efficiently (lane 3) accordingly to the input and remaining flowthrough. Using IRSp53-GFP and an anti-GFP antibody, we could bypass the IgG signal masking the endogenous IRSp53 with the IRSp53 antibody’s IP.
  
  3) The authors observed an increase in the membrane-bound pool of IRSp53 when Gag is present (Fig. 2c). It is not clear whether this is specific for IRSp53 or other IBAR proteins can also be more membrane-bound as a result of Gag expression.
  
  See the new figure 2. In the new figure 2d, we have re-loaded all the gel fractions on new SDS- PAGE gels and probed the corresponding immunoblots for Gag, IRSp53, IRTKS, Tsg101 and the cellular markers, Lamp2 (for membrane fractions) and ribosomal S6 protein (for cytosolic fractions). One can see that after quantification of the IRSp53 versus IRTKS bands in the HEK293T cell control and in the Gag expressing cells, only IRSp53 is increasing at the cell membranes upon Gag expression and not IRTKS.
  
  Reviewer #3:
  
  Inamdar et al. used biochemical and microscopy assays to investigate the role of I-BAR domain host proteins on HIV-1 assembly and release from HEK 293T and Jurkat cells. They show that siRNA knockdown of IRSp53, but not a similar I-BAR domain protein IRTKS, inhibits HIV-1 particle release from 293T cells after transfection of the HIV-1 provirus or HIV-1 Gag in cells. The authors then show that HIV-1 Gag associates with IRSp53 in the host cell membrane and cytoplasm, using biochemical assays and super resolution microscopy. In addition, IRSp53 is incorporated into HIV-1 particles along with other previously identified host proteins. Then using in vitro-derived membrane vesicles ("giant unilamellar vesicles" or GUVs), the authors indicate that HIV-1 Gag can associate with IRSp53, particularly on highly curved structures.
  
  The conclusions are largely supported data, with the virology and biochemical results being particularly strong, but the mechanistic studies in GUVs appear somewhat preliminary and are not entirely clear. The GUV experiments would benefit from better quantification of measurements and manipulation to simulate actual cellular scenarios. In addition, while it is appreciated that the HEK 293T cell line is convenient for biochemical and imaging studies, they are not biologically relevant HIV-1 target cells. While the authors present examples of reproducibility of their results in a CD4+ T cell line, these data are buried in the supplemental figures, whilst it would have been better to highlight them and perhaps include primary CD4+ T cells.
  
  1) Immortalized cell lines do not always recapitulate primary cells. It is unclear what the role of IRSp53 is in the membrane curvature of CD4+ T cells and whether expression levels and localization are consistent with Jurkat T cells.
  
  Please consider the general responses to the Editors, which is:
  
  We have published that IRSp53 (using siRNA) is involved in HIV-1 particle release on primary T cells (PBMC derived T cells) in Thomas et al, JVI 2015, so high probability is that it would be the same in different cell type, transfected HEK293T cells, transfected or infected Jurkat T cells and infected primary T cells. But we have not done the extensive super-resolution microscopy on infected primary T cells because this would require time overconsuming study. We are currently proceeding in setting up condition with an infectious HIV-1 virus carrying mEOS2 photoactivable protein for being able to infect primary T cells and go on for further research using infectious relevant system and super- resolution microscopy, but it is not ready for this current manuscript as it would require months of extra work and experiments.
  
  Although, we agree with the reviewer #3 that the localization of Gag in Jurkat T cells and in primary CD4 Tc cells is different at the cell level (in primary T cells HIV-1 Gag is more polarized at uropods, as referred in the literature – see for an example Bedi et al/Ono’s Lab), but at the nanoscopic level of the budding sites, chances are that it would be similar but it need to be checked in future studies.
  
  2) Description of some of the microscopy measurements could be improved. In lines 204-206 of the text and Figure S5, it is unclear how the localization of precision was determined to be approximately 16 nm for PALM-STORM.
  
  These lines have been changed in the main text as they were not mandatory to understand how we determine the size of the VLP clusters. However, we have now detailed in figure S5 how we measure localisation precision.
  
  The following text has been added to the legend of the figS5:
  
  “Distribution of localisation precisions for PALM (in green) or STORM (in red) as given by Thunderstorm analysis in Fiji : Localisation precision distribution exhibit maxima at 16 nm and a mean±sd value of 20±5 nm for PALM, and a maxima of 26 nm, corresponding to a mean±sd value of 27±10 nm for STORM. The localization precision is obtained by eq 17 of (Thompson et al., 2002).”
  
  As well as the reference of the original paper (Thompson et al. 2002, Biophysical Journal).
  
  In Figure 4b, it is understood from the text (lines 252-256) that the red bars denote the Mander's coefficient for colocalization of the GFP-tagged proteins with Gag-mCherry (presumably the average of multiple experiments with standard deviations or errors of the mean, although this is not stated in the figure legend), it is unclear what the green bars are showing.
  
  Yes, the red bars denote the Mander's coefficient for colocalization of the Gag-mCherry with the GFP-proteins, and the green bar denote for colocalization of the GFP-tagged proteins with Gag- mCherry, showing for more than 300 green and red vesicles, thant indeed all the Gag-VLP are green in the case of IRSp53-GFP (red bar) but that not all the GFP-IRSp53-GFP “green” vesicles are (+) for Gag: this indicates that vesicles produced by transfected HEK cells produced GAG/IRSp53 VLP but also IRSp53-GFP vesicles. Thanks to the reviewer to point this out. We added the explanation in the main text (page 12, lanes 272-282) and in the figure legend of Figure 4b.
  
  Also, the histograms for IRSp53 and IRTKS colocalized with Gag look similar in Figure S10, suggesting that they are not different in Jurkat cells, but this is not addressed.
  
  Yes. We have now addressed this particular point in the global response to the reviewers. Indeed, the figure 3 and 4 were remodelled into new figure 3 showing, in the same figure, HEK and Jurkat cells results and in figure 4 the simulations results. Overall, the PALM/STORM microscopy analysis results on Gag/IRSp53 colocalization are very similar in both cell types.
  
  3) GUVs are first referenced on page 7 after description of Figure 2, the significance of which is confusing to the reader. However, the actual experimental data are described on pages 12-13 and Figures 5 and S11. A better description of these structures would be warranted for an audience that is unfamiliar with them. In addition, the biologic concentrations of I-BAR proteins at cell membranes are not provided and it is unclear what conditions used in Figures 5 and S11 represent a "normal CD4+ T cell" situation. It appears that the advantage of this in vitro system is that different factors can be provided or removed to simulate different cellular scenarios. For example, relatively low IRSp53 concentrations may simulate siRNA knockdown experiments in Figure 1, which could recapitulate those results that less viral particles are released from the membrane. In addition, the authors state that HIV-1 Gag preferentially colocalizes with IRSp53 as the tips of the GUV tubular structures (Figure 5b,c), but this is not actually shown or quantified. Similar quantification as shown in Figure 1e could be performed to strengthen this argument.
  
  We thank the review for pointing this out. We now described all the GUV result in section 5.
  
  Considering the biological concentrations of I-BAR proteins in cells, to the best of our knowledge, there is no measurement of it. We thus could not relate concentrations used in the GUV experiments with those in cells.
  
  We could not perform quantification as in Figure 1e because the majority of the tubes in GUVs were moving too rapidly, preventing us from acquiring images with higher spatial resolution (see Fig. S11, and Movie 2 and 3). However, we would like to point out that the Gag signals appeared dotty inside GUVs (see Fig. S11, and Movie 2 and 3), which is very different from the signals of I-BAR that are clearly along the tubes (see Fig. S10c). Moreover, for tubes that were not moving too fast, we found that for all the tubes (17 tubes), Gag signals are exclusively located at the tips of the tubes (see new Fig. 6d). Also, the sorting maps shown in Fig. 6c and Fig. S10 d indicate the relative accumulations of Gag at the tips of the tubes. To make it clearer that the Gag signals were located at the tips of the tubes, in the current manuscript, we have added the new Fig. S11, Movie 1, 2 and 3, and included zoom-in images in Fig. 6b, 6c and a new Fig. 6d. Also, we have included the quantitation results (17 tubes) in the manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.02.10.430663v1
www.biorxiv.org www.biorxiv.org

Parallel pathways for rapid odor processing in lateral entorhinal cortex: Rate and temporal coding by layer 2 subcircuits

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The lateral entorhinal cortex (LEC) receives direct inputs from the olfactory bulb (OB) but their odor response properties have not been well characterized despite a recent increase in interests in the role of LEC in olfactory behaviors. In this study, Bitzenhofer and colleagues provide unprecedented details of odor response properties of layer 2 cells in LEC. The authors first show that LEC neurons respond to odors with a rapid burst of activity time-locked to inhalation onset, similarly to the piriform cortex (PCx), but distinct from the OB. Firing rates of LEC ensembles conveyed information about odor identify whereas timing of spikes odor intensity. The authors then examined the difference between two major cell types in LEC layer 2 - fan cells and pyramidal neurons, and found that, on average, fan cells responded earlier than pyramidal neurons, and pyramidal neurons, but not fan cells, changed their peak timing in response to changes in concentrations, providing a basis for temporal coding of odor concentrations. Additionally, the authors show that inactivation of LEC impairs odor discrimination based on either identify or intensity, and demonstrate different cellular properties of fan cells and pyramidal neurons. Finally, the authors also examined the odor response properties of hippocampal CA1 neurons, and showed that odor identify can be decoded by firing rate responses, while decoding of odor concentration depended on spike timing.
  
  The authors performed a large amount of experiments, and provide an impressive set of data regarding odor response properties of LEC layer 2 neurons in a cell type specific manner. The results reported are very interesting, and will be a point of reference for future studies on odor coding and processing in the LEC. The manuscript is clearly written, and data are well analyzed and presented clearly. I have only relatively minor concerns or suggestions.
  
  The authors infer the time at which "mice could discriminate odors" from the time at which d-prime becomes significantly different between baseline and odor stimulation conditions (line 111 and line 121). However, the statistical test applied to these data does not guarantee that an observer can accurately discriminate odors. For example, a small p-value can be obtained even when discrimination accuracy is only slightly above chance if there are many trials. The statement such as "mice could discriminate two odors by as early as 225 ms after inhalation onset" (line 111) can be misleading because this might sound as if mice can accurately discriminate odors at this timepoint, while this is not necessarily the case (as indicated by the d-prime value).
  
  We have added plots of performance accuracy over time under control conditions (LED off) to Figure 2-supplement 1. These plots of fraction of correct responses (binned every 50 ms) show that mice (n = 6) are making choices significantly different from chance within 200 ms of odor inhalation. We changed the wording in the Results to now say: “Moreover, by analyzing lick timing, we determined that the discriminability measure d’ became significantly different under control conditions as early as 225 ms after inhalation onset and performance accuracy increased within 200 ms of inhalation (Fig. 2b, Figure 2-supplement 1).”
  
  Optogenetic identification can be a little tricky when identifying excitatory neurons as in this study. Please discuss some rational or difficulty regarding how to distinguish those that are activated directly by light from those activated indirectly (i.e. synaptically). Do the results hold if the authors use only those that the authors are more confident about identification?
  
  We only used the cells that were confidently identified using a combination of two criteria. First, tagged cells had to show a significant increase in firing (p_Rate <0.01) during the 5 ms LED illumination period versus 100 randomly selected time windows before LED stimulation. Cells also had to respond with a fixed latency to reduce the chance of including cells recruited by polysynaptic excitation. Further, we used the stimulus associated spike latency test (SALT) as detailed in Kvitsiani et al., 2013. To be judged as tagged, units had to show significantly less spike jitter during the 5 ms LED illumination than 100 randomly selected time windows before LED stimulation (p_SALT<0.01). Only those cells with BOTH p_Rate<0.01 and p_Salt<0.01 were considered as tagged (both methods typically agreed for most cells). Moreover, slice work testing synaptic connections between LEC layer 2 cells found extremely low levels of connectivity between fan and pyramidal cells Nilssen et al., J. Neuroscience, 2018. This makes it unlikely that LED-induced firing of fan or pyramidal cells would recruit indirectly (synaptically) excited cells.
  
  The authors sort odor response profiles by peak timing, and indicate that odor responses peak at different timing that tiles respiration cycles. However, this analysis does not indicate the reliability of peak timing. Sorting random activity by "peak timing" could generate similar figure. One way to show the reliability or significance of peaks is to cross-validate. For instance, one can use a half of the trials to sort, and plot the rest of the trials. If the peak timing is reliable, the original pattern will be replicated by the other half, and those neurons that are not reliable will lose their peaks. Please use such a method so that we can evaluate the reliability of peaks.
  
  We analyzed the data as suggested by this reviewer as shown below (Author response image 1). Plotting only the odd trials sorted by the odd trials in the dataset (top) looked identical to the data from all trails used in Figure 1g. More importantly, plotting only the even trials sorted by the odd trials (bottom), though noisier due to trial-by-trial variation, showed the same general structure of tiling throughout the respiration cycle for OB cells.
  
  Author response image 1
  
  Reviewer #2 (Public Review):
  
  In this study, Bitzenhofer et al recorded odor-evoked activity in the LEC and examined the coding of odor identity and intensity using extracellular recordings in head-fixed mice, and used the standard suite of quantitative tools to interpret these data (decoding analyses, dimensionality reduction, etc). In addition, they performed behavioral experiments to show the necessity of LEC in odor identity and intensity discrimination, and deploy some elegant and straightforward 'circuit-busting' slice physiology experiments to characterize this circuit. Importantly, they performed some of their experiments in Ntng1-cre and Calb-cre mice, which allowed them to differentiate between the two major classes of LEC principal neurons, fan cells and pyramidal cells, respectively. Many of their results are contrasted with what has previously been observed in the piriform cortex (PCx), where odor coding has been studied much more extensively.
  
  Their major conclusions are:
  
  Cells in the LEC respond rapidly to odor stimuli. Within the first 300 ms after inhalation, odor identity is encoded by the ensemble of active neurons, while odor intensity (more specifically, responses to different concentrations) is encoded by the timing of the LEC response; specifically, the synchrony of the response. These coding strategies have been described in the PCx by Bolding & Franks. Bolding also found two populations of responses to different concentrations: one population of responses was rapid and barely changed with concentration and the second population of responses had onset latencies that decreased with increasing concentration. Roland et al also found two populations of responses using calcium imaging in anesthetized mice: one population of responses was concentration-dependent and another population was 'concentration-invariant'. However, neither Bolding nor Roland were able to determine whether these populations of responses emerged from distinct populations of cells. Here, the authors elegantly register these two response types in LEC to different cell types: fan cells respond early and stably, and pyramidal cells response latencies decrease with concentration. This is a novel and important finding. They also showed that, unlike PCx or LEC where concentration primarily affects timing rather than rate/number, odor concentration in CA1 is only reflected in the timing of responses.
  
  Using optogenetic suppression of LEC in a 2AFC task, the authors purport to show that LEC is required for both the discrimination of odor identity and odor intensity. If true, this is an important result, but see below.
  
  In slice experiments, the authors characterize the differential connectivity of fan and pyramidal cells to direct olfactory bulb input, input from PCx, and inhibitory inputs from SOM and PV cells. This work is elegant, novel, and important, although it is a little out of place in this manuscript. As such, their findings are irrelevant/orthogonal to the rest of the results in this study. But fine.
  
  The simultaneous recordings from three different stations along the olfactory pathway are impressive.
  
  Major concern
  
  My major concern with this manuscript regards the behavioral experiments. The authors show that blue light over the LEC in GAD2-Cre/Ai32 mice completely abolishes (i.e. to chance) the mouse's ability to perform a 2AFC task discriminating between either two different odorants or one odorant at different concentrations. Their interpretation is that LEC is required for rapid odor-driven behavior. The sensory component of the task is so easy, and the effect is so striking that I find this result surprising and almost too good to be true. The authors do control for a blue-light distraction effect by repeating the experiments in mice that don't express ChR2, but do not control for the effect of rapidly shutting down a large part of the sensory/limbic system. If they did this experiment in the bulb I would be impressed with how clean the result was but not conceptually surprised by the outcome. I think a different negative control is needed here to convince me that the LEC is necessary for this simple sensory discrimination task. For example, the authors could activate all the interneurons (i.e. use this protocol) in another part of the brain, ideally in the olfactory pathway not immediately upstream of the LEC, and show that the behavior is not affected.
  
  This reviewer suggests a negative control experiment for the effects we observe on behavior when optogenetically silencing LEC. However, we disagree that it would be informative to silence other olfactory pathways in search of those that do not affect behavior. Our strong effects on behavior are also in complete agreement with recent findings that muscimol inactivation of LEC abolishes discrimination of learned odor associations (Extended Data Figure 8, Lee et. al., Nature, 2021).
  
  More specifically, both the presentation and the interpretation of the data are confusing. First, there is a lack of detail about the behavioral task. I was not sure exactly when the light comes on and goes off, when the cue was presented, and when the reward was presented. In the manuscript they say (line 108) "…used to suppress activity during odor delivery on a random subset…". There is nothing more about this in the figure legend or Methods. The only clue to this is the dotted line in the 'LED On' example at the bottom of Fig. 2a. The authors also say that (line 660) "Trials were initiated with a 50 ms tone." When exactly was the tone presented? In the absence of any other information, I assume it was presented at odor onset. When was the reward presented? Lines 106-7 say "Mice were free to report their choice (left or right lick) at any time within 2 s of odor onset." Presumably this means the reward was presented to one of the ports for 2 seconds, starting at odor onset.
  
  The LED is applied during odor delivery, the 50 ms tone immediately precedes odor delivery, and water reward is dispensed after the first lick at the correct lick port during the choice period. The choice period begins with the odor onset and odor delivery is terminated by the first lick at either the correct or incorrect port. If there is no lick at either port, odor delivery lasts 1s and is followed by an extended choice period (terminated by correct or incorrect lick) lasting 1s. To clarify the behavior protocol, we have included a schematic of the trial structure in Figure 2-supplement 1.
  
  These details matter because the authors want to claim that "LEC is essential for rapid odor-driven behavior." The data presented in support of this claim are (1) that mice perform this task at chance levels in LED On trials, presumably based on which port the mouse licked first (this is the 'essential' part), and (2) that in control in LED Off trials, d' becomes statistically different from baseline after ~200 ms (this is the 'rapid' part).
  
  To further support the argument that LEC is required for rapid odor-driven behavior, we now show a plot of % correct responses over time from first odor inhalation.
  
  On first reading, these suggested that shutting off LEC makes odor discrimination worse and/or slower. However, the supplementary data clarifies several things. First, the mice never Miss (Fig.2S.2a & c), meaning then they always lick. Second, in LED Off trials (F2S2 & e), the mice make few mistakes, and these only occur immediately after inhalation, presumably meaning the mice occasionally guess, possibly in response to the auditory cue. Thus, the mean time to lick is much shorter for Error trials than Correct trials. To state the obvious, the mice often wait >300 ms before they lick, and when they do wait, they never make mistakes. Now, in the LED On trials, the mice almost always lick within the first 300 ms and perform at chance levels, with the distribution of lick times for Correct and Error trials almost overlapping. In fact, although the authors claim LEC is required for rapid odor discrimination, the mean time to lick on Correct trials appears to decrease in LED On trials. This makes me think that the mice are making ballistic guesses in response to the tone in LED On cases, which doesn't necessarily implicate a dependence on LEC for odor discrimination.
  
  We do not believe that mice are making ballistic guesses in response to the tone for LED on trials. First, although a 50 ms tone immediately precedes odor delivery, all data in Figure 2-supplement 1 shows lick times aligned to the first inhalation of odor. Thus, time 0 ms is not the tone or subsequent odor onset but rather a variable time point coinciding with the first odor inhalation (the delay from odor onset to first inhalation is ~300 ms, the average respiration interval under our conditions). In fact, we excluded trials if mice made premature licks between the time of odor onset and first odor inhalation. We re-analyzed these trials to test the reviewer’s idea that mice were more likely to make fast ballistic guesses when the LEC was silenced. However, we saw no evidence that mice made more premature licks in trials with LED on (Author response image 2).
  
  Author response image 2
  
  The authors' interpretation of their data would be more solid if, for example, there were a delay between the auditory cue and odor delivery and/or if the reward was only available with some delay after the odor offset. Here, however, it seems just as likely as not that the mice are making ballistic guesses in response to the tone in LED On cases, which doesn't necessarily involve dependence on LEC for odor discrimination. Here, the divergence of d' from baseline in the control (i.e LED Off) condition seems mostly because mice take longer to correctly discriminate under control conditions. While this is not formally contradictory to LEC is essential for rapid odor-driven behavior", it is nevertheless a bit contrived and misleading. An interesting (thought) experiment is what would happen if the authors presented a tone but no odor. I would guess that the mice would continue licking randomly in Light On trials.
  
  While a delay between odor delivery and reward would have been useful for some aspects of interpreting the behavior, we would have lost the ability to examine the role of LEC in response timing. To address this reviewer’s concern, we have added a section to the Discussion mentioning caveats related to the interpretation of experiments using acute optogenetic silencing to understand behavior.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.08.19.456942v1
www.biorxiv.org www.biorxiv.org

New submission 25/06/2022, 10:19:25

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  We thank the reviewer for carefully reading of the manuscript and for the insightful criticisms and comments. In the following we address them point by point.
  
  The community assembly process is modelled in a very specific way, and the manuscript would benefit from an expanded ecological motivation of the processes that are being mimicked, and thereby explain more clearly what taxonomic level of organization is being considered.
  
  We follow the more recent trait-based approach that shifts the focus from species (and the many traits by which they differ from one another) to groups of species that share the same values of selected functional traits. Since the general context is ecosystem response to drier climates, we choose the functional traits to include a response trait associated with stress tolerance and an effect trait associated with biomass production. We further assume a tradeoff between the two traits which is well supported by earlier studies (see e.g. Angert et al. 2009, https://doi.org/10.1073/pnas.0904512106). So, indeed, the choice we make in characterizing the community is quite specific, but it is highly relevant to the ecological context considered of dryland plant communities where plants compete primarily for water and light. The taxonomic level we consider is species except that we group them in a manner that is more transparent to questions of ecosystem function, ignoring differences between species that are not significant to these questions.
  
  We expanded considerably the text in the section “Modeling spatial assembly of dryland plant communities” to clarify the ecological motivation of the processes we model.
  
  In addition, it would be useful if the authors could provide further clarification as to what extent the community diversity dynamics can be separated from total biomass dynamics of patterned water-limited ecosystems given the current approach. These points are explained in further detail below.
  
  The model describes the dynamics of all functional groups, which provides the biomass distribution 𝐵 = 𝐵(𝜒) in trait space (in the case of patterned states we first integrate over space). That distribution contains information about various community-level properties, including functional diversity (richness, evenness) as figure 3 in the revised manuscript illustrates, and total biomass, which is the area below the distribution curve. The two types of dynamics are tightly connected and cannot be separated, but in principle the approach can be used to study the relationships between diversity and total biomass by calculating biomass distributions along the rainfall gradient and extracting the two properties from the distributions.
  
  We added in the section “Modeling spatial assembly of dryland plant communities” the information that the biomass distribution also contains information about the total biomass.
  
  First, it was not entirely clear to this reviewer how the reaction parts of the model equations determine the optimal trait value χ, and how this value varies as a function of precipitation.
  
  The ‘optimal’ trait value 𝜒𝑚𝑎𝑥 is determined by the interspecific interactions that the model captures, which divide into ‘direct’ and ‘indirect’ interactions. The direct interactions are captured by the dependence of the growth rate Λ𝑖 of the ith functional group (see Eq. (1a)) on the aboveground biomass values of all functional groups, Λ𝑖 = Λ𝑖(𝐵1,𝐵2,… , 𝐵𝑁) (see Eq. (2)). This dependence represents competition for light (taller plants are better competitors) and includes the effect of self-shading. The indirect interactions are through the water uptake term in the soil-water equation (1b) (2nd term from right) and the water dependence of the biomass growth term in Eq. (1a). These terms represent competition for water. For a given precipitation value 𝑃 the net effect of these interspecific interactions result in a particular functional group 𝜒𝑚𝑎𝑥 which is most abundant. For spatially uniform vegetation, as 𝑃 is increased 𝜒𝑚𝑎𝑥 moves to lower values. The precipitation increases surface water (Eq. (1c)) and consequently the amount of water 𝐼𝐻 infiltrating into the soil. The increased soil water gives competitive advantage to species investing in growth, mainly because they better compete for light as they grow taller, and therefore 𝜒𝑚𝑎𝑥 decreases.
  
  … it is then not immediately clear why the most successful trait class is not outcompeting the other classes.
  
  With the current model and parameters set the most successful trait does eventually outcompete all other traits, when trait diffusion is set to zero, 𝐷𝜒 = 0. This is, however, a very long process because the most successful trait suffers from self-shading at late growth stages, which slows down its growth and allows nearby traits to survive for a long time. Choosing a finite but very small 𝐷𝜒 values that represent mutations occurring on evolutionarily long times counteracts the exclusion process and results in a stationary asymptotic community, as Fig. 3 in the revised manuscript shows (this behavior is reminiscent of optical solitons, where self-focusing instability is balanced by dispersion). We note that modeling stronger growth-inhibiting factors, such as pathogens, by including a factor of the form (1 − 𝐵𝑖/𝐾) to the growth rate, results in an asymptotic stationary community also for 𝐷𝜒 = 0 (see also earlier studies Nathan et al. 2016, Yizhaq et al. 2020).
  
  We revised original Fig. 4 (now Fig. 3) by adding a new part (Fig. 3a) that shows the exclusion process for 𝐷𝜒 = 0, and the effect of the counter-acting process of trait diffusion, which results in an asymptotic distribution of finite width (Fig. 3b) from which community level properties such as functional diversity can be derived. We also extended the text in section “Modeling spatial assembly of dryland plant communities” (last paragraph) to clarify the two counter-acting processes of exclusion because of interspecific competition for water and light, and trait diffusion driven by mutations, which together culminate in an asymptotic biomass distribution along the 𝜒 axis of finite width.
  
  The authors model trait adaptation through a diffusion approximation between trait classes. That is, every timestep, a small amount of biomass flows from the class with higher biomass to the neighboring trait class with lower biomass. From an ecological point of view, it seems that this process is describing adaptation of vegetation that is already present, so this process seems to be limited to intraspecific phenotypic plasticity. From the text, however, it seems that the trait classes correspond to higher taxonomic levels of organization, when describing shifts from fast growing to stress-tolerant species, for example. It is not entirely clear, however, how biomass flows as assumed in the model could occur at these higher levels of organization.
  
  We do not study in this work adaptation through diffusion in trait space. That kind of adaptive dynamics can indeed be studied with the current model, but with different initial conditions, namely, initial conditions corresponding to a single resident trait where the biomass of all other traits is zero. The resulting dynamics of mutations and succession are then very slow, occurring on evolutionarily long time scales set by the small value of 𝐷𝜒 (e.g. 10−6). In this study the initial conditions represent the presence of all traits, even if at very low biomass values that may represent a pool of seeds that germinate once environmental conditions allow. For a given precipitation value 𝑃, the functional traits we consider determine which functional groups (of species) overcome environmental filtering and grow, and which of the growing traits survive the competition for water and light. These are relatively fast processes, occurring on ecological time scales, which determine the emerging community. At longer times this community is further shaped by slow processes of interspecific competition among species of similar traits and by trait diffusion (mutations). A final remark about phenotypic changes: although in general 𝜒 can be interpreted as representing different phenotypes, the choice of very small values for 𝐷𝜒 cannot represent relatively fast phenotypic changes and restricts the context to mutations at the taxonomic level of species.
  
  We added an explanation in the 3rd paragraph of the section “Modeling spatial assembly of dryland plant communities” of the need to consider mutations and the role they play in our study.
  
  Combining the observations from the previous two points, there is a concern that for a given level of precipitation, there is a single trait class with optimal biomass/lowest soil water level that is dominant, with the neighboring trait classes being sustained by the diffusion of biomass from the optimal class to neighboring inferior classes. This would seem a bit problematic, as it would mean that most classes are not a true fit for the environment, and only persist due to the continuous inflow of biomass. Taking a clue from the previous papers of the authors, it seems this may not be the case, though. Specifically, in the paper by Nathan et al. (2016) it seems that all trait classes are started at low initial biomass density, and the resulting steady state (in the absence of biomass flows between classes) seems to show similar biomass profiles as shown in Figs. 4,5 and 7 of the current paper. While the current model formulation seems slightly different, similar results may apply here. Indeed, keeping all trait classes at non-zero (but low) density, and when the (abiotic and biotic) environment permits, let each class increase in biomass seems like the most straightforward approach to model community assembly dynamics. Given the above discussion about these trait classes competing for a single resource (soil water), and one trait class being able to drive this resource availability to the lowest level, it would then be useful to readers to explain why multiple trait classes can coexist here, and how(for spatial uniform solutions) the equilibrium soil water level with multiple trait classes present compares to the equilibrium soil water level when only the optimal trait class is present. Furthermore, if results as presented in Nathan et al. (2016) indeed hold in the current case, perhaps it means that the biomass profile responses as shown in e.g. Fig. 5 would also occur if there was no biomass flow between trait classes included, but that the time needed to adjust the profile would take much longer as compared to when the drift term/second trait derivative is included. In summary, further clarification of what the biomass flows between classes represent, and the role it plays in driving the presented results would be useful for readers.
  
  As explained in the reply to previous comments the asymptotic community is tuned by a balance between two slow counter-acting processes, interspecific competition among similar traits and mutations over evolutionarily long time scales. However, the community structure is largely determined by much faster processes of environmental filtering and interspecific competition among widely distinct traits, as all traits are initially present. Indeed, comparing the biomass distributions in new Fig. 3, with and without trait diffusion indicates that the community composition, as measured by 𝜒𝑚𝑎𝑥, is the same. Trait diffusion, however, does affect functional diversity, along with environmental factors. In that sense the emerging community is a true fit for the environment.
  
  We thank the reviewer for these thoughtful comments, which helped us realize that our presentation of these issues was too concise and unclear. We believe that the new extended section on modeling spatial assembly of dryland plant communities, and the new figure 3a clarify these issues.
  
  In addition, it would be useful for readers to understand to what extent the shifts in average trait values and functional diversity can be decoupled from the biomass and soil water responses to changes in precipitation that would occur in a model with only a single biomass variable. For example, early studies on self-organization in semi-arid ecosystems already showed that the shift toward a patterned state involved the formation of patches with higher biomass, and higher soil water availability, as compared to the preceding spatially uniform state, and that the biomass in these patches remains relatively stable under decreasing rainfall, while their geometry changes (e.g. Rietkerket al. 2002). It has also been observed that for a given environmental condition, biomass in vegetation patches tends to increase with pattern wavelength (e.g. Bastiaansen and Doelman 2018; Bastiaansen et al. 2018). Given the model formulation, one wonders whether higher biomass in the single variable model is not automatically corresponding to higher abundance of faster growing species and a higher functional diversity (as the diffusion of biomass can cover a broader range when starting from higher mass in the optimal trait class). There are some indications in the current work that the linkage is more complicated, for example, the biomass peak in Fig. 7c is lower, but also broader as compared to the distribution of Fig. 7b, but it is currently not entirely clear how this result can be explained (for example, it might be the case that in the spatially patterned states, the biomass profiles also vary in space).
  
  We are not sure we understand what the reviewer means by “decoupled”, but much insight indeed can be gained from a study of a model for a single functional group (trait) and observing the behaviors described by the reviewer. In fact, these behaviors, which some of us are familiar with from numerical studies, motivated parts of the current study. Higher biomass in vegetation patches (compared to uniform vegetation) in the single trait model does not automatically imply a shift to faster growing species; in principle the stress-tolerant species that already reside in the system when uniform vegetation destabilizes to a periodic pattern can simply grow denser. To answer this and additional questions we need to take into account interspecific interactions by studying the full community model. As to Fig. 7b,c, the behavior appears to be opposite to that described by the reviewer: the biomass pick in Fig. 7c is higher and narrower than that in Fig. 7b, not lower and broader. This is because of the much larger domain of the patterned state as compared with that of the uniform state, which increases the abundance of low-𝜒 species, i.e. species investing in growth.
  
  The increase of biomass in vegetation patches with pattern wavelength for given environmental conditions, as observed by Bastiaansen et al. 2018, is actually another mechanism for increasing functional diversity. This is because the water stress at the patch center is higher than that in the outer patch areas and thus forms favorable conditions for stress tolerant species while the outer areas form favorable conditions for fast growing species.
  
  We added a new paragraph in the Discussion and Conclusion section (last paragraph in the subsection Insight III) where we discuss the effect of coexisting periodic patterns of different wavelengths on functional diversity and ecosystem management. We also added citations to the references the reviewer mentioned.
  
  The possibility of hybrid states, where part of the landscape is in a spatially uniform state, while the other part of the landscape is in a patterned state, is quite interesting. To better understand how such states could be leveraged in management strategies, it would be useful if a bit more information could be provided on how these hybrid states emerge, and whether one can anticipate whether a perturbation will grow until a fully patterned state, or whether the expansion will halt at some point, yielding the hybrid state. It seems that being able to distinguish this case would be necessary in the design of planning and management strategies
  
  The hybrid states appear in the bistability range of the uniform and patterned vegetation states, and typically occupy most of this range. Their appearance is related to the behavior of ‘front pinning’ in bistability ranges of uniform and patterned states in general. Front pinning refers to fronts that separate a uniform domain and a periodic-pattern domain, which remain stationary in a range of a control parameter (precipitation in our case). This is unlike fronts that separate two uniform states, which always propagate in one direction or another and can be stationary only at a single parameter value – the Maxwell point. Thus, an indication that a given landscape may have the whole multitude of hybrid states is the presence of a front (ecotones) that separates uniform and patterned vegetation. If that front appears stationary over long period of times (on average), this is a strong indication.
  
  We added a new paragraph in the subsection Insight III of the Discussion and conclusion section to clarify this point.
  
  Also, in Fig. 3a, the region of parameter space in which hybrid states occur is not very large; it is not entirely clear whether the full range of hybrid states is left out here for visual considerations, or whether these states only occur within this narrow range in the vicinity of the Turing instability point.
  
  As pointed out in the reply to the previous comment the hybrid states are limited to the bistability range of uniform and patterned vegetation, which is not wide. However, this should not necessarily restrictma nagement of ecosystem services by nonuniform biomass removal, as such management will have similar effects on community structure also outside the bistability range where front propagate slowly.
  
  The new paragraph we added also addresses this point.
  
  Reviewer #2 (Public Review):
  
  We thank the reviewer for carefully reading the manuscript and for the constructive criticisms and comments. In the following we address them point by point.
  
  1) Model presentation.
  
  It would be better to explain the model in ecological terms first, clarifying parameter biological meaning and justifying their choice. In doing so, creating a specific 'Methods' section, which now is lacking, would be of help too. Authors should clarify whether and how the model follows the conservation of mass principle involving precipitation and evapotranspiration. Are root growth and seed dispersal included for this purpose? Why they are not referred to any further in the analysis and discussion? Why a specific term for plant transpiration is not included, or is to somehow phenomenologically incorporated into the growth-tolerance tradeoff? In doing so, authors should also pay attention to water balance as above (H) and below (W) ground water are not independent from each other.
  
  We added a Methods section, which in eLife is placed at the end of the manuscript. The section includes the model equations and more detailed explanations in ecological terms of various parts of the model. We also added Table 1 with a list of all model parameters, their descriptions, units and numerical values used in the simulations. Presenting the model at the end of the manuscript suits more technical information about the model, but not essential information that is needed for understanding the results. We therefore kept the subsection “A model for spatial assembly of dryland plant communities” in the Results section, where we present that information.
  
  There is no conservation of mass in the model (and all other models of this kind) simply because the system that we consider is open. In particular, it does not include the atmosphere, which constitute part of the system’s environment. Including the atmosphere as additional state variables in the model, capturing the feedback of evapotranspiration on the atmosphere, would make the model too complicated for the kind of analysis we perform. So, although the model contains parts that represent mass conservation such as the terms describing below- and above-ground water transport, water mass is not conserved. The biomass variables represent aboveground biomass of living plants or plant parts and are not conserved either as biomass production involve biochemical reactions that convert inorganic substances coming from the system’s environment (atmosphere and the soil) into organic ones, while plant mortality involves organic matter that leaves the system.
  
  Roots in the model platform we consider are modeled indirectly through their relation to aboveground biomass. That relation constitutes one of the scale-dependent feedbacks that produce a Turing instability to vegetation patterns, the so-called root-augmentation feedback (see Meron 2019, Physics Today), but in this particular study we eliminate this feedback for simplicity. The scale-dependent feedback that we do consider is the so-called infiltration feedback, associated with biomass-dependent infiltration rate that produces overland water flow towards vegetation patches, as explained in the subsection “A model for spatial assembly of dryland plant communities”. It will be interesting indeed to extend the study in the future to include also the root-augmentation feedback.
  
  We assume short-range seed dispersal and take it into account through biomass “diffusion” terms (obtained as approximations of dispersal kernels assuming narrow kernels). These terms play important roles in the scale-dependent feedback that induces the Turing instability, as is explained in earlier papers which we cite. Plant transpiration is modeled through the water uptake term in the equation for the soilwater 𝑊. Indeed above-ground water 𝐻 and below-ground water 𝑊 are not independent; the infiltration term IH in the equations for both state variables account for this dependence in a unidirectional manner (loss of 𝐻 and gain of 𝑊). As we do not include the atmosphere in the model the other direction, namely, evapotranspiration that increases air humidity and affects rainfall, is not accounted for. The neglect of this effect can be justified for sparse dryland vegetation.
  
  These good points have already been discussed in many earlier papers as well as in the book Nonlinear Physics of Ecosystems (Meron 2015), and we cannot address them all in this paper. We did however add several clarifications in the section Modeling spatial assembly of dryland plant communities and in the new Methods section, including the consideration of the atmosphere as the system’s environment quantified by the precipitation parameter 𝑃.
  
  Another unclear point is that growth rates for the same plant functional groups are assumed to be constant among different species within the same group and are confounded by biomass production. Why is that the case? Furthermore, how many different species are characterizing each functional group? How are interspecific interactions accounted for (more specifically, see comment below)?
  
  In the trait-based approach we focus on just two functional traits, related to growth rate and tolerance to water stress, ignoring differences in other traits that distinguish species. That is, a given functional group consists of species that share the same values of the two selected functional traits (to a given precision determined by 𝑁), taking all other traits represented in the model to be equal. In this approach we do not care about how many species belong to each functional group, only their total biomass. We wish to add that simplifying assumptions of this kind are necessary if we want the model to be mathematically tractable and capable of providing deep insights by mathematical analysis.
  
  We expanded the discussion of the trait-based approach in the section Modeling spatial assembly of dryland plant communities and added relevant references (second paragraph).
  
  Finally, stress tolerance is purely phenomenological. There is no actual mechanism/parameter describing it. Rather, it "simply" appears as low/high mortality, which in turn is said to be due to high/low tolerance. This leads to a sort of circularity between mortality and tolerance. Yet, mortality can occur due to other biophysical factors (e.g. disturbance, fire, herbivory, pathogens). A drawback of this assumption is that a mechanism of drought tolerance is often to invest in belowground organs, including roots. However, according to the proposed model, it turns out that fast growing species with low investment in tolerance also have high investment in roots; vice versa, tolerant species have low investment in roots. This is a bit counterintuitive and not well biologically supported.
  
  First, we agree with the reviewer that our approach is purely phenomenological, as we model tolerance to water stress by a single parameter that lumps together the effects of various physiological mechanisms. That parameter can be distinguished from other factors affecting mortality by regarding the constant 𝑀𝑚𝑎𝑥 in Eq. (3) as representing several contributions. Since we do not study the effects of these other factors we can absorb them in 𝑀𝑚𝑎𝑥 for mathematical simplicity. Tolerance to water stress is not necessarily associated with roots. Plants can better tolerate water stress by reducing transpiration through stomatal closure, regulating leaf water potential, or develop hydraulically independent multiple stems that lead to a redundancy of independent conduits and higher resistance to drought (see Schenk et al. 2008 - https://doi.org/10.1073/pnas.0804294105).
  
  We added a discussion in the Methods section (5th paragraph, “Tolerance to water stess …”) of the simple form by which we model tolerance to water stress through the mortality parameter.
  
  2) Parameter choice.
  
  N = 128 is an extremely high number for plant functional groups. It is even quite unrealistic to have 128 species per square meter, so this value is not very reasonable. Please run the model and report results with more realistic N (e.g from 4-64) as well as with different sets of N values keeping all other parameters constant.
  
  We wish to clarify two points: 1) N=128 does not imply 128 functional groups per square meter; the emerging community has much lower functional richness (FR) as the average FR is around 0.25, meaning only 128 × 0.25 = 32 functional groups. 2) The model results, as reflected by the key metrics 𝜒𝑚𝑎𝑥, 𝐹𝑅, and 𝐹𝐸, are independent of the particular value of N (for N values sufficiently large), as Figures IA and IB below show. The biomass 𝐵𝑖 of each functional group, however, does change (Figure IA) because by changing N we change the range of traits Δ𝜒 = 1/𝑁 that belong to a given functional group. But if we look at the biomass density in trait space 𝑏𝑖, related to 𝐵𝑖 through the relation 𝐵𝑖 = 𝑏𝑖Δ𝜒, then also the biomass density is independent of 𝑁 as Figure IB shows. So, even if in practice there are less functional groups and thus species as considered in the model studies, the results are not affected by that. On the other hand, choosing higher 𝑁 values provides smoother curves and nicer presentation of our results.
  
  Figure IA
  
  Figure IB
  
  We added a discussion of this issue in the Methods section after Eq. (2).
  
  Gamma (rate of water uptake by plants' roots): why is it in that unit of m^2/kg * y? Why are you now considering the area (and not the volume) per biomass unit?
  
  The vegetation pattern formation model we study, like most other models of this kind, does not explicitly capture the soil depth dimension. Accordingly, W is interpreted as the soil-water content in the soil volume below a unit ground area within the reach of the plant roots. In practice W has units kg/m2, like B, and since Γ𝑊𝐵 should have the same units as 𝜕𝑊/𝜕𝑡 (see Eq. 1b), Γ must have the units of (𝐵𝑡)−1.
  
  A is not defined in the text.
  
  We now define it in Table 1 (see Methods section).
  
  M min: why 0.5 mortality? Having M max set to 0.9, please consider a lower mortality value set to 0.1, and please report evidence(hopefully) demonstrating the robustness of results to such change.
  
  The results are robust to the particular values of 𝑀𝑚𝑖𝑛 and 𝑀𝑚𝑎𝑥, except that there are combinations of these two parameters for which the biomass distributions are pushed towards the edge of the 𝜒 domain, which make the presentation of the results less clear. Figure II shows results of recalculations of the distribution 𝐵 = 𝐵(𝜒) for 𝑀𝑚𝑖𝑛 = 0.1, as requested (using 𝑀𝑚𝑎𝑥 = 0.15) for 3 different precipitation values. As the reviewer can see there’s no qualitative change in the results: lower precipitation push a uniform community to stress tolerant species (higher 𝜒), while the formation of patterns at yet lower precipitation push the community back to fast growing species (low 𝜒).
  
  Figure II
  
  K_min and K_max are in two different units, and should both be kg/m^2.
  
  Thanks, we fixed this typo in Table 1.
  
  Values of precipitation (P, mean annual precipitation) are not reported.
  
  The precipitation parameter is variable, as is now stated in Table 1, and therefore was not include it in the list of parameters’ values used. Whenever a particular precipitation value has been used our intention was to state it in the caption of the corresponding figure. This was done in Figs. 5,6,7, but indeed not in Fig. 4 (Fig. 3 in revised ms.). The insets on the right side of Fig. 3 (Fig. 4 in revised ms.) where also calculated for particular precipitation values, but that information is not essential as the intention is to show typical forms of the various solution branches, which do not qualitatively change along the branches (i.e. at different P values).
  
  We added the precipitation value (P=180mm/y) at which all the biomass distributions shown in new Fig. 3 (Fig. 4 in original ms) were calculated.
  
  3) Results presentation and interpretation.
  
  Parameter range of precipitation in figure 3 is odd. Why in one case precipitation ranges from 0 to 160 while in another it is only 60-120? Furthermore, in paragraph 198-213 and associated results in fig. 5. the Choice of precipitation values is somehow discordant from the previous model. Please provide motivation for this choice, clarify and uniformize it.
  
  In Fig. 3b (Fig. 4b in revised ms) we restricted the precipitation range to 60-120 as the curves, which are limited to 0 < 𝜒 < 1 (by the definition of 𝜒), do not extend to 𝑃 < 60 and to 𝑃 > 120. Extending the range to 0 < 𝑃 < 160 would make the figure less compact and nice as it will contain blank parts with no information.
  
  We are not sure we understand what the reviewer means by “is somehow discordant from the previous model”. The motivation of the choices we made for the precipitation values P=150, 100 and 80 was to show the shift of a spatially uniform community to a higher 𝜒 value as the precipitation is decreased to a lower value (from 150 to 100), and the shift back to a lower 𝜒 value at yet lower precipitation (80) past the Turing instability.
  
  Finally, authors seem to create confusion around community composition, which is defined as the (taxonomic) identity of all different species inhabiting a community. Notably, it is remarkably different from the x_max parameter used in the model, which as a matter of fact is just the value of the most productive (notably, not necessarily the most abundant) functional group.
  
  We thank the reviewer for this comment. Since all the emerging communities in the model studies are pretty localized around the value of 𝜒𝑚𝑎𝑥, that value does contain information about the identity of other functional groups in the community when complemented by FR (functional richness) and FE (functional evenness). More significantly to our study, shifts in 𝜒𝑚𝑎𝑥 represent the shifts in community composition we focus on in this study, i.e. shifts towards fast growing species or towards stress-tolerant species.
  
  We modified the description of the community-level properties that can be derived from the biomass distribution in trait space (see modified text towards the end of the section “Modeling spatial assembly …” and also the caption of Fig. 3b), explaining that both functional diversity and community composition can be described by several metrics, and clarifying the significance of 𝜒𝑚𝑎𝑥 in describing community-composition shifts.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.20.449156v1
www.biorxiv.org www.biorxiv.org

Decoding the genetic and chemical basis of sexual attractiveness in parasitic wasps

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors convincingly show in this study the effects of the fas5 gene on changes in the CHC profile and the importance of these changes toward sexual attractiveness.
  
  The main strength of this study lies in its holistic approach (from genes to behaviour) showing a full and convincing picture of the stated conclusions. The authors succeeded in putting a very interdisciplinary set of experiments together to support the main claims of this manuscript.
  
  We appreciate the kind comments from the reviewer.
  
  The main weakness stems from the lack of transparency behind the statistical analyses conducted in the study. Detailed statistical results are never mentioned in the text, nor is it always clear what was compared to what. I also believe that some tests that were conducted are not adequate for the given data. I am therefore unable to properly assess the significance of the results from the presented information. Nevertheless, the graphical representations are convincing enough for me to believe that a revision of the statistics would not significantly affect the main conclusions of this manuscript.
  
  We apologize for neglecting a detailed description of statistical tests that were performed. We wrote additional paragraphs in the method part specifically explaining the statistical analyses (line 435-445; 489-502; 559-561; 586-591).
  
  The second major problem I had with the study was how it brushes over the somewhat contradicting results they found in males (Fig S2). These are only mentioned twice in the main text and in both cases as being "similarly affected", even though their own stats seem to indicate otherwise for many of the analysed compound groups. This also should affect the main conclusion concerning the effects of fas5 genes in the discussion, a more careful wording when interpreting the results is therefore necessary.
  
  Thank you for pointing this out. Though our focus clearly lay on the female CHC profiles as a function in sexual signaling has only been described thus far for them, we now elaborated the result and discussion for the fas5 RNAi male part (line 167-178; 258-268).
  
  Reviewer #2 (Public Review):
  
  Insects have long been known to use cuticular hydrocarbons for communication. While the general pathways for hydrocarbon synthesis have been worked out, their specificity and in particular the specificity of the different enzymes involved is surprisingly little understood. Here, the authors convincingly demonstrate that a single fatty acid synthase gene is responsible for a shift in the positions of methyl groups across the entire alkane spectrum of a wasp, and that the wasps males recognize females specifically based on these methyl group positions. The strength of the study is the combination of gene expression manipulations with behavioural observations evaluating the effect of the associated changes in the cuticular hydrocarbon profiles. The authors make sure that the behavioural effect is indeed due to the chemical changes by not only testing life animals, but also dead animals and corpses with manipulated cuticular hydrocarbons.
  
  I find the evidence that the hydrocarbon changes do not affect survival and desiccation resistance less convincing (due to the limited set of conditions and relatively small sample size), but the data presented are certainly congruent with the idea that the methyl alkane changes do not have large effects on desiccation.
  
  We appreciate the kind comments from the reviewer.
  
  Reviewer #3 (Public Review):
  
  In this manuscript, the authors are aiming to demonstrate that a fatty-acyl synthase gene (fas5) is involved in the composition of the blend of surface hydrocarbons of a parasitoid wasp and that it affects the sexual attractiveness of females for males. Overall, the manuscript reads very well, it is very streamlined, and the authors' claims are mostly supported by their experiments and observations.
  
  We appreciate the kind comments from the reviewer.
  
  However, I find that some experiments, information and/or discussion are absent to assess how the effects they observe are, at least in part, not due to other factors than fas5 and the methyl-branched (MB) alkanes. I'm also wondering if what the authors observe is only a change in the sexual attractiveness of females and not related to species recognition as well.
  
  We appreciate the interesting point that the reviewer raises in sexual attractiveness and species recognition and now expand upon this potential aspect in the discussion (lines 327-330). However, in this manuscript, we very much focused on the effect of fas5 knockdown on the conveyance of female sexual attractiveness in a single species (Nasonia vitripennis). Therefore, we argue that species recognition constitutes a different communication modality here, and we currently cannot infer whether and how species recognition is exactly encoded in Nasonia CHC profiles despite some circumstantial evidence for species-specificity (Buellesbach et al. 2013; Mair et al. 2017). Thus, we would like to refrain from any further speculation on species recognition before this can be unambiguously demonstrated, and remain within the mechanism of sexual attractiveness within a single species which we clearly show is mediated by the female MB-alkane fraction governed by the fatty acid synthase genes. We however still consider potential alternative explanations (e.g., n-alkenes acting as a deterrent of homosexual mating attempts).
  
  The authors explore the function of cuticular hydrocarbons (CHCs) and a fatty-acyl synthase in Nasonia vitripennis, a parasitic wasp. Using RNAi, they successfully knockdown the expression of the fas5 gene in wasps. The authors do not justify their choice of fatty-acyl synthase candidate gene. It would have been interesting to know if that is one of many genes they studied or if there was some evidence that drove them to focus their interest in fas5.
  
  In a previous study, 5 fas candidate genes orthologous to Drosophila melanogaster fas genes were identified and mapped in the genome of Nasonia vitripennis (Buellesbach et al. 2022). We actually investigated the effects of all of these fas genes on CHC variation, but only fas5 led to such a striking, traceable pattern shift. We are currently preparing another manuscript discussing the effects of the other fas genes, but decided to focus exclusively on fas5 here, due to its significance for revealing how sexual attractiveness can be encoded and conveyed in complex chemical profiles, maintained and governed by a surprisingly simple genetic basis.
  
  The authors observe large changes in the cuticular hydrocarbons (CHC) profile of male and females. These changes are mostly a reduction of some MB alkanes and an increase in others as well as an increase of n-alkene in fas5 knockdown females. For males fas5 knockdowns, the overall quantity of CHC is increased and consequently, multiple types of compounds are increased compared to wild-type, with only one compound appearing to decrease compared to wild-type. Insects are known to rely on ratios of compounds in blends to recognize odors. Authors address this by showing a plot of the relative ratios, but it seems to me that they do show statistical tests of those changes in the proportions of the different types of compounds. In the results section, the authors give percentages while referring to figures showing the absolute amount of CHCs. They should also test if the ratios are significantly different or not between experimental conditions. Similar data should be displayed for the males as well.
  
  We appreciate your suggestions. We kindly refer you to our response to reviewer 1, where we addressed the statistical tests. Specifically, we generated separate subplots to display the proportions of different compound classes and performed statistical tests to compare these proportions between different treatments for both males and females. Additionally, we have revised the results section to replace relative abundances with absolute quantity, as depicted in Figure 2C-G.
  
  Furthermore, the authors didn't use an internal standard to measure the quantity of CHCs in the extracts, which, to me, is the gold standard in the field. If I understood correctly, the authors check the abundance measured for known quantities of n-alkanes. I'm sure this method is fine, but I would have liked to be reassured that the quantities measured through this method are good by either testing some samples with an internal standard, or referring to work that demonstrates that this method is always accurate to assess the quantities of CHC in extracts of known volumes.
  
  We actually did include 7,5 ng/μl dodecane (C12) as an “internal” standard in the hexane resuspensions of all of our processed samples (line 456, Materials and Methods). This was primarily done to allow for visually inspecting and comparing the congruence of all chromatograms in the subsequent data analysis and immediately detect any variation from sample preparation, injection process and instrument fluctuation. In our study, we have a very elaborate and standardized CHC extraction method that the volume of solvent and duration for extraction are strictly controlled to minimize the variation from sample preparation steps. Furthermore, we calibrated each individual CHC compound quantity with a dilution series of external standards (C21-C40) of known concentration. By constructing a calibration curve based on this dilution series, we achieved the most accurate compound quantification, also taking into account and counteracting the generally diminishing quantities of compounds with higher chain lengths.
  
  The authors provide a sensible control for their RNAi experiments: targeting an unrelated gene, absent in N. vitripennis (the GFP). This allows us to see if the injection of RNAi might affect CHC profiles, which it appears to do in some cases in males, but not in females. The authors also show to the reader that their RNAi experiments do reduce the expression of the target gene. However, one of the caveats of their experiments, is that the authors don't provide evidence or information to allow the (non-expert) reader to assess whether the fas5 RNAi experiments did affect the expression of other fatty-acyl synthase genes. I'm not an expert in RNAi, so maybe this suggestion is not relevant, but it should, at least, be addressed somewhere in the manuscript that such off-target effects are very unlikely or impossible, in that case, or more generally.
  
  We acknowledge the reviewer’s concern about potential off-target effect of the fas5 knockdown. We actually did check initially for off-target effects on the other four previously published fas genes in N. vitripennis (Lammers et al. 2019; Buellesbach et al. 2022) and did not find any effects on their respective expressions. We now include these results as supplementary data (Figure 2-figure supplement 1). However, as mentioned in the cover letter to the editor, we discovered a previously uncharacterized fas gene in the most recent N. vitripennis genome assembly (NC_045761.1), fas6, most likely constituting a tandem gene duplication of fas5. These two genes turned out to have such high sequence similarity (> 90 %, Figure 2-figure supplement 2) that both were simultaneously downregulated by our fas5 dsRNAi construct, which we confirmed with qPCR and now incorporated into our manuscript (Fig. 2H). Therefore, we now explicitly mention that the knockdown affects both genes, and either one or both could have the observed phenotypic effects. Recognizing this RNAi off-target effect, we have now also incorporated a discussion of this issue in the appropriate section of the manuscript (line 364-377), as well as the potential off-target effects of our GFP dsRNAi controls (line 262-274).
  
  The authors observe that the modified CHCs profiles of RNAi females reduce courtship and copulation attempts, but not antennation, by males toward live and (dead) dummy females. They show that the MB alkanes of the CHC profile are sufficient to elicit sexual behaviors from males towards dummy females and that the same fraction from extracts of fas5 knockdown females does so significantly less. From the previous data, it seems that dummy females with fas5 female's MB alkanes profile elicit more antennation than CHC-cleared dummy females, but the authors do not display data for this type of target on the figure for MB alkane behavioral experiments.
  
  Actually similar proportions of males performed antennation behavior towards female dummies with MB alkane fraction of fas5 RNAi females and CHC-cleared female dummies (55% and 50%, respectively, see Author response image 1 for the corresponding parts of the sub-figures 3 E and 4 D). We did not deem it necessary to show the same data on CHC-cleared female dummies in Figure 3 as well.
  
  Author response image 1.
  
  Unfortunately, the authors don't present experiments testing the effect of the non-MB alkanes fractions of the CHC extracts on male behavior toward females. As such, they are not able to (and didn't) conclude that the MB-alkane is necessary to trigger the sexual behaviors of males. I believe testing this would have significantly enhanced the significance of this work. I would also have found it interesting for the authors to comment on whether they observe aggressive behavior of males towards females (live or dead) and/or whether such behavior is expected or not in inter-individual interactions in parasitoids wasps.
  
  In our experiment, we focus on the function of the MB-alkane fraction in female CHC profiles, and we comprehensibly demonstrate in figure 4 that the MB-alkane fraction from WT females alone is sufficient to trigger mating behavior coherent with that on alive and untreated female dummies. Therefore, we do not completely understand the reviewer’s concern about us not being ” able to (and didn't) conclude that the MB-alkane is necessary to trigger the sexual behaviors of males”. We appreciate the suggestion from the reviewer of testing the non-MB alkanes (n-alkanes and n-alkenes). However, due to the experimental procedure of separating the CHC compound class fractions through elution with molecular sieves, it was not possible for us to retrieve either the whole n-alkane or n-alkene fraction remaining bound to the sieves after separation). The role of n-alkenes in N. vitripennis is however considered in the discussion, as a deterrent for homosexual interactions between males (Wang et al. 2022a). Moreover, we did not observe aggressive behavior of males towards live or dead females.
  
  CHCs are used by insects to signal and/or recognize various traits of targets of interest, including species or groups of origin, fertility, etc. The authors claim that their experiments show the sexual attractiveness of females can be encoded in the specific ratio of MB alkanes. While I understand how they come to this conclusion, I am somewhat concerned. The authors very quickly discuss their results in light of the literature about the role of CHCs (and notably MB alkanes) in various recognition behaviors in Hymenoptera, including conspecific recognition. Previous work (cited by the authors) has shown that males recognize males from females using an alkene (Z9C31). As such, it remains possible that the "sexual attractiveness" of N. vitripennis females for males relies on them not being males and being from the right species as well. The authors do not address the question of whether the CHCs (and the MB alkanes in particular) of females signal their sex or their species. While I acknowledge that responding to this question is beyond the scope of this work, I also strongly believe that it should be discussed in the manuscript. Otherwise, non-specialist readers would not be able to understand what I believe is one of the points that could temper the conclusions from this work.
  
  We acknowledge the reviewer’s insight about the MB alkanes in signaling sex or species in N. vitripennis, and now include this aspect in our revised discussion (line 324-330). Moreover, we clearly demonstrate that n-alkenes have been reduced to minute trace components after our compound class separation, and the males still do not display courtship and copulation behaviors similar to WT females, thus strongly indicating that the n-alkenes do not play a role when relying solely on the changed MB-alkane patterns, further strengthening our main argument.
  
  References
  
  Benjamini, Y. and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29:1165-1188.
  
  Buellesbach, J., J. Gadau, L. W. Beukeboom, F. Echinger, R. Raychoudhury, J. H. Werren, and T. Schmitt. 2013. Cuticular hydrocarbon divergence in the jewel wasp Nasonia: Evolutionary shifts in chemical communication channels? J. Evol. Biol. 26:2467-2478.
  
  Buellesbach, J., C. Greim, and T. Schmitt. 2014. Asymmetric interspecific mating behavior reflects incomplete prezygotic isolation in the jewel wasp genus Nasonia. Ethology 120:834-843.
  
  Buellesbach, J., H. Holze, L. Schrader, J. Liebig, T. Schmitt, J. Gadau, and O. Niehuis. 2022. Genetic and genomic architecture of species-specific cuticular hydrocarbon variation in parasitoid wasps. Proc. R. Soc. B 289:20220336.
  
  Engl, T., N. Eberl, C. Gorse, T. Krüger, T. H. P. Schmidt, R. Plarre, C. Adler, and M. Kaltenpoth. 2018. Ancient symbiosis confers desiccation resistance to stored grain pest beetles. Mol. Ecol. 27:2095-2108.
  
  Ferveur, J. F., J. Cortot, K. Rihani, M. Cobb, and C. Everaerts. 2018. Desiccation resistance: effect of cuticular hydrocarbons and water content in Drosophila melanogaster adults. Peerj 6.
  
  Lammers, M., K. Kraaijeveld, J. Mariën, and J. Ellers. 2019. Gene expression changes associated with the evolutionary loss of a metabolic trait: lack of lipogenesis in parasitoids. BMC Genom. 20:309.
  
  Mair, M. M., V. Kmezic, S. Huber, B. A. Pannebakker, and J. Ruther. 2017. The chemical basis of mate recognition in two parasitoid wasp species of the genus Nasonia. Entomol. Exp. Appl. 164:1-15.
  
  Wang, Y., W. Sun, S. Fleischmann, J. G. Millar, J. Ruther, and E. C. Verhulst. 2022a. Silencing Doublesex expression triggers three-level pheromonal feminization in Nasonia vitripennis males. Proc. R. Soc. B 289:20212002.
  
  Wang, Z., J. P. Receveur, J. Pu, H. Cong, C. Richards, M. Liang, and H. Chung. 2022b. Desiccation resistance differences in Drosophila species can be largely explained by variations in cuticular hydrocarbons. eLife 11:e80859.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.01.09.523239v1
www.biorxiv.org www.biorxiv.org

New submission 09/09/2022, 13:51:14

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this manuscript, the authors investigate the genes involved in the retention of eggs in Aedes aegypti females. They do so by identifying two candidate genes that are differentially expressed across the different reproductive phases and also show that the transcripts of those two genes are present in ovaries and in the proteome. Overall, I think this is interesting and impressive work that characterizes the function of those two specific protein-coding genes thoroughly. I also really enjoyed the figures. Although they were a bit packed, the visuals made it easy to follow the authors' arguments. I have a few concerns and suggested changes, listed below.
  
  1) These two genes/loci are definitely rapidly evolving. However, that does not automatically imply that positive selection has occurred in these genes. Clearly, you have demonstrated that these gene sequences might be important for fitness in Aedes aegypti. However, if these happen to be disordered proteins, then they would evolve rapidly, i.e., under fewer sequence constraints. In such a scenario, dN/dS values are likely to be high. Another possibility is that as these are expressed only in one tissue and most likely not expressed constitutively, they could be under relaxed constraints relative to all other genes in the genome. For instance, we know that average expression levels of protein-coding genes are highly correlated with their rate of molecular evolution (Drummond et al., 2005). Moreover, there have clearly been genome rearrangements and/or insertion/deletions in the studied gene sequences between closely- related species (as you have nicely shown), thus again dN/dS values will naturally be high. Thus, high values of dN/dS are neither surprising nor do they directly imply positive selection in this case. If the authors really want to investigate this further, they can use the McDonald Kreitman test (McDonald and Kreitman 1991) to ask if non- synonymous divergence is higher than expected. However, this test would require population-level data. Alternatively, the authors can simply discuss adaptation as a possibility along with the others suggested above. A discussion of alternative hypotheses is extremely important and must be clearly laid out.
  
  We agree with the reviewer’s point that rapid evolution is not the same as positive selection. We also agree with the reviewer’s point that McDonald-Kreitman test (MK test) is more powerful than dN/dS analysis. We took advantage of a large population dataset from Rose et al. 2020. After filtering the data, we kept 454 genomes for MK tests. We found both genes are marginally significant or insignificant (tweedledee p = 0.068; tweedledum p = 0.048), despite that these are small genes and have low Pn values. This suggests that it is likely the genes evolve under positive selection.
  
  In line with the reviewer’s suggestion, we performed another analysis using a large amount of population data. We asked if the SNP frequencies of tweedledee and tweedledum are correlated with environmental variables. We found that when compared to a distribution of 10,000 simulated genes with randomly-sampled genetic variants, both tweedledee and tweedledum showed significant correlation to multiple ecological variables reflecting climate variability, such as mean diurnal range, temperature seasonality, and precipitation seasonality (p<0.05). These results are now incorporated into the manuscript in Figure 5 and Figure 5 – Figure supplement 1.
  
  2) The authors show that the two genes under study are important for the retention of viable eggs. However, as these genes are close to two other conserved genes (scratch and peritrophin-like gene), it is unclear to me how it is possible to rule out the contribution of the conserved genes to the same phenotype. Is it possible that the CRISPR deletion leads to the disruption of expression of one of the other important genes nearby (i.e., in a scratch or peritrophin-like gene) as the deleted region could have included a promoter region for instance, which is causing the phenotype you observe? Since all of these genes are so close to each other, it is possible that they are co-regulated and that tweedledee and tweedledum and expressed and translated along with the scratch and peritrophin-like gene. Do we know whether their expression patterns diverge and that scratch and peritrophin-like genes do not play a role in the retention of viable eggs?
  
  This is a fair criticism; however, we think the chance that the phenotypes are caused by interrupting nearby genes is very low. First, peritrophin-like acts in the immune response, and scratch is a brain-biased transcription factor. Neither of the genes show expression in the ovary before or after blood feeding (TPM <1 or 2 are generally considered unexpressed, while scratch and peritrophin-like expression levels are overall lower than 0.1 TPM).
  
  This suggests that peritrophin-like and scratch are not likely to function in the ovary. Thus, although we cannot completely rule out the gene knockout impacts regulation of very distant genes, it is unlikely. Since the mounting evidence we show in this manuscript that tweedledee and tweedledum are highly translated in the ovary after blooding feeding, under the principle of parsimony, we expect the phenotypes came from knocking out the highly expressed and translated genes.
  
  Reviewer #2 (Public Review):
  
  This manuscript is overall quite convincing, presenting a well- thought-out approach to candidate gene detection and systemic follow- ups on two genes that meet their candidate gene criteria. There are several major claims made by the authors, and some have more compelling evidence than others, but in general, the conclusions are quite sound. My main issues stem from how the strategy to identify genes playing a role in egg retention success has led to very particular genes being examined, and so I question some of the elements of the discussion focusing on the rapid evolution and taxon- uniqueness of the identified genes. In short, while I believe the authors have demonstrated that tweedledee and tweedledum play an important role in egg retention, I'm not sure whether this study should be taken as evidence that taxon-specific or rapidly evolving genes, in general, are responsible for this adaptation, or simply play an important role in it.
  
  We have revised the paper to make it clearer that the focus is indeed on these two genes on not on the greater question of taxon-specific or rapidly-evolving genes.
  
  First, the authors present evidence that Aedes aegypti females can retain eggs when a source of fresh water is lacking, confirming that females are not attracted to human forearms while retaining eggs and that up to 70% of the retained eggs hatch after retaining them for nearly a week. This ability is likely an important adaptation that allows Aedes aegypti to thrive in a broad range of conditions. The data here seem fairly compelling.
  
  Based on this observation, the authors reason that genes responsible for the ability to retain eggs must: 1) be highly expressed in ovaries during retention, but not before or after. 2) be taxon-specific (as this behavior seems limited to Aedes aegypti). While this approach to enriching candidate genes has proven fruitful in this particular case, I'm not sure I agree with the authors' rationale. First, even genes at a low expression in the ovaries may be crucial to egg retention. Second, while egg-laying behavior is vastly varied in insects, I'm not sure focusing on taxon-restricted genes is necessary. It is entirely possible that many of the genes identified in Figure 2E play a crucial role in egg retention evolution. These are minor issues, but they are relevant to some later points made by the authors.
  
  We regret framing the discovery of tweedledee and tweedledum in the original submission using this somewhat artificial set of filtering criteria. The reality is that the genes caught our attention for their novel sequence, tight genetic linkage, and interesting expression profile. That really is the focus of the paper, not these other peripheral questions that have been the focus of attention of the reviews. We really do apologize for all of the confusion about what this paper is about.
  
  Nonetheless, the authors provide very compelling evidence that the two genes meeting their criteria - tweedledee and tweedledum, play an important role in egg retention. The genes seem to be expressed primarily in ovaries during egg retention (some observed expression in brain/testes is expected for any gene), and the proteins they code seem to be found in elevated quantities in both ovaries and hemolymph during and immediately after egg retention. RNA for the genes is detected in follicles within the ovary, and CRISPR knockouts of both the genes lead to a large decrease in egg viability post retention.
  
  My earlier qualms about their search strategy relay into some issues with Figure 4, which describes how the two genes are 1) taxon- restricted and 2) have evolved very rapidly. Neither of the two statements is unexpected given the authors' search strategy. Of course, the genes examined precisely for their lack of homologs do not have any homologs. Similarly, by limiting themselves to genes that show a lack of homology (i.e. low sequence similarity) to other genes as well as genes with high expression levels in the ovaries, a higher rate of evolution is almost inevitable to infer (as ovary expressed genes tend to evolve more rapidly in mosquitoes). I agree with the authors that inferences of the evolutionary history of these genes are quite difficult because of their uniqueness, and I especially appreciate their attempts to identify homologs (although I really dislike the term "conceptualog").
  
  We have removed our term “conceptualog” and replaced with the mor conventional “putative ortholog”
  
  This leads to my main (fairly minor) issue of the paper - the discussion on the evolutionary history of these genes and its implications (sections "Taxon-restricted genes underlie tailored adaptations in a diverse world" and "Evolutionary histories and catering to different natural histories"). As noted, inferring this history is very difficult because the authors have focused on two rapidly evolving, taxon-restricted genes. The analyses they have performed here definitely demonstrate that the genes play an important role in egg retention, however, they do not show that taxon-restricted genes play a disproportionate role in egg retention evolution. Indeed, the only data relevant to this point would be the proportion of genes in Figure 2E that are taxon-restricted (3/9), but I'm not sure what the null expectation for this proportion for highly expressed ovary genes is to begin with. Furthermore, the extremely rapid evolution of this gene makes it hard to judge how truly taxon-restricted it is. My own search of tweedle homologs identified multiple as previously having been predicted to be "Knr4/Smi1-like", and while no similar genes are located in a similar location in melanogaster, there is generally little synteny conservation in Drosophila (for instance Bhutkar et al 2008), so I'm unsure what can really be said about their evolutionary origins/lack of homologs in Drosophila.
  
  In short - the manuscript makes clear that tweedledee and tweedledum play an important role in egg retention in A. aegypti, nonetheless, it is not clear that this is a demonstration of how important taxon- restricted genes are to understanding the evolution of life-history strategies.
  
  Again, we should have never framed the paper the way we did in the original version. We make no claims whatsoever that taxon-restricted genes in general should play a role in this biology, only that the two candidate genes under study influence egg viability after extended retention. We hope that the framing is clearer in this revision.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.01.482582v1
www.biorxiv.org www.biorxiv.org

Gene age predicts the transcriptional landscape of sexual morphogenesis in multicellular fungi

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This study sought to systematically identify the components and driving forces of transcriptome evolution in fungi that exhibit complex multicellularity (CM). The authors examined a series of parameters or expression signatures (i.e. natural antisense transcripts, allele-specific expression, RNA-editing) concluding that the best predictor of a gene behavior in the CM transcriptome was evolutionary age.
  
  Thus, the transcriptomes of fruiting bodies showed a distinct gene-age-related stratification, where it was possible to sort out genes related to general sexual processes from those likely linked to morphogenetic aspects of the CM fruiting bodies. Notably, their results did not support a developmental hourglass, which is the rather predominant hypothesis in metazoans, including some analysis in fungi.
  
  The studies involved analyses of new transcriptomic datasets for different developmental stages (and tissue types in some cases) of Pleurotus ostreatus and Pterula gracilis, as well as the analyses of existing datasets for other fungi.
  
  There are diverse interesting observations such as ones regarding Allele Specific Expression (ASE), suggesting that in P. ostreatus ASE mainly occurs due to cis-regulatory allele divergence, possibly in fast evolving genes that are not under strong selection constraints, such as ones grouped in youngest gene ages categories. In addition, a large number of conserved unannotated genes among CM-specific orthogroups highlights the rather cryptic nature of CM in fungi and raises as an important area for future research.
  
  Some of the key aspects of the analyses would need to be better exemplified such as:
  
  – Providing a better description of the developmentally expressed TFs only in CM species
  
  – Providing clear examples of the promoter divergence that could be the underlying mechanism behind ASE. In particular, for some cases, there may be enough information in the literature/databases to predict the appearance or disappearance of relevant cis-elements in the promoters showing the highest divergence in genes depicting the highest levels of ASE.
  
  We appreciate the constructive comments of the Reviewer and have revised the ms in accordance with the suggestions. In particular, we link different parts of the ms better to each other, provided a more detailed discussion of developmentally expressed TFs (lines 615-621). We also provide case studies of ASE genes with cis-regulatory divergence (Figure 5 and see below), although we note that these analyses are based on inferred and not directly determined motifs, so they should be considered as preliminary.
  
  We had considered using TF binding motifs previously, and now we gave a try to analyzing potential transcription factor binding sites in divergent promoters. We find that there are no P. ostreatus transcription factors for which motifs based on direct evidence are available; rather, all P. ostreatus motifs are based on extrapolations from experimentally determined motifs (typically in Neurospora crassa). Therefore, to avoid too general motifs, we used only those where at least 5 nucleotides show at least 80% expected frequency in the PWM-s. This left us with 158 motifs (126 excluded). High motif binding score (>=4) and self-rate (>=0.9) were also required to ignore false positive hits. Different binding ability and lack of binding in one of the parental genomes were counted for each promoter. We found that genes with allele specific expression (ASE S2 and S4) show significantly higher differences in motif binding (lacking motifs, or different binding ability) than non-ASE genes (Fig. A1). These observations show that, not only promoter divergence, but differential predicted TF binding ability is also more common among ASE genes than among non-ASE genes. This supports our conjecture that ASE arises from cis-regulatory divergence.
  
  Fig A1: The left plot below shows the number of cases when the promoter of one allele of an allele pair in the two parent genomes has, but the other lacks a motif. The right plot shows the same in terms of difference in binding score.
  
  We could find examples, such as the allele specific expression of PleosPC15_2_1031042, a Hemerythrin-like (IPR012312) protein which might be regulated by the conserved c2h2 transcription factor, containing zinc finger domain of the C2H2 type (Fig. A2). C2h2 has already been proved to be important during the initiation of primordia formation with targeted gene inactivation (Ohm et al 2011, https://pubmed.ncbi.nlm.nih.gov/21815946/). A binding site of c2h2 was detected in the upstream region of PleosPC15_2_1031042. There is a mismatch in the inferred binding motif which causes a reduced binding score in PC15 (Fig. A2/c). Indeed the PC9 nuclei contribute better to the total expression of this gene.
  
  Despite this, and other (not shown) examples that we have found, we were not convinced about the reliability of this approach. There are many assumptions in this analysis, the positional weight matrices (PWM) that we used, are all based on indirect evidence, high number of loci these PWMs identify, uncertainty in the position of binding site from transcriptional start site, relation of difference in binding motif and expressional changes. We consider these factors to potentially contribute too much noise to the analyses for these to be robust, therefore, we are hesitant to include these results in the ms.
  
  Fig A2: An example for promoter divergence a) expression of c2h2 transcription factor (TF) in P. ostreatus. b) allele-specific expression pattern of PleosPC15_2_1031042 from the two parental genomes. c) inferred binding motif of c2h2 TF and a detected potential binding site in the upstream region of PleosPC15_2_1031042 gene.
  
  Reviewer #2 (Public Review):
  
  The evolution of complex multicellularity represents a major developmental reprogramming, and comparing related species which differ in multicellular structures may shed light on the mechanisms involved. Here, the authors compare species of Basidiomycete fungi and focus on analyzing developmental transcriptomes to identify patterns across species. Deep RNA-Seq data is generated for two species, P. ostreatus and Pt. gracilis, sampling different developmental stages. The authors report conflicting evidence for a "developmental hourglass" using a weighted transcription index vs gene age categories. There is substantial allele-specific expression in P. ostreatus, and these genes tend to have a more recent origin, have more divergent upstream regions and coding sequences, and are enriched for developmentally regulated transcripts. Antisense transcripts have low overlap with coding regions and low conservation, and a subset show a positive or negative correlation with the overlapping gene. Comparison to a species without complex multicellular development is used to further classify the developmental program.
  
  Overall the new transcriptional data and extensive analysis provide a thorough view of the types of transcripts that appear differentially regulated, their age, and associated gene function enrichment. The gene sets identified from this analysis as well as the potential to re-analyze this data will be useful to the community studying multicellularity in fungi. The primary insights drawn in this study relate to the dating of the developmental transcriptome, however some patterns observed with young genes and noncoding transcripts are primarily reflective of expected patterns of evolutionary time.
  
  We appreciate the Reviewer’s nice words on our ms, we think the revised version has substantial improvements in many aspects listed above.
  
  Reviewer #3 (Public Review):
  
  Fungi are unique in forming complex 3D multicellular reproductive structures from 2D mycelium filaments, a property used in this paper to study the genetic changes associated with the evolution of complex 3D multicellularity. The manuscript by Merenyi et al. investigates the evolution of gene expression and genome regulation during the formation of reproductive structures (fruiting bodies) in the Agaricomycetes lineage of Fungi. Transcriptome and multicellularity evolution are very exciting fundamental questions in biology that only become accessible with recent technological developments and the appropriate analysis framework. Important perspectives include understanding how genes acquire new functions and what role plays transcriptional regulation in adaptation. The study gathers a very useful dataset to this end, and relies on generally relevant hypotheses-driven analyses.
  
  Analysis of fruiting body transcriptome in nine species revealed that prediction from the development hourglass model (that young genes are expressed in early and late but not intermediate phases of development) verified only in a few species, including Pleurotus ostreatus. An allele-specific expression (ASE) analysis in P. ostreatus showed that young genes frequently show ASE during fruiting body development. A comparative analysis with C. neoformans, which reproduces sexually without forming fruiting body, indicates that young and old (but not intermediate) genes are likely involved specifically in fruiting body morphogenesis. A number of underlying hypothesis could be presented better, and importantly the connection between the various analyses did not appear obvious to me. Some hypotheses and reasoning therefore need clarification. Some important results from the analyses are not provided and not commented, although they are required to fully meet the manuscript's objectives.
  
  We appreciate the Reviewer’s suggestions and have revised the ms as explained below.
  
  I do not clearly see the connection between the developmental hourglass model studied in the first part of the ms, and the allele-specific expression patterns in the second half of the ms. Which "phase" of the hourglass is expected to contain true CM-related genes (by contrast to general sexual processes)? Was P. ostreatus chosen for the ASE analysis because evidence for a developmental hourglass pattern was detected in this species? The conclusion that "evolutionary age predicts, to a large extent, the behaviour of a gene in the CM transcriptome" was established thanks to ASE in P. ostreatus, which was also found to be rather an exception for conforming to the hourglass model of developmental evolution. To what extent is this conclusion transferable to other Agaricomycete/fungal species?
  
  We chose P. ostreatus because this is the only species for which the genomes of both parental strains (PC9 and PC15) are available. Although the hourglass concept is indeed a central hypothesis in animal developmental biology (though see recent critiques some (Piasecka et al 2013), our results suggest that it simply does not generally apply to fungal development. This may be due to the unique developmental mechanisms of fungi, or the independent origin(s) of CM in fungi. Our ms might have been misleading in this respect, in the revision we clarify that the ASE and hourglass analyses are independent of each other. Our interpretation of the hourglass results is that this model is not or hardly applicable for fungal development and the fact that P. ostreatus was the only species that in fact showed an hourglass did not drive our selection of this species. We inserted a note on this in the ms.
  
  The authors acknowledge that fruiting body-expressed genes may relate either to CM or to more general sexual functions, and that disentangling these functions is a major challenge in their study. An overview of which gene was assigned to which function is not explicit in the ms (proposed to be described in a separate publication). Do these functional gene classes show distinct transcriptome evolution patterns (hourglass model, ASE...)?
  
  We made accessible the complete list of CM-related genes and genes with more general sexual functions in Table S2/b-c. Due to length restrictions, we do not discuss many or each of these genes here, but provided gene ontology-based overviews (Fig 8/c-d, from lines 631). To answer the question whether CM vs shared genes show distinct transcriptomic patterns, we analyzed ASE, NATs and the hourglass model separately for CM-specific and shared genes. as follows:
  
  -hourglass: We calculated and visualised the TAI for CM-specific and Shared gene sets of P. ostreatus separately. The average value of TAI decreased a lot in Shared genes possibly due to the overrepresentation of ancient genes here, but the patterns remained similar to the original, which imply that not simply one or the other gene set drives these patterns (Fig A3).
  
  Fig A3: Transcriptome Age Index for CM-specific and Shared gene sets of P. ostreatus separately
  
  -ASE: As we detailed in the ms, allele specific expression occurs mainly in young genes. Indeed, only 13.1% of ASE genes belong to the conserved gene sets (CMspecific: 200 and Shared: 144). Although there are more ASE genes (>2FC) among CM-specific genes, they are still underrepresented compared to young genes that are neither shared, nor CM-specific. This indicates that ASE is generally a feature of non-conserved genes and is not particularly characteristic for either conserved or CM-specific genes.
  
  -NAT: We found that 17.3% of CM-specific (141 genes) and 18.3% of Shared genes (165 genes) overlap with antisense transcripts. Since these numbers don't differ substantially from 17.6%, which is the proportion of NATs corresponding to all protein coding genes, it implies an independent occurrence between NATs and these gene conservation groups.
  
  3.) As far as I understand, major functions of the fruiting body transcriptome are either CM or general sexual functions. Could these genes, notably those showing ASE, play a role in general processes other than sexual development (hyphal growth, environment sensing, cell homeostasis, pathogenicity)?
  
  Certainly, ASE might also occur in genes related to these processes. However, the processes mentioned by the Reviewer are likely associated with very conserved genes (except pathogenicity, which we can’t examine here) and our results suggest that ASE is more typical of young genes that are under weak selection. We detected ASE in 931/343 (S2/S4 genes) genes expressed in the vegetative mycelium stage of P. ostreatus. We also note that by the definition of developmentally regulated genes, we do not expect very basic fungal processes, like hyphal growth to be among the functions of the genes we identified. Genes related to such basic (housekeeping) processes usually (exceptions exist) show flat expression profiles (because they are equally important in mycelia and all fruiting body stages) and will not be picked up by our pipelines for identifying shared developmentally regulated genes.
  
  As stated by the authors, "the goal of this study was to systematically tease apart the components and driving forces of transcriptome evolution in CM fungi". What drives the interesting ASE pattern discovered however remains an open question at the end of the ms. The authors appropriately discuss that these patterns could be either adaptive or neutral but there is no direct evidence for any scenario in P. ostreatus. Is the expression of (some of) the young genes showing ASE required for CM? one or two case studies would allow providing support for such scenarios.
  
  We respectfully disagree. We provide evidence that the driving force of ASE is promoter divergence (and consequently differential transcription factor binding) in genes in which it is tolerated (see conclusions, lines 708-712). Our results suggest that ASE is mostly a neutrally arising phenomenon. To get to the mechanistic bases of how promoter divergence can cause ASE (following the suggestion of Reviewer 1), we analysed putative, inferred transcription factor binding motifs in P. ostreatus and found that ASE genes had more divergence in putative TF binding sites. However, it is important to emphasise that all TF motifs we analyzed are inferred motifs and therefore these results are indicative at best.
  
  Reviewer #4 (Public Review):
  
  This work develops a comparative framework to test genes which support complex morphological structures and complex multicelluarity. This expands beyond simple gene sharing and phylogenomics by incorporating comparison of gene expression profiling of development of multicellular structures during sexual reproduction. This approach tests the hypothesis that genes underlying sexual reproductive structure formation are homologous and the molecular evolutionary processes that control transcriptome evolution which underlie complex multicellularity.
  
  The approaches used are appropriate and employ modern comparative and transcriptome analyses to example allele specific expression, and evaluate an age of the evolutionary ages of genes. This work produced additional new RNAseq to examine developmental processes and combined it with existing published data to contrast fungi with either complex morphologies or yeast forms.
  
  The strengths of work are well selected comparison organisms and efforts to have developmental stages which are appropriate comparisons.
  
  We appreciate the Reviewer’s positive comments.
  
  Weakness could be pointed to in how the NAT descriptions are interesting it isn't clear how they link directly to morphology variation or development. I am unclear if these are arising from new de novo promotors, are ferried by transposable elements, or if any other understanding of their genesis indicates they are more than very recent gains in a species for the most part and not part of any conserved developmental process (outside a few exemplars).
  
  Originally, we assayed natural antisense transcripts (NAT) based on the assumption that they regulate developmental processes (e.g. Kim et al 2018 https://doi.org/10.1128/mBio.01292-18). Our analyses showed that although NATs are abundant in CM transcriptomes of fungi, they show no homology across species and so are unlikely to drive conserved developmental processes, which we are after in this ms. Rather, our data are compatible with most (but likely not all) NATs being transcriptional noise, arising from novel or random promoters. We therefore shortened this section and moved much of it to the Appendix 1.
  
  The impact of this work will reside in how gene age intersects with variability and relative importance in CM. it will be interesting to see future work examine the functions of these genes and test how allele specific expression and specific alleles are contributing to the formation of these tissues and growth forms. I am still not sure if molecular mechanisms of how high variability in gene expression is still producing relatively uniform morphologies, or if it isn't quantification of morphological variation would be nice to link to whether ASE underlie that.
  
  We agree that allele specific expression could influence morphologies significantly, but investigating that is beyond the scope of the current work (it would require a population genomics project). More direct evidence on allelic differences can be seen in monokaryon phenotypes, which only express one of the parental alleles. Phenotypic differences are obvious in the mycelium of the two parental monokaryons : the mycelium of PC9 is more fluffy and grows faster than that of PC15. This was reported recently by Lee et al 2021 (https://doi.org/10.1093/g3journal/jkaa008). We agree with the Reviewer that this is a very exciting future research direction.
  
  To my read of the work, the authors achieved their goals and confirmed hypothesis about the age of genes and the variability of gene expression. I still feel there is some clarity lost in whether the findings across the large number of species compared here help inform predictions or classifications of types of genes which either have ASE or are implicated in CM. This is really work for the future as the authors have provided a detailed analysis and approach that can fuel further direction in this research area.
  
  To address this issue we reworked the ms to make connections between ASE and CM clearer. Because ASE appears based on our results to (mostly) arise neutrally, predictions for other species are expected to be hard. On the other hand, we think we can make confident predictions on what types of genes are implicated in CM in other species, at least for conserved aspects of fruiting body development.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.04.447176v1
www.biorxiv.org www.biorxiv.org

Task-dependent and automatic tracking of hierarchical linguistic structure

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this study, the authors set out to clarify the relationship between brain oscillations and different levels of speech (syllables, words, phrases) using MEG. They presented word lists and sentences and used task instructions to attempt to focus listeners' attention on different levels of linguistic analysis (syllables, words, phrases).
  
  1) I came away with mixed feelings about the task design: following each stimulus (sentence or word list), participants were asked to (a) press a button (i.e. nothing related to what they heard, (b) indicate which of two syllables was heard, (c) indicate which of two words was heard, (d) indicate which pair of words was present in the correct order. This task is the critical manipulation in the study, as it is intended to encourage (or in the authors' words, "require") participants to focus on different timescales of speech (syllable, word, and phrase, respectively). I very much like the idea of keeping the physical stimuli unchanged, and manipulating attention through task demands - an elegant and effective approach. At the same time, I have reservations about the degree to which these task instructions altered attention during listening. My intuition is that, if I were a participant, I would just listen attentively, and then answer the question about the specific level. For example, I don't know that knowing I would be doing a "word pair" task, I would be attending at a slower rate than a "word" task, as in both cases I would be motivated to understand all of the words in the sentence. I fully acknowledge my introspection (n=1) may be flawed here, but nevertheless, any additional support validating the effect of these instructions would help the interpretation of the MEG results.
  
  The reviewer points out that to do any task on sentences (such as a word task and a syllable task) participants’ strategy could be to understand the full meaning of the sentence and infer the lower level properties based on the understanding of the full sentence. We fully share this introspection, which would suggest that extracting sentence meaning is partly automatic (or at least a default mode of processing) and independent of the behavioral relevance. While the reviewer sees this as a downside of the design, this is part of what our study tried to disentangle (automatic versus task-dependent processing at lower frequency time-scales). If, as the reviewer points out, all processing of sentences would be automatic we should not find any effect of task (as the task should not affect the tracking response at all). We found that overall the tracking response is robust to task-induced manipulation of attention – the main effect that MI to phrases is higher for sentences than for word lists is robust across passive and task conditions. But that is not the whole story on the source level, where we do find some task effects, which indicates that task instructions do matter. This means that participants changed their strategy depending on the instructions, but that overall, tracking of linguistic structures such as phrases is automatic. We show that for the IFG MI phrasal time scales are tracked stronger during the phrase task versus the other tasks. This is also reflected in stronger STG-IFG connectivity during the phrasal versus passive task. These results speak against the interpretation of the reviewer that “task instructions“ do not “ altered attention during listening”. While there are these subtle task differences, especially in IFG, overall our findings do speak for an automatic tracking of phrasal rate structure in sentences independent of task. We therefore concluded that “automatic understanding of linguistic information, and all the processing that this entails, cannot be countered to substantially change the consequences for neural readout, even when explicitly instructing participants to pay attention to particular time-scales” (line 548-549).
  
  The analysis steps generally seem sensible and well-suited to answering the main claims of the study. Controlling for power differences between conditions through matching was a nice feature.
  
  2) I had a concern about accuracy differences (as seen in Figure 1) across stimulus materials and tasks. In particular, for the phrase task, participants were more accurate for sentence stimuli than word list stimuli. I think this makes a lot of sense, as a coherent sentence will be easier to remember in order than a list of words. But, I did not see accuracy taken into account in any of the analyses. These behavioral differences raise the possibility that the MEG results related to the sentence > word list contrast in phrases (which seems one of the most interesting findings in IFG) simply reflect differences in accuracy.
  
  With the caveat of the concern regarding accuracy differences, the research goals were clear and the conclusions were generally supported by the analyses.
  
  Thank you for pointing this out. We have now taken accuracy into account in our analysis. It did not change any of our main findings or conclusions, and strengthened the argument that tracking of phrases in sentences vs. word lists is stronger. The influence of task difficulty is a relevant point to investigate (also see point 1 of reviewer 2 and point 4 of reviewer 3). To do so we added accuracy (per participant per condition) as a factor in the mixed model (as well as all interactions with task and condition) for the MI, power, and connectivity analyses at the phrasal rate/delta band. Note that as for the passive task there is no accuracy, we removed the passive task from the analyses. We could also only run models with random intercepts (not random slopes), due to the reduced number of degrees of freedom when adding the factor accuracy to the models.
  
  For the MI analysis we only found an effect in MTG. Specifically, there was a three-way interaction between task, condition and accuracy (F(2, 91.9) = 3.4591, p = 0.036). To follow up on this three-way interaction we split the data per task. The condition*accuracy interaction was only (uncorrected) significant for the word combination task (F(1,24.8) = 5.296, p = 0.03 (uncorrected)) and not for any other task (p>0.1). In the word combination task, we found that the difference between sentences and word lists was the strongest at high accuracies (see below figure the predicted values of the model). One way to interpret this finding is that stronger phrasal-rate MI tracking in MTG promotes phrasal-rate processing (as indicated by accuracy) more in sentences than in word lists.
  
  MEG – behavioral performance relation. A) Predicted values for the phrasal band MI in the MTG for the word combination task separately for the two conditions. B) Predicted values for the delta band WPLI in the STG-MTG connection separately for the two conditions. Error bars indicate the 95% confidence interval of the fit. Colored lines at the bottom indicate individual datapoints.
  
  For power we did not find any effect of accuracy. For the connectivity analysis we found in the STG-MTG connectivity a significant conditionaccuracy interaction (F(1, 80.23)=5.19, p = 0.025). The conditionaccuracy interaction showed that lower accuracies were generally associated with stronger differences between the sentences and word lists (see figure; the opposite of the MI analysis). Thus, functional connections in the delta band are stronger during sentence processing when participants have difficulty with the task (independent of the task performed). This could indicate that low-frequency connections are more relevant for the sentence than the word list condition (as the reviewer also indicated in point 1).
  
  After correcting for accuracy there was also a significant task condition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005 corrected), but not for the other tasks (p>0.1).
  
  We added the results of the accuracy analyses in the main manuscript as well as adding a dedicated section in our discussion section (page 21-22). Adding accuracy did not remove any of the effects we report in the original analyses. Therefore, none of these finding change the interpretation of the results as the task still had an influence on the MI responses of MTG and IFG. The effect of accuracy in the MTG refined the results showing that the effect was strongest there for participants with high accuracies. This relationship suggests a functional role of tracking through phase alignment for understanding phrasal structure.
  
  The methods now read: “MEG-behavioural performance analysis: To investigate the relation between the MEG measures and the behavioural performance we repeated the analyses (MI, power, and connectivity) but added accuracy as a factor (together with the interactions with the task and condition factor). As there is no accuracy for the passive task, we removed this task from the analysis. We then followed the same analyse steps as before. Since we reduced our degree of freedom, we could however only create random intercept and not random slope models”.
  
  The results now read: “MEG-behavioural performance relation. We found for the MI analysis a significant effect of accuracy only in the MTG. Here, we found a three-way interaction between accuracy task condition (F(2, 91.9) = 3.459, p = 0.036). Splitting up for the three different tasks we found only an uncorrected significant effect for the condition accuracy interaction for the phrasal task (F(1, 24.8) = 5.296, p = 0.03) and not for the other two tasks (p>0.1). In the phrasal task, we found that when accuracy was high, there was a stronger difference between the sentence and the word list condition compared to when accuracy was low, with stronger accuracy for the sentence condition (Figure 7A).
  
  No relation between accuracy and power was found. For the connectivity analysis we found a significant condition accuracy interaction for the STG-MTG connection (F(1,80.23) = 5.19, p = 0.025; Figure 7B). Independent of task, when accuracy was low the difference between sentence and word lists was stronger with higher WPLI fits for the sentence condition. After correcting for accuracy there was also a significant task condition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005), but not for the other tasks (p>0.1).”
  
  The discussion now reads: “We found that across participants both the MI and the connectivity in temporal cortex influenced behavioural performance. Specifically, MTG-STG connections were, independent of task, related to accuracy. There was higher connectivity between MTG and STG for sentences compared to word lists at low accuracies. At high accuracies, we found that stronger MTG tracking at phrasal rates (measured with MI) for sentences compared to word lists during the word combination task. These results suggest that indeed tracking of phrasal structure in MTG is relevant to understand sentences compared to word lists. This was reflected in a general increase in delta connectivity differences when the task was difficult (Figure 7B). Participants might compensate for the difficulty using phrasal structure present in the sentence condition. When phrasal structure in sentences are accurately tracked (as measured with MI) performance is better when these rates are relevant (Figure 7A). These results point to a role for phrasal tracking for accurately understanding the higher order linguistic structure in sentences even though more research is needed to verify this. It is evident that the connectivity and tracking correlations to behaviour do not explain all variation in the behavioural performance (compare Figure 1 with 3). Plainly, temporal tracking does not explain everything in language processing. Besides tracking there are many other components important for our designated tasks, such as memory load and semantic context which are not captured by our current analyses.”
  
  Reviewer #2 (Public Review):
  
  In a MEG study, the authors investigate as their main question whether neural tracking at the phrasal time scale reflects linguistic structure building (testing different conditions: sentences vs. word-lists) or an attentional focus on the phrasal time scale (testing different tasks, passive listening, syllable task, word task, word combination/phrasal scale task). They perform the following analyses at brain areas (ROIs: STG, IFG, MTG) of the language network: (1) Mutual information (MI) between the acoustic envelope and the delta band neuronal signals is analyzed. (2) Power in the delta band is analyzed. (3) Connectivity is analyzed using debiased WPLI. For all analyses, linear mixed-models are separately conducted for each ROI. The main finding is that the sentence compared to the word-list condition is more strongly tracked at the phrasal scale (MI). In STG the effect was task-independent; in MTG the effect only occurred for active tasks; and in IFG additionally, the word-combining/phrasal scale task resulted in higher tracking compared to all other tasks. The authors conclude that phrasal scale neural tracking reflects linguistic processing which takes place automatically, while task-related attention contributes additionally at IFG (interpreted as combinatorial hub involved in language and non-language processing). The findings are stable when power differences are controlled. The connectivity analysis showed increased connectivity in the delta band (phrasal time scale) between IFG-STG in the phrasal-scale compared to the passive task (adding to the IFG MI findings). (Additionally, they separately analyze neural tracking at the syllabic and word time scale, which however is not in the main focus).
  
  Major strength/weaknesses of the methods and results:
  
  1) A major strength of the results is that part of them replicate the authors' earlier findings (i.e. higher tracking at the phrasal time scale for sentences compared to word-lists; Kaufeld et al., 2020), while they complement this earlier work by showing that the effects are due to linguistic processing and not to an attentional focus on the phrasal time scale due to the task (at least in STG and MTG; while the task plays a role for the IFG tracking). Another strength is that a power control analysis is applied, which allows excluding spurious results due to condition differences in power. A weakness of the method is that analyses were applied separately per ROI, and combined across correct/incorrect trials (if I understood correctly), no trial-based analysis was conducted (which is related to how MI is computed). Furthermore, several methodological details could be clarified in the manuscript.
  
  The authors achieved their aims by providing evidence that neuronal tracking at the phrasal time scale in STG and MTG depends on the presence of linguistic information at this scale rather than indicating an attentional focus on this time scale due to a specific task. Their results support the conclusion. Results would be strengthened by showing that these effects are not impacted by different amounts of correct/incorrect trials across conditions (if I understood that correctly).
  
  We thank the reviewer for her comments. It is correct that we collapsed across the correct and incorrect trials. This had various reasons (also see point 2 and 9 of reviewer 1 and point 4 of reviewer 3). First, our tasks function solely to direct participants’ attention to the various linguistic representations (syllables, words, phrases) and the timescales that they occur on. The three tasks are in a sense more control tasks to study the tracking response, and manipulate attention as tracking during spoken language comprehension occurs, rather than a case where the neural response to the tasks is itself to be studied. For example, in a typical working memory paradigm, it is only during correct trials that the relevant cognitive process occurs. In contrast, in our paradigm, it is likely that that spoken stimuli are heard and processing, in other words, sentence comprehension and word list perception occur, even during incorrect trials in the syllable condition. As such, we do not expect MI tracking responses to explain the behavioral data. However, we agree it is crucially important to show that MI differences are not a function of task performance differences.
  
  Second, there are clear differences in difficulty level of the trials within conditions. For example, if the target question was related to the last part of the audio fragment, the task was much easier than when it was at the beginning of the audio fragment. In the syllable task, if syllables also were (by chance) a part-word, the trial was also much easier. If we were to split up in correct and incorrect we would not really infer solely processes due to accurately processing the speech fragments, but also confounded the analysis by the individual difficulty level of the trials.
  
  To acknowledge this, we added this limitation to the methods. The methods now reads: “Note that different trials within a task were not matched for task difficulty. For example, in the syllable task syllables that make a word are much easier to recognize than syllables that do not make a word. Additionally, trials pertaining to the beginning of the sentence are more difficult than ones related to the end of the sentence due to recency effects.”.
  
  To still investigate if overall accuracy influenced the results we did add accuracy (across participants) to the mixed models. Note that as for the passive task there is no accuracy, we removed the passive task from the analyses. We could also only run models with random intercepts (not random slopes), due to the reduced number of degrees of freedom when adding the factor accuracy to the models.
  
  For the MI analysis we only found an effect in MTG. Specifically, there was a three-way interaction between task, condition and accuracy (F(2, 91.9) = 3.4591, p = 0.036). To follow up on this three-way interaction we split the data per task. The condition*accuracy interaction was only (uncorrected) significant for the word combination task (F(1,24.8) = 5.296, p = 0.03 (uncorrected)) and not for any other task (p>0.1). In the word combination task, we found that the difference between sentences and word lists was the strongest at high accuracies (see on the right attached figure the predicted values of the model). One way to interpret this finding is that stronger phrasal-rate MI tracking in MTG promotes phrasal-rate processing (as indicated by accuracy) more in sentences than in word lists.
  
  For power we did not find any effect of accuracy. For the connectivity analysis we found in the STG-MTG connectivity a significant conditionaccuracy interaction (F(1, 80.23)=5.19, p = 0.025). The conditionaccuracy interaction showed that lower accuracies were generally associated with stronger differences between the sentences and word lists (see figure below; the opposite of the MI analysis). Thus, functional connections in the delta band are stronger during sentence processing when participants have difficulty with the task (independent of the task performed). This could indicate that low-frequency connections are more relevant for the sentence than the word list condition.
  
  MEG – behavioral performance relation. A) Predicted values for the phrasal band MI in the MTG for the word combination task separately for the two conditions. B) Predicted values for the delta band WPLI in the STG-MTG connection separately for the two conditions. Error bars indicate the 95% confidence interval of the fit. Colored lines at the bottom indicate individual datapoints.
  
  After correcting for accuracy there was also a significant task*condition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005 corrected), but not for the other tasks (p>0.1).
  
  We added the results of the accuracy analyses in the main manuscript as well as adding a dedicated section in our discussion section (page 21-22). Adding accuracy did not remove any of the effects we report in the original analyses. Therefore, none of these finding change the interpretation of the results as the task still had an influence on the MI responses of MTG and IFG. The effect of accuracy in the MTG refined the results showing that the effect was strongest there for participants with high accuracies. This relationship suggests a functional role of tracking through phase alignment for understanding phrasal structure.
  
  The methods now read: “MEG-behavioural performance analysis: To investigate the relation between the MEG measures and the behavioural performance we repeated the analyses (MI, power, and connectivity) but added accuracy as a factor (together with the interactions with the task and condition factor). As there is no accuracy for the passive task, we removed this task from the analysis. We then followed the same analyse steps as before. Since we reduced our degree of freedom, we could however only create random intercept and not random slope models”.
  
  The results now read: “MEG-behavioural performance relation. We found for the MI analysis a significant effect of accuracy only in the MTG. Here, we found a three-way interaction between accuracytaskcondition (F(2, 91.9) = 3.459, p = 0.036). Splitting up for the three different tasks we found only an uncorrected significant effect for the condition*accuracy interaction for the phrasal task (F(1, 24.8) = 5.296, p = 0.03) and not for the other two tasks (p>0.1). In the phrasal task, we found that when accuracy was high, there was a stronger difference between the sentence and the word list condition compared to when accuracy was low, with stronger accuracy for the sentence condition (Figure 7A).
  
  No relation between accuracy and power was found. For the connectivity analysis we found a significant conditionaccuracy interaction for the STG-MTG connection (F(1,80.23) = 5.19, p = 0.025; Figure 7B). Independent of task, when accuracy was low the difference between sentence and word lists was stronger with higher WPLI fits for the sentence condition. After correcting for accuracy there was also a significant taskcondition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005), but not for the other tasks (p>0.1).”
  
  The discussion now reads: “We found that across participants both the MI and the connectivity in temporal cortex influenced behavioural performance. Specifically, MTG-STG connections were, independent of task, related to accuracy. There was higher connectivity between MTG and STG for sentences compared to word lists at low accuracies. At high accuracies, we found that stronger MTG tracking at phrasal rates (measured with MI) for sentences compared to word lists during the word combination task. These results suggest that indeed tracking of phrasal structure in MTG is relevant to understand sentences compared to word lists. This was reflected in a general increase in delta connectivity differences when the task was difficult (Figure 7B). Participants might compensate for the difficulty using phrasal structure present in the sentence condition. When phrasal structure in sentences are accurately tracked (as measured with MI) performance is better when these rates are relevant (Figure 7A). These results point to a role for phrasal tracking for accurately understanding the higher order linguistic structure in sentences even though more research is needed to verify this. It is evident that the connectivity and tracking correlations to behaviour do not explain all variation in the behavioural performance (compare Figure 1 with 3). Plainly, temporal tracking does not explain everything in language processing. Besides tracking there are many other components important for our designated tasks, such as memory load and semantic context which are not captured by our current analyses.”
  
  The findings are an important contribution to the ongoing debate in the field whether neuronal tracking at the phrasal time scale indicates linguistic structure processing or more general processes (e.g. chunking).
  
  Reviewer #3 (Public Review):
  
  This manuscript presents a MEG study aiming to investigate whether neural tracking of phrasal timescales depends on automatic language processing or specific tasks related to temporal attention. The authors collected MEG data of 20 participants as they listened to naturally spoken sentences or word lists during four different tasks (passive listening vs. syllable task vs. word tasks vs. phrase task). Based on mutual information and Connectivity analysis, the authors found that (1) neural tracking at the phrasal band (0.8-1.1 Hz) was significantly stronger for the sentence condition compared to the word list condition across the classical language network, i.e., STG, MTG, and IFG; (2) neural tracking at the phrasal band was (at least tend significantly) stronger for phrase task than other tasks in the IFG; (3) the IFG-STG connectivity was increased in the delta-band for the phrase task. Ultimately, the authors concluded that neural tracking of phrasal timescales relied on both automatic language processing and specific tasks.
  
  Overall, this study is trying to tackle an interesting question related to the contributing factors for neural tracking of linguistic structures. The study procedure and analyses are well executed, and the conclusions of this paper are mostly well supported by data. However, I do have several major concerns.
  
  The title of the manuscript uses the description "tracking of hierarchical linguistic structure". In general, hierarchical linguistic structures involve multiple linguistic units with different timescales, such as syllables, words, phrases, and sentences. In this study, however, the main analysis only focused on the phrasal band (0.8-1.1 Hz). It seemed that there was no significant stimulus- or task-effect on the word band or syllabic band (supplementary figures). Therefore, it is highly recommended that the authors modify the related descriptions, or explain why neural tracking of phrases can represent neural tracking of hierarchical linguistic structures in the current study.
  
  We thank the reviewer for this comment. We meant to refer to the task manipulation directing attention to different levels of representation across the linguistic hierarchy. We have changed the title to “Neural tracking of phrases during spoken language comprehension is automatic and task-dependent.” We hope this resolves any inadvertent confusion we created. Furthermore, throughout the manuscript we ensure to talk about effect occurring for phrasal tracking at low frequency bands at not across any hierarchical linguistic structure. We agree that our findings cannot speak for any task-dependent effects along the hierarchy, only that at the phrasal level there is a difference between sentences and word lists.
  
  In Methods, the authors employed MI analyses on three frequency bands: 0.8-1.1 Hz for the phrasal band, 1.9-2.8 Hz for the word band, and 3.5-5.0 Hz for the syllabic band (line 191-192). As the timescales of linguistic units are various and overlapped in natural speech, I wonder how the authors define the boundaries of these frequency bands, and whether these bands are proper for the naturally spoken stimuli in the current study. These important details should be clarified.
  
  The frequency bands of the MI analysis were based on the stimuli, or in other words, are data driven. They reflect the syllabic, word, and phrasal rates in our stimulus set (calculated in Kaufeld et al., 2020). They were calculated by annotating the sentences by syllables, words, and phrasal and converting the rate of the linguistic units to frequency ranges. The information has been added to the manuscript. We acknowledge that unlike our stimulus set in natural speech the boundaries of these bands can overlap and now also state this (“While in our stimulus set the boundaries of the linguistic levels did not overlap, in natural speech the brain has an even more difficult task as there is no one-to-one match between band and linguistic unit [26]”, line number 211-213).
  
  What is missing in the manuscript are the explanations of the correlation between behavioral performance and neural tracking. In Results, the behavioral performance shows significant differences across the active tasks (Figure 1), but the MI differences across the tasks are relatively weak in IFG (Figure 3). In addition, the behavioral performance only shows significant differences between the sentence and word list conditions during the phrasal task, but the MI differences between the conditions are significant in MTG during the syllabic, word, and phrasal tasks. Explanations for these inconsistent results are expected.
  
  We answer this point together with point 4 below where we analyze the behavioral performance and the MEG responses.
  
  Since the behavioral performance of these active tasks is likely related to the temporal attention to relevant timescales of different linguistic units, I wonder whether there exist underlying neural correlates of behavioral performance (e.g., significant correlation between performance and mutual information). If so, it may be interesting and bring a new bright spot for the current study.
  
  The influence of task difficulty is a relevant point to investigate (also see point 1 of reviewer 2 and point 4 of reviewer 3). To do so we added accuracy (per participant per condition) as a factor in the mixed model (as well as all interactions with task and condition) for the MI, power, and connectivity analyses at the phrasal rate/delta band. Note that as for the passive task there is no accuracy, we removed the passive task from the analyses. We could also only run models with random intercepts (not random slopes), due to the reduced number of degrees of freedom when adding the factor accuracy to the models.
  
  For the MI analysis we only found an effect in MTG. Specifically, there was a three-way interaction between task, condition and accuracy (F(2, 91.9) = 3.4591, p = 0.036). To follow up on this three-way interaction we split the data per task. The condition*accuracy interaction was only (uncorrected) significant for the word combination task (F(1,24.8) = 5.296, p = 0.03 (uncorrected)) and not for any other task (p>0.1). In the word combination task, we found that the difference between sentences and word lists was the strongest at high accuracies (see the below figure the predicted values of the model). One way to interpret this finding is that stronger phrasal-rate MI tracking in MTG promotes phrasal-rate processing (as indicated by accuracy) more in sentences than in word lists.
  
  MEG – behavioral performance relation. A) Predicted values for the phrasal band MI in the MTG for the word combination task separately for the two conditions. B) Predicted values for the delta band WPLI in the STG-MTG connection separately for the two conditions. Error bars indicate the 95% confidence interval of the fit. Colored lines at the bottom indicate individual datapoints.
  
  For power we did not find any effect of accuracy. For the connectivity analysis we found in the STG-MTG connectivity a significant conditionaccuracy interaction (F(1, 80.23)=5.19, p = 0.025). The conditionaccuracy interaction showed that lower accuracies were generally associated with stronger differences between the sentences and word lists (see figure attached; the opposite of the MI analysis). Thus, functional connections in the delta band are stronger during sentence processing when participants have difficulty with the task (independent of the task performed). This could indicate that low-frequency connections are more relevant for the sentence than the word list condition.
  
  After correcting for accuracy there was also a significant task*condition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005 corrected), but not for the other tasks (p>0.1).
  
  We added the results of the accuracy analyses in the main manuscript as well as adding a dedicated section in our discussion section (page 21-22). Adding accuracy did not remove any of the effects we report in the original analyses. Therefore, none of these finding change the interpretation of the results as the task still had an influence on the MI responses of MTG and IFG. The effect of accuracy in the MTG refined the results showing that the effect was strongest there for participants with high accuracies. This relationship suggests a functional role of tracking through phase alignment for understanding phrasal structure.
  
  While the findings can explain some behavioral effects, we agree with the reviewer that the behavioral results and the MI results don’t align. We note that our use of tasks to guide attention to different timescales and linguistic representations differs from the use of, for example, a working memory task where only the correct trials contain the relevant cognitive process. In working memory type paradigms, the MEG data should indeed explain the behavioral response. Our study was designed to test for effects of task demands on the neural tracking response to speech and language. As we are only using the tasks to control attention, we do not attempt to explain behavior through the MEG data or differences in MI.
  
  Thus, the phrasal tracking cannot explain all of the behavioral results (point 3). It is at this point unclear what could have caused this effect, but it quite likely that neural sources outside the speech and language ROIs we selected are in play. We discuss this now.
  
  The methods now read: “MEG-behavioural performance analysis: To investigate the relation between the MEG measures and the behavioural performance we repeated the analyses (MI, power, and connectivity) but added accuracy as a factor (together with the interactions with the task and condition factor). As there is no accuracy for the passive task, we removed this task from the analysis. We then followed the same analyse steps as before. Since we reduced our degree of freedom, we could however only create random intercept and not random slope models”.
  
  The results now read: “MEG-behavioural performance relation. We found for the MI analysis a significant effect of accuracy only in the MTG. Here, we found a three-way interaction between accuracytaskcondition (F(2, 91.9) = 3.459, p = 0.036). Splitting up for the three different tasks we found only an uncorrected significant effect for the condition*accuracy interaction for the phrasal task (F(1, 24.8) = 5.296, p = 0.03) and not for the other two tasks (p>0.1). In the phrasal task, we found that when accuracy was high, there was a stronger difference between the sentence and the word list condition compared to when accuracy was low, with stronger accuracy for the sentence condition (Figure 7A).
  
  No relation between accuracy and power was found. For the connectivity analysis we found a significant conditionaccuracy interaction for the STG-MTG connection (F(1,80.23) = 5.19, p = 0.025; Figure 7B). Independent of task, when accuracy was low the difference between sentence and word lists was stronger with higher WPLI fits for the sentence condition. After correcting for accuracy there was also a significant taskcondition interaction (F(2,80.01) = 3.348, p = 0.040) and a main effect of condition (F(1,80.361) = 5.809, p = 0.018). While overall there was a stronger WPLI for the sentence compared to the word list condition, the interaction seemed to indicate that this was especially the case during the word task (p = 0.005), but not for the other tasks (p>0.1).”
  
  The discussion now reads: “We found that across participants both the MI and the connectivity in temporal cortex influenced behavioural performance. Specifically, MTG-STG connections were, independent of task, related to accuracy. There was higher connectivity between MTG and STG for sentences compared to word lists at low accuracies. At high accuracies, we found that stronger MTG tracking at phrasal rates (measured with MI) for sentences compared to word lists during the word combination task. These results suggest that indeed tracking of phrasal structure in MTG is relevant to understand sentences compared to word lists. This was reflected in a general increase in delta connectivity differences when the task was difficult (Figure 7B). Participants might compensate for the difficulty using phrasal structure present in the sentence condition. When phrasal structure in sentences are accurately tracked (as measured with MI) performance is better when these rates are relevant (Figure 7A). These results point to a role for phrasal tracking for accurately understanding the higher order linguistic structure in sentences even though more research is needed to verify this. It is evident that the connectivity and tracking correlations to behaviour do not explain all variation in the behavioural performance (compare Figure 1 with 3). Plainly, temporal tracking does not explain everything in language processing. Besides tracking there are many other components important for our designated tasks, such as memory load and semantic context which are not captured by our current analyses.”
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.02.08.479571v1
www.biorxiv.org www.biorxiv.org

Conserved structural elements specialize ATAD1 as a membrane protein extraction machine

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response*
  
  Reviewer #3 (Public Review):
  
  AAA protein are involved in a variety of cellular activity. They all share the same structural fold and still they are all incredibly specialised. This study works towards the direction of understanding the unique specialisation of the AAA protein ATAD1. While the general mechanism of substrate threading by AAA proteins is by now fairly well-elucidated, it remains to describe and understand the finer structural protein details that make each specific AAA perform unfolding (threading) of certain substrate rather than others. Additionally, regulation and stabilisation of each AAA is also finely regulated by specific subdomain.
  
  This work is definitively strong in addressing these two points for ATAD1.
  
  The structural data are solid and the analysis of the pore loops residues and the role of a11 overall convincing.
  
  1) The cell fluorescence microscopy assay is a very good tool for checking in the cell the hypothesis risen by analysing of the structure. However, the assay is currently only based on the localisation of the Gos28 substrate, which leaves open the possibility that ATAD1 a11 mutants will have a different phenotype on different substrates.
  
  We agree with the reviewer that it would be interesting to test ATAD1’s activity on other known substrates. To do that, we picked Pex26, an established tail-anchored protein substrate of ATAD1. We stably expressed EGFP-Pex26 in ATAD1-/- cells and tested the effect of ATAD1 expression on Pex26 mislocalization. As shown in the figure below, we found that although the general trend observed for Gos28 also holds true for Pex26, the measured PCC values clearly have a bimodal distribution, with some cells showing the complete mislocalization (PCC = 1.0) of Pex26. One exciting possibility to explain this result is that Pex26 is important in peroxisome biogenesis. Once enough Pex26 is mislocalized to the mitochondria, peroxisomal biogenesis becomes impaired, thus causing less Pex26 to be correctly inserted. A partial impairment in Pex26 peroxisomal insertion in turn creates a vicious cycle that leads to the complete mislocalization of Pex26. It will be an interesting to follow up on the cause of this bimodal distribution, which, however, is beyond the scope of this paper.
  
  *Quantification of live-cell imaging showing using the localization of EGFP-Pex26 as a readout. Mean Pearson correlation coefficient (PCC) values and the SEM between EGFP-Pex26 and the mitochondria when expressing the ATAD1 variants indicated. Individual cell PCC values are represented as a single dot. *
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.24.461712v1
www.biorxiv.org www.biorxiv.org

Mutation saturation for fitness effects at human CpG sites

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  [...] While the study is addressing an interesting topic, I also felt this manuscript was limited in novel findings to take away. Certainly the study clearly shows that substitution saturation is achieved at synonymous CpG sites. However, subsequent main analyses do not really show anything new: the depletion of segregating sites in functional versus neutral categories (Fig 2) has been extensively shown in the literature and polymorphism saturation is not a necessary condition for observing this pattern.
  
  We agree with the reviewer that many of the points raised were appreciated previously and did not mean to convey another impression. Our aim was instead to highlight some unique opportunities provided by being at or very near saturation for mCpG transitions. In that regard, we note that although depletion of variation in functional categories is to be expected at any sample size, the selection strength that this depletion reflects is very different in samples that are far from saturated, where invariant sites span the entire spectrum from neutral to lethal. Consider the depletion per functional category relative to synonymous sites in the adjoining plot in a sample of 100k: ~40% of mCpG LOF sites do not have T mutations. From our Fig. 4 and b, it can be seen that these sites are associated with a much broader range of hs values than sites invariant at 780k, so that information about selection at an individual site is quite limited (indeed, in our p-value formulation, these sites would be assigned p≤0.35, see Fig. 1). Thus, only now can we really start to tease apart weakly deleterious mutations from strongly deleterious or even embryonic lethal mutations. This allows us to identify individual sites that are most likely to underlie pathogenic mutations and functional categories that harbor deleterious variation at the extreme end of the spectrum of possible selection coefficients. More generally, saturation is useful because it allows one to learn about selection with many fewer untested assumptions than previously feasible.
  
  Similarly, the diminishing returns on sampling new variable sites has been shown in previous studies, for example the first "large" human datasets ca. 2012 (e.g. Fig 2 in Nelson et al. 2012, Science) have similar depictions as Figure 3B although with smaller sample sizes and different approaches (projection vs simulation in this study).
  
  We agree completely: diminishing returns is expected on first principles from coalescent theory, which is why we cited a classic theory paper when making that point in the previous version of the manuscript. Nonetheless, the degree of saturation is an empirical question, since it depends on the unknown underlying demography of the recent past. In that regard, we note that Nelson et al. predict that at sample sizes of 400K chromosomes in Europeans, approximately 20% of all synonymous sites will be segregating at least one of three possible alleles, when the observed number is 29%. Regardless, not citing Nelson et al. 2012 was a clear oversight on our part, for which we apologize; we now cite it in that context and in mentioning the multiple merger coalescent.
  
  There are some simulations presented in Fig 4, but this is more of a hypothetical representation of the site-specific DFE under simulation conditions roughly approximating human demography than formal inference on single sites. Again, these all describe the state of the field quite well, but I was disappointed by the lack of a novel finding derived from exploiting the mutation saturation properties at methylated CpG sites.
  
  As noted above, in our view, the novelty of our results lies in their leveraging saturation in order to identify sites under extremely strong selection and make inferences about selection without the need to rely on strong, untested assumptions.
  
  However, we note that Fig 4 is not simply a hypothetical representation, in that it shows the inferred DFE for single mCpG sites for a fixed mutation rate and given a plausible demographic model, given data summarized in terms of three ranges of allele frequency (i.e., = 0, between 1 and 10 copies, or above 10 copies). One could estimate a DFE across all sites from those summaries of the data (i.e., from the proportion of mCpG sites in each of the three frequency categories), by weighting the three densities in Fig 4 by those proportions. That is, in fact, what is done in a recent preprint by Dukler et al. (2021, BioRxiv): they infer the DFE from two summaries of the allele frequency spectrum (in bins of sites), the proportion of invariant sites and the proportion of alleles at 1-70 copies, in a sample of 70K chromosomes.
  
  To illustrate how something similar could be done with Fig. 4 based on individual sites, we obtain an estimate of the DFE for LOF mutations (shown in Panel B and D for two different prior distributions on hs) by weighting the posterior densities in Panel A by the fraction of LOF mutations that are segregating (73% at 780K; 9% at 15K) and invariant (27% and 91% respectively); in panel C, we show the same for a different choice of prior. For the smaller sample size considered, the posterior distribution recapitulates the prior, because there is little information about selection in whether a site is observed to be segregating or invariant, and particularly about strong selection. In the sample of 780K, there is much more information about selection in a site being invariant and therefore, there is a shift towards stronger selection coefficients for LOF mutations regardless of the prior.
  
  Our goal was to highlight these points rather than infer a DFE using these two summaries, which throw out much of the information in the data (i.e., the allele frequency differences among segregating sites). In that regard, we note that the DFE inference would be improved by using the allele frequency at each of 1.1 million individual mCpG sites in the exome. We outline this next step in the Discussion but believe it is beyond the scope of our paper, as it is a project in itself – in particular it would require careful attention to robustness with regard to both the demographic model (and its impact on multiple hits), biased gene conversion and variability in mutation rates among mCpG sites. We now make these points explicitly in the Outlook.
  
  Similarly, I felt the authors posed a very important point about limitations of DFE inference methods in the Introduction but ended up not really providing any new insights into this problem. The authors argue (rightly so) that currently available DFE estimates are limited by both the sparsity of polymorphisms and limited flexibility in parametric forms of the DFE. However, the nonsynonymous human DFE estimates in the literature appear to be surprisingly robust to sample size: older estimates (Eyre-Walker et al. 2006 Genetics, Boyko et al. 2008 PLOS Genetics) seem to at least be somewhat consistent with newer estimates (assuming the same mutation rate) from samples that are orders of magnitude larger (Kim et al. 2017 Genetics).
  
  We are not quite sure what the reviewer has in mind by “somewhat consistent,” as Boyko et al. estimate that 35% of non-synonymous mutations have s>10^-2 while Kim et al. find that proportion to be “0.38–0.84 fold lower” than the Boyko et al. estimate (see, e.g., Fig. 4 in Kim et al., 2017). Moreover, the preprint by Dukler et al. mentioned above, which infers the DFE based on ~70K chromosomes, finds estimates inconsistent with those of Kim et al. (see SOM Table 2 and SOM Figure S5 in Dukler et al., 2021).
  
  More generally, given that even 70K chromosomes carry little information about much of the distribution of selection coefficients (see our Fig. 4), we expect that studies based on relatively sample sizes will basically recover something close to their prior; therefore, they should agree when they use the same or similar parametric forms for the distribution of selection coefficients and disagree otherwise. The dependence on that choice is nicely illustrated in Kim et al., who consider different choices and then perform inference on the same data set and with the same fixed mutation rate for exomes; depending on their choice anywhere between 5%-28% of non-synonymous changes are inferred to be under strong selection with s>=10^-2 (see their Table S4).
  
  Whether a DFE inferred under polymorphism saturation conditions with different methods is different, and how it is different, is an issue of broad and immediate relevance to all those conducting population genomic simulations involving purifying selection. The analyses presented as Fig 4A and 4B kind of show this, but they are more a demonstration of what information one might have at 1M+ sample sizes rather than an analysis of whether genome-wide nonsynonymous DFE estimates are accurate. In other words, this manuscript makes it clear that a problem exists, that it is a fundamental and important problem in population genetics, and that with modern datasets we are now poised to start addressing this problem with some types of sites, but all of this is already very well-appreciated except for perhaps the last point.
  
  At least a crude analysis to directly compare the nonsynonymous genome-wide DFE from smaller samples to the 780K sample would be helpful, but it should be noted that these kinds of analyses could be well beyond the scope of the current manuscript. For example, if methylated nonsynonymous CpG sites are under a different level of constraint than other nonsynonymous sites (Fig. S14) then comparing results to a genome-wide nonsynonymous DFE might not make sense and any new analysis would have to try and infer a DFE independently from synonymous/nonsynonymous methylated CpG sites.
  
  We are not sure what would be learned from this comparison, given that Figure 4 shows that, at least with an uninformative prior, there is little information about the true DFE in samples, even of tens of thousands of individuals. Thus, if some of the genome-wide nonsynonymous DFE estimates based on small sample sizes turn out to be accurate, it will be because the guess about the parametric shape of the DFE was an inspired one. In our view, that is certainly possible but not likely, given that the shape of the DFE is precisely what the field has been aiming to learn and, we would argue, what we are now finally in a position to do for CpG mutations in humans.
  
  Reviewer #2 (Public Review):
  
  This manuscript presents a simple and elegant argument that neutrally evolving CpG sites are now mutationally saturated, with each having a 99% probability of containing variation in modern datasets containing hundreds of thousands of exomes. The authors make a compelling argument that for CpG sites where mutations would create genic stop codons or impair DNA binding, about 20% of such mutations are strongly deleterious (likely impairing fitness by 5% or more). Although it is not especially novel to make such statements about the selective constraint acting on large classes of sites, the more novel aspect of this work is the strong site-by-site prediction it makes that most individual sites without variation in UK Biobank are likely to be under strong selection.
  
  The authors rightly point out that since 99% of neutrally evolving CpG sites contain variation in the data they are looking at, a CpG site without variation is likely evolving under constraint with a p value significance of 0.01. However, a weakness of their argument is that they do not discuss the associated multiple testing problem-in other words, how likely is it that a given non synonymous CpG site is devoid of variation but actually not under strong selection? Since one of the most novel and useful deliverables of this paper is single-base-pair-resolution predictions about which sites are under selection, such a multiple testing correction would provide important "error bars" for evaluating how likely it is that an individual CpG site is actually constrained, not just the proportion of constrained sites within a particular functional category.
  
  We thank the reviewer for pointing this out. One way to think about this problem might be in terms of false discovery rates, in which case the FDR would be 16% across all non-synonymous mCpG sites that are invariant in current samples, and ~4% for the subset of those sites where mutations lead to loss-of-function of genes.
  
  Another way to address this issue, which we had included but not emphasized previously, is by examining how one’s beliefs about selection should be updated after observing a site to be invariant (i.e., using Bayes odds). At current sample sizes and assuming our uninformative prior, for a non-synonymous mCpG site that does not have a C>T mutation, the Bayes odds are 15:1 in favor of hs>0.5x10^-3; thus the chance that such a site is not under strong selection is 1/16, given our prior and demographic model. These two approaches (FDR and Bayes odds) are based on somewhat distinct assumptions.
  
  We have now added and/or emphasized these two points in the main text.
  
  The paper provides a comparison of their functional predictions to CADD scores, an older machine-learning-based attempt at identifying site by site constraint at single base pair resolution. While this section is useful and informative, I would have liked to see a discussion of the degree to which the comparison might be circular due to CADD's reliance on information about which sites are and are not variable. I had trouble assessing this for myself given that CADD appears to have used genetic variation data available a few years ago, but obviously did not use the biobank scale datasets that were not available when that work was published.
  
  We apologize for the lack of clarity in the presentation. We meant to emphasize that de novo mutation rates vary across CADD deciles when considering all CpG sites (Fig. 2-figure supplement 5c), which confounds CADD precisely because it is based in part on which sites are variable. We have edited the manuscript to clarify this.
  
  Reading this paper left me excited about the possibility of examining individual invariant CpG sites and deducing how many of them are already associated with known disease phenotypes. I believe the paper does not mention how many of these invariant sites appear in Clinvar or in databases of patients with known developmental disorders, and I wondered how close to saturation disease gene databases might be given that individuals with developmental disorders are much more likely to have their exomes sequenced compared to healthy individuals. One could imagine some such analyses being relatively low hanging fruit that could strengthen the current paper, but the authors also make several reference to a companion paper in preparation that deals more directly with the problem of assessing clinical variant significance. This is a reasonable strategy, but it does give the discussion section of the paper somewhat of a "to be continued" feel.
  
  We apologize for the confusion that arose from our references to a second manuscript in prep. The companion paper is not a continuation of the current manuscript: it contains an analysis of fitness and pathogenic effects of loss-of-function variation in human exomes.
  
  Following the reviewer’s suggestion to address the clinical significance of our results, we have now examined the relationship of mCpG sites invariant in current samples with Clinvar variants. We find that of the approximately 59,000 non-synonymous mCpG sites that are invariant, only ~3.6% overlap with C>T variants associated with at least one disease and classified as likely pathogenic in Clinvar (~5.8% if we include those classified as uncertain or with conflicting evidence as pathogenic). Approximately 2% of invariant mCpGs have C>T mutations in what is, to our knowledge, the largest collection of de novo variants ascertained in ~35,000 individuals with developmental disorders (DDD, Kaplanis et al. 2020). At the level of genes, of the 10k genes that have at least one invariant non-synonymous mCpG, only 8% (11% including uncertain variants) have any non-synonymous hits in Clinvar, and ~8% in DDD. We think it highly unlikely that the large number of remaining invariant sites are not seen with mutations in these databases because such mutations are lethal; rather it seems to us to be the case that these disease databases are far from saturation as they contain variants from a relatively small number of individuals, are subject to various ascertainment biases both at the variant level and at the individual level, and only contain data for a small subset of existing severe diseases.
  
  With a view to assessing clinical relevance however, we can ask a related question, namely how informative being invariant in a sample of 780k is about pathogenicity in Clinvar. Although the relationship between selection and pathogenicity is far from straightforward, being an invariant non-synonymous mCpG in current samples not only substantially increases (15-10fold) the odds of hs > 0.5x10-3 (see Fig. 4b), it also increases the odds of being classified as pathogenic vs. benign in Clinvar 8-51 fold. In the DDD sample, we don’t know which variants are pathogenic; however, if we consider non-synonymous mutations that occur in consensus DDD genes as pathogenic (a standard diagnostic criterion), being invariant increases the odds of being classified as pathogenic 6-fold. We caution that both Clinvar classifications and the identification of consensus genes in DDD relies in part on whether a site is segregating in datasets like ExAC, so this exercise is somewhat circular. Nonetheless it illustrates that there is some information about clinical importance in mCpG sites that are invariant in current samples, and that the degree of enrichment (6 to 51-fold) is very roughly on par with the Bayes odds that we estimate of strong selection conditional on a site being invariant. We have added these findings to the main text and added the plot as Supplementary Figure 13.
  
  Reviewer #3 (Public Review):
  
  [...] The authors emphasize several times how important an accurate demographic model is. While we may be close to a solid demographic model for humans, this is certainly not the case for many other organisms. Yet we are not far off from sufficient sample sizes in a number of species to begin to reach saturation. I found myself wondering how different the results/inference would be under a different model of human demographic history. Though likely the results would be supplemental, it would be nice in the main text to be able to say something about whether results are qualitatively different under a somewhat different published model.
  
  We had previously examined the effect of a few demographic scenarios with large increases in population size towards the present on the average length of the genealogy of a sample (and hence the expected number of mutations at a site) in Figure 3-figure supplement 1b, but without quantifying the effect on our selection inference. Following this suggestion, we now consider a widely used model of human demography inferred from a relatively small sample, and therefore not powered to detect the huge increase in population size towards the present (Tennessen et al. 2012). Using this model, we find a poor fit to the proportion of segregating CpG sites (the observed fraction is 99% in 780k exomes, when the model predicts 49%). Also, as expected, inferences about selection depend on the accuracy of the demographic model (as can be seen by comparing panel B to Fig 4B in the main text).
  
  On a similar note, while a fixed hs simplifies much of the analysis, I wondered how results would differ for 1) completely recessive mutations and 2) under a distribution of dominance coefficients, especially one in which the most deleterious alleles were more recessive. Again, though I think it would strengthen the manuscript by no means do I feel this is a necessary addition, though some discussion of variation in dominance would be an easy and helpful add.
  
  There's some discussion of population structure, but I also found myself wondering about GxE. That is, another reason a variant might be segregating is that it's conditionally neutral in some populations and only deleterious in a subset. I think no analysis to be done here, but perhaps some discussion?
  
  We agree that our analysis ignores the possibilities of complete recessivity in fitness (h=0) as well as more complicated selection scenarios, such as spatially-varying selection (of the type that might be induced by GxE). We note however that so long as there are any fitness effects in heterozygotes, the allele dynamics will be primarily governed by hs; one might also imagine that under some conditions, the mean selection effect across environments would predict allele dynamics reasonably well even in the presence of GxE. Also worth exploring in our view is the standard assumption that hs remains fixed even as Ne changes dramatically. We now mention these points in the Outlook.
  
  Maybe I missed it, but I don't think the acronym DNM is explained anywhere. While it was fairly self-explanatory, I did have a moment of wondering whether it was methylation or mutation and can't hurt to be explicit.
  
  We apologize for the oversight and have updated the text accordingly.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.02.446661v1
www.biorxiv.org www.biorxiv.org

Role of YAP in early ectodermal specification and a Huntington’s Disease model of human neurulation

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The manuscript by Piccolo and colleagues employs an in vitro neuruloid system to investigate the role of Hippo/YAP signaling pathway in early ectodermal fate specification. The authors examine YAP expression in forming neuruloids and test how manipulation of Hippo/Yap signaling affects their cellular composition. They observe that YAP expression is dynamic and enriched in cells occupying periphery of the neuruloid. Overactivation of the YAP activity by the Lats-kinase inhibitor TRULI leads to an expansion of TFAP2A+ cells (NNE) at early stages and of KRT18+ cells (epidermal) at later stages of development. Accordingly, the authors propose that YAP acts as a lineage determinant that (i) promotes a NNE fate during early development and (ii) impacts the fate of NNE cells by promoting an epidermal instead of a neural crest fate. Finally, the authors report that neuruloids developed with cells harboring mutations characteristics of Huntington's disease display elevated Yap activity.
  
  The study takes advantage of the neuruloid system to examine the role of Hippo-Yap in early development and disease. A strength of the study is the use of the neuruloid as a proxy for the human embryo, which allows the authors to examine the control of spatial patterning in early development (in both wild type and altered cellular states). Yet, this model also presents significant limitations. Some of the results indicate a high degree of variability in YAP activity (and ectodermal patterning) in neuruloids obtained from different inductions. This raises the concern that the neuruloid system may interfere with Hippo/YAP. Furthermore, the model proposed by the authors is not consistent with the functional manipulations with pharmacological agents (e.g., pharmacological activation of YAP results in an increase of both neural and NNE cells; inhibition of YAP does not result in the expected phenotypes).
  
  We thank the reviewer for her/his compliments on our work. The reviewer also points to the limitations of our neuruloid models and asks for clarifications.
  
  The authors propose that YAP activation promotes a non-neural ectodermal (NNE) fate in early neuruloids, and subsequently drives NNE to differentiate into epidermis. However, manipulation of Hippo signaling with pharmacological inhibitors does not entirely support this, as treatment of neuruloids with agonist TRULI leads to expansion of both the PAX6 neural population and the NNE Tfap2a population. A prediction of the model is that treatment with verteporfin should neuralize the organoids, which is not the case (Fig 6A). This disconnect between the model presented in Figure 6D and the experimental results should be addressed by the authors.
  
  We would like to thank the reviewer for this request. In our experiments we observed a dual effect of YAP activation (or HD mutation). As noted by the reviewer, ectodermal lineage- specification occurs both early (increased NNE induction) and late (enhanced epidermis differentiation and contraction of NC differentiation). Moreover, we observed a structural consequence of increased YAP activation in neuruloids, failure of the NE domain to fully close. Following the reviewer suggestion, we have now included an additional panel in Figure 6 to illustrate the phenotype alongside the difference in ectodermal lineage specification (panel E). We have also added in the Discussion a paragraph that highlights the architectural aspect of the observed phenotype.
  
  Regarding the interpretation of the effect of pharmacological inhibition of YAP, we believe that the result of verteporfin treatment on WT neuruloids indicates that YAP activity is not required for this specification but can skew the differentiation towards NNE and epidermis. This is now included in the Results, and a new paragraph has been added in the Discussion directly addressing this point.
  
  The study at times conflates YAP expression with activation of the Hippo-YAP pathway. While the images in figures 1,2, and 4 show changes in YAP expression, confirmation of Hippo-YAP pathway activity should include the use of a reporter (e.g., HOP-Flash) or at least high magnification images showing translocation of YAP to the nucleus. Overall, inclusion of better quantification of YAP-activity is crucial to support the manuscript's conclusions (the authors should also state the number of micropatterns used in each quantitative experiment).
  
  Our evidence correlating YAP nuclear localization with activity is based on: (i) Immunoblots (Figures 1D and 2B); (ii) Confocal image analysis (Figures 1E, 2D, and 4B); and (iii) Induction of YAP target-genes expression as demonstrated by our scRNA-seq analysis, occurs in same epidermal (KRT18+) lineage cells that display the highest levels of YAP nuclear accumulation (Figure 2). However, to strengthen this argument and following the reviewer’s advice, we have now added magnified confocal microscope images of YAP/DAPI staining used to measure nuclear YAP localization at D4 (Figure 1—figure supplement 5). We have also added a slowed and magnified videos of the YAP-GFP/H2B-mCherry (and YAP-GFP alone) at D3-D4, which illustrates the dynamic accumulation of YAP in the nucleus of cells upon BMP4 stimulation (Figure 1—video 2, Figure 1—video 3, Figure 1—video 5, Figure 1—video 6, Figure 4—video 2 and Figure 4—video 3). Finally, the number of colonies analyzed for each experiment is now added in the Figure Legends.
  
  A limitation of the study is that it does not investigate the possibility that Hippo/Yap could be affecting cell proliferation in the different lineages, instead of acting as a cell fate determinant. This is particularly important since Hippo is affected by cell density, which varies from the center to the periphery of the neuruloid. Different rates of proliferation over several days could potentially lead to drastic changes in neuruloid cellular composition.
  
  To address the reviewer’s legitimate point, and assess to potential effects of YAP activation in HD-neuruloids, we performed three sets of experiments. First, we performed RNA-velocity analysis to determine the cellular trajectories within each lineage (FigureXA, below), and calculated the velocity of Seurat’s “cell cycle-associated” genes in each cell population in our scRNA-seq dataset at D4. This analysis indicates that the three ectodermal progenitors have a comparable rate of cell division, with NE being slightly faster than the others and epidermis being the slowest (Figure XB). However, these differences are subtle: the mean velocity of these cell-cycle genes within each population are not significantly different across the three ectodermal lineages (FigureXC). Second, comparison of velocity values between WT and HD, highlighted a significant HD-associated increase in the dynamic of cell-cycle associated genes only within the NE population (FigureXD), consistent with the observation that YAP is ectopically active in this lineage. This increase is also not very dramatic, for the mean velocity of these genes is not significantly different in any comparison at this time (Figure XE).
  
  Figure X. Proliferation rate analysis of D4 neuruloid from scRNAseq dataset. A) transcriptional trajectories were identified in the three ectodermal lineages. B) Velocity of cell cycle associated genes show that NNE lineages (NC and E) are slightly faster than NE. C) However this is not significant the mean population level. D) NE in HD D4 neuruloids display subtle increase in the velocity of cell cycle associated genes. E) Such effect disappears at the mean population level.
  
  Finally, quantification of the number of mitotic nuclei per colony as marked by phospho-histone H3 (Kim et al., 2017) at different time points, demonstrated that YAP activation by TRULI leads to an increase in cell proliferation, especially in late neuruloids. This evidence is now presented in the new Supplemental Figure 4—figure supplement 3. We thank the reviewer for bringing this point to our attention.
  
  It is also important to note that our study does not suggest that YAP is a bona fide cell-fate determinant, but rather that that the global phenotypic signature of YAP activation is influenced by differential regulation of cell-cycle dynamics. Moreover, inasmuch as YAP inhibition with verteporfin does not effect neuruloid formation, we believe that YAP is more of a booster signal operating on top of differentiation programs.
  
  The results of the study contradict a previous reports, and some of these contradictions are not sufficiently addressed. The authors state that the activation of YAP in culture leads to a "complete loss of NC-like SOX10+ colonies"; however, a number of studies in in vivo models support a role for YAP as a positive regulator of neural crest specification.
  
  We thank the reviewer for pointing to the results observed in model systems. We have now included a paragraph in which we acknowledge that YAP has been previously associated with NC specification and survival. However, it should be noted that these conclusions are based on data obtained from non-human model organisms such as Xenopus, or relied on differentiation protocols that are independent of BMP4 stimulation. We believe that the phenotype of unbalanced specification of the NNE depends on an epistatic relationship between BMP4 and Hippo-YAP pathway, which might play a crucial role during human neurulation.
  
  Furthermore, the authors briefly speculate on the finding that Huntington's disease neuruloids have high YAP activity (whereas tissues from patients have low activity), but there is no real clear link to the pathophysiology of the disease.
  
  In our in vitro assay that recapitulates aspects of human neurulation, we observed an early increase (D4) followed by a later decline (D7) in YAP activity associated with HD mutation, which is comparable to a dysregulation of the Hippo pathway that was observed in HD patients. To better clarify this aspect and its potential implication during embryogenesis we have now expanded our Discussion on the possible connection between HD and embryonic development.
  
  Experimental results presented in different figures are often inconsistent throughout the manuscript. This should be examined by the authors since it suggests a lack of reproducibility in the neuruloid protocol. For example, the expression of TFAP2A at D4 neuruloids is a sparse halo at D4 in Fig4D, but robust in Fig1E.
  
  The reviewer is correct in pointing to a certain degree of variability between experiments, especially during the period (D4) when the first NNE lineage begin to emerge (i.e., Supplemental Figure 4—figure supplement 2). Because parallel experimental conditions such as comparison with HD samples or TRULI treatment show consistent trends, however, we believe that our interpretation of these results is fundamentally correct.
  
  The western blot in fig1D shows bands for tYAP and pYAP at D4, while in Fig2B the bands are not present (Fig1D also shows double bands for both markers while fig2B presents single bands).
  
  There are several splicing alternative isoforms of human YAP (Vrbský et al., 2021). Immunoblots for YAP in YAP-GFP biallelically tagged cell lines (Figure 1—figure supplement 1B) show that two isoforms are detectable at pluripotency. During neural induction (D1-D3) both isoforms are downregulated, and upon BMP4 stimulation the larger isoform (top band) is primarily upregulated, so that from D4 onwards only the top band is visible (Figure 2B). To better clarify this point, we now discuss this in the Results and include Supplemental Figures with the quantification of the top and bottom bands (D1-D4, Figure 1D) and only of the top band (D4D7, Figure 2B and Figure 1—figure supplement 4).
  
  As Hippo responds very quickly to cell density, mechanical forces, etc., these inconsistencies could affect the proposed analyses.
  
  As previously mentioned, we have assessed the effect on proliferation rate due to YAP activation by TRULI or HD mutation in neuruloids by scRNA-seq analysis and by counting the number of mitotic cells at different times. Our manuscript leaves open the relationship between HTT mutation and YAP hyperactivation, which likely is mediated in part by these factors, but we do address possible connections in the discussion.
  
  Reviewer #2:
  
  This manuscript by Piccolo et al identifies YAP signalling as key player in lineage determination during development of early human ectoderm. Additionally, the authors show that neuroloids generated using cells engineered to express penetrant levels of CAG repeats in the HTT gene display aberrant YAP signalling during ectodermal specification and that this phenotype can be partially rescued by inhibition of this pathway. This is interesting study and the similarity of the YAP-activated neuroloids and the HD neuroloids is striking. The value of this work would be increased by providing experiments to definitively demonstrate the role of YAP signalling in NNE specification and in HD neuroloids.
  
  We also thank this reviewer for her/his compliments on our work. The reviewer also expresses specific recommendations listed below:
  
  Specific comment: The authors describe the emergence of non-neuronal ectoderm (NNE) at the edges of the printed island cell colony and neuronal ectoderm (NE) within this circular colony. However, they do not show images of any lineage markers confirming that these regions are, in fact, NNE and NE.
  
  We show in Figure 1E that the edges of the neuruloids are positive for TFAP2A, a marker for the NNE lineage. In Figure 4D we also show TFAP2A at the edge (NNE) and PAX6 at the center (NE). Additionally, the spatial identity of the various ectodermal lineages was full characterized in our previous study (Haremaki et al., 2018).
  
  They also don't show that this YAP-GFP cell line recapitulates endogenous fix-and-stains of YAP in these colonies.
  
  Figure 1E shows YAP expression at D4 by immunolabeling for YAP/DAPI acquired by confocal microscopy, which recapitulates that of immunofluorescence detection of nuclear YAP , shown in Figure 4B , and the results obtained by live fluorescence (YAP-GFP/H2B, Figure 4A).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.08.11.455964v1
www.biorxiv.org www.biorxiv.org

New submission 16/06/2022, 16:31:11

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors evaluate the involvement of the hippocampus in a fast-paced time-to-contact estimation task. They find that the hippocampus is sensitive to feedback received about accuracy on each trial and has activity that tracks behavioral improvement from trial to trial. Its activity is also related to a tendency for time estimation behavior to regress to the mean. This is a novel paradigm to explore hippocampal activity and the results are thus novel and important, but the framing as well as discussion about the meaning of the findings obscures the details of the results or stretches beyond them in many places, as detailed below.
  
  We thank the reviewer for their constructive feedback and were happy to read that s/he considered our approach and results as novel and important. The comments led us to conduct new fMRI analyses, to clarify various unclear phrasings regarding our methods, and to carefully assess our framing of the interpretation and scope of our results. Please find our responses to the individual points below.
  
  1) Some of the results appear in the posterior hippocampus and others in the anteriorhippocampus. The authors do not motivate predictions for anterior vs. posterior hippocampus, and they do not discuss differences found between these areas in the Discussion. The hippocampus is treated as a unitary structure carrying out learning and updating in this task, but the distinct areas involved motivate a more nuanced picture that acknowledges that the same populations of cells may not be carrying out the various discussed functions.
  
  We thank the reviewer for pointing this out. We split the hippocampus into anterior and posterior sections because prior work suggested a different whole-brain connectivity and function of the two. This was mentioned in the methods section (page 15) in the initial submission but unfortunately not in the main text. Moreover, when discussing the results, we did indeed refer mostly to the hippocampus as a unitary structure for simplicity and readability, and because statements about subcomponents are true for the whole. However, we agree with the reviewer that the differences between anterior and posterior sections are very interesting, and that describing these effects in more detail might help to guide future work more precisely.
  
  In response to the reviewer's comment, we therefore clarified at various locations throughout the manuscript whether the respective results were observed in the posterior or anterior section of the hippocampus, and we extended our discussion to reflect the idea that different functions may be carried out by distinct populations of hippocampal cells. In addition, we also now motivate the split into the different sections better in the main text. We made the following changes.
  
  Page 3: “Second, we demonstrate that anterior hippocampal fMRI activity and functional connectivity tracks the behavioral feedback participants received in each trial, revealing a link between hippocampal processing and timing-task performance.
  
  Page 3: “Fourth, we show that these updating signals in the posterior hippocampus were independent of the specific interval that was tested and activity in the anterior hippocampus reflected the magnitude of the behavioral regression effect in each trial.”
  
  Page 5: “We performed both whole-brain voxel-wise analyses as well as regions-of-interest (ROI) analysis for anterior and posterior hippocampus separately, for which prior work suggested functional differences with respect to their contributions to memory-guided behavior (Poppenk et al., 2013, Strange et al. 2014).”
  
  Page 9: “Because anterior and posterior sections of the hippocampus differ in whole-brain connectivity as well as in their contributions to memory-guided behavior (Strange et al. 2014), we analyzed the two sections separately. “
  
  Page 9: “We found that anterior hippocampal activity as well as functional connectivity reflected the feedback participants received during this task, and its activity followed the performance improvements in a temporal-context-dependent manner. Its activity reflected trial-wise behavioral biases towards the mean of the sampled intervals, and activity in the posterior hippocampus signaled sensorimotor updating independent of the specific intervals tested.”
  
  Page 10: “Intriguingly, the mechanisms at play may build on similar temporal coding principles as those discussed for motor timing (Yin & Troger, 2011; Eichenbaum, 2014; Howard, 2017; Palombo & Verfaellie, 2017; Nobre & van Ede, 2018; Paton & Buonomano, 2018; Bellmund et al., 2020, 2021; Shikano et al., 2021; Shimbo et al., 2021), with differential contributions of the anterior and posterior hippocampus. Note that our observation of distinct activity modulations in the anterior and posterior hippocampus suggests that the functions and coding principles discussed here may be mediated by at least partially distinct populations of hippocampal cells.”
  
  Page 11: Interestingly, we observed that functional connectivity of the anterior hippocampus scaled negatively (Fig. 2C) with feedback valence [...]
  
  2) Hippocampal activity is stronger for smaller errors, which makes the interpretationmore complex than the authors acknowledge. If the hippocampus is updating sensorimotor representations, why would its activity be lower when more updating is needed?
  
  Indeed, we found that absolute (univariate) activity of the hippocampus scaled with feedback valence, the inverse of error (Fig. 2A). We see multiple possibilities for why this might be the case, and we discussed some of them in a dedicated discussion section (“The role of feedback in timed motor actions”). For example, prior work showed that hippocampal activity reflects behavioral feedback also in other tasks, which has been linked to learning (e.g. Schönberg et al., 2007; Cohen & Ranganath, 2007; Shohamy & Wagner, 2008; Foerde & Shohamy, 2011; Wimmer et al., 2012). In our understanding, sensorimotor updating is a form of ‘learning’ in an immediate and behaviorally adaptive manner, and we therefore consider our results well consistent with this earlier work. We agree with the reviewer that in principle activity should be stronger if there was stronger sensorimotor updating, but we acknowledge that this intuition builds on an assumption about the relationship between hippocampal neural activity and the BOLD signal, which is not entirely clear. For example, prior work revealed spatially informative negative BOLD responses in the hippocampus as a function of visual stimulation (e.g. Szinte & Knapen 2020), and the effects of inhibitory activity - a leading motif in the hippocampal circuitry - on fMRI data are not fully understood. This raises the possibility that the feedback modulation we observed might also involve negative BOLD responses, which would then translate to the observed negative correlation between feedback valence and the hippocampal fMRI signal, even if the magnitude of the underlying updating mechanism was positively correlated with error. This complicates the interpretation of the direction of the effect, which is why we chose to avoid making strong conclusions about it in our manuscript. Instead, we tried discussing our results in a way that was agnostic to the direction of the feedback modulation. Importantly, hippocampal connectivity with other regions did scale positively with error (Fig. 2B), which we again discussed in the dedicated discussion section.
  
  In response to the reviewer’s comment, we revisited this section of our manuscript and felt the latter result deserved a better discussion. We therefore took this opportunity to extend our discussion of the connectivity results (including their relationship to the univariate-activity results as well as the direction of these effects), all while still avoiding strong conclusions about directionality. Following changes were made to the manuscript.
  
  Page 11: Interestingly, we observed that functional connectivity of the anterior hippocampus scaled negatively (Fig. 2C) with feedback valence, unlike its absolute activity, which scaled positively with feedback valence (Fig. 2A,B), suggesting that the two measures may be sensitive to related but distinct processes.
  
  Page 11: Such network-wide receptive-field re-scaling likely builds on a re-weighting of functional connections between neurons and regions, which may explain why anterior hippocampal connectivity correlated negatively with feedback valence in our data. Larger errors may have led to stronger re-scaling, which may be grounded in a corresponding change in functional connectivity.
  
  3) Some tests were one-tailed without justification, which reduces confidence in the robustness of the results.
  
  We thank the reviewer for pointing us to the fact that our choice of statistical tests was not always clear in the manuscript. In the analysis the reviewer is referring to, we predicted that stronger sensorimotor updating should lead to stronger activity as well as larger behavioral improvements across the respective trials. This is because a stronger update should translate to a more accurate “internal model” of the task and therefore to a better performance. We tested this one-sided hypothesis using the appropriate test statistic (contrasting trials in which behavioral performance did improve versus trials in which it did not improve), but we did not motivate our reasoning well enough in the manuscript. The revised manuscript therefore includes the two new statements shown below to motivate our choice of test statistic more clearly.
  
  Page 7: [...] we contrasted trials in which participants had improved versus the ones in which they had not improved or got worse (see methods for details). Because stronger sensorimotor updating should lead to larger performance improvements, we predicted to find stronger activity for improvements vs. no improvements in these tests (one-tailed hypothesis).
  
  Page 18: These two regressors reflect the tests for target-TTC-independent and target-TTC-specific updating, respectively. Because we predicted to find stronger activity for improvements vs. no improvements in behavioral performance, we here performed one-tailed statistical tests, consistent with the direction of this hypothesis. Improvement in performance was defined as receiving feedback of higher valence than in the corresponding previous trial.
  
  4) The introduction motivates the novelty of this study based on the idea that thehippocampus has traditionally been thought to be involved in memory at the scale of days and weeks. However, as is partially acknowledged later in the Discussion, there is an enormous literature on hippocampal involvement in memory at a much shorter timescale (on the order of seconds). The novelty of this study is not in the timescale as much as in the sensorimotor nature of the task.
  
  We thank the reviewer for this helpful suggestion. We agree that a key part of the novelty of this study is the use of the task that is typically used to study sensorimotor integration and timing rather than hippocampal processing, along with the new insights this task enabled about the role of the hippocampus in sensorimotor updating. As mentioned in the discussion, we also agree with the reviewer that there is prior literature linking hippocampal activity to mnemonic processing on short time scales. We therefore rephrased the corresponding section in the introduction to put more weight on the sensorimotor nature of our task instead of the time scales.
  
  Note that the new statement still includes the time scale of the effects, but that it is less at the center of the argument anymore. We chose to keep it in because we do think that the majority of studies on hippocampal-dependent memory functions focus on longer time scales than our study does, and we expect that many readers will be surprised about the immediacy of how hippocampal activity relates to ongoing behavioral performance (on ultrashort time scales).
  
  We changed the introduction to the following.
  
  Page 2: Here, we approach this question with a new perspective by converging two parallel lines of research centered on sensorimotor timing and hippocampal-dependent cognitive mapping. Specifically, we test how the human hippocampus, an area often implicated in episodic-memory formation (Schiller et al., 2015; Eichenbaum, 2017), may support the flexible updating of sensorimotor representations in real time and in concert with other regions. Importantly, the hippocampus is not traditionally thought to support sensorimotor functions, and its contributions to memory formation are typically discussed for longer time scales (hours, days, weeks). Here, however, we characterize in detail the relationship between hippocampal activity and real-time behavioral performance in a fast-paced timing task, which is traditionally believed to be hippocampal-independent. We propose that the capacity of the hippocampus to encode statistical regularities of our environment (Doeller et al. 2005, Shapiro et al. 2017, Behrens et al., 2018; Momennejad, 2020; Whittington et al., 2020) situates it at the core of a brain-wide network balancing specificity vs. regularization in real time as the relevant behavior is performed.
  
  5) The authors used three different regressors for the three feedback levels, asopposed to a parametric regressor indexing the level of feedback. The predictions are parametric, so a parametric regressor would be a better match, and would allow for the use of all the medium-accuracy data.
  
  The reviewer raises a good point that overlaps with question 3 by reviewer 2. In the current analysis, we model the three feedback levels with three independent regressors (high, medium, low accuracy). We then contrast high vs. low accuracy feedback, obtaining the results shown in Fig. 2AB. The beta estimates obtained for medium-accuracy feedback are being ignored in this contrast. Following the reviewer’s feedback, we therefore re-run the model, this time modeling all three feedback levels in one parametric regressor. All other regressors in the model stayed the same. Instead of contrasting high vs. low accuracy feedback, we then performed voxel-wise t-tests on the beta estimates obtained for the parametric feedback regressor.
  
  The results we observed were highly consistent across the two analyses, and all conclusions presented in the initial manuscript remain unchanged. While the exact t-scores differ slightly, we replicated the effects for all clusters on the voxel-wise map (on whole-brain FWE-corrected levels) as well as for the regions-of-interest analysis for anterior and posterior hippocampus. These results are presented in a new Supplementary Figure 3C.
  
  Note that the new Supplementary Figure 3B shows another related new analyses we conducted in response to question 4 of reviewer 2. Here, we re-ran the initial analysis with three feedback regressors, but without modeling the inter-trial interval (ITI) and the inter-session interval (ISI, i.e. the breaks participants took) to avoid model over-specification. Again, we replicated the results for all clusters and the ROI analysis, showing that the initial results we presented are robust.
  
  The following additions were made to the manuscript.
  
  Page 5: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]
  
  Page 17: Moreover, instead of modeling the three feedback levels with three independent regressors, we repeated the analysis modeling the three feedback levels as one parametric regressor with three levels. All other regressors remained unchanged, and the model included the regressors for ITIs and ISIs. We then conducted t-tests implemented in SPM12 using the beta estimates obtained for the parametric feedback regressor (Fig. 2C). Compared to the initial analyses presented above, this has the advantage that medium-accuracy feedback trials are considered for the statistics as well.
  
  6) The authors claim that the results support the idea that the hippocampus is findingan "optimal trade-off between specificity and regularization". This seems overly speculative given the results presented.
  
  We understand the reviewer's skepticism about this statement and agree that the manuscript does not show that the hippocampus is finding the trade-off between specificity and regularization. However, this is also not exactly what the manuscript claims. Instead, it suggests that the hippocampus “may contribute” to solving this trade-off (page 3) as part of a “brain-wide network“ (pages 2,3,9,12). We also state that “Our [...] results suggest that this trade-off [...] is governed by many regions, updating different types of task information in parallel” (Page 11). To us, these phrasings are not equivalent, because we do not think that the role of the hippocampus in sensorimotor updating (or in any process really) can be understood independently from the rest of the brain. We do however think that our results are in line with the idea that the hippocampus contributes to solving this trade-off, and that this is exciting and surprising given the sensorimotor nature of our task, the ultrashort time scale of the underlying process, and the relationship to behavioral performance. We tried expressing that some of the points discussed remain speculation, but it seems that we were not always successful in doing so in the initial submission. We apologize for the misunderstanding, adapted corresponding statements in the manuscript, and we express even more carefully that these ideas are speculation.
  
  Following changes were made to the introduction and discussion.
  
  Page 2: Here, we approach this question with a new perspective by converging two parallel lines of research centered on sensorimotor timing and hippocampal-dependent cognitive mapping. Specifically, we test how the human hippocampus, an area often implicated in episodic-memory formation (Schiller et al., 2015; Eichenbaum, 2017), may support the flexible updating of sensorimotor representations in real time and in concert with other regions.
  
  Page 12: Because hippocampal activity (Julian & Doeller, 2020) and the regression effect (Jazayeri & Shadlen, 2010) were previously linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. This may explain why hippocampal activity reflected the magnitude of the regression effect as well as behavioral improvements independently from TTC, and why it reflected feedback, which informed the updating of the internal prior.
  
  Page 12: This is in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.
  
  Page 13: This is in line with the notion that the hippocampus [...] supports finding an optimal trade off between specificity and regularization along with other regions. [...] Our results show that the hippocampus supports rapid and feedback-dependent updating of sensorimotor representations, suggesting that it is a central component of a brain-wide network balancing task specificity vs. regularization for flexible behavior in humans.
  
  Note that in response to comment 1 by reviewer 2, the revised manuscript now reports the results of additional behavioral analyses that support the notion that participants find an optimal trade-off between specificity and regularization over time (independent of whether the hippocampus was involved or not).
  
  7) The authors find that hippocampal activity is related to behavioral improvement fromthe prior trial. This seems to be a simple learning effect (participants can learn plenty about this task from a prior trial that does not have the exact same timing as the current trial) but is interpreted as sensitivity to temporal context. The temporal context framing seems too far removed from the analyses performed.
  
  We agree with the reviewer that our observation that hippocampal activity reflects TTC-independent behavioral improvements across trials could have multiple explanations. Critically, i) one of them is that the hippocampus encodes temporal context, ii) it is only one of multiple observations that we build our interpretation on, and iii) our interpretation builds on multiple earlier reports
  
  Interval estimates regress toward the mean of the sampled intervals, an effect that is often referred to as the “regression effect”. This effect, which we observed in our data too (Fig. 1B), has been proposed to reflect the encoding of temporal context (e.g. Jazayeri & Shadlen 2010). Moreover, there is a large body of literature on how the hippocampus may support the encoding of spatial and temporal context (e.g. see Bellmund, Polti & Doeller 2020 for review).
  
  Because both hippocampal activity and the regression effect were linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. If so, one would expect that hippocampal activity should reflect behavioral improvements independently from TTC, it should reflect the magnitude of the regression effect, and it should generally reflect feedback, because it is the feedback that informs the updating of the internal prior.
  
  All three observations may have independent explanations indeed, but they are all also in line with the idea that the hippocampus does encode temporal context and that this explains the relationship between hippocampal activity and the regression effect. It therefore reflects a sparse and reasonable explanation in our opinion, even though it necessarily remains an interpretation. Of course, we want to be clear on what our results are and what our interpretations are.
  
  In response to the reviewer’s comment, we therefore toned down two of the statements that mention temporal context in the manuscript, and we removed an overly speculative statement from the result section. In addition, the discussion now describes more clearly how our results are in line with this interpretation.
  
  Abstract: This is in line with the idea that the hippocampus supports the rapid encoding of temporal context even on short time scales in a behavior-dependent manner.
  
  Page 13: This is in line with the notion that the hippocampus encodes temporal context in a behavior-dependent manner, and that it supports finding an optimal trade off between specificity and regularization along with other regions.
  
  Page 12: Because hippocampal activity (Julian & Doeller, 2020) and the regression effect (Jazayeri & Shadlen, 2010) were previously linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. This may explain why hippocampal activity reflected the magnitude of the regression effect as well as behavioral improvements independently from TTC, and why it reflected feedback, which informed the updating of the internal prior.
  
  The following statement was removed, overlapping with comment 2 by Reviewer 3:
  
  Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time.
  
  8) I am not sure the term "extraction of statistical regularities" is appropriate. The termis typically used for more complex forms of statistical relationships.
  
  We agree with the reviewer that this expression may be interpreted differently by different readers and are grateful to be pointed to this fact. We therefore removed it and instead added the following (hopefully less ambiguous) statement to the manuscript.
  
  Page 9: This study investigated how the human brain flexibly updates sensorimotor representations in a feedback-dependent manner in the service of timing behavior.
  
  Reviewer #2 (Public Review):
  
  The authors conducted a study involving functional magnetic resonance imaging and a time-to-contact estimation paradigm to investigate the contribution of the human hippocampus (HPC) to sensorimotor timing, with a particular focus on the involvement of this structure in specific vs. generalized learning. Suggestive of the former, it was found that HPC activity reflected time interval-specific improvements in performance while in support of the latter, HPC activity was also found to signal improvements in performance, which were not specific to the individual time intervals tested. Based on these findings, the authors suggest that the human HPC plays a key role in the statistical learning of temporal information as required in sensorimotor behaviour.
  
  By considering two established functions of the HPC (i.e., temporal memory and generalization) in the context of a domain that is not typically associated with this structure (i.e., sensorimotor timing), this study is potentially important, offering novel insight into the involvement of the HPC in everyday behaviour. There is much to like about this submission: the manuscript is clearly written and well-crafted, the paradigm and analyses are well thought out and creative, the methodology is generally sound, and the reported findings push us to consider HPC function from a fresh perspective. A relative weakness of the paper is that it is not entirely clear to what extent the data, at least as currently reported, reflects the involvement of the HPC in specific and generalized learning. Since the authors' conclusions centre around this observation, clarifying this issue is, in my opinion, of primary importance.
  
  We thank the reviewer for these positive and extremely helpful comments, which we will address in detail below. In response to these comments, the revised manuscript clarifies why the observed performance improvements are not at odds with the idea that an optimal trade-off between specificity and regularization is found, and how the time course of learning relates to those reported in previous literature. In addition, we conducted two new fMRI analyses, ensuring that our conclusions remain unchanged even if feedback is modeled with one parametric regressor, and if the number or nuisance regressors is reduced to control for overparameterization of the model. Please find our responses underneath each individual point below.
  
  1) Throughout the manuscript, the authors discuss the trade-off between specific and generalized learning, and point towards Figure S1D as evidence for this (i.e., participants with higher TTC accuracy exhibited a weaker regression effect). What appears to be slightly at odds with this, however, is the observation that the deviation from true TTC decreased with time (Fig S1F) as the regression line slope approached 0.5 (Fig S1E) - one would have perhaps expected the opposite i.e., for deviation from true TTC to increase as generalization increases. To gain further insight into this, it would be helpful to see the deviation from true TTC plotted for each of the four TTC intervals separately and as a signed percentage of the target TTC interval (i.e., (+) or (-) deviation) rather than the absolute value.
  
  We thank the reviewer for raising this important question and for the opportunity to elaborate on the relationship between the TTC error and the magnitude of the regression effect in behavior. Indeed, we see that the regression slopes approach 0.5 and that the TTC error decreases over the course of the experiment. We do not think that these two observations are at odds with each other for the following reasons:
  
  First, while the reviewer is correct in pointing out that the deviation from the TTC should increase as “generalization increases”, that is not what we found. It was not the magnitude of the regularization per se that increased over time, but the overall task performance became more optimal in the face of both objectives: specificity and generalization. This optimum is at a regression-line slope of 0.5. Generalization (or regularization how we refer to it in the present manuscript), therefore did not increase per se on group level.
  
  Second, the regression slopes approached 0.5 on the group-level, but the individual participants approached this level from different directions: Some of them started with a slope value close to 1 (high accuracy), whereas others started with a slope value close to 0 (near full regression to the mean). Irrespective of which slope value they started with, over time, they got closer to 0.5 (Rebuttal Figure 1A). This can also be seen in the fact that the group-level standard deviation in regression slopes becomes smaller over the course of the experiment (Rebuttal Figure 1B, SFig 1G). It is therefore not generally the case that the regression effect becomes stronger over time, but that it becomes more optimal for longer-term behavioral performance, which is then also reflected in an overall decrease in TTC error. Please see our response to the reviewer’s second comment for more discussion on this.
  
  Third, the development of task performance is a function of two behavioral factors: a) the accuracy and b) the precision in TTC estimation. Accuracy describes how similar the participant’s TTC estimates were to the true TTC, whereas precision describes how similar the participant’s TTC estimates were relative to each other (across trials). Our results are a reflection of the fact that participants became both more accurate over time on average, but also more precise. To demonstrate this point visually, we now plotted the Precision and the Accuracy for the 8 task segments below (Rebuttal Figure 1C, SFig 1H), showing that both measures increased as the time progressed and more trials were performed. This was the case for all target durations.
  
  In response to the reviewer’s comment, we clarified in the main text that these findings are not at odds with each other. Furthermore, we made clear that regularization per se did not increase over time on group level. We added additional supporting figures to the supplementary material to make this point. Note that in our view, these new analyses and changes more directly address the overall question the reviewer raised than the figure that was suggested, which is why we prioritized those in the manuscript.
  
  However, we appreciated the suggestion a lot and added the corresponding figure for the sake of completeness.
  
  Following additions were made.
  
  Page 5: In support of this, participants' regression slopes converged over time towards the optimal value of 0.5, i.e. the slope value between veridical performance and the grand mean (Fig. S1F; linear mixed-effects model with task segment as a predictor and participants as the error term, F(1) = 8.172, p = 0.005, ε2=0.08, CI: [0.01, 0.18]), and participants' slope values became more similar (Fig. S1G; linear regression with task segment as predictor, F(1) = 6.283, p = 0.046, ε2 = 0.43, CI: [0, 1]). Consequently, this also led to an improvement in task performance over time on group level (i.e. task accuracy and precision increased (Fig. S1I), and the relationship between accuracy and precision became stronger (Fig. S1H), linear mixed-effect model results for accuracy: F(1) = 15.127, p = 1.3x10-4, ε2=0.06, CI: [0.02, 0.11], precision: F(1) = 20.189, p = 6.1x10-5, ε2 = 0.32, CI: [0.13, 1]), accuracy-precision relationship: F(1) = 8.288, p =0.036, ε2 = 0.56, CI: [0, 1], see methods for model details).
  
  Page 12: This suggests that different regions encode distinct task regularities in parallel to form optimal sensorimotor representations to balance specificity and regularization. This is in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.
  
  Page 15: We also corroborated this effect by measuring the dispersion of slope values between participants across task segments using a linear regression model with task segment as a predictor and the standard deviation of slope values across participants as the dependent variable (Fig. S1G). As a measure of behavioral performance, we computed two variables for each target-TTC level: sensorimotor timing accuracy, defined as the absolute difference in estimated and true TTC, and sensorimotor timing precision, defined as coefficient of variation (standard deviation of estimated TTCs divided by the average estimated TTC). To study the interaction between these two variables for each target TTC over time, we first normalized accuracy by the average estimated TTC in order to make both variables comparable. We then used a linear mixed-effects model with precision as the dependent variable, task segment and normalized accuracy as predictors and target TTC as the error term. In addition, we tested whether accuracy and precision increased over the course of the experiment using separate linear mixed-effects models with task segment as predictor and participants as the error term.
  
  2) Generalization relies on prior experience and can be relatively slow to develop as is the case with statistical learning. In Jazayeri and Shadlen (2010), for instance, learning a prior distribution of 11-time intervals demarcated by two briefly flashed cues (compared to 4 intervals associated with 24 possible movement trajectories in the current study) required ~500 trials. I find it somewhat surprising, therefore, that the regression line slope was already relatively close to 0.5 in the very first segment of the task. To what extent did the participants have exposure to the task and the target intervals prior to entering the scanner?
  
  We thank the reviewer for raising the important question about the time course of learning in our task and how our results relate to prior work on this issue. Addressing the specific reviewer question first, participants practiced the task for 2-3 minutes prior to scanning. During the practice, they were not specifically instructed to perform the task as well as they could nor to encode the intervals, but rather to familiarize themselves with the general experimental setup and to ask potential questions outside the MRI machine. While they might have indeed started encoding the prior distribution of intervals during the practice already, we have no way of knowing, and we expect the contribution of this practice on the time course of learning during scanning to be negligible (for the reasons outlined above).
  
  However, in addition to the specific question the reviewer asked, we feel that the comment raises two more general points: 1) How long does it take to learn the prior distribution of a set of intervals as a function of the number of intervals tested, and 2) Why are the learning slopes we report quite shallow already in the beginning of the scan?
  
  Regarding (1), we are not aware of published reports that answer this question directly, and we expect that this will depend on the task that is used. Regarding the comparison to Jazayeri & Shadlen (2010), we believe the learning time course is difficult to compare between our study and theirs. As the reviewer mentioned, our study featured only 4 intervals compared to 11 in their work, based on which we would expect much faster learning in our task than in theirs. We did indeed sample 24 movement directions, but these were irrelevant in terms of learning the interval distribution. Moreover, unlike Jazayeri & Shadlen (2010), our task featured moving stimuli, which may have added additional sensory, motor and proprioceptive information in our study which the participants of the prior study could not rely on.
  
  Regarding (2), and overlapping with the reviewer’s previous comment, the average learning slope in our study is indeed close to 0.5 already in the first task segment, but we would like to highlight that this is a group-level measure. The learning slopes of some subjects were closer to 1 (i.e. the diagonal in Fig 1B), and the one of others was closer to 0 (i.e. the mean) in the beginning of the experiment. The median slope was close to 0.65. Importantly, the slopes of most participants still approached 0.5 in the course of the experiment, and so did even the group-level slope the reviewer is referring to. This also means that participants’ slopes became more similar in the course of the experiment, and they approached 0.5, which we think reflects the optimal trade-off between regressing towards the mean and regressing towards the diagonal (in the data shown in Fig. 1B). This convergence onto the optimal trade-off value can be seen in many measures, including the mean slope (Rebuttal Figure 1A, SFig 1F), the standard deviation in slopes (Rebuttal Figure 1B, SFig 1G) as well as the Precision vs. Accuracy tradeoff (Rebuttal Figure 1C, SFig 1H). We therefore think that our results are well in line with prior literature, even though a direct comparison remains difficult due to differences in the task.
  
  In response to the reviewer’s comment, and related to their first comment, we made the following addition to the discussion section.
  
  Page 12: This suggests that different regions encode distinct task regularities in parallel to form optimal sensorimotor representations to balance specificity and regularization. This is well in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.
  
  3) I am curious to know whether differences between high-accuracy andmedium-accuracy feedback as well as between medium-accuracy and low-accuracy feedback predicted hippocampal activity in the first GLM analysis (middle page 5). Currently, the authors only present the findings for the contrast between high-accuracy and low-accuracy feedback. Examining all feedback levels may provide additional insight into the nature of hippocampal involvement and is perhaps more consistent with the subsequent GLM analysis (bottom page 6) in which, according to my understanding, all improvements across subsequent trials were considered (i.e., from low-accuracy to medium-accuracy; medium-accuracy to high-accuracy; as well as low-accuracy to high-accuracy).
  
  We thank the reviewer for this thoughtful question, which relates to questions 5 by reviewer 1. The reviewer is correct that the contrast shown in Fig 2 does not consider the medium-accuracy feedback levels, and that the model in itself is slightly different from the one used in the subsequent analysis presented in Fig. 3. To reply to this comment as well as to a related one by reviewer 1 together, we therefore repeated the full analysis while modeling the three feedback levels in one parametric regressor, which includes the medium-accuracy feedback trials, and is consistent with the analysis shown in Fig. 3. The results of this new analysis are presented in the new Supplementary Fig. 3B.
  
  In short, the model included one parametric regressor with three levels reflecting the three types of feedback, and all nuisance regressors remained unchanged. Instead of contrasting high vs. low accuracy feedback, we then performed voxel-wise t-tests on the beta estimates obtained for the parametric feedback regressor. We found that our results presented initially were very robust: Both the observed clusters in the voxel-wise analysis (on whole-brain FWE-corrected levels) as well as the ROI results replicated across the two analyses, and our conclusions therefore remain unchanged.
  
  We made multiple textual additions to the manuscript to include this new analysis, and we present the results of the analysis including a direct comparison to our initial results in the new Supplementary Fig. 3. Following textual additions were.
  
  Page 5: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]
  
  Page 17: Moreover, instead of modeling the three feedback levels with three independent regressors, we repeated the analysis modeling the three feedback levels as one parametric regressor with three levels. All other regressors remained unchanged, and the model included the regressors for ITIs and ISIs. We then conducted t-tests implemented in SPM12 using thebeta estimates obtained for the parametric feedback regressor (Fig. S2C). Compared to the initial analyses presented above, this has the advantage that medium-accuracy feedback trials are considered for the statistics as well.
  
  4) The authors modeled the inter-trial intervals and periods of rest in their univariateGLMs. This approach of modelling all 'down time' can lead to model over-specification and inaccurate parameter estimation (e.g. Pernet, 2014). A comment on this approach as well as consideration of not modelling the inter-trial intervals would be useful.
  
  This is an important issue that we did not address in our initial manuscript. We are aware and agree with the reviewer’s general concern about model over-specification, which can be a big problem in regression as it leads to biased estimates. We did examine whether our model was overspecified before running it, but we did not report a formal test of it in the manuscript. We are grateful to be given the opportunity to do so now.
  
  In response to the reviewer’s comment, we repeated the full analysis shown in Fig. 2 while excluding the nuisance regressors for inter-trial intervals (ISI) and breaks (or inter-session intervals, ISI). All other regressors and analysis steps stayed unchanged relative to the one reported in Fig. 2. The new results are presented in a new Supplementary Figure 3B.
  
  Like for our previous analysis, we again see that the results we initially presented were extremely robust even on whole-brain FWE corrected levels, as well as on ROI level. Our conclusions therefore remain unchanged, and the results we presented initially are not affected by potential model overspecification. In addition to the new Supplementary Figure 3B, we made multiple textual changes to the manuscript to describe this new analysis and its implications. Note that we used the same nuisance regressors in all other GLM analyses too, meaning that it is also very unlikely that model overspecification affects any of the other results presented. We thank the reviewer for suggesting this analysis, and we feel including it in the manuscript has further strengthened the points we initially made.
  
  Following additions were made to the manuscript.
  
  Page 16: The GLM included three boxcar regressors modeling the feedback levels, one for ITIs, one for button presses and one for periods of rest (inter-session interval, ISI) [...]
  
  Page 16: ITIs and ISIs were modeled to reduce task-unrelated noise, but to ensure that this did not lead to over-specification of the above-described GLM, we repeated the full analysis without modeling the two. All other regressors including the main feedback regressors of interest remained unchanged, and we repeated both the voxel-wise and ROI-wise statistical tests as described above (Fig. S2B).
  
  Page 17: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]
  
  Reviewer #3 (Public Review):
  
  This paper reports the results of an interesting fMRI study examining the neural correlates of time estimation with an elegant design and a sensorimotor timing task. Results show that hippocampal activity and connectivity are modulated by performance on the task as well as the valence of the feedback provided. This study addresses a very important question in the field which relates to the function of the hippocampus in sensorimotor timing. However, a lack of clarity in the description of the MRI results (and associated methods) currently prevents the evaluation of the results and the interpretations made by the authors. Specifically, the model testing for timing-specific/timing-independent effects is questionable and needs to be clarified. In the current form, several conclusions appear to not be fully supported by the data.
  
  We thank the reviewer for pointing us to many methodological points that needed clarification. We apologize for the confusion about our methods, which we clarify in the revised manuscript. Please find our responses to the individual points below.
  
  Major points
  
  Some methodological points lack clarity which makes it difficult to evaluate the results and the interpretation of the data.
  
  We really appreciate the many constructive comments below. We feel that clarifying these points improved our manuscript immensely.
  
  1) It is unclear how the 3 levels of accuracy and feedback (high, medium, and lowperformance) were computed. Please provide the performance range used for this classification. Was this adjusted to the participants' performance?
  
  The formula that describes how the response window was computed for the different speed levels was reported in the methods section of the original manuscript on page 13. It reads as follows:
  
  “The following formula was used to scale the response window width: d ± ((k ∗ d)/2) where d is the target TTC and k is a constant proportional to 0.3 and 0.6 for high and medium accuracy, respectively.“
  
  In response to the reviewer’s comment, we now additionally report the exact ranges of the different response windows in a new Supplementary Table 1 and refer to it in the Methods section as follows.
  
  Page 10: To calibrate performance feedback across different TTC durations, the precise response window widths of each feedback level scaled with the speed of the fixation target (Table S1).
  
  2) The description of the MRI results lacks details. It is not always clear in the resultssection which models were used and whether parametric modulators were included or not in the model. This makes the results section difficult to follow. For example,
  
  a) Figure 2: According to the description in the text, it appears that panels A and B report the results of a model with 3 regressors, ie one for each accuracy/feedback level (high, medium, low) without parametric modulators included. However, the figure legend for panel B mentions a parametric modulator suggesting that feedback was modelled for each trial as a parametric modulator. The distinction between these 2 models must be clarified in the result section.
  
  We thank the reviewer very much for spotting this discrepancy. Indeed, Figure 2 shows the results obtained for a GLM in which we modeled the three feedback levels with separate regressors, not with one parametric regressor. Instead, the latter was the case for Figure 3. We apologize for the confusion and corrected the description in the figure caption, which now reads as follows. The description in the main text and the methods remain unchanged.
  
  Caption Fig. 2: We plot the beta estimates obtained for the contrast between high vs. low feedback.
  
  Moreover, note that in response to comment 5 by reviewer 1 and comment 3 by reviewer 2, the revised manuscript now additionally reports the results obtained for the parametric regressor in the new Supplementary Figure 3C. All conclusions remain unchanged.
  
  Additionally, it is unclear how Figure 2A supports the following statement: "Moreover, the voxel-wise analysis revealed similar feedback-related activity in the thalamus and the striatum (Fig. 2A), and in the hippocampus when the feedback of the current trial was modeled (Fig. S3)." This is confusing as Figure 2A reports an opposite pattern of results between the striatum/thalamus and the hippocampus. It appears that the statement highlighted above is supported by results from a model including current trial feedback as a parametric modulator (reported in Figure S3).
  
  We agree with the reviewer that our result description was confusing and changed it. It now reads as follows.
  
  Page 5: Moreover, the voxel-wise analysis revealed feedback-related activity also in the thalamus and the striatum (Fig. 2A) [...]
  
  Also, note that it is unclear from Figure 2A what is the direction of the contrast highlighting the hippocampal cluster (high vs. low according to the text but the figure shows negative values in the hippocampus and positive values in the thalamus). These discrepancies need to be addressed and the models used to support the statements made in the results sections need to be explicitly described.
  
  The description of the contrast is correct. Negative values indicate smaller errors and therefore better feedback, which is mentioned in the caption of Fig. 2 as follows:
  
  “Negative values indicate that smaller errors, and higher-accuracy feedback, led to stronger activity.”
  
  Note that the timing error determined the feedback, and that we predicted stronger updating and therefore stronger activity for larger errors (similar to a prediction error). We found the opposite. We mention the reasoning behind this analysis at various locations in the manuscript e.g. when talking about the connectivity analysis:
  
  “We reasoned that larger timing errors and therefore low-accuracy feedback would result in stronger updating compared to smaller timing errors and high-accuracy feedback”
  
  In response to the reviewer’s remark, we clarified this further by adding the following statement to the result section.
  
  Page 5: “Using a mass-univariate general linear model (GLM), we modeled the three feedback levels with one regressor each plus additional nuisance regressors (see methods for details). The three feedback levels (high, medium and low accuracy) corresponded to small, medium and large timing errors, respectively. We then contrasted the beta weights estimated for high-accuracy vs. low-accuracy feedback and examined the effects on group-level averaged across runs.”
  
  b) Connectivity analyses: It is also unclear here which model was used in the PPIanalyses presented in Figure 2. As it appears that the seed region was extracted from a high vs. low contrast (without modulators), the PPI should be built using the same model. I assume this was the case as the authors mentioned "These co-fluctuations were stronger when participants performed poorly in the previous trial and therefore when they received low-accuracy feedback." if this refers to low vs. high contrast. Please clarify.
  
  Yes, the PPI model was built using the same model. We clarified this in the methods section by adding the following statement to the PPI description.
  
  Page 17: “The PPI model was built using the same model that revealed the main effects used to define the HPC sphere “
  
  Yes, the reviewer is correct in thinking that the contrast shows the difference between low vs. high-accuracy feedback. We clarified this in the main text as well as in the caption of Fig. 2.
  
  Caption Fig 2: [...] We plot results of a psychophysiological interactions (PPI) analysis conducted using the hippocampal peak effects in (A) as a seed for low vs. high-accuracy feedback. [...]
  
  Page 17: The estimated beta weight corresponding to the interaction term was then tested against zero on the group-level using a t-test implemented in SPM12 (Fig. 2C). The contrast reflects the difference between low vs. high-accuracy feedback. This revealed brain areas whose activity was co-varying with the hippocampus seed ROI as a function of past-trial performance (n-1).
  
  c) It is unclear why the model testing TTC-specific / TTC-independent effects (resultspresented in Figure 3) used 2 parametric modulators (as opposed to building two separate models with a different modulator each). I wonder how the authors dealt with the orthogonalization between parametric modulators with such a model. In SPM, the orthogonalization of parametric modulators is based on the order of the modulators in the design matrix. In this case, parametric modulator #2 would be orthogonalized to the preceding modulator so that a contrast focusing on the parametric modulator #2 would highlight any modulation that is above and beyond that explained by modulator #1. In this case, modulation of brain activity that is TTC-specific would have to be above and beyond a modulation that is TTC-independent to be highlighted. I am unsure that this is what the authors wanted to test here (or whether this is how the MRI design was built). Importantly, this might bias the interpretation of their results as - by design - it is less likely to observe TTC-specific modulations in the hippocampus as there is significant TTC-independent modulation. In other words, switching the order of the modulators in the model (or building two separate models) might yield different results. This is an important point to address as this might challenge the TTC-specific/TTC-independent results described in the manuscript.
  
  We thank the reviewer for raising this important issue. When running the respective analysis, we made sure that the regressors were not collinear and we therefore did not expect substantial overlap in shared variance between them. However, we agree with the reviewer that orthogonalizing one regressor with respect to the other could still affect the results. To make sure that our expectations were indeed met, we therefore repeated the main analysis twice: 1) switching the order of the modulators and 2) turning orthogonalization off (which is possible in SPM12 unlike in previous versions). In all cases, our key results and conclusions remained unchanged, including the central results of the hippocampus analyses.
  
  Anterior (ant.) / Posterior (post.) Hippocampus ROI analysis with A) original order of modulators, B) switching the order of the modulators and C) turning orthogonalization of modulators off. ABC) Orange color corresponds to the TTC-independent condition whereas light-blue color corresponds to the TTC-specific condition. Statistics reflect p<0.05 at Bonferroni corrected levels () obtained using a group-level one-tailed one-sample t-test against zero; A) pfwe = 0.017, B) pfwe = 0.039, C) pfwe = 0.039.*
  
  Because orthogonalization did not affect the conclusions, the new manuscript simply reports the analysis for which it was turned off. Note that these new figures are extremely similar to the original figures we presented, which can be seen in the exemplary figure below showing our key results at a liberal threshold for transparency. In addition, we clarified that orthogonalization was turned off in the methods section as follows.
  
  Page 18: These two regressors reflect the tests for target-TTC-independent and target-TTC-specific updating, respectively, and they were not orthogonalized to each other.
  
  Comparison of old & new results: also see Fig. 3 and Fig. S5 in manuscript
  
  d) It is also unclear how the behavioral improvement was coded/classified "wecontrasted trials in which participants had improved versus the ones in which they had not improved or got worse"- It appears that improvement computation was based on the change of feedback valence (between high, medium and low). It is unclear why performance wasn't used instead? This would provide a finer-grained modulation?
  
  We thank the reviewer for the opportunity to clarify this important point. First, we chose to model feedback because it is the feedback that determines whether participants update their “internal model” or not. Without feedback, they would not know how well they performed, and we would not expect to find activity related to sensorimotor updating. Second, behavioral performance and received feedback are tightly correlated, because the former determines the latter. We therefore do not expect to see major differences in results obtained between the two. Third, we did in fact model both feedback and performance in two independent GLMs, even though the way the results were reported in the initial submission made it difficult to compare the two.
  
  Figure 4 shows the results obtained when modeling behavioral performance in the current trial as an F-contrast, and Supplementary Fig 4 shows the results when modeling the feedback received in the current trial as a t-contrast. While the voxel-wise t-maps/F-maps are also quite similar, we now additionally report the t-contrast for the behavioral-performance GLM in a new Supplementary Figure 4C. The t-maps obtained for these two different analyses are extremely similar, confirming that the direction of the effects as well as their interpretation remain independent of whether feedback or performance is modeled.
  
  The revised manuscript refers to the new Supplementary Figure 4C as follows.
  
  Page 17: In two independent GLMs, we analyzed the time courses of all voxels in the brain as a function of behavioral performance (i.e. TTC error) in each trial, and as a function of feedback received at the end of each trial. The models included one mean-centered parametric regressor per run, modeling either the TTC error or the three feedback levels in each trial, respectively. Note that the feedback itself was a function of TTC error in each trial [...] We estimated weights for all regressors and conducted a t-test against zero using SPM12 for our feedback and performance regressors of interest on the group level (Fig. S4A). [...]
  
  Page 17: In addition to the voxel-wise whole-brain analyses described above, we conducted independent ROI analyses for the anterior and posterior sections of the hippocampus (Fig. S2A). Here, we tested the beta estimates obtained in our first-level analysis for the feedback and performance regressors of interest (Fig. S4B; two-tailed one-sample t tests: anterior HPC, t(33) = -5.92, p = 1.2x10-6, pfwe = 2.4x10-6, d=-1.02, CI: [-1.45, -0.6]; posterior HPC, t(33) = -4.07, p = 2.7x10-4, pfwe = 5.4x10-4, d=-0.7, CI: [-1.09, -0.32]). See section "Regions of interest definition and analysis" for more details.
  
  If the feedback valence was used to classify trials as improved or not, how was this modelled (one regressor for improved, one for no improvement? As opposed to a parametric modulator with performance improvement?).
  
  We apologize for the lack of clarity regarding our regressor design. In response to this comment, we adapted the corresponding paragraph in the methods to express more clearly that improvement trials and no-improvement trials were modeled with two separate parametric regressors - in line with the reviewer’s understanding. The new paragraph reads as follows.
  
  Page 18: One regressor modeled the main effect of the trial and two parametric regressors modeled the following contrasts: Parametric regressor 1: trials in which behavioral performance improved \textit{vs}. parametric regressor 2: trials in which behavioral performance did not improve or got worse relative to the previous trial.
  
  Last, it is also unclear how ITI was modelled as a regressor. Did the authors mean a parametric modulator here? Some clarification on the events modelled would also be helpful. What was the onset of a trial in the MRI design? The start of the trial? Then end? The onset of the prediction time?
  
  The Inter-trial intervals (ITIs) were modeled as a boxcar regressor convolved with the hemodynamic response function. They describe the time after the feedback-phase offset and the subsequent trial onset. Moreover, the start of the trial was the moment when the visual-tracking target started moving after the ITI, whereas the trial end was the offset of the feedback phase (i.e. the moment in which the feedback disappeared from the screen). The onset of the “prediction time” was the moment in which the visual-tracking target stopped moving, prompting participants to estimate the time-to-contact. We now explain this more clearly in the methods as shown below.
  
  Page 16: The GLM included three boxcar regressors modeling the feedback levels, one for ITIs, one for button presses and one for periods of rest (inter-session interval, ISI), which were all convolved with the canonical hemodynamic response function of SPM12. The start of the trial was considered as the trial onsets for modeling (i.e. the time when the visual-tracking target started moving). The trial end was the offset of the feedback phase (i.e. the moment in which the feedback disappeared from the screen). The ITI was the time between the offset of the feedback-phase and the subsequent trial onset.
  
  On a related note, in response to question 4 by reviewer 2, we now repeated one of the main analyses (Fig. 2) without modeling the ITI (as well as the Inter-session interval, ISI). We found that our key results and conclusions are independent of whether or not these time points were modeled. These new results are presented in the new Supplementary Figure 3B.
  
  Page 16: ITIs and ISIs were modeled to reduce task-unrelated noise, but to ensure that this did not lead to over-specification of the above-described GLM, we repeated the full analysis without modeling the two. [...]
  
  Perhaps as a result of a lack of clarity in the result section and the MRI methods, it appears that some conclusions presented in the result section are not supported by the data. E.g. "Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time." The data show that hippocampal activity is higher during and after an accurate trial. This pattern of results could be attributed to various processes such as e.g. reward or learning etc. I would recommend not providing such interpretations in the result section and addressing these points in the discussion.
  
  Similar to above, statements like "These results suggest that the hippocampus updates information that is independent of the target TTC". The data show that higher hippocampal activity is linked to greater improvement across trials independent of the timing of the trial. The point about updating is rather speculative and should be presented in the discussion instead of the result section.
  
  The reviewer is referring to two statements in the results section that reflect our interpretation rather than a description of the results. In response to the reviewer’s comment, we therefore removed the following statement from the results.
  
  Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time.
  
  In addition, we replaced the remaining statement by the following. We feel this new statement makes clear why we conducted the analysis that is described without offering an interpretation of the results that were presented before.
  
  Page 8: We reasoned that updating TTC-independent information may support generalization performance by means of regularizing the encoded intervals based on the temporal context in which they were encoded.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.08.03.454928v3
www.biorxiv.org www.biorxiv.org

Task-specific roles of local interneurons for inter -and intraglomerular signaling in the insect antennal lobe

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The manuscript provides very high quality single-cell physiology combined with population physiology to reveal distinctives roles for two anatomically dfferent LN populations in the cockroach antennal lobe. The conclusion that non-spiking LNs with graded responses show glomerular-restricted responses to odorants and spiking LNs show similar responses across glomeruli generally supported with strong and clean data, although the possibility of selective interglomerular inhibition has not been ruled out. On balance, the single-cell biophysics and physiology provides foundational information useful for well-grounded mechanistic understanding of how information is processed in insect antennal lobes, and how each LN class contributes to odor perception and behavior.
  
  Thank you for this positive feedback.
  
  Reviewer #2 (Public Review):
  
  The manuscript "Task-specific roles of local interneurons for inter- and intraglomerular signaling in the insect antennal lobe" evaluates the spatial distribution of calcium signals evoked by odors in two major classes of olfactory local neurons (LNs) in the cockroach P. Americana, which are defined by their physiological and morphological properties. Spiking type I LNs have a patchy innervation pattern of a subset of glomeruli, whereas non-spiking type II LNs innervate almost all glomeruli (Type II). The authors' overall conclusion is that odors evoke calcium signals globally and relatively uniformly across glomeruli in type I spiking LNs, and LN neurites in each glomerulus are broadly tuned to odor. In contrast, the authors conclude that they observe odor-specific patterns of calcium signals in type II nonspiking LNs, and LN neurites in different glomeruli display distinct local odor tuning. Blockade of action potentials in type I LNs eliminates global calcium signaling and decorrelates glomerular tuning curves, converting their response profile to be more similar to that of type II LNs. From these conclusions, the authors infer a primary role of type I LNs in interglomerular signaling and type III LNs in intraglomerular signaling.
  
  The question investigated by this study - to understand the computational significance of different types of LNs in olfactory circuits - is an important and significant problem. The design of the study is straightforward, but methodological and conceptual gaps raise some concerns about the authors' interpretation of their results. These can be broadly grouped into three main areas.
  
  1) The comparison of the spatial (glomerular) pattern of odor-evoked calcium signals in type I versus type II LNs may not necessarily be a true apples-to-apples comparison. Odor-evoked calcium signals are an order of magnitude larger in type I versus type II cells, which will lead to a higher apparent correlation in type I cells. In type IIb cells, and type I cells with sodium channel blockade, odor-evoked calcium signals are much smaller, and the method of quantification of odor tuning (normalized area under the curve) is noisy. Compare, for instance, ROI 4 & 15 (Figure 4) or ROI 16 & 23 (Figure 5) which are pairs of ROIs that their quantification concludes have dramatically different odor tuning, but which visual inspection shows to be less convincing. The fact that glomerular tuning looks more correlated in type IIa cells, which have larger, more reliable responses compared to type IIb cells, also supports this concern.
  
  We agree with the reviewer that "the comparison of the spatial (glomerular) pattern of odor-evoked calcium signals is not necessarily a true apples-to-apples comparison". Type I and type II LNs are different neuron types. Given their different physiology and morphology, this is not even close to a "true apples-to-apples comparison" - and a key point of the manuscript is to show just that.
  
  As we have emphasized in response to Essential Revision 1, the differences in Ca2+ signals are not an experimental shortcoming but a physiologically relevant finding per se. These data, especially when combined with the electrophysiological data, contribute to a better understanding of these neurons’ physiological and computational properties.
  
  It is physiologically determined that the Ca2+ signals during odorant stimulation in the type II LNs are smaller than in type I LNs. And yes, the signals are small because small postsynpathetic Ca2+ currents predominantly cause the signals. Regardless of the imaging method, this naturally reduces the signal-to-noise ratio, making it more challenging to detect signals. To address this issue, we used a well-defined and reproducible method for analyzing these signals. In this context, we do not agree with the very general criticism of the method. The reviewer questions whether the signals are odorant-induced or just noise (see also minor point 12). If we had recorded only noise, we would expect all tuning curves (for each odorant and glomerulus) to be the same. In this context, we disagree with the reviewer's statement that the tuning curves do not represent the Ca2+ signals in Figure 4 (ROI 4 and 15) and Figure 5 (ROI 16 and 23). This debate reflects precisely the kind of 'visual inspection bias' that our clearly defined analysis aims to avoid. On close inspection, the differences in Ca2+ signals can indeed be seen. Figure II (of this letter) shows the signals from the glomeruli in question at higher magnification. The sections of the recordings that were used for the tuning curves are marked in red.
  
  Figure II: Ca2+ signals of selected glomeruli that were questioned by the reviewer.
  
  2) An additional methodological issue that compounds the first concern is that calcium signals are imaged with wide-field imaging, and signals from each ROI likely reflect out of plane signals. Out of plane artifacts will be larger for larger calcium signals, which may also make it impossible to resolve any glomerular-specific signals in the type I LNs.
  
  Thank you for allowing us to clarify this point. The reviewer comment implies that the different amplitudes of the Ca2+ signals indicate some technical-methodological deficiency (poorly chosen odor concentration). But in fact, this is a key finding of this study that is physiologically relevant and crucial for understanding the function of the neurons studied. These very differences in the Ca2+ signals are evidence of the different roles these neurons play in AL. The different signal amplitudes directly show the distinct physiology and Ca2+ sources that dominate the Ca2+ signals in type I and type II LNs. Accordingly, it is impractical to equalize the magnitude of Ca2+ signals under physiological conditions by adjusting the concentration of odor stimuli.
  
  In the following, we address these issues in more detail: 1) Imaging Method 2) Odorant stimulation 3) Cell type-specific Ca2+ signals
  
  1) Imaging Method:
  
  Of course, we agree with the reviewer comment that out-of-focus and out-of-glomerulus fluorescence can potentially affect measurements, especially in widefield optical imaging in thick tissue. This issue was carefully addressed in initial experiments. In type I LNs, which innervate a subset of glomeruli, we detected fluorescence signals, which matched the spike pattern of the electrophysiological recordings 1:1, only in the innervated glomeruli. In the not innervated ROIs (glomeruli), we detected no or comparatively very little fluorescence, even in glomeruli directly adjacent to innervated glomeruli.
  
  To illustrate this, FIGURE I (of this response letter) shows measurements from an AL in which an uniglomerular projection neuron was investigated in an a set of experiments that were not directly related to the current study. In this experiment, a train of action potential was induced by depolarizing current. The traces show the action potential induced fluorescent signals from the innervated glomerulus (glomerulus #1) and the directly adjacent glomeruli.
  
  These results do not entirely exclude that the large Ca2+ signals from the innervated LN glomeruli may include out-of-focus and out-of-glomerulus fluorescence, but they do show that the bulk of the signal is generated from the recorded neuron in the respective glomeruli.
  
  Figure I: Simultaneous electrophysiological and optophysiological recordings of a uniglomerular projection using the ratiometric Ca2+ indicator fura-2. The projection neuron has its arborization in glomerulus 1. The train of action potentials was induced with a depolarizing current pulse (grey bar).
  
  2) Odorant Stimulation: It is important to note that the odorant concentration cannot be varied freely. For these experiments, the odorant concentrations have to be within a 'physiologically meaningful' range, which means: On the one hand, they have to be high enough to induce a clear response in the projection neurons (the antennal lobe output). On the other hand, however, the concentration was not allowed to be so high that the ORNs were stimulated nonspecifically. These criteria were met with the used concentrations since they induced clear and odorant-specific activity in projection neurons.
  
  3) Cell type-specific Ca2+ signals:
  
  The differences in Ca2+ signals are described and discussed in some detail throughout the text (e.g., page 6, lines 119-136; page 9, lines 193-198; page 10-11, lines 226-235; page 14-15, line 309-333). Briefly: In spiking type I LNs, the observed large Ca2+ signals are mediated mainly by voltage-depended Ca2+ channels activated by the Na+-driven action potential's strong depolarization. These large Ca2+ signals mask smaller signals that originate, for example, from excitatory synaptic input (i.e., evoked by ligand-activated Ca2+ conductances). Preventing the firing of action potentials can unmask the ligand-activated signals, as shown in Figure 4 (see also minor comments 8. and 10.). In nonspiking type II LNs, the action potential-generated Ca2+ signals are absent; accordingly, the Ca2+ signals are much smaller. In our model, the comparatively small Ca2+ signals in type II LNs are mediated mainly by (synaptic) ligand-gated Ca2+ conductances, possibly with contributions from voltage-gated Ca2+ channels activated by the comparatively small depolarization (compared with type I LNs).
  
  Accordingly, our main conclusion, that spiking LNs play a primary role in interglomerular signaling, while nonspiking LNs play an essential role in intraglomeular signaling, can be DIRECTLY inferred from the differences in odorant induced Ca2+ signals alone.
  
  a) Type I LN: The large, simultaneous, and uniform Ca2+ signals in the innervated glomeruli of an individual type I LN clearly show that they are triggered in each glomerulus by the propagated action potentials, which conclusively shows lateral interglomerular signal propagation.
  
  b) Type II LNs: In the type II LNs, we observed relatively small Ca2+ signals in single glomeruli or a small fraction of glomeruli of a given neuron. Importantly, the time course and amplitude of the Ca2+ signals varied between different glomeruli and different odors. Considering that type II LNs in principle, can generate large voltage-activated Ca2+ currents (larger that type I LNS; page 4, lines 82-86, Husch et al. 2009a,b; Fusca and Kloppenburg 2021), these data suggest that in type II LNs electrical or Ca2+ signals spread only within the same glomerulus; and laterally only to glomeruli that are electrotonically close to the odorant stimulated glomerulus.
  
  Taken together, this means that our conclusions regarding inter- and intraglomerular signaling can be derived from the simultaneously recorded amplitudes and the dynamics of the membrane potential and Ca2+ signals alone. This also means that although the correlation analyses support this conclusion nicely, the actual conclusion does not ultimately depend on the correlation analysis. We had (tried to) expressed this with the wording, “Quantitatively, this is reflected in the glomerulus-specific odorant responses and the diverse correlation coefficiiants across…” (page 10, lines 216-217) and “ …This is also reflected in the highly correlated tuning curves in type I LNs and low correlations between tuning curves in type II LNs”(page 13, lines 293-295).
  
  3) Apart from the above methodological concerns, the authors' interpretation of these data as supporting inter- versus intra-glomerular signaling are not well supported. The odors used in the study are general odors that presumably excite feedforward input to many glomeruli. Since the glomerular source of excitation is not determined, it's not possible to assign the signals in type II LNs as arising locally - selective interglomerular signal propagation is entirely possible. Likewise, the study design does not allow the authors to rule out the possibility that significant intraglomerular inhibition may be mediated by type I LNs.
  
  The reviewer addresses an important point. However, from the comment, we get the impression that he/she has not taken into account the entire data set and the DISCUSSION. In fact, this topic has already been discussed in some detail in the original version (page 12, lines 268-271; page 15-16; lines 358-374). This section even has a respective heading: "Inter- and intraglomerular signaling via nonspiking type II LNs" (page 15, line 338). We apologize if our explanations regarding this point were unclear, but we also feel that the reviewer is arguing against statements that we did not make in this way.
  
  a) In 11 out of 18 type II LNs we found 'relatively uncorrelated' (r=0.43±0.16, N=11) glomerular tuning curves. These experiments argue strongly for a 'local excitation' with restricted signal propagation and do not provide support for interglomerular signal propagation. Thus, these results support our interpretation of intraglomerular signaling in this set of neurons.
  
  b) In 7 out of 18 experiments, we observed 'higher correlated' glomerular tuning curves (r=0.78±0.07, N=7). We agree with the reviewer that this could be caused by various mechanisms, including simultaneous input to several glomeruli or by interglomerular signaling. Both possibilities were mentioned and discussed in the original version of the manuscript (page 12, lines 268-271; page 15-16; lines 358-374). In the Discussion, we considered the latter possibility in particular (but not exclusively) for the type IIa1 neurons that generate spikelets. Their comparatively stronger active membrane properties may be particularly suitable for selective signal transduction between glomeruli.
  
  c) We have not ruled out that local signaling exists in type I LNs – in addition to interglomerular signaling. The highly localized Ca2+ signals in type I LNs, which we observed when Na+ -driven action potential generation was prevented, may support this interpretation. However, we would like to reiterate that the simultaneous electrophysiological and optophysiological recordings, which show highly correlated glomerular Ca2+ dynamics that match 1:1 with the simultaneously recorded action potential pattern, clearly suggest interglomerular signaling. We also want to emphasize that this interpretation is in agreement with previous models derived from electrophysiological studies(Assisi et al., 2011; Fujiwara et al., 2014; Hong and Wilson, 2015; Nagel and Wilson, 2016; Olsen and Wilson, 2008; Sachse and Galizia, 2002; Wilson, 2013).
  
  In light of the reviewer's comment(s), we have modified the text to clarify these points (page 14, lines 317-319).
  
  Reviewer #3 (Public Review):
  
  To elucidate the role of the two types of LNs, the authors combined whole-cell patch clamp recordings with calcium imaging via single cell dye injection. This method enables to monitor calcium dynamics of the different axons and branches of single LNs in identified glomeruli of the antennal lobe, while the membrane potential can be recorded at the same time. The authors recorded in total from 23 spiking (type I LN) and 18 non-spiking (type II LN) neurons to a set of 9 odors and analyzed the firing pattern as well as calcium signals during odor stimulation for individual glomeruli. The recordings reveal on one side that odor-evoked calcium responses of type I LNs are odor-specific, but homogeneous across glomeruli and therefore highly correlated regarding the tuning curves. In contrast, odor-evoked responses of type II LNs show less correlated tuning patterns and rather specific odor-evoked calcium signals for each glomerulus. Moreover the authors demonstrate that both LN types exhibit distinct glomerular branching patterns, with type I innervating many, but not all glomeruli, while type II LNs branch in all glomeruli.
  
  From these results and further experiments using pharmacological manipulation, the authors conclude that type I LNs rather play a role regarding interglomerular inhibition in form of lateral inhibition between different glomeruli, while type II LNs are involved in intraglomerular signaling by developing microcircuits in individual glomeruli.
  
  In my opinion the methodological approach is quite challenging and all subsequent analyses have been carried out thoroughly. The obtained data are highly relevant, but provide rather an indirect proof regarding the distinct roles of the two LN types investigated. Nevertheless, the conclusions are convincing and the study generally represents a valuable and important contribution to our understanding of the neuronal mechanisms underlying odor processing in the insect antennal lobe. I think the authors should emphasize their take-home messages and resulting conclusions even stronger. They do a good job in explaining their results in their discussion, but need to improve and highlight the outcome and meaning of their individual experiments in their results section.
  
  Thank you for this positive feedback.
  
  References:
  
  Assisi, C., Stopfer, M., Bazhenov, M., 2011. Using the structure of inhibitory networks to unravel mechanisms of spatiotemporal patterning. Neuron 69, 373–386. https://doi.org/10.1016/j.neuron.2010.12.019
  
  Das, S., Trona, F., Khallaf, M.A., Schuh, E., Knaden, M., Hansson, B.S., Sachse, S., 2017. Electrical synapses mediate synergism between pheromone and food odors in Drosophila melanogaster . Proc Natl Acad Sci U S A 114, E9962–E9971. https://doi.org/10.1073/pnas.1712706114
  
  Fujiwara, T., Kazawa, T., Haupt, S.S., Kanzaki, R., 2014. Postsynaptic odorant concentration dependent inhibition controls temporal properties of spike responses of projection neurons in the moth antennal lobe. PLOS ONE 9, e89132. https://doi.org/10.1371/journal.pone.0089132
  
  Fusca, D., Husch, A., Baumann, A., Kloppenburg, P., 2013. Choline acetyltransferase-like immunoreactivity in a physiologically distinct subtype of olfactory nonspiking local interneurons in the cockroach (Periplaneta americana). J Comp Neurol 521, 3556–3569. https://doi.org/10.1002/cne.23371
  
  Fuscà, D., and Kloppenburg, P. (2021). Odor processing in the cockroach antennal lobe-the network components. Cell Tissue Res.
  
  Hong, E.J., Wilson, R.I., 2015. Simultaneous encoding of odors by channels with diverse sensitivity to inhibition. Neuron 85, 573–589. https://doi.org/10.1016/j.neuron.2014.12.040
  
  Husch, A., Paehler, M., Fusca, D., Paeger, L., Kloppenburg, P., 2009a. Calcium current diversity in physiologically different local interneuron types of the antennal lobe. J Neurosci 29, 716–726. https://doi.org/10.1523/JNEUROSCI.3677-08.2009
  
  Husch, A., Paehler, M., Fusca, D., Paeger, L., Kloppenburg, P., 2009b. Distinct electrophysiological properties in subtypes of nonspiking olfactory local interneurons correlate with their cell type-specific Ca2+ current profiles. J Neurophysiol 102, 2834–2845. https://doi.org/10.1152/jn.00627.2009
  
  Nagel, K.I., Wilson, R.I., 2016. Mechanisms Underlying Population Response Dynamics in Inhibitory Interneurons of the Drosophila Antennal Lobe. J Neurosci 36, 4325–4338. https://doi.org/10.1523/JNEUROSCI.3887-15.2016
  
  Neupert, S., Fusca, D., Kloppenburg, P., Predel, R., 2018. Analysis of single neurons by perforated patch clamp recordings and MALDI-TOF mass spectrometry. ACS Chem Neurosci 9, 2089–2096.
  
  Olsen, S.R., Bhandawat, V., Wilson, R.I., 2007. Excitatory interactions between olfactory processing channels in the Drosophila antennal lobe. Neuron 54, 89–103. https://doi.org/10.1016/j.neuron.2007.03.010
  
  Olsen, S.R., Wilson, R.I., 2008. Lateral presynaptic inhibition mediates gain control in an olfactory circuit. Nature 452, 956–960. https://doi.org/10.1038/nature06864
  
  Sachse, S., Galizia, C., 2002. Role of inhibition for temporal and spatial odor representation in olfactory output neurons: a calcium imaging study. J Neurophysiol. 87, 1106–17.
  
  Shang, Y., Claridge-Chang, A., Sjulson, L., Pypaert, M., Miesenbock, G., 2007. Excitatory Local Circuits and Their Implications for Olfactory Processing in the Fly Antennal Lobe. Cell 128, 601–612.
  
  Wilson, R.I., 2013. Early olfactory processing in Drosophila: mechanisms and principles. Annu Rev Neurosci 36, 217–241. https://doi.org/10.1146/annurev-neuro-062111-150533
  
  Yaksi, E., Wilson, R.I., 2010. Electrical coupling between olfactory glomeruli. Neuron 67, 1034–1047. https://doi.org/10.1016/j.neuron.2010.08.041
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.11.26.369686v1
www.biorxiv.org www.biorxiv.org

New submission 07/06/2022, 16:19:01

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  In this MEG work employing two types of bistable perception test and unique regression analyses, the authors identified different neural frequencies to different components of visual perception: its content and stability.
  
  Strengths:
  
  This study has a nice set of three different experiments to clarify neural differences between content, memory and stability of visual perception.
  
  The state space analysis appears to be powerful to identify such different neural signatures for different cognitive components as well.
  
  Weaknesses:
  
  Despite such strengths, this work may have the somewhat critical weakness specified in the recommendations for the authors.
  
  First, in the analysis to identify content-specific neural frequency, the authors concluded that the SCP is more relevant to the visual perceptual content compared to the neural activity in the alpha and beta-band frequencies. In my impression, to claim this, it would be necessary to show statistically significant differences in the prediction accuracy between the SCP and the other frequencies. Given the not-so-high prediction accuracy seen in the SCP-based analysis, such statistical supports appear essential.
  
  We have now directly compared decoding accuracy for SCP and alpha/beta oscillations, which showed statistically significant differences in both the ambiguous and unambiguous conditions for both ambiguous images. We have added these results as a supplementary figure (new Figure 2—figure supplement 1).
  
  Second, two behavioural metrics in the neural state space analysis-i.e., Switch and Direction-may be too arbitrary. As suggested by the power-law distribution of the percept duration, the neural dynamics during seemingly stable percept may not be able to be described in linear functions. Instead, the brain may go back and forth between several neural states even when we are thinking we're experiencing stable visual consciousness. If so, the current definition of the Switch metric and Direction index, which seems to be based on the behaviour of the Switch index, may be arbitrary. In other words, I feel the authors may have to elaborate the rationale for the definitions of such metrics.
  
  First, we note it is generally accepted in the field that the distribution of percept durations follows a gamma distribution instead of a power-law distribution (e.g., Sterzer et al., TiCS 2009; Blake & Logothetis Nature Rev. Neurosci 2002; Kleinschmidt et al., 1998; Leopold et al., TiCS 1999), and microswitches have not been reported either using the more classic task as that employed here or the more recently developed ‘no-report’ task of using eye-tracking statistics to deduce perceptual switches without overt report (e.g., Frassle et al., J Neurosci 2014).
  
  Second, while brain activity may fluctuate during these time periods, it never crosses the threshold of evoking a conscious report, and thus we would expect that such fluctuations, if they do occur, would be of a lower magnitude than those that do produce a conscious report.
  
  Most importantly, our goal here is to define behavioral metrics in order to identify components of neural dynamics underpinning the relevant aspect of behavior. As such, our definition of the behavioral metric should not be directly informed by observed spontaneous dynamics of brain activity (especially those that may be observed in the data but are of unclear relevance to perceptual switching); otherwise the analysis would be prone to circularity and spurious correlations (i.e., using observed brain dynamics to inform construction of behavioral metrics might pick up aspect of brain dynamics not really relevant to behavior in the analysis results).
  
  Finally, the timing characteristics of ‘Switch’ and ‘Direction’ behavioral metrics are not arbitrary; instead they are the simplest behavioral functions that allow a comparison of pre- and post-switching periods (or when the percepts might be in the ‘stabilizing’ phase vs. the ‘destabilizing’ phase). Nevertheless, the regression analysis can pick up on other temporal patterns of changes not exactly the same as our defined behavioral metric. This can be seen for SCP and beta activity projected onto the Direction axis, where it has the lowest value at ~20th percentile of the trial (not 50th percentile as assumed by the behavioral metric). To confirm that the analysis is not highly dependent on the precise timing definition of the behavioral metrics, we ran a control analysis, where the switching point was set at 30%tile (rather than 50%tile as in the original analysis). This control analysis resulted in a similar pattern of neural results (Figure R1).
  
  Figure R1: Changing temporal behavior definition (switching point moved from 50th percentile to 30th percentile of percept duration) does not significantly alter the neural results. Compare to Figure 4—figure supplement 1, ‘Switch’ and “Direction’ Columns.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.18.484861v1
www.biorxiv.org www.biorxiv.org

New submission 01/06/2022, 14:48:28

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In computational modeling studies of behavioral data using reinforcement learning models, it has been implicitly assumed that parameter estimates generalize across tasks (generalizability) and that each parameter reflects a single cognitive function (interpretability). In this study, the authors examined the validity of these assumptions through a detailed analysis of experimental data across multiple tasks and age groups. The results showed that some parameters generalize across tasks, while others do not, and that interpretability is not sufficient for some parameters, suggesting that the interpretation of parameters needs to take into account the context of the task. Some researchers may have doubted the validity of these assumptions, but to my knowledge, no study has explicitly examined their validity. Therefore, I believe this research will make an important contribution to researchers who use computational modeling. In order to clarify the significance of this research, I would like the authors to consider the following points.
  
  1) Effects of model misspecification
  
  In general, model parameter estimates are influenced by model misspecification. Specifically, if components of the true process are not included in the model, the estimates of other parameters may be biased. The authors mentioned a little about model misspecification in the Discussion section, but they do not mention the possibility that the results of this study itself may be affected by it. I think this point should be discussed carefully.
  
  The authors stated that they used state-of-the-art RL models, but this does not necessarily mean that the models are correctly specified. For example, it is known that if there is history dependence in the choice itself and it is not modeled properly, the learning rates depending on valence of outcomes (alpha+, alpha-) are subject to biases (Katahira, 2018, J Math Pscyhol). In the authors' study, the effect of one previous choice was included in the model as choice persistence, p. However, it has been pointed out that not including the effect of a choice made more than two trials ago in the model can also cause bias (Katahira, 2018). The authors showed taht the learning rate for positive RPE, alpha+ was inconsistent across tasks. But since choice persistence was included only in Task B, it is possible that the bias of alpha+ was different between tasks due to individual differences in choice persistence, and thus did not generalize.
  
  However, I do not believe that it is necessary to perform a new analysis using the model described above. As for extending the model, I don't think it is possible to include all combinations of possible components. As is often said, every model is wrong, and only to varying degrees. What I would like to encourage the authors to do is to discuss such issues and then consider their position on the use of the present model. Even if the estimation results of this model are affected by misspecification, it is a fact that such a model is used in practice, and I think it is worthwhile to discuss the nature of the parameter estimates.
  
  We thank the reviewer for this thoughtful question, and have added the following paragraph to the discussion section that is aims to address it:
  
  “Another concern relates to potential model misspecification and its effects on model parameter estimates: If components of the true data-generating process are not included in a model (i.e., a model is misspecified), estimates of existing model parameters may be biased. For example, if choices have an outcome-independent history dependence that is not modeled properly, learning rate parameters have shown to be biased [63]. Indeed, we found that learning rate parameters were inconsistent across the tasks in our study, and two of our models (A and C) did not model history dependence in choice, while the third (model B) only included the effect of one previous choice (persistence parameter), but no multi-trial dependencies. It is hence possible that the differences in learning rate parameters between tasks were caused by differences in the bias induced by misspecification of history dependence, rather than a lack of generalization. Though pressing, however, this issue is difficult to resolve in practicality, because it is impossible to include all combinations of possible parameters in all computational models, i.e., to exhaustively search the space of possible models ("Every model is wrong, but to varying degrees"). Furthermore, even though our models were likely affected by some degree of misspecification, the research community is currently using models of this kind. Our study therefore sheds light on generalizability and interpretability in a realistic setting, which likely includes models with varying degrees of misspecification. Lastly, our models were fitted using robust computational tools and achieved good behavioral recovery (Fig. D.7), which also reduces the likelihood of model misspecification.“
  
  2) Issue of reliability of parameter estimates
  
  I think it is important to consider not only the bias in the parameter estimates, but also the issue of reliability, i.e., how stable the estimates will be when the same task is repeated with the same individual. For the task used in this study, has test-retest reliability been examined in previous studies? I think that parameters with low reliability will inevitably have low generalizability to other tasks. In this study, the use of three tasks seems to have addressed this issue without explicitly considering the reliability, but I would like the author to discuss this issue explicitly.
  
  We thank the reviewer for this useful comment, and have added the following paragraph to the discussion section to address it:
  
  “Furthermore, parameter generalizability is naturally bounded by parameter reliability, i.e., the stability of parameter estimates when participants perform the same task twice (test-retest reliability) or when estimating parameters from different subsets of the same dataset (split-half reliability). The reliability of RL models has recently become the focus of several parallel investigations [...], some employing very similar tasks to ours [...]. The investigations collectively suggest that excellent reliability can often be achieved with the right methods, most notably by using hierarchical model fitting. Reliability might still differ between tasks or models, potentially being lower for learning rates than other RL parameters [...], and differing between tasks (e.g., compare [...] to [...]). In this study, we used hierarchical fitting for tasks A and B and assessed a range of qualitative and quantitative measures of model fit for each task [...], boosting our confidence in high reliability of our parameter estimates, and the conclusion that the lack of between-task parameter correlations was not due to a lack of parameter reliability, but a lack of generalizability. This conclusion is further supported by the fact that larger between-task parameter correlations (r>0.5) than those observed in humans were attainable---using the same methods---in a simulated dataset with perfect generalization.“
  
  3) About PCA
  
  In this paper, principal component analysis (PCA) is used to extract common components from the parameter estimates and behavioral features across tasks. When performing PCA, were each parameter estimate and behavioral feature standardized so that the variance would be 1? There was no mention about this. It seems that otherwise the principal components would be loaded toward the features with larger variance. In addition, Moutoussis et al. (Neuron, 2021, 109 (12), 2025-2040) conducted a similar analysis of behavioral parameters of various decision-making tasks, but they used factor analysis instead of PCA. Although the authors briefly mentioned factor analysis, it would be better if they also mentioned the reason why they used PCA instead of factor analysis, which can consider unique variances.
  
  To answer the reviewer's first question: We indeed standardized all features before performing the PCA. Apologies for missing to include this information - we have now added a corresponding sentence to the methods sections.
  
  We also thank the reviewer for the mentioned reference, which is very relevant to our findings and can help explain the roles of different PCs. Like in our study, Moutoussis et al. found a first PC that captured variability in task performance, and subsequent PCs that captured task contrasts. We added the following paragraph to our manuscript:
  
  “PC1 therefore captured a range of "good", task-engaged behaviors, likely related to the construct of "decision acuity" [...]. Like our PC1, decision acuity was the first component of a factor analysis (variant of PCA) conducted on 32 decision-making measures on 830 young people, and separated good and bad performance indices. Decision acuity reflects generic decision-making ability, and predicted mental health factors, was reflected in resting-state functional connectivity, but was distinct from IQ [...].”
  
  To answer the reviewer's question about PCA versus FA, both approaches are relatively similar conceptually, and oftentimes share the majority of the analysis pipeline in practice. The main difference is that PCA breaks up the existing variance in a dataset in a new way (based on PCs rather than the original data features), whereas FA aims to identify an underlying model of latent factors that explain the observable features. This means that PCs are linear combinations of the original data features, whereas Factors are latent factors that give rise to the observable features of the dataset with some noise, i.e., including an additional error term.
  
  However, in practice, both methods share the majority of computation in the way they are implemented in most standard statistical packages: FA is usually performed by conducting a PCA and then rotating the resulting solution, most commonly using the Varimax rotation, which maximizes the variance between features loadings on each factor in order to make the result more interpretable, and thereby foregoing the optimal solution that has been achieved by the PCA (which lack the error term). Maximum variance in feature loadings means that as many features as possible will have loadings close to 0 and 1 on each factor, reducing the number of features that need to be taken into account when interpreting this factor. Most relevant in our situation is that PCA is usually a special case of FA, with the only difference that the solution is not rotated for maximum interpretability. (Note that this rotation can be minor if feature loadings already show large variance in the PCA solution.)
  
  To determine how much our results would change in practice if we used FA instead of PCA, we repeated the analysis using FA. Both are shown side-by-side below, and the results are quite similar:
  
  We therefore conclude that our specific results are robust to the choice of method used, and that there is reason to believe that our PC1 is related to Moutoussis et al.’s F1 despite the differences in method.
  
  Reviewer #2 (Public Review):
  
  I am enthusiastic about the comprehensive approach, the thorough analysis, and the intriguing findings. This work makes a timely contribution to the field and warrants a wider discussion in the community about how computational methods are deployed and interpreted. The paper is also a great and rare example of how much can be learned from going beyond a meta-analytic approach to systematically collect data that assess commonly held assumptions in the field, in this case in a large data-driven study across multiple tasks. My only criticism is that at times, the paper misses opportunities to be more constructive in pinning down exactly why authors observe inconsistencies in parameter fits and interpretation. And the somewhat pessimistic outlook relies on some results that are, in my view at least, somewhat expected based on what we know about human RL. Below I summarize the major ways in which the paper's conclusions could be strengthened.
  
  One key point the authors make concerns the generalizability of absolute vs. relative parameter values. It seems that at least in the parameter space defined by +LRs and exploration/noise (which are known to be mathematically coupled), subjects clustered similarly for tasks A and C. In other words, as the authors state, "both learning rate and inverse temperature generalized in terms of the relationships they captured between participants". This struck me as a more positive and important result than it was made out to be in the paper, for several reasons:
  
  As authors point out in the discussion, a large literature on variable LRs has shown that people adapt their learning rates trial-by-trial to the reward function of the environment; given this, and given that all models tested in this work have fixed learning rates, while the three tasks vary on the reward function, the comparison of absolute values seems a bit like a red-herring.
  
  We thank the reviewers for this recommendation and have reworked the paper substantially to address the issue. We have modified the highlights, abstract, introduction, discussion, conclusion, and relevant parts of the results section to provide equal weight to the successes and failures of generalization.
  
  Highlights:
  
  ● “RL decision noise/exploration parameters generalize in terms of between-participant variation, showing similar age trajectories across tasks.”
  
  ● “These findings are in accordance with previous claims about the developmental trajectory of decision noise/exploration parameters.”
  
  Abstract:
  
  ● “We found that some parameters (exploration / decision noise) showed significant generalization: they followed similar developmental trajectories, and were reciprocally predictive between tasks.“
  
  The introduction now introduces different potential outcomes of our study with more equal weight:
  
  “Computational modeling enables researchers to condense rich behavioral datasets into simple, falsifiable models (e.g., RL) and fitted model parameters (e.g., learning rate, decision temperature) [...]. These models and parameters are often interpreted as a reflection of ("window into") cognitive and/or neural processes, with the ability to dissect these processes into specific, unique components, and to measure participants' inherent characteristics along these components.
  
  For example, RL models have been praised for their ability to separate the decision making process into value updating and choice selection stages, allowing for the separate investigation of each dimension. Crucially, many current research practices are firmly based on these (often implicit) assumptions, which give rise to the expectation that parameters have a task- and model-independent interpretation and will seamlessly generalize between studies. However, there is growing---though indirect---evidence that these assumptions might not (or not always) be valid.
  
  The following section lays out existing evidence in favor and in opposition of model generalizability and interpretability. Building on our previous opinion piece, which---based on a review of published studies---argued that there is less evidence for model generalizability and interpretability than expected based on current research practices [...], this study seeks to directly address the matter empirically.”
  
  We now also provide more even evidence for both potential outcomes:
  
  “Many current research practices are implicitly based on the interpretability and generalizability of computational model parameters (despite the fact that many researchers explicitly distance themselves from these assumptions). For our purposes, we define a model variable (e.g., fitted parameter, reward-prediction error) as generalizable if it is consistent across uses, such that a person would be characterized with the same values independent of the specific model or task used to estimate the variable. Generalizability is a consequence of the assumption that parameters are intrinsic to participants rather than task dependent (e.g., a high learning rate is a personal characteristic that might reflect an individual's unique brain structure). One example of our implicit assumptions about generalizability is the fact that we often directly compare model parameters between studies---e.g., comparing our findings related to learning-rate parameters to a previous study's findings related to learning-rate parameters. Note that such a comparison is only valid if parameters capture the same underlying constructs across studies, tasks, and model variations, i.e., if parameters generalize. The literature has implicitly equated parameters in this way in review articles [...], meta-analyses [...], and also most empirical papers, by relating parameter-specific findings across studies. We also implicitly evoke parameter generalizability when we study task-independent empirical parameter priors [...], or task-independent parameter relationships (e.g., interplay between different kinds of learning rates [...]), because we presuppose that parameter settings are inherent to participants, rather than task specific.
  
  We define a model variable as interpretable if it isolates specific and unique cognitive elements, and/or is implemented in separable and unique neural substrates. Interpretability follows from the assumption that the decomposition of behavior into model parameters "carves cognition at its joints", and provides fundamental, meaningful, and factual components (e.g., separating value updating from decision making). We implicitly invoke interpretability when we tie model variables to neural substrates in a task-general way (e.g., reward prediction errors to dopamine function [...]), or when we use parameters as markers of psychiatric conditions (e.g., working-memory parameter and schizophrenia [...]). Interpretability is also required when we relate abstract parameters to aspects of real-world decision making [...], and generally, when we assume that model variables are particularly "theoretically meaningful" [...].
  
  However, in midst the growing recognition of computational modeling, the focus has also shifted toward inconsistencies and apparent contradictions in the emerging literature, which are becoming apparent in cognitive [...], developmental [...], clinical [...], and neuroscience studies [...], and have recently become the focus of targeted investigations [...]. For example, some developmental studies have shown that learning rates increased with age [...], whereas others have shown that they decrease [...]. Yet others have reported U-shaped trajectories with either peaks [...] or troughs [...] during adolescence, or stability within this age range [...] (for a comprehensive review, see [...]; for specific examples, see [...]). This is just one striking example of inconsistencies in the cognitive modeling literature, and many more exist [...]. These inconsistencies could signify that computational modeling is fundamentally flawed or inappropriate to answer our research questions. Alternatively, inconsistencies could signify that the method is valid, but our current implementations are inappropriate [...]. However, we hypothesize that inconsistencies can also arise for a third reason: Even if both method and implementation are appropriate, inconsistencies like the ones above are expected---and not a sign of failure---if implicit assumptions of generalizability and interpretability are not always valid. For example, model parameters might be more context-dependent and less person-specific that we often appreciate [...]“
  
  In the results section, we now highlight findings more that are compatible with generalization: “For α+, adding task as a predictor did not improve model fit, suggesting that α+ showed similar age trajectories across tasks (Table 2). Indeed, α+ showed a linear increase that tapered off with age in all tasks (linear increase: task A: β = 0.33, p < 0.001; task B: β = 0.052, p < 0.001; task C: β = 0.28, p < 0.001; quadratic modulation: task A: β = −0.007, p < 0.001; task B: β = −0.001, p < 0.001; task C: β = −0.006, p < 0.001). For noise/exploration and Forgetting parameters, adding task as a predictor also did not improve model fit (Table 2), suggesting similar age trajectories across tasks.”
  
  “For both α+ and noise/exploration parameters, task A predicted tasks B and C, and tasks B and C predicted task A, but tasks B and C did not predict each other (Table 4; Fig. 2D), reminiscent of the correlation results that suggested successful generalization (section 2.1.2).”
  
  “Noise/exploration and α+ showed similar age trajectories (Fig. 2C) in tasks that were sufficiently similar (Fig. 2D).” And with respect to our simulation analysis (for details, see next section):
  
  “These results show that our method reliably detected parameter generalization in a dataset that exhibited generalization. ”
  
  We also now provide more nuance in our discussion of the findings:
  
  “Both generalizability [...] and interpretability (i.e., the inherent "meaningfulness" of parameters) [...] have been explicitly stated as advantages of computational modeling, and many implicit research practices (e.g., comparing parameter-specific findings between studies) showcase our conviction in them [...]. However, RL model generalizability and interpretability has so far eluded investigation, and growing inconsistencies in the literature potentially cast doubt on these assumptions. It is hence unclear whether, to what degree, and under which circumstances we should assume generalizability and interpretability. Our developmental, within-participant study revealed a nuanced picture: Generalizability and interpretability differed from each other, between parameters, and between tasks.”
  
  “Exploration/noise parameters showed considerable generalizability in the form of correlated variance and age trajectories. Furthermore, the decline in exploration/noise we observed between ages 8-17 was consistent with previous studies [13, 66, 67], revealing consistency across tasks, models, and research groups that supports the generalizability of exploration / noise parameters. However, for 2/3 pairs of tasks, the degree of generalization was significantly below the level of generalization expected for perfect generalization. Interpretability of exploration / noise parameters was mixed: Despite evidence for specificity in some cases (overlap in parameter variance between tasks), it was missing in others (lack of overlap), and crucially, parameters lacked distinctiveness (substantial overlap in variance with other parameters).”
  
  “Taken together, our study confirms the patterns of generalizable exploration/noise parameters and task-specific learning rate parameters that are emerging from the literature [13].”
  
  Regarding the relative inferred values, it's unclear how high we really expect correlations between the same parameter across tasks to be. E.g., if we take Task A and make a second, hypothetical, Task B by varying one feature at a time (say, stochasticity in reward function), how correlated are the fitted LRs going to be? Given the different sources of noise in the generative model of each task and in participant behavior, it is hard to know whether a correlation coefficient of 0.2 is "good enough" generalizability.
  
  We thank the reviewer for this excellent suggestion, which we think helped answer a central question that our previous analyses had failed to address, and also provided answers to several other concerns raised by both reviewers in other section. We have conducted these additional analyses as suggested, simulating artificial behavioral data for each task, fitting these data using the models used in humans, repeating the analyses performed on humans on the new fitted parameters, and using bootstrapping to statistically compare humans to the hence obtained ceiling of generalization. We have added the following section to our paper, which describes the results in detail:
  
  “Our analyses so far suggest that some parameters did not generalize between tasks, given differences in age trajectories (section 2.1.3) and a lack of mutual prediction (section 2.1.4). However, the lack of correspondence could also arise due to other factors, including behavioral noise, noise in parameter fitting, and parameter trade-offs within tasks. To rule these out, we next established the ceiling of generalizability attainable using our method.
  
  We established the ceiling in the following way: We first created a dataset with perfect generalizability, simulating behavior from agents that use the same parameters across all tasks (suppl. Appendix E). We then fitted this dataset in the same way as the human dataset (e.g., using the same models), and performed the same analyses on the fitted parameters, including an assessment of age trajectories (suppl. Table E.8) and prediction between tasks (suppl. Tables E.9, E.10, and E.11). These results provide the practical ceiling of generalizability. We then compared the human results to this ceiling to ensure that the apparent lack of generalization was valid (significant difference between humans and ceiling), and not in accordance with generalization (lack of difference between humans and ceiling).
  
  Whereas humans had shown divergent trajectories for parameter alpha- (Fig. 2B; Table 1), the simulated agents did not show task differences for alpha- or any other parameter (suppl. Fig E.8B; suppl. Table E.8, even when controlling for age (suppl. Tables E.9 and E.10), as expected from a dataset of generalizing agents. Furthermore, the same parameters were predictive between tasks in all cases (suppl. Table E.11). These results show that our method reliably detected parameter generalization in a dataset that exhibited generalization.
  
  Lastly, we established whether the degree of generalization in humans was significantly different from agents. To this aim, we calculated the Spearman correlations between each pair of tasks for each parameter, for both humans (section 2.1.2; suppl. Fig. H.9) and agents, and compared both using bootstrapped confidence intervals (suppl. Appendix E). Human parameter correlations were significantly below the ceiling for all parameters except alpha+ (A vs B) and epsilon / 1/beta (A vs C; suppl. Fig. E.8C). This suggests that humans were within the range of maximally detectable generalization in two cases, but showed less-than-perfect generalization between other task combinations and for parameters Forgetting and alpha-.”
  
  The +LR/inverse temp relationship seems to generalize best between tasks A/C, but not B/C, a common theme in the paper. This does not seem surprising given that in A and C there is a key additional task feature over the bandit task in B -- which is the need to retain state-action associations. Whether captured via F (forgetting) or K (WM capacity), the cognitive processes involved in this learning might interact with LR/exploration in a different way than in a task where this may not be necessary.
  
  We thank the reviewer for this comment, which raises an important issue. We are adding the specific pairwise correlations and scatter plots for the pairs of parameters the reviewer asked about below (“bf_alpha” = LR task A; “bf_forget” = F task A; “rl_forget” = F task C; “rl_log_alpha” = LR task C; “rl_K” = WM capacity task C):
  
  Within tasks:
  
  Between tasks:
  
  To answer the question in more detail, we have expanded our section about limitations stemming from parameter tradeoffs in the following way:
  
  “One limitation of our results is that regression analyses might be contaminated by parameter cross-correlations (sections 2.1.2, 2.1.3, 2.1.4), which would reflect modeling limitations (non-orthogonal parameters), and not necessarily shared cognitive processes. For example, parameters alpha and beta are mathematically related in the regular RL modeling framework, and we observed significant within-task correlations between these parameters for two of our three tasks (suppl. Fig. H.10, H.11). This indicates that caution is required when interpreting correlation results. However, correlations were also present between tasks (suppl. Fig. H.9, H.11), suggesting that within-model trade-offs were not the only explanation for shared variance, and that shared cognitive processes likely also played a role.
  
  Another issue might arise if such parameter cross-correlations differ between models, due to the differences in model parameterizations across tasks. For example, memory-related parameters (e.g., F, K in models A and C) might interact with learning- and choice-related parameters (e.g., alpha+, alpha-, noise/exploration), but such an interaction is missing in models that do not contain memory-related parameters (e.g., task B). If this indeed the case, i.e., parameters trade off with each other in different ways across tasks, then a lack of correlation between tasks might not reflect a lack of generalization, but just the differences in model parameterizations. Suppl. Fig. \ref{figure:S2AlphaBetaCorrelations} indeed shows significant, medium-sized, positive and negative correlations between several pairs of Forgetting, memory-related, learning-related, and exploration parameters (though with relatively small effect sizes; Spearman correlation: 0.17 < |r| < 0.22).
  
  The existence of these correlations (and differences in correlations between tasks) suggest that memory parameters likely traded off with each other, as well as with other parameters, which potentially affected generalizability across tasks. However, some of the observed correlations might be due to shared causes, such as a common reliance on age, and the regression analyses in the main paper control for these additional sources of variance, and might provide a cleaner picture of how much variance is actually shared between parameters.
  
  Furthermore, correlations between parameters within models are frequent in the existing literature, and do not prevent researchers from interpreting parameters---in this sense, the existence of similar correlations in our study allows us to address the question of generalizability and interpretability in similar circumstances as in the existing literature.”
  
  More generally, isn't relative generalizability the best we would expect given systematic variation in task context? I agree with the authors' point that the language used in the literature sometimes implies an assumption of absolute generalizability (e.g. same LR across any task). But parameter fits, interactions, and group differences are usually interpreted in light of a single task+model paradigm, precisely b/c tasks vary widely across critical features that will dictate whether different algorithms are optimal or not and whether cognitive functions such as WM or attention may compensate for ways in which humans are not optimal. Maybe a more constructive approach would be to decompose tasks along theoretically meaningful features of the underlying Markov Decision Process (which gives a generative model), and be precise about (1) which features we expect will engage additional cognitive mechanisms, and (2) how these mechanisms are reflected in model parameters.
  
  We thank the reviewer for this comment, and will address both points in turn:
  
  (1) We agree with the reviewer's sentiment about relative generalizability: If we all interpreted our models exclusively with respect to our specific task design, and never expected our results to generalize to other tasks or models, there would not be a problem. However, the current literature shows a different pattern: Literature reviews, meta-analyses, and discussion sections of empirical papers regularly compare specific findings between studies. We compare specific parameter values (e.g., empirical parameter priors), parameter trajectories over age, relationships between different parameters (e.g., balance between LR+ and LR-), associations between parameters and clinical symptoms, and between model variables and neural measures on a regular basis. The goal of this paper was really to see if and to what degree this practice is warranted. And the reviewer rightfully alerted us to the fact that our data imply that these assumptions might be valid in some cases, just not in others.
  
  (2) With regard to providing task descriptions that relate to the MDP framework, we have included the following sentence in the discussion section:
  
  “Our results show that discrepancies are expected even with a consistent methodological pipeline, and using up-to-date modeling techniques, because they are an expected consequence of variations in experimental tasks and computational models (together called "context"). Future research needs to investigate these context factors in more detail. For example, which task characteristics determine which parameters will generalize and which will not, and to what extent? Does context impact whether parameters capture overlapping versus distinct variance? A large-scale study could answer these questions by systematically covering the space of possible tasks, and reporting the relationships between parameter generalizability and distance between tasks. To determine the distance between tasks, the MDP framework might be especially useful because it decomposes tasks along theoretically meaningful features of the underlying Markov Decision Process.“
  
  Another point that merits more attention is that the paper pretty clearly commits to each model as being the best possible model for its respective task. This is a necessary premise, as otherwise, it wouldn't be possible to say with certainty that individual parameters are well estimated. I would find the paper more convincing if the authors include additional information and analysis showing that this is actually the case.
  
  We agree with the sentiment that all models should fit their respective task equally well. However, there is no good quantitative measure of model fit that is comparable across tasks and models - for example, because of the difference in difficulty between the tasks, the number of choices explained would not be a valid measure to compare how well the models are doing across tasks. To address this issue, we have added the new supplemental section (Appendix C) mentioned above that includes information about the set of models compared, and explains why we have reason to believe that all models fit (equally) well. We also created the new supplemental Figure D.7 shown above, which directly compares human and simulated model behavior in each task, and shows a close correspondence for all tasks. Because the quality of all our models was a major concern for us in this research, we also refer the reviewer and other readers to the three original publications that describe all our modeling efforts in much more detail, and hopefully convince the reviewer that our model fitting was performed according to high standards.
  
  I am particularly interested to see whether some of the discrepancies in parameter fits can be explained by the fact that the model for Task A did not account for explicit WM processes, even though (1) Task A is similar to Task C (Task A can be seen as a single condition of Task C with 4 states and 2 possible visible actions, and stochastic rather than deterministic feedback) and (2) prior work has suggested a role for explicit memory of single episodes even in stateless bandit tasks such as Task B.
  
  We appreciate this very thoughtful question, which raises several important issues. (1) As the reviewer said, the models for task A and task C are relatively different even though the underlying tasks are relatively similar (minus the differences the reviewer already mentioned, in terms of visibility of actions, number of actions, and feedback stochasticity). (2) We also agree that the model for task C did not include episodic memory processes even though episodic memory likely played a role in this task, and agree that neither the forgetting parameters in tasks A and C, nor the noise/exploration parameters in tasks A, B, and C are likely specific enough to capture all the memory / exploration processes participants exhibited in these tasks.
  
  However, this problem is difficult to solve: We cannot fit an episodic-memory model to task B because the task lacks an episodic-memory manipulation (such as, e.g., in Bornstein et al., 2017), and we cannot fit a WM model to task A because it lacks the critical set-size manipulation enabling identification of the WM component (modifying set size allows the model to identify individual participants’ WM capacities, so the issue cannot be avoided in tasks with only one set size). Similarly, we cannot model more specific forgetting or exploration processes in our tasks because they were not designed to dissociate these processes. If we tried fitting more complex models that include these processes to these tasks, they would most likely lose in model comparison because the increased complexity would not lead to additional explained behavioral variance, given that the tasks do not elicit the relevant behavioral patterns. Because the models therefore do not specify all the cognitive processes that participants likely employ, the situation described by the reviewer arises, namely that different parameters sometimes capture the same cognitive processes across tasks and models, while the same parameters sometimes capture different processes.
  
  And while the reviewer focussed largely on memory-related processes, the issue of course extends much further: Besides WM, episodic memory, and more specific aspects of forgetting and exploration, our models also did not take into account a range of other processes that participants likely engaged in when performing the tasks, including attention (selectivity, lapses), reasoning / inference, mental models (creation and use), prediction / planning, hypothesis testing, etc., etc. In full agreement with the reviewer’s sentiment, we recently argued that this situation is ubiquitous to computational modeling, and should be considered very carefully by all modelers because it can have a large impact on model interpretation (Eckstein et al., 2021).
  
  If we assume that many more cognitive processes are likely engaged in each task than are modeled, and consider that every computational model includes just a small number of free parameters, parameters then necessarily reflect a multitude of cognitive processes. The situation is additionally exacerbated by the fact that more complex models become increasingly difficult to fit from a methodological perspective, and that current laboratory tasks are designed in a highly controlled and consequently relatively simplistic way that does not lend itself to simultaneously test a variety of cognitive processes.
  
  The best way to deal with this situation, we think, is to recognize that in different contexts (e.g., different tasks, different computational models, different subject populations), the same parameters can capture different behaviors, and different parameters can capture the same behaviors, for the reasons the reviewer lays out. Recognizing this helps to avoid misinterpreting modeling results, for example by focusing our interpretation of model parameters to our specific task and model, rather than aiming to generalize across multiple tasks. We think that recognizing this fact also helps us understand the factors that determine whether parameters will capture the same or different processes across contexts and whether they will generalize. This is why we estimated here whether different parameters generalize to different degrees, which other factors affect generalizability, etc. Knowing the practical consequences of using the kinds of models we currently use will therefore hopefully provide a first step in resolving the issues the reviewer laid out.
  
  It is interesting that one of the parameters that generalizes least is LR-. The authors make a compelling case that this is related to a "lose-stay" behavior that benefits participants in Task B but not in Task C, which makes sense given the probabilistic vs deterministic reward function. I wondered if we can rule out the alternative explanation that in Task C, LR- could reflect a different interpretation of instructions vis. a vis. what rewards indicate - do authors have an instruction check measure in either task that can be correlated with this "lose-stay" behavior and with LR-? And what does the "lose-stay" distribution look like, for Task C at least? I basically wonder if some of these inconsistencies can be explained by participants having diverging interpretations of the deterministic nature of the reward feedback in Task C. The order of tasks might matter here as well -- was task order the same across participants? It could be that due to the within-subject design, some participants may have persisted in global strategies that are optimal in Task B, but sub-optimal in Task C.
  
  The PCA analysis adds an interesting angle and a novel, useful lens through which we can understand divergence in what parameters capture across different tasks. One observation is that loadings for PC2 and PC3 are strikingly consistent for Task C, so it looks more like these PCs encode a pairwise contrast (PC2 is C with B and PC2 is C with A), primarily reflecting variability in performance - e.g. participants who did poorly on Task C but well on Task B (PC2) or Task A (PC3). Is it possible to disentangle this interpretation from the one in the paper? It also is striking that in addition to performance, the PCs recover the difference in terms of LR- on Task B, which again supports the possibility that LR- divergence might be due to how participants handle probabilistic vs. deterministic feedback.
  
  We appreciate this positive evaluation of our PCA and are glad that it could provide a useful lens for understanding parameters. We also agree to the reviewer's observation that PC2 and PC3 reflect task contrasts (PC2: task B vs task C; PC3: task A vs task C), and phrase it in the following way in the paper:
  
  “PC2 contrasted task B to task C (loadings were positive / negative / near-zero for corresponding features of tasks B / C / A; Fig. 3B). PC3 contrasted task A to both B and C (loadings were positive / negative for corresponding features on task A / tasks B and C; Fig. 3C).”
  
  Hence, the only difference between our interpretation and the reviewer’s seems to be whether PC3 contrasts task C to task B as well as task A, or just to task A. Our interpretation is supported by the fact that loadings for tasks A and C are quite similar on PC3; however, both interpretations seem appropriate.
  
  We also appreciate the reviewer's positive evaluation of the fact that the PCA reproduces the differences in LR-, and its relationship to probabilistic/deterministic feedback. The following section reiterates this idea:
  
  “alpha- loaded positively in task C, but negatively in task B, suggesting that performance increased when participants integrated negative feedback faster in task C, but performance decreased when they did the same in task B. As mentioned before, contradictory patterns of alpha- were likely related to task demands: The fact that negative feedback was diagnostic in task C likely favored fast integration of negative feedback, while the fact that negative feedback was not diagnostic in task B likely favored slower integration (Fig. 1E). This interpretation is supported by behavioral findings: "Lose-stay" behavior (repeating choices that produce negative feedback) showed the same contrasting pattern as alpha- on PC1. It loaded positively in task B, showing Lose-stay behavior benefited performance, but it loaded negatively on task C, showing that it hurt performance (Fig. 3A). This supports the claim that lower alpha- was beneficial in task B, while higher alpha- was beneficial in task C, in accordance with participant behavior and developmental differences.“
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.28.446162v1
www.biorxiv.org www.biorxiv.org

Effects of arousal and movement on secondary somatosensory and visual thalamus

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The authors make juxtacellular recordings on awake mice, which should yield clear responses of actions potentials, and employ a number of manipulations to silence pathways. They also record from a "non"-whisker secondary thalamic region, LP, as a null hypothesis to establish if certain effects are related to "behavior" - read arousal or saliency". I have no major qualms.
  
  In light of Petersen's paper (Cell Reports 2014) on cholinergic effects on spike rates in primary whisker somatosensory cortex, I can imagine that the authors considered measuring from cholinergic neurons in nucleus basalis during whisking. I'll assume that this is easier said than done. As such, the current manuscript passes my threshold for publication modulo issues raised below that are related to anatomy.
  
  The cholinergic experiments are an interesting idea. However, inactivation of S1 did not change the relationship between POm and whisking, suggesting that cholinergic modulation of S1 and thereby corticothalamic output are not the key mechanism. It is conceivable that acetylcholine modulates POm directly, but the critical experiments would involve extensive manipulations of POm (a whole additional study). Nevertheless, we have added a reference to Eggermann & Petersen and discussed this issue further in the revision.
  
  I provide a figure-by-figure critique:
  
  (1) Recent work from Deschênes et al (Neuron 2016) points to a description of whisking in terms of Angle = Set-point_angle - Whisking-amplitude [1 + cosine(Phase - Phase_0)], where Phase is a rapidly varying, typically rhythmic function of time. Why not use this notation as opposed to yet another descriptive statistic and report the kinetics as the time averaged parameters , i.e., the most forward position, and ,Whisking-amplitude>, i.e., the half-amplitude of the average whisk?
  
  We are not entirely sure what the reviewer means by “another descriptive statistic” as we do not introduce new approaches for analyzing whisking in this paper. (Perhaps the reviewer refers to “median angle”, which is an average of all the whisker positions on a single frame. We use this measurement because our videos contain the entire whisker field rather than just a single whisker as in our other studies, e.g. Hong et al 2018, Rodgers et al 2021). We based our parameterization of median angle on two publications: Hill et al (2011 Neuron) and Moore et al (2015 PLoS Biology). Moore et al describes whisking as a function of phase, amplitude, and midpoint:
  
  where 𝜃(t) is the median whisker angle at time 𝑡 , 𝜙 is the phase as computed by the Hilbert transform of the filtered whisker angle, 𝜃^Amplitude is the difference between the most protracted and retracted whisker positions over a single cycle, and 𝜃^midpoint is the central angle of a single whisk cycle. As we understand the reviewer, we are using the formulation they describe. We are happy to consider alternate formulations if we are missing something.
  
  A critical issue is to confirm where the recording were made. This the authors should supply at least a typical record of anatomy from their POm as well as VPM and LP recording. The beauty of the juxtacellular technique is that neurons can be labeling after the recording
  
  We used the juxtacellular recording technique for its superior recording quality. We did not label individual cells after recording because we recorded multiple cells per animal over several days. The number of cells would complicate matching of filled cells to recorded physiological data, and biotin filling is not stable over multiple days (beyond 36 hours). Instead, as described in the original manuscript, we tracked the relative locations of all inserted pipettes and labeled the final track with DiI. Cells were roughly localized along the tracks using relative microdrive depths. Due to the morphological homogeneity of thalamic neurons, filling individual cells would not be more informative than labelling the recording site with DiI. New Figure 1 – figure supplement 1 includes representative histology images from our recordings in POm, VPM, LP, and M1.
  
  (2) Did the authors make sure that the mystacial pad is not moving by imaging the pad as opposed to just the shaft of the whiskers? The top view in Figure 1A makes this hard to check.
  
  To address this concern, we provide new data, in which both the cut and uncut sides of the face of mice were imaged. We measured the movement of the mystacial pads as motion energy – the mean absolute difference in pixel values across video frames. The motor nerve surgery almost completely abolished movement of the mystacial pad. A new figure panel (Figure 2B) demonstrates the movement of the normal and paralyzed mystacial pads.
  
  Further, did the authors perform post-hoc anatomy to insure that both the ramus buccolabialis inferior and ramus buccolabialis superior muscles were cut? This is critical; it is also easy to leave the maxillolabialis (external retractor) innervated if the cut is too far rostral.
  
  We did not attempt to cut muscles. We only cut the motor nerve. We did not examine the face post mortem, as it was obvious that both whisker and mystacial pad movement were absent (as in new Figure 2B).
  
  (3/4) As relevant background, the text should note that whisker primary motor cortex maintains a copy of the envelope of the whisking, i.e., an ill-defined summation of set-point and amplitudes, even if the sensory input (Ahrens & Kleinfeld J Neurophysiol 2004) or motor output (Fee et al. J Neurophysiol 1997) in the periphery are cut.
  
  The Results text now cites these papers as motivation for the experiments of Figure 3.
  
  (6/7) Same comments in (1) in whisking parameters and anatomy.
  
  As we discussed in (1), we are using the conventional parameterization of others. Histological examples are now included in Figure 1 – figure supplement 1.
  
  Reviewer #3 (Public Review):
  
  Previous studies in urethane-anesthetized rats (PMID 16605304) proposed that POm cells code whisker movements. This was observed using "artificial whisking" procedures (stimulating the motor nerve to produce a whisking-like movement). It has been clear for some time now that there are substantial (obvious) differences between this procedure and natural whisking. In addition, under urethane-anesthesia animals are in a sleep-like state that is very dissimilar to waking (although some work has tested the effect of network state on artificial whisking responses in both primary thalamus and cortex; see 25505118). In the present study, the authors measured activity in POm cells during whisking in awake (head-fixed) mice to determine if they code whisking movement. However, this seems to have already been done previously. For instance, Moore et al (2015; 26393890) found that coding of whisking in the ascending paralemniscal pathway, including POm, is "relatively poor" (as stated in the abstract), which is the same conclusion reached in the present study. The authors should clarify the main differences observed in whisking coding between their study and previous work.
  
  The authors then focused on the idea that POm codes behavioral state. However, many studies have previously determined that state has a great impact on thalamocortical dynamics; thalamic cells are very sensitive to state including cells in primary whisker thalamic nuclei, such as VPM, and these effects can be produced by neuromodulators (see work by Castro-Alamancos' group, for example, 16306412). There is nothing special about VPM in this regard; other thalamic sensory nuclei are also sensitive to behavioral state and neuromodulators. Therefore, the observation that POm and LP cells are sensitive to state is unsurprising. It is also known that these thalamic state changes have a great impact on the state of the cortex (see 20053845), which seems very relevant to the main conclusion. The POm has to be doing something different than coding behavioral state since most thalamic nuclei do this. The study did not identify the role of POm, which certainly has to be different from LP (otherwise, why would these nuclei be differentiated?). POm is unlikely to be specialized for monitoring state since this is done by most of the thalamus -including VPM, which projects to the same cortical region. Thus, while it is interesting that most of the whisker-related activity in POm is state-dependent, the study does not clarify the role of POm.
  
  We have added the references we did not already include to our text and improved our discussion.
  
  Prior studies (such as Moore et al 2015 and Urbain et al 2015) have previously characterized the encoding of whisker motion in POm. Indeed, we note the consistency between our results and such studies in both the introduction and conclusion. Here we expand upon prior studies to directly test two prominent hypotheses about the role of the paralemniscal pathway: that it encodes sensory reafference, and that it inherits a motor efference copy from cortical and subcortical regions. We present the impact of several manipulations of the vibrissal system (facial paralysis, cortical silencing, and lesion of superior colliculus) on thalamic activity that, to our knowledge, have not been previously reported. Moreover, we leveraged a novel comparison of POm and LP to test whether movement‐correlations of POm reflected true motor modulation or rather state dependency. We have provided evidence that the coupling of POm activity to whisking reflects state rather than motor signals. We never suggested that POm is a unique monitor of behavioral state. We suggest instead that secondary thalamic nuclei may be state‐modulated and have specific impacts on response gain and plasticity in their respective cortical areas. While our work is consistent with previous studies, we believe these results are novel extensions of past work.
  
  The main strength of the study is that it was performed in awake mice with behavioral state monitoring, which contributes to the current understanding of active whisking coding in the complex network of the vibrissa system.
  
  In our opinion, the main strength of our study is its multiple manipulations to test the sources of modulation and the leveraging of a POm‐LP comparison. We have revised the text to reinforce these points.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.03.04.977348v1
www.biorxiv.org www.biorxiv.org

New submission 14/07/2022, 16:32:14

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  Q1) The manuscript reports that in vitro fertilization (including in vitro culture) of mouse embryos seemingly originates metabolic alterations probably caused by enhanced oxidative stress compared to in vivo development. Such alterations apparently increase anaerobic glycolysis, as evidenced by altered pH and lactate levels, and remain after birth, as evidenced by altered protein abundance of MCT1 and LDHB.
  
  The manuscript concludes that IVF alters embryo metabolism, increasing oxidative damage and glycolytic activity. The topic is interesting but I consider that the conclusions are not well supported by the experiments:
  
  1) In vivo generated blastocysts are analyzed at a more advanced developmental stage than their in vitro counterparts as evidenced by their increased cell number (70 vs. 50 cells). In this regard, the developmental timing when in vitro generated blastocysts are collected is undisclosed in the Materials and methods. This has an obvious effect on all experiments as the differences observed may be stage-specific rather than IVF vs. in vivo.
  
  A1) Thank you for the comment. The reviewer is correct and it is indeed well known that in vitro fertilization and embryo culture results in profound changes to the embryo. Overall, embryos generated in vitro are delayed compared to embryos generated in vivo. To control for this, as done in our past publications (Belli 2019; Bloise 2014; Delle Piane 2010; Giritharan 2012; Giritharan 2010; Giritharan 2006; Giritharan 2007; Rinaudo 2006; Rinaudo 2004), or by others (Doherty 2000; Ecker 2004; Weinerman 2016), we limited the analysis to expanded blastocysts of similar morphology (under microscopic examination) in all of the groups. Therefore the embryos appeared morphological similar in all of the groups. As an alternative, we could have waited longer time in vitro, but this would have resulted in embryo hatching and being not morphological similar to in vivo embryos. In addition, the 2 IVF groups provide an internal control: embryos were at the same developmental stage, but showed significant changes in metabolism and cell numbers. (96 hours of culture +13-14hours for egg collection+ 4hours of fertilization= time post HCG administration)
  
  We have added this information as follows: Line 377-382: To control for the known delay in development after culture in vitro, for all experiments, only expanded blastocysts of similar morphology were used, as done before (Doherty 2000; Rinaudo 2006; Rinaudo 2004). The in vivo-generated blastocysts were isolated by flushing 96-98 hours after hCG administration. IVF- 5% O2 and 20% O2 generated embryos reached the blastocyst stage after 96-98 hours following in vitro culture and 113-114 hours after hCG administration, respectively.
  
  Q2) Several methods are not reliable to quantify the parameters analyzed. For instance, determining protein content by immunofluorescence has been largely shown to be misleading as immunofluorescence can be affected by multiple parameters. Intracellular pH was also analyzed by an assay also based on immunofluorescence, which can also be affected by embryo size (the blastocoel is a call-devoid cavity). These analyses are not reliable.
  
  A2) Thank you for the comments.
  
  We appreciate the comments and concerns. Any single method can result in error and possible bias. Immunofluorescence analysis is a robust method that has been used to analyze the distribution of proteins in cells or tissues. For instance, oxidative stress (Liu et al., 2022, Reprod Domest Anim), several signaling molecule (Spirkova et al., 2022, Biol Reprod) and DNA methylation level (Diaz et al, 2021, Fron Gent) have been measured by immunofluorescence in preimplantation embryos and oocytes. It our study, to minimize errors, we followed exactly the same protocol and we found immunofluorescence to be reliable. In addition, global proteomics analysis of blastocysts provide partial independent confirmation of our results. While LDH-A and MCT1 were not detected, LDH-B was detected and found to be lower in IVF blastocysts, exactly as show by IF studies. Finally, western blot analysis of adult tissues confirmed reduction in LDH-B and MCT-1 levels.
  
  These comments have been added to the discussion as follows:
  
  Line 299-302: Unsupervised global proteomics analysis revealed that LDH-B was downregulated in IVF embryos. We confirmed these results by performing immunofluorescence studies. In addition we found that IVF embryos showed downregulation of both LDHA and B and of the monocarboxylate transporter, MCT 1, providing an explanation for the increase in their lactate levels
  
  Regarding pH measurement: to control for the possible variation in blastocoel size in different embryos, we compared immunofluorescence level of only the inner cell mass and trophoblast region of blastocysts and excluded the blastocoel region.
  
  This clarification has been added to the method section as follow:
  
  Line 488-491: To control for the possible variation in blastocoel size in different embryos, we compared immunofluorescence level of only the inner cell mass and trophoblast region of blastocysts and excluded the blastocoel region.
  
  Q3) Identifying proteins and metabolites in such small samples is technically difficult and error-prone, requiring validation by alternative techniques.
  
  We appreciate the comments and concerns. Any single method can result in error and possible bias. Immunofluorescence analysis is a robust method that has been used to analyze the distribution of proteins in cells or tissues. For instance, oxidative stress (Liu 2022), several signaling molecule (Spirkova 2022) and DNA methylation level (Diaz 2021) have been measured by immunofluorescence in preimplantation embryos and oocytes. It our study, to minimize errors, we followed exactly the same protocol and we found immunofluorescence to be reliable. In addition, global proteomics analysis of blastocysts (triplicate for each group; n=100 blastocysts for each replicate; total 900 embryos). provide partial independent confirmation of our results. While LDH-A and MCT1 were not detected, LDH-B was detected and found to be lower in IVF blastocysts, exactly as show by IF studies. Finally, western blot analysis of adult tissues confirmed reduction in LDH-B and MCT-1 levels.
  
  These comments have been added to the discussion as follows:
  
  Line 299-302: Unsupervised global proteomics analysis revealed that LDH-B was downregulated in IVF embryos. We confirmed these results by performing immunofluorescence studies. In addition we found that IVF embryos showed downregulation of both LDHA and B and of the monocarboxylate transporter, MCT 1, providing an explanation for the increase in their lactate levels
  
  Q4) Given the small size of these embryos (~80 µm diameter), it is unclear how they can alter significantly the composition of 500 µl of medium (106 their own volume).
  
  To collect 300 blastocysts, we performed multiple IVF, each IVF resulting in 10-20 blastocysts cultured in 30 microliters of media. While intracellular lactate and pyruvate were performed on the embryos collected, the media from different experiments was pooled to a final 500 microliter volume. Lactate and pyruvate levels were measured in this final volume for each group of embryo (FB, IVF5% and IVF20%)
  
  This has been clarified in the method section as follows:
  
  Line 516-519: To collect 300 blastocysts, we performed multiple IVF, each IVF resulting in 10-20 blastocysts cultured in 30 microliters of media. While intracellular lactate and pyruvate were performed on the embryos collected, the media from different experiments was pooled to a final 500 microliter volume.
  
  Q5) The metabolic changes observed in the offspring lack a mechanistic explanation.
  
  Thank you for the comment. We can formulate a hypothesis in which (Figure 8) oxidative stress from in vitro condition increase ROS and induce oxidative damage resulting in a shift toward Warburg metabolism, given that lactate is a critical energy source (Brooks, 2018). The higher intracellular lactate levels will likely induce epigenetic changes, to favor Warburg metabolism during development, as an embryonic attempt to optimize growth based on the environment predicted to be experienced in the future. When the environment does not match the prediction, disease risk increases (Godfrey 2007). Low lactate would be beneficial in a setting of low food resources because it could favor lipolysis (Brooks, 2020). In fact, lactate activates the hydroxycarboxylic acid receptor 1 (HCAR1), a G protein-coupled receptor, which in turn inhibits lipolysis in fat cells via cAMP and CREB (Liu 2009). However, since there is an abundance of food in our society, this mismatch could predispose IVF concepti to develop chronic disease like glucose intolerance.
  
  This hypothesis has been added to line 333-344:
  
  In summary, we can formulate a hypothesis in which (Figure 8) oxidative stress from in vitro condition increase ROS and induce oxidative damage resulting in a shift toward Warburg metabolism, given that lactate is a critical energy source (Brooks, 2018). The higher intracellular lactate levels will likely induce epigenetic changes, to favor Warburg metabolism during development, as an embryonic attempt to optimize growth based on the environment predicted to be experienced in the future. When the environment does not match the prediction, disease risk increases (Godfrey 2007). Low lactate would be beneficial in a setting of low food resources because it could favor lipolysis (Brooks, 2020). In fact, lactate activates the hydroxycarboxylic acid receptor 1 (HCAR1), a G protein-coupled receptor in turn inhibits lipolysis in fat cells via cAMP and CREB (Liu 2009). However, since there is an abundance of food in our society, this mismatch could predispose IVF concepti to develop chronic disease like glucose intolerance.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.13.488204v1
www.biorxiv.org www.biorxiv.org

New submission 18/08/2022, 14:51:36

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Public Evaluation Summary
  
  The authors aim to tackle a fundamental question with their study: whether there is a direct age-associated increase of transcriptional noise. To investigate this question, they develop tools to analyze single-cell sequencing data from mouse and human aging datasets. Ultimately, application of their novel tool (Scallop) suggests that transcriptional noise does not change with age, changes in transcriptional noise can be attributed to other sources such as subtle shifts in cell identity. This study is in principle of broad interest, but it currently lacks a definitive demonstration of the robustness of Scallop. Systematic testing of this new package would ultimately strengthen the key conclusion of the work and give additional users more confidence when using the tool to estimate expression noise.
  
  We have now attempted to further demonstrate the robustness of Scallop by performing a more systematic analysis and a side-by-side comparison to other existing methods using a set of artificially generated datasets. These analyses have resulted in the inclusion of six supplementary figures that are presented in the subsections Scallop membership score accurately identifies transcriptionally noisy cells, Ability to detect noisy cells within cell types, Effect of cellular composition, Effect of dataset size, Effect of feature expression and Effect of cell type marker expression within the Results section of the revised manuscript.
  
  We have also included a supplementary figure showing an in-depth analysis of a dataset where ageassociated increase in transcriptional noise was detected using alternative methods, but whose closer dissection has revealed that the difference in noise is due to a single donor and to the choice of methods. We discuss this is in the subsection Distance-to-centroid methods detect transcriptionally stable cell subtypes as transcriptional noise within the Results section.
  
  Finally, we have revised the manuscript to clarify the main points raised by the reviewers: the definition of transcriptional noise, the reasoning behind the choice of the single-cell aging datasets and Leiden’s rationale. Also, we have expanded the description of the method to make the definition of membership score more clear to the readers, and discussed the implications of our main findings (a lack of evidence for age-related transcriptional noise) in the broader context of theories of aging.
  
  Reviewer #1 (Public Review):
  
  In the present study, Ibanez-Sole et al evaluate transcriptional noise across aging and tissues in several publicly available mouse and human datasets. Initially, the authors compare 4 generalized approaches to quantify transcriptional noise across cell types and later implement a new approach which uses iterative clustering to assess cellular noise. Based on implementation of this approach (scallop), the authors survey noise across seven sc-seq datasets relevant for aging. Here, the authors conclude that enhanced transcriptional noise is not a hallmark of aging, rather changes in cell identity and abundances, namely immune and endothelial cells. The development of new tools to quantify transcriptional noise from sc-seq data presents appeal, as these datasets are increasing exponentially. Further, the conclusion that increased transcriptional noise is not a defined aspect of aging is clearly an important contribution; however, given the provocative nature of this claim, more comprehensive and systematic analyses should be performed. In particular, the robustness and appeal of scallop is still not sufficiently demonstrated and given the complexity (multiple tissues, species and diverse relative age ranges) of datasets analyzed, a more thorough comparison should be performed. I list a few thoughts below:
  
  Initially, the authors develop Decibel, which centralizes noise quantification methods. The authors provide schematics shown in Fig 1, and compare noise estimates with aging in Fig 2 - Supplement 2. Since the authors emphasize the necessary use of scallop as a ”better” pipeline, more systematic comparisons to the other methods should be made side-by-side.
  
  We thank the reviewer for their positive assessment of the manuscript and their suggestions. We agree that side-by-side benchmarking of Scallop with the methods implemented in Decibel, as well as a more thorough analysis on the effect of different features such as dataset size, cellular composition, etc. might have on the output of Scallop will reinforce the main points of the manuscript. To experimentally respond to these requests, we took advantage of a set of four artificial datasets previously generated by us with the R package splatter (v1.10.1; as described in Ascensión et al. [1]). In the present work, we first run a side-by-side comparison between Scallop and two distance-to-centroid (DTC) methods on the four artificial datasets with increasing degrees of transcriptional noise present in them (the novel data are included as Figure 1 – Figure supplement 1 in the revised manuscript). Then, we compared Scallop to one DTC method regarding their ability to detect noisy cells in different cell types (Figure 1 – Figure supplement 2). Finally, we implemented four simulations to test the effect of the following features on the performance of Scallop: cellular composition (Figure 1 – Figure supplement 3), dataset size (Figure 1 – Figure supplement 4), number of genes (Figure 1 – Figure supplement 5) and marker gene expression (Figure 1 – Figure supplement 6). A summary of these results follows.
  
  Side-by-side comparison of Scallop vs DTC methods
  
  Each of the four artificial datasets used consists of 10K cells, from 9 populations, named Group1 to Group9, with the following relative abundances: 25, 20, 15, 10, 10, 7, 5.5, 4, and 3.5%, respectively. The four datasets only differ in the de.prob parameter used in their generation. The de.prob parameter determines the probability that a gene is differentially expressed between subpopulations within the dataset. The greater the de.prob value, the more differentially expressed genes there will be between clusters, meaning that the different cell types present in the dataset will cluster in a more robust way. Decreasing the value of de.prob results in datasets with noisy cells, with populations that do not have such a strong transcriptional signature. In order to study how Scallop can capture the degree of robustness with which cells of the same cell type cluster together, we selected four de.prob values (0.05, 0.016, 0.01 and 0.005) and measured transcriptional noise using Scallop and two DTC methods, the whole transcriptome-based Euclidean distance to cell type mean and the invariant gene-based Euclidean distance to tissue mean expression. These two methods were selected because GCL does not yield a transcriptional noise measure per cell, so no comparisons can be made with respect to the amount of noisy cells the method is able to detect within a cluster. Similarly, comparing Scallop to the ERCC spike in-based method was not possible for artificial datasets. Importantly, these analyses showed that Scallop, unlike DTC methods, was able to discern between the core transcriptionally stable cells within each cell type cluster from the more noisy cells that lie in between clusters (provided in the Figure 1 - Supplement 1 of revised manuscript).
  
  Effect of dataset features on the performance of Scallop
  
  We simulated five artificial datasets with the same nine cell type populations but whose relative abundances were different between datasets. We used the imbalance degree (ID) to measure class imbalance in each of them and to make sure that the selected cell compositions represented a wide range of imbalance degrees (to this end, we explored ID values between 1.2 and 5.3). The ID provides a normalized summary of the extent of class imbalance in a dataset in so-called ”multiclass” settings, that is to say, where more than two classes are present. It was specifically developed to improve the commonly used imbalance ratio (IR) measurement, whose calculation only considers the abundance of the most and the least popular classes and which gives the same summary for datasets with different numbers of minority classes. The presence of multiple minority classes is not uncommon in single-cell RNAseq datasets, as tissues might contain several rare cell types. We observed that the transcriptional noise measurements provided by Scallop were very robust to changes in imbalance degree (see Figure 1 - Supplement 3), both in qualitative and in quantitative terms. For instance, Group2 and Group8 were always detected as the most stable and noisiest cell types, respectively, regardless of their relative abundance in the dataset, and their average percentage of noise had little variation between different ID values: it ranged between 0-0.14% (Group2) and 16-18% (Group8).
  
  The effect of dataset size (number of cells) and the number of genes was evaluated by generating versions of an artificial dataset where cells/genes had been subsampled from an original artificial dataset (the one generated with de.prob=0.001). We tested datasets sized 1,000-10,000 cells and with a number of genes between 5,000 and 14,000. Dataset size had nearly no impact on the transcriptional noise measurements provided by Scallop (Figure 1 - Supplement 4 of the revised manuscript). The average percentage of transcriptional noise per cell type remained within a narrow range as we implemented a ten-fold increase in dataset size. Perhaps more strikingly, removing the expression of most genes did not substantially impact transcriptional noise measurements per cell type (Figure 1 - Supplement 5). The variation when removing half of the genes (7,000 genes) was minimal, and we did not see important changes in transcriptional noise measurements unless over 60% of the genes from the original dataset were removed. For example, Figure 1 - Supplement 5C shows that noise measurements suffer important variations when removing 8,000 and 9,000 genes (and therefore keeping 6,000 and 5,000 genes, respectively), but only some cell types (Groups 4, 7, 8 and 9) were affected by these variations.
  
  In order to measure the effect marker gene expression has on the membership with which cells are assigned to their cell type cluster, we ran a simulation where the top 10 markers for a cell type were removed from the dataset one by one, so that the first simulation lacked the expression of the Top1 marker, the second simulation had the effect of the first 2 markers removed (Top1 and Top2), and so on. Then, we ran Scallop on each of the resulting datasets and observed a steady increase in transcriptional noise associated with that cell type. This provided evidence that the strength of cell type marker expression in a cluster is directly related to its transcriptional stability (or lack of transcriptional noise). We included the result of this experiment in the revised version of the manuscript (Figure 1 - Supplement 6).
  
  In conclusion, by using artificially generated datasets where the ground truth (cell type labels, degree of noise, etc) was known, the newly provided systematic analyses showed that Scallop had a remarkably robust response to said changes in dataset features, further reinforcing the manuscript conclusions.
  
  For example, scallop noise estimates (Fig 2) compared to other euclidean distance-based measures (Fig 2 supplement 2) looks fairly similar.
  
  It is true that some datasets show similar trends regardless of the transcriptional noise quantification method. For instance, the murine brain dataset by Ximerakis et al. shows no overall change in noise between the age groups across different methods. However, we do observe important differences in other examples. This is the case of the human pancreas dataset by Enge et al. and the human skin dataset by Solé-Boldo et al., where not only the magnitude but also the directionality of the trend are different depending on the method used to measure noise. In the former, three methods (Scallop, invariant gene-based Euclidean distance to average tissue expression and GCL) show an age-related increase in noise, whereas one method (whole transcriptome-based Euclidean distance to the cell type mean) shows a decrease in noise. In the latter, two methods (Scallop and GCL) yield a decrease in noise and the two DTC methods measure a mild increase in noise. These inconsistencies can now be reconciled with our proposed explanation that said ”noise” may actually be referring to substantially different biology in the diverse experimental settings.
  
  Are downstream observations (ex lung immune composition changes more than noise) supported from these methods as well? If so, this would strengthen the overall conclusion on noise with age, but if not, it would be relevant to understand why.
  
  Studying changes in cell type composition in the lung and other aged tissues would be highly pertinent. Nevertheless, we have measured changes in cell type composition using only one method that is based on Generalized Linear Models, covered in the subsection Age-related cell type enrichment of the Methods. The methods that we have compared in our study (DTC methods, ERCC-based methods, GCL, etc.) were all designed to measure transcriptional noise, but not changes in cell type composition.
  
  Whether the effects of cell type composition changes are bigger than changes in noise for the rest of the methods used to measure noise was probably not clear enough in the original manuscript. We found no evidence for an increase in noise associated with aging, regardless of the method used. Although not included in the manuscript, we did generate heatmaps similar to the one shown in Figure 3B for each of the noise quantification methods. However, as the heatmap on the right side (the one showing cell type enrichment) was identical in each figure, we considered them to be redundant and decided not to include them, since they did not provide any additional insight besides giving more examples of lack of evidence for transcriptional noise, this time at the cell type level. We consider that the lack of evidence was already well demonstrated in the previous analyses (Figure 2 and Figure 2 - Supplement 2.
  
  Similarly, the ’validation of scallop seems mostly based on the ability to localize noisy vs stable cells in Fig 1 supplement 1 and relative robustness within dataset to input parameters (Fig 1 supplement 2). A more systematic analysis should be performed to robustly establish this method. For example, noise cell clustering comparisons across the 7 datasets used. In addition, the Levy et all 2020 implemented a pathway-based approach to validate. Specifically, surrogate genes were derived from GCL value where KEGG preservation was used as an output. Similar additional types of analyses should be performed in scallop.
  
  We believe that this legitimate concern is now solved with the newly included data. In particular, with the systematic comparison between Scallop and DTC methods on three artificially generated datasets with different degrees of transcriptional noise provided in Figure 1 - Supplement 2. The ability of Scallop to detect cells that are particularly noisy within a cell type, or cells that lie between cell types, may represent its biggest advantage with respect to other methods. DTC methods fail to discern between stable and noisy cells within cell types. Also, in our analysis, DTC methods were unable to distinguish between cell types that have a marked transcriptional program (which systematically cluster together) and those that have a less clear transcriptomic identity (which have at least part of their cells be assigned to other cell types across bootstrap iterations). However, comparing the performance of Scallop on the same datasets showed that our method was able distinguish between the two cases.
  
  The conclusion that immune and endothelial cell transcriptional shifts associate more with age than noise are quite compelling, but seem entirely restricted to the mouse and human lung datasets. It would be interesting to know if pan-tissues these same cell types enrich age-related effects or whether this phenomenon is localized.
  
  We agree with the reviewer that it would be very interesting to see whether a change in cell type composition (and particularly, an increase in abundance of immune cell types) is observed in aged tissues other than the lung. Qualitative cell type composition changes in the aging lung have been described in the literature [5]. Specifically, the higher abundance of immune cell types was observed in a single-nucleus RNAseq dataset of cardiopulmonary cells in Macaca fascicularis [6]. However, we believe that trying to answer the question whether this phenomenon holds in other tissues would require a systematic analysis of several datasets for each tissue with a sufficient number of donors/individuals in each of them. This is because our approach to measure age-associated cell type enrichment using generalized linear models relies heavily on having multiple biological replicates for each age group. Unfortunately, this is not the case for most published single-cell RNAseq datasets of aging. In any case, we have toned down the last sentence in the subsection Changes in the abundance of the immune and endothelial cell repertoires characterize the human aging lung by making it more clear that our claim regarding changes in the cellular composition of aged tissues is based on lung datasets (the text in italics represents what was added in the revised version of the manuscript):
  
  "Even though the evidence for changes in tissue composition are based on a single tissue, we hypothesize that these facts may have influenced previous analyses of transcriptional noise associated with aging."
  
  As discussed in the original manuscript, there is evidence published by other groups pointing out to pantissue changes in cellular composition with age, which undoubtedly will influence those analyses that did not pay attention to cellular composition changes in the datasets that they compared. Cellular composition is in fact a very important aspect that has been greatly overlooked. In fact, only one [7] out of the seven articles that had measured transcriptional noise in aging (the datasets used in Figure 2) had attempted to remove its effect by subsampling cells to balance compositions between age groups prior to their noise analysis. In any case, we do not believe this is the only phenomenon underlying the purported increase in transcriptional noise associated with age. Each dataset will most probably have different issues that the authors originally misread as an increase in noise or loss of cellular identity of a particular organ or tissue. As an additional example of such phenomena, we have now included a re-analysis of the data by Enge et al. [3] on ”noisy” β-cells in the aged human pancreas (Figure 5–Figure supplement 2 of the revised manuscript). In this case, rather than observing an age-dependent pattern, the 21-year-old donor presents much lower transcriptional noise values than the rest of the donors. However, there is no significant difference between the 22-year-old donor and the rest of the donors. We conclude that the statistically significant differences between the ”young” and ”old” age categories can be attributed to the abnormal noise values obtained for the 21-year-old donor, of uncertain origin. Finding out all causes of apparent transcriptional noise in other organs and tissues would be too lengthy, and certainly out of scope for the present manuscript.
  
  Related to these, there does not seem to be a specific rationale for why these datasets (the seven used in total or the lung for deep-dive), were selected. Clearly, many mouse and human sc-RNA-seq datasets exist with large variations in age so expanding the datasets analyzed and/or providing sufficient rationale as to why these ones are appearing for noise analyses would be helpful. For example, querying ”aging” across sc-seq datasets in Single cell portal yields 79 available datasets: https://singlecell.broadinstitute. org/single_cell?type=study&page=1&terms=aging&facets=organism_age%3A0%7C103%7Cyears.
  
  We now realize that the reasoning behind our selection of aging datasets was not sufficiently clear in the original manuscript. We thank the reviewer for pointing out this omission. We have made a more explicit reference to Appendices 2, 3, 4 and 6 in the revised manuscript. The seven selected scRNAseq datasets are those where transcriptional noise had originally been measured by the authors, using the computational methods that we later implemented in Decibel. Our aim was to first recapitulate previous reports of transcriptional noise using our novel method (Scallop). Thus, we downloaded all publicly available scRNAseq datasets of aged tissues where transcriptional noise had explicitly been measured. Some of them had reported an increase in transcriptional noise only in some cell types (for instance, the human aged pancreas dataset by Enge et al. [3]), whereas others found an increase in most cell types [7]. Appendix 2 summarizes the main features of those seven datasets (tissue, organism and number of cells) and provides information on whether an increase in transcriptional noise was observed in the original article where they were published. Additionally, the ”scope” column indicates where that increase was found (in which cell types), and the ”Method” column briefly describes the computational method used to measure transcriptional noise in that article. Appendix 3 provides information on the final datasets that were used in our analysis (Figure 2). Not every sample from the original dataset was included, so the inclusion criteria are specified there, as well as the number of cells, individuals and age of each of the cohorts. Appendix 4 shows the abnormal count distribution of two samples that were discarded from the Kimmel lung dataset. As for the selection of lung for the deep dive, the reason was that this was the organ with most datasets available, both for mouse and human. Appendix 6 provides information on the number of cells and donors per age cohort in the human lung datasets included in this study.
  
  We have included the following sentence in the Increased transcriptional noise is not a universal hallmark of aging subsection in the Results:
  
  "We provide a summary of the main characteristics of each dataset, as well as the findings regarding transcriptional noise obtained in each of the original studies, whether changes in transcriptional noise were restricted to particular cell types, and the computational method used to measure noise (see Appendix 2)."
  
  The analysis that noise is indistinguishable from cell fate shifts is compelling, but again relies on one specific example where alternative surfactant genes are used as markers. The same question arises if this observation holds up to other cell types within other organs. For example the human cell atlas contains over dozens of tissue with large variations in age (https://www.science.org/doi/10.1126/science. abl4290).
  
  We sympathize with this comment but hope that the reviewer will agree with us that providing an additional example of different phenomena originally reported as ”transcriptional noise” (in this case in aged human pancreas; see Figure 5 – Figure supplement 2), but actually reflecting something else, may be sufficient to prevent interested readers. In our opinion, it is likely that diverse phenomena will underlie the purported increases in transcriptional noise, and a re-analysis should be made case-by-case. We can only hope that researchers in the field re-analyze the available aging datasets in this new light.
  
  Reviewer #2 (Public Review):
  
  In this manuscript, Ibanez-Sole et al. focus on an important open question in ageing research; ”how does transcriptional noise increase at the cellular level?”. They developed two python toolkits, one for comparison of previously described methods to measure transcriptional noise, Decibel, and another one implementing a new method of variability measure based on cluster memberships, Scallop. Using published datasets and comparing multiple methods, they suggest that increased transcriptional noise is not a fundamental property of ageing, but instead, previous reports might have been driven by age-related changes in cell type compositions.
  
  I would like to congratulate the authors on openly providing all code and data associated with the manuscript. The authors did not restrict their paper to one dataset or one approach but instead provided a comprehensive analysis of diverse biology across murine and human tissues.
  
  While the results support their main conclusions, the lack of robustness/sensitivity measures for the methods used makes it difficult to judge the biology.The authors use real data to compare between methods but using synthetic data with known artificial ’variability’ across cell clusters can first establish the methods, which would make the results more convincing and easier to interpret. Despite the comprehensive analysis of biological data, a detailed prior description of how the methods behave against e.g. the number of cells in each cell type cluster, the number of cell types in the dataset, and % feature expression, would make the paper more convincing. Once the details of the method is provided, the python toolkit can be widely used, not limited to the ageing research community. I am also concerned that a definition of ’transcriptional noise’ (e.g. genome-wide noise, transcriptional dysregulation in cell-type-specific genes, noise in certain pathways) and its interpretation with regard to the biology of ageing is missing. Differences in different methods could be explained by the different biology they capture. Moreover, the interpretation of a lack of different types of variability may not be the same for the biology of ageing.
  
  Increased transcriptional noise is compatible with genomic instability, loss of proteostasis and epigenetic regulation. Showing a lack of consistent transcriptional noise can challenge the widespread assumptions about how these hallmarks affect the organism. Overall, I found the paper very interesting and central to the field of ageing biology. However, I believe it requires a more detailed description of the methods and interpretations in the context of biology and theories of ageing.
  
  We thank the reviewer for their positive assessment of the manuscript and their suggestions. We respond to each of the specific comments below.
  
  Major comments
  
  1) The concept of transcriptional noise is central to the manuscript; however, what the authors consider as transcriptional noise and why is not clear. Genome-wide vs. function or cell-type specific noise could have different implications for the biology of ageing. In line with this, a discussion of the findings in the context of theories of ageing is necessary to understand its implications.
  
  We thank the reviewer for pointing out the lack of clarity in this key point. The use of the ”transcriptional noise” term in the literature is quite heterogeneous, and we agree that the lack of a consensus definition may be confusing to the reader. For this reason, we adopted in the introduction the definition by Raser and O’Shea [8] as ”the measured level of variation in gene expression among cells supposed to be identical”, i.e. the sum of both intrinsic and extrinsic noise as previously defined by Swain and colleagues [9, 10]. In our opinion, this is generally what the literature of age-associated transcriptional noise is referring to.
  
  With Scallop, we aimed to translate this concept to the context of single-cell RNAseq datasets, where clusters obtained using a community detection algorithm are typically annotated as distinct cell types.
  
  Therefore, we aimed to measure transcriptional noise here defined as ”lack of membership to cell type clusters”. When running a clustering algorithm iteratively, if a cell is not unambiguously assigned to the same cluster, we consider it to be noisy. Conversely, when a cell consistently clusters with the same group of cells, we consider it to be stable. The membership score we use as a measure of stability is the frequency with which any given cell was assigned to the same cluster across all iterations.
  
  We have included in the Results section an explicit reference to the Methods subsection that explains how Scallop works in detail, so that the readers can easily find that information:
  
  "A detailed description of the three steps of the method (bootstrapping, cluster relabeling and computation of the membership score) is provided in the Scallop subsection in the Methods."
  
  Additionally, we have now realized that the formula to compute the membership score might be more easily understood if we renamed the freq_score as freq_score(c), to make it clear that each cell is assigned a score. Also, we have used n and m instead of i and j in this notation, to avoid confusing the readers with the notation used in the previous section, where i and j represented the i-th and j-th bootstrap iterations. Finally, we have included a small paragraph to clarify what each component of the formula refers to. Below we show the formula and text included in the Methods section of the revised manuscript:
  
  "Where |cn| is the number of times cell c was assigned to the n-th cluster, and Pm∈clusters |cm| is the sum of all assignments made on cell c, which is the same as the number of times cell c was clustered across bootstrap iterations."
  
  Thus, and in order to accommodate this reviewer’s concerns, we have now included this exact definition of how we measure noise plus a statement making clear that we refer to the sum of both intrinsic and extrinsic noise aspects, with no distinction among them.
  
  Similarly, we had discussed our findings in the framework of different theories of aging, such as their potential relationship to some of the established hallmarks of aging (genomic instability, epigenetic deregulation and loss of proteostasis), as well as with more recent theories of aging such as cell type imbalance in aged organs [11] and inter-tissue convergence [12]. However, it is now clear to us that this was not enough so we have now expanded these paragraphs to make our understanding of the work implications better understood. More specifically:
  
  "Our results suggest that transcriptional noise is not a bona fide hallmark of aging. Instead, we posit that previous analyses of noise in aging scRNAseq datasets have been confounded by a number of factors, including both computational methods used for analysis as well as other biology-driven sources of variability."
  
  2) While I found the suggested method, Scallop, quite exciting and valuable, I would suggest including a number of performance/robustness measures (primarily based on simulations) on how sensitive the method is to the number of cells in each cell type (cellular composition), misannotations, % feature expression (number of 0s) etc.:
  
  We have analyzed the effect of cellular composition and the percentage of feature expression by using artificially generated datasets (see Figure 1 - Supplements 3 and 5, respectively; and section Effect of dataset features on the performance of Scallop in the response to reviewer #1). Although studying the effect of misannotations on downstream analysis is important, we believe that Scallop was already designed so that its effects could be avoided, since the membership is measured for each cluster (and not for each cell type label). That is to say, a reference clustering is obtained at the beginning of the pipeline and memberships are computed using that output as a reference, which means Scallop noise values attributed to each cell are not affected by the original labeling of the dataset.
  
  The output of these analyses reinforced our original conclusions, and it is now included in the Results section:
  
  "In order to characterize and validate our method for transcriptional noise quantification, we conducted three types of analyses. First, we used artificially generated datasets containing various degrees of transcriptional noise to compare the performance of Scallop and DTC methods side-by-side, regarding their ability to measure transcriptional noise and detect noisy cells within cell types. Next, we ran simulations using artificial datasets in order to study the effect of a number of dataset features on the performance of Scallop: cellular composition, dataset size, number of genes and marker expression. Finally, we graphically evaluated the output of Scallop on a dataset of human T cells, we analyzed its robustness to its input parameters, and we studied the relationship between membership and robust marker expression, using a PBMC dataset."
  
  2.1) Most importantly, knowing that cell-type composition changes with age, it is important to know how sensitive community detection is to the number of cells in each cell type. While the average can be robust, I wonder if the size of the cell-type cluster affects membership (voting).
  
  We have included an analysis on a set of artificial datasets with different cellular compositions to evaluate the performance of Scallop in the presence of different degrees of class imbalance (see Figure 1 - Supplement 3). We explain the output of this analysis, which reinforces the algorithm’s robustness, in the Results section:
  
  "Next, we ran a series of simulations on artificially generated datasets to evaluate the performance of Scallop in the presence of different levels of class imbalance, dataset size, number of genes, and different degrees of expression of cell type markers. Our analysis showed that Scallop was remarkably robust to changes in cellular composition (see Figure 1 - Supplement 3). Both the average percentage of noise and the distribution remained unchanged for a wide range of class imbalance degrees. Similarly, altering the dataset size (number of cells) and the number of genes of an artificial dataset did not cause any major changes on the transcriptional noise values attributed to each cell type (see Figure 1 - Supplements 4 and 5). Additionally, we conducted an analysis where we identified the 10 most differentially expressed gene markers for a cell type and measured the transcriptional noise associated with that cell type as we removed the expression of those genes from the dataset (Figure 1 - Supplement 5). Transcriptional noise steadily increased as we removed the effect of the top marker genes that defined the cell type under study (see Figure 1 - Supplement 5B). This experiment provides further evidence on how strong marker expression is related to robust cell type identity and how the lack of it results in transcriptional noise."
  
  3) Although the Leiden algorithm is widely used by many single-cell clustering methods, since the proposed methodology is heavily dependent on clustering, I suggest including a description of the Leiden algorithm.
  
  We agree that understanding how community detection algorithms in general –and Leiden in particular– work is crucial to understand the core of the paper, so we have included a brief introduction to these methods in the Methods section, at the beginning of the Scallop subsection:
  
  Leiden is a graph-based community detection algorithm that was designed to improve the popular Louvain method [13]. Graph-community detection methods take a graph representation of a dataset. In the context of single-cell RNAseq data, shared nearest neighbor (SNN) graphs are commonly used. These are graphs whose nodes represent individual cells and edges connect pairs of cells that are part of the K-nearest neighbors of each other by some distance metric. The aim of community detection algorithms like Leiden is to find groups of nodes that are densely connected between them, by optimizing modularity. For a graph with C communities, the modularity (Q) is computed by taking, for each community (group of cells), the difference between the actual number of edges in that community (ei) and the number of expected edges in that community ( K2/1/2m).
  
  Where r is a resolution parameter (r > 0) that controls for the amount of communities: a greater resolution parameter gives more communities whereas a low resolution parameter fewer clusters. Since maximizing the modularity of a graph is an NP-hard problem, different heuristics are used, and Leiden has shown to outperform Louvain in this task both in terms of quality and speed [14]. However, users can choose to run the Louvain method instead by setting the parameter clustering="louvain" in the initialization of the Bootstrap object.
  
  3.1) Most importantly, the authors comment that they found stronger expression of cell-type specific markers in the cells with high membership values - is it already a product of the Leiden algorithm that it weighs highly variable (thus cell-type specific) features higher - resulting in better prediction of cell-types for cells with strong cell-marker expression? It is important to make a description of transcriptional noise at this stage as it could be genome-wide or more specific to cell-type markers. Can authors provide any support that their method can capture both?
  
  We agree with the reviewer that finding a stronger expression of cell-type markers in cells with high membership values is indeed something we expected. The graph representation of the dataset taken as input by Leiden is built after running highly variable gene detection and PCA. The neighbors of each cell are detected based on the expression of genes that are highly variable, as the reviewer pointed out, so genes that are differentially expressed between cells are more likely to contribute to the clusters found by Leiden.
  
  Whether Scallop measures genome-wide or cell type-specific noise (or a mixture of both) is a very interesting question. Clusters in single-cell RNA sequencing datasets are often mainly driven by the presence/absence of a few cell type markers, rather than changes in expression levels of broader sets of genes. Moreover, it has been shown that single-cell RNAseq datasets generally preserve the same population structure even after data binarization [15]. This is a consequence of the sparsity of single-cell RNAseq datasets. In our case, any difference in expression between one cluster vs the rest of the cells in the dataset –be it the expression of a gene that was not detected in the rest of the cells or a higher expression of a gene whose presence is weaker in other clusters– will certainly have an impact on the output of every downstream analysis, from clustering to dimensionality reduction. The influence of the expression of cell type-specific markers on Scallop membership has been demonstrated in several analyses. First, the simulation where we measured the impact of removing the 10 most defining markers for a particular cell type on transcriptional noise measurements (included in the Figure 1 - Supplement 6 of the revised manuscript). Also, Figure 5 provides evidence that the differential expression of a handful of genes (in this case, genes coding for surfactant proteins) can have an impact on the clustering solutions obtained for a set of human alveolar macrophages, and this in turn influences the membership scores obtained with Scallop. In essence, Scallop merely provides a measure of the robustness of clustering at the single-cell level, so any type of transcriptional noise might have an impact on Scallop memberships, provided it is sufficiently strong to influence the output of the clustering algorithm used. In other words, the fact Scallop membership captures a mixture of both types of noise (genome-wide and that associated with cell type-specific markers) is a consequence of the influence both types of noise have on clustering.
  
  4) The authors conclude that Scallop outperforms other methods through the analysis of biological data, where there is no positive and negative control. I suggest creating synthetic datasets (which could be based on real data), introducing different levels of noise artificially (considering biological constraints like max/min expression levels) and then testing the performance where the truth about each dataset is known. Otherwise, the definitions of noisy and stable cells, regardless of the method, are arbitrary.
  
  Our initial focus was on biological datasets, were no positive and negative controls regarding transcriptional noise could be used, but we agree in the need of including an analysis using simulations on artificial datasets. We analyzed artificially generated datasets with known degrees of transcriptional noise in order to evaluate the performance of Scallop on a setting where the ground truth is known beforehand. The way we modeled transcriptional noise was by tuning the de.prob parameter, which determines the probability that a gene will be differentially expressed between clusters. The creation of these datasets is explained in detail in the Methods section of the revised manuscript, and specifically in the subsections Performance of Scallop and two DTC methods on four artificial datasets with increasing transcriptional noise. and Ability to detect noisy cells within cell types.
  
  We have now included the following section in the Results:
  
  "We compared the output of Scallop and two DTC methods (the whole transcriptome-based Euclidean distance to average cell type expression and the invariant gene-based Euclidean distance to average tissue expression) on four artificially generated datasets containing various levels of transcriptional noise. The analysis showed that Scallop, unlike DTC methods, was able to discern between the core transcriptionally stable cells within each cell type cluster from the more noisy cells that lie in between clusters (see Figure 1 - Supplement 1). We then compared one of the DTC methods to Scallop regarding their ability to detect noisy cells within each of the cell types, by plotting the top 10% noisiest and top 10% most stable cells and (see Figure 1 - Supplement 2A). Analyzing the distribution of noise values for each cell type separately revealed that Scallop can distinguish between clusters that mainly consist of transcriptionally stable cells from noisier clusters that do not have such a distinct transcriptional signature (Figure 1 - Supplement 2B."
  
  Reviewer #3 (Public Review):
  
  In this manuscript, Ibáñez-Solé et al aim to clarify the answer to a very basic and important question that has gained a lot of attention in the past ∼5 years due to fast-increasing pace of research in the aging field and development/optimization of single-cell gene expression quantification techniques: how does noise in gene expression change during the course of cellular/tissue aging? As the authors clearly describe, there have been multiple datasets available in the literature but one could not say the same for the number of available analysis pipelines, especially a pipeline that quantifies membership of single cells to their assigned cell type cluster. To address these needs, Ibáñez-Solé et al developed: 1. a toolkit (named Decibel) to implement the common methods for the quantification of age-related noise in scRNAseq data; and 2. a method (named Scallop) for obtaining membership information for single-cells regarding their assigned celltype cluster. Their analyses showed that previously-published aging datasets had large variability between tissues and datasets, and importantly the author’s results show that noise-increase in aging could not be claimed as a universal phenotype (as previously suggested by various studies).
  
  We thank the reviewer for their positive assessment of the manuscript and their suggestions.
  
  Comments:
  
  1) In two relevant papers (doi.org/10.1038/s41467-017-00752-9anddoi.org/10.1016/j.isci. 2018.08.011), previous work had already shown what haploid/diploid genetic backgrounds could show in terms of intercellular/intracellular noise. Due to the direct nature of age/noise quantification in these papers, one cannot blame any computational pipeline-related issues for the ”unconventional” results. The authors should cite and sufficiently discuss the noise-related results of these papers in their Discussion section. These two papers collectively show how the specific gene, its protein half-life and ploidy can lead to similar/different noise outcomes.
  
  We agree that we have failed to mention and sufficiently discuss the effects of measuring transcriptional noise from data generated via destructive experimentation, where no longitudinal analyses are possible. As aforementioned in the response to other reviewers, the body of literature on transcriptional noise is quite wide and based on heterogeneous assumptions. We have focused our efforts in measuring actual noise in scRNAseq aging datasets, which by definition imply sampling of different cells and thus make assumptions at the population level. We believe our results provide a different and interesting perspective into transcriptional noise and aging, but we agree with this reviewer in the need to discuss our findings in the context of other attempts to measure transcriptional noise in a more direct way. We have now included a brief discussion of the work by Sarnoski et al. and Liu et al.. This point is explained in more detail later in the letter.
  
  2) While the authors correctly put a lot of emphasis on studying the same cell type or tissue for a faithful interpretation of noise-related results, they ignore another important factor: tracking the same cell over time instead of calculating noise from single-cell populations at supposedly-different age points. Obviously, scRNAseq cannot analyze the same cell twice, but inability to assess noise-in-aging in the same cell over time is still an important concern. Noise could/does affect the generation durations and therefore neighboring cells in the same cluster may not have experienced the same amount of mitotic aging, for example. Also, perhaps a cell has already entered senescence at early age in the same tissue. This caveat should be properly discussed.
  
  The distinction between intrinsic and extrinsic noise and the impossibility to discern between the two in destructive experiments is a relevant point that we have now included in the Discussion (the newly added text is shown in italics):
  
  "Transcriptional noise could be related to genomic instability [18], epigenetic deregulation [19, 20] or loss of proteostasis [21], all established hallmarks of aging. Some authors consider transcriptional noise to be a hallmark of aging in and of itself [22]. In any case, the origin of transcriptional noise is unclear, as it could arise from many different sources. Most importantly, it not possible to distinguish between intrinsic and extrinsic noise from a snapshot of cellular states, i.e., one cannot tell whether the observed differences between cells in a single-cell RNA experiment reflect time-dependent variations in gene expression or differences between cells across a population [23]. Interestingly, recent work by Liu et al. measuring intrinsic noise in S. cerevisiae showed that aging is associated with a steady decrease in noise, with a sudden increase in soon-to-die cells. Another longitudinal study found an increase extrinsic noise and a lack of change in intrinsic noise in diploid yeast [16]."
  
  Regarding the caveat of cells of individuals in the Young groups showing signs of aging, we can only agree that this is correct: there will be cells sampled that already show signs of cellular damage in the absence of chronological aging. However this applies to every study of aging that samples cells in a destructive manner and it is generally assumed by the field that this is a discrete phenomenon that does not affect the overall results in a meaningful way.
  
  3) Another weakness of this study is that the authors did not show the source/cause of decreasing/stable/increasing noise during aging. Understanding the source of loss of cell type identity is also important but this manuscript was about noise in aging, so it would have been nice if there could be some attempts to explain why noise is having this/that trend in differentially aged cell types in specific tissues.
  
  The reviewer raises here a very important point that we would like to discuss in detail. The papers that we have re-analyzed generally assume that an increase in transcriptional noise and a loss in cell type identity are equivalent terms. However, as this reviewer points out, you could theoretically have cells that lose their cell type identity without a concomitant increase in transcriptional noise, for instance by a sharp decrease in a limited number of marker genes that collectively define that cell within a given cell type/cluster. Thus, transcriptional noise can certainly arise from different sources and several mechanisms have been proposed to explain its presence in the context of cellular aging. We agree with the reviewer that discussing how transcriptional noise could be related to aging is of interest to the readers. However, as pointed out in the responses to similar concerns by the other reviewers, our main finding is that we don’t detect meaningful and reliable increases in transcriptional noise associated with cell aging. Instead, what we see is a number of different technical and biological issues/phenomena that have been interpreted as transcriptional noise. We hope this reviewer will agree that the manuscript now presents a full and robust story and that finding the causes of up/down ”noise” trends in the different datasets may be more appropriately tackled by follow up studies.
  
  4) In the discussion section, the authors say that ”Most importantly, Scallop measures transcriptional noise by membership to cell type-specific clusters which is a re-definition of the original formulation of noise by Raser and O’Shea.” It is not clear what the authors refer to by ”the original formulation of noise by Raser and O’Shea”. Intrinsic/extrinsic noise formulations?? Please be more specific.
  
  We thank the reviewer for pointing this out, since we agree that the sentence needed to be reformulated for the sake of clarity. What we meant by the definition by Raser and O’Shea was ”the measured level of variation in gene expression among cells supposed to be identical”, which does not make any distinction between intrinsic and extrinsic noise. Since their definition is previous to the development of single-cell technologies, we meant to state our attempt to bring this classic concept to the context of single-cell RNAseq. Nowadays, cell clusters produced by a community detection algorithm are given cell type annotations depending on their expression of known cell type markers. What Scallop aims to measure is the extent of membership each individual cell has for their cluster as evidence of its transcriptional stability. In order to make this point more clear, we have now rewritten the paragraph as follows:
  
  Most importantly, Scallop measures transcriptional noise by membership to cell type-specific clusters which is a re-definition of the original formulation of noise by Raser and O’Shea: measurable variation among cells that should share the same transcriptome. This is in stark contrast to measurements of noise including other phenomena (as demonstrated in Figure 5) by the distance-to-centroid methods prevalent in the literature.
  
  References
  
  [1] M. Alex Ascensión, Olga Ibáñez-Solé, Iñaki Inza, Ander Izeta, and Marcos J Araúzo-Bravo. Triku: A feature selection method based on nearest neighbors for single-cell data. GigaScience, 11, 2022. doi: 10.1093/gigascience/giac017.
  
  [2] M. Ximerakis, S. L. Lipnick, B. T. Innes, S. K. Simmons, X. Adiconis, D. Dionne, B. A. Mayweather, L. Nguyen, Z. Niziolek, C. Ozek, V. L. Butty, R. Isserlin, S. M. Buchanan, S. S. Levine, A. Regev, G. D. Bader, J. Z. Levin, and L. L. Rubin. Single-cell transcriptomic profiling of the aging mouse brain. Nat Neurosci, 22(10), 2019. doi: https://doi:10.1038/s41593-019-0491-3.
  
  [3] M. Enge, H. E. Arda, M. Mignardi, J. Beausang, R. Bottino, S. K. Kim, and S. R. Quake. Single-cell analysis of human pancreas reveals transcriptional signatures of aging and somatic mutation patterns. Cell, 171(2), 2017. doi: https://doi:10.1016/j.cell.2017.09.004.
  
  [4] L. Solé-Boldo, G. Raddatz, and S. et al. Schütz. Single-cell transcriptomes of the human skin reveal age-related loss of fibroblast priming. Commun Biol, 3(188), 2020. doi: https://doi.org/10.1038/ s42003-020-0922-4.
  
  [5] Jaime L. Schneider, Jared H. Rowe, Carolina Garcia-de Alba, Carla F. Kim, Arlene H. Sharpe, and Marcia C. Haigis. The aging lung: Physiology, disease, and immunity. Cell, 184(8):1990–2019, 2021. doi: 10.1016/j.cell.2021.03.005.
  
  [6] Shuai Ma, Shuhui Sun, Jiaming Li, Yanling Fan, Jing Qu, Liang Sun, Si Wang, Yiyuan Zhang, Shanshan Yang, Zunpeng Liu, and et al. Single-cell transcriptomic atlas of primate cardiopulmonary aging. Cell Research, 31(4):415–432, 2020. doi: 10.1038/s41422-020-00412-6.
  
  [7] I. Angelidis, L. M. Simon, and I. E. et al. Fernandez. An atlas of the aging lung mapped by single cell transcriptomics and deep tissue proteomics. Nature Communications, 2019. doi: https://doi.org/10. 1038/s41467-019-08831-9.
  
  [8] Jonathan M. Raser and Erin K. O’Shea. Noise in gene expression: origins, consequences, and control. Science, 309(5743):2010–2013, 2005. doi: 10.1126/science.1105891.
  
  [9] Michael B. Elowitz, Arnold J. Levine, Eric D. Siggia, and Peter S. Swain. Stochastic gene expression in a single cell. Science, 297:1183– 1186, 2002. doi: 10.1126/science.1070919.
  
  [10] Peter S. Swain, Michael B. Elowitz, and Eric D. Siggia. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci U S A., 99:12795–12800, 2002. doi: 10.1073/pnas.162041399.
  
  [11] Alex Cagan, Adrian Baez-Ortega, Natalia Brzozowska, Federico Abascal, Tim H. H. Coorens, Mathijs A. Sanders, Andrew R. J. Lawson, Luke M. R. Harvey, Shriram Bhosle, David Jones, Raul E. Alcantara, Timothy M. Butler, Yvette Hooks, Kirsty Roberts, Elizabeth Anderson, Sharna Lunn, Edmund Flach, Simon Spiro, Inez Januszczak, Ethan Wrigglesworth, Hannah Jenkins, Tilly Dallas, Nic Masters, Matthew W. Perkins, Robert Deaville, Megan Druce, Ruzhica Bogeska, Michael D. Milsom, Björn Neumann, Frank Gorman, Fernando Constantino-Casas, Laura Peachey, Diana Bochynska, Ewan St. John Smith, Moritz Gerstung, Peter J. Campbell, Elizabeth P. Murchison, Michael R. Stratton, and Iñigo Martincorena. Somatic mutation rates scale with lifespan across mammals. Nature, 604: 517–524, 2022. doi: 10.1038/s41586-022-04618-z.
  
  [12] Hamit Izgi, Dingding Han, Ulas Isildak, Shuyun Huang, Ece Kocabiyik, Philipp Khaitovich, Mehmet Somel, and Handan Melike Dönertas. Inter-tissue convergence of gene expression during ageing suggests age-related loss of tissue and cellular identity. eLife, 11, 2022. doi: 10.7554/eLife.68048.
  
  [13] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10): P10008, oct 2008. doi: 10.1088/1742-5468/2008/10/p10008. URL https://doi.org/10.1088/ 1742-5468/2008/10/p10008.
  
  [14] V. A. Traag, L. Waltman, and N. J. van Eck. From louvain to leiden: guaranteeing well-connected communities. Scientific Reports, 9, 2019. doi: https://doi.org/10.1038/s41598-019-41695-z.
  
  [15] Peng Qiu. Embracing the dropouts in single-cell rna-seq analysis. Nature Communications, 11(1), 2020. doi: 10.1038/s41467-020-14976-9.
  
  [16] Ethan A. Sarnoski, Ruijie Song, Ege Ertekin, Noelle Koonce, and Murat Acar. Fundamental characteristics of single-cell aging in diploid yeast. iScience, 7:96–109, 2018. doi: 10.1016/j.isci.2018.08.011.
  
  [17] Ping Liu, Ruijie Song, Gregory L. Elison, Weilin Peng, and Murat Acar. Noise reduction as an emergent property of single-cell aging. Nature Communications, 8(1), 2017. doi: 10.1038/s41467-017-00752-9.
  
  [18] Jan Vijg. From dna damage to mutations: All roads lead to aging. Ageing Res Rev., 68(101316), 2021. doi: 10.1016/j.arr.2021.101316.
  
  [19] Yuancheng Lu, Benedikt Brommer, Xiao Tian, Anitha Krishnan, Margarita Meer, Chen Wang, Daniel L. Vera, Qiurui Zeng, Doudou Yu, Michael S. Bonkowski, Jae-Hyun Yang, Songlin Zhou, Emma M. Hoffmann, Margarete M. Karg, Michael B. Schultz, Alice E. Kane, Noah Davidsohn, Ekaterina Korobkina, Karolina Chwalek, Luis A. Rajman, George M. Church, Konrad Hochedlinger, Vadim N. Gladyshev, Steve Horvath, Morgan E. Levine, Meredith S. Gregory-Ksander, Bruce R. Ksander, Zhigang He, and David A. Sinclair. Reprogramming to recover youthful epigenetic information and restore vision. Nature, 588(7836):124–129, 2020. doi: 10.1038/s41586-020-2975-4.
  
  [20] Giorgio Oliviero, Sergey Kovalchuk, Adelina Rogowska-Wrzesinska, Veit Schwämmle, and Ole N. Jensen. Distinct and diverse chromatin proteomes of ageing mouse organs reveal protein signatures that correlate with physiological functions. eLife, 11(e73524), 2022. doi: 10.7554/eLife.73524.
  
  [21] Jingyi Li, Yuxuan Zheng, Pengze Yan, Moshi Song, Si Wang, Liang Sun, Zunpeng Liu, Shuai Ma, Juan Carlos Izpisua Belmonte, Piu Chan, Qi Zhou, Weiqi Zhang, Guang-Hui Liu, Fuchou Tang, and Jing Qu. A single-cell transcriptomic atlas of primate pancreatic islet aging. Natl Sci Rev., 8(2): nwaa127, 2020. doi: 10.1093/nsr/nwaa127.
  
  [22] Alexander R. Mendenhall, George M. Martin, Matt Kaeberlein, and Rozalyn M. Anderson. Cellto-cell variation in gene expression and the aging process. Geroscience, 43(1):181–196, 2021. doi: 10.1007/s11357-021-00339-9.
  
  [23] Lucy Ham, Marcel Jackson, and Michael PH Stumpf. Pathway dynamics can delineate the sources of transcriptional noise in gene expression. eLife, 10, 2021. doi: 10.7554/elife.69324.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.18.492432v1
www.biorxiv.org www.biorxiv.org

Homotopic contralesional excitation suppresses spontaneous circuit repair and global network reconnections following ischemic stroke

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Bice et al. present new work using an optogenetics-based stimulation to test how this affects stroke recovery in mice. Namely, can they determine if contralateral stimulation of S1 would enhance or hinder recovery after a stroke? The study provides interesting evidence that this stimulation may be harmful, and not helpful. They found that contralesional optogenetic-based excitation suppressed perilesional S1FP remapping, and this caused abnormal patterns of evoked activity in the unaffected limb. They applied a network analysis framework and found that stimulation prevented the restoration of resting-state functional connectivity within the S1FP network, and resulted in limb-use asymmetry in the mice. I think it's an important finding. My suggestions for improvement revolve around quantitative analysis of the behavior, but the experiments are otherwise convincing and important.
  
  Thank you for the positive feedback regarding our work.
  
  Other comments - Data and paper presentation:
  
  1) Figure 1A is misleading; it appears as if optogenetic stimulation is constant (which indeed would be detrimental to the tissue). Also, the atlas map overlaps color-wise with conditions; at a glance it looks like the posterior cortex might be stimulated; consider making greyscale?
  
  We have updated Figure 1A to address these concerns.
  
  Reviewer #2 (Public Review):
  
  These studies test the effect of stimulation of the contralateral somatosensory cortex on recovery, evoked responses, functional interconnectivity and gene expression in a somatosensory cortex stroke. Using transgenic mice with ChR2 in excitatory neurons, these neurons are stimulated in somatosensory cortex from days 1 after stroke to 4 weeks. This stimulation is fairly brief: 3min/day. Mice then received behavioral analysis, electrical forepaw stimulation and optical intrinsic signal mapping, and resting state MRI. The core finding is that this ChR2 stimulation of excitatory neurons in contralateral somatosensory cortex impairs recovery, evoked activity and interconnectivity of contralateral (to the stimulation, ipsilateral to the stroke) cortex in this localized stroke model. This is a surprising result, and resonates with some clinical findings, and a robust clinical discussion, on the role of the contralateral cortex in recovery. This manuscript addresses several important topics. The issue of brain stimulation and alterations in brain activity that the studies explore are also part of human brain stimulation protocols, and pre-clinical studies. The finding that contralateral stimulation inhibits recovery and functional circuit remapping is an important one. The rsMRI analysis is sophisticated.
  
  Thank you for the supportive comments regarding our manuscript
  
  Concerns:
  
  1) The gene expression data is to be expected. Stimulation of the brain in almost any context alters the expression of genes.
  
  We agree with the reviewer that stimulation of the brain is expected to broadly alter gene expression. However, in this set of studies, we examined a subset of genes that are of particular interest in neuroplasticity, and compared expression in ipsi-lesional vs. contra-lesional cortex in the presence or absence of contralesional stimulation during the post stroke recovery period. Genes like Arc, for example, have been shown by our group to be necessary for perilesional plasticity and recovery (Kraft, et al., Science Translational Medicine, 2018). The finding that validated plasticity genes are suppressed by contralesional stimulation is consistent with the central finding that contralesional stimulation suppresses the recovery of normal patterns of brain organization and activity. Importantly, there were also genes associated with spontaneous recovery that were unaltered or increased by contra-lesional brain stimulation. While these data do not provide causal associations, they may prove to be useful for developing hypotheses regarding molecular mechanisms involved in spontaneous brain repair for future studies.
  
  In light of the reviewer’s comment, we have altered text throughout to not focus on specific directionality of transcripts. Instead, we indicate that relevant transcript changes are those that are altered in association with spontaneous recovery, and which are altered in the opposite direction with contralesional brain stimulation.
  
  Minor points.
  
  1) Was the behavior and the functional imaging done while the brain was being stimulated?
  
  We have updated the methods (page 17) to clarify that the only experiments during which the photostimulus occurred during neuroimaging are reported in new Figure 6, and to clarify that photostimulation did not occur during the behavioral tests of asymmetry.
  
  2) It would be useful to understand what is being stimulated. The stimulation method is not described. Is an entire cortical width of tissue stimulated, and this is what is feeding back onto the contralateral cortex? Or is this stimulation mostly affecting excitatory (CaMKII+) cells in upper or lower layers? This will be important to be able to compare to the Chen et al study that gave rise to the stimulation approach here. This gets to the issue of the circuitry that is important in recovery, or in inhibiting recovery. One might answer this question by doing the stimulation and staining tissue for immediate early gene activation, to see the circuits with evoked activity. Also, the techniques used in this study could be applied with OIS or rsMRI during stimulation, to determine the circuits that are activated.
  
  We have clarified the stimulation protocol in response to Essential point 2.2. Due to light scattering and appreciable attenuation of 473nm in brain tissue, only ~1% of photons penetrate to a depth of 600 microns. Experimentally, this provides superficial-layer specificity to Layer 2/3 Camk2a cells (https://doi.org/10.1016/j.neuron.2011.06.004)
  
  To answer the question of what circuits are affecting recovery, we performed 2 sets of additional experiments – Experiment 1: OISI during photostimulation before and after photothrombosis, and Experiment 2: tissue staining for IEG expression (cFOS). We describe each below:
  
  Experiment 1 New results are included from 16 Camk2a-ChR2 mice (Results, page 10-11; Methods, page 18) and reported as new Figure 6. Similar to the previously reported experiments, all mice were subject to photothrombosis of left S1FP, half of which received interventional optogenetic photostimulation beginning 1 day after photothrombosis (+Stim) while the other half recovered spontaneously (-Stim). To visualize in real time whether contralesional photostimulation differentially affected global cortical activity in these 2 groups, concurrent awake OISI during acute contralesional photostimulation was performed in +Stim and –Stim groups before, 1, and 4 weeks after photothrombosis. At baseline, all mice exhibited focal increases in right S1FP activity during photostimulation that spread to contralateral (left) S1FP and other motor regions approximately 8-10 seconds after stimulus onset. While activity increases within the targeted circuit, subtle inhibition of cortical activity can also be observed in surrounding non-targeted cortices. Thus, activity both increases and decreases in different cortical regions during and after optogenetic stimulation of the right S1FP circuit. Of note, regions that are inhibited by S1FP stimulation show more pronounced decreases in activity in +Stim mice at 1 and 4 weeks compared to baseline and were significantly larger in +Stim mice compared to –Stim mice. We conclude that focal stimulation of contralesional cortex results in significant, widespread inhibitory influences that extend well beyond the targeted circuit.
  
  Experiment 2 For experiment 2, we hypothesized that IEG expression would increase in photostimulated regions, cortical regions functionally connected to targeted areas, and potentially deeper brain regions. For the IEG experiments, healthy ChR2 naïve animals (C57 mice) or CamK2a-ChR2 mice were acclimated to the head-restraint apparatus described in the manuscript used for photostimulation treatment. Once trained, awake mice were subject to the same photostimulus protocol as described in the manuscript applied to forepaw somatosensory cortex in the right hemisphere. After stimulation, mice were sacrificed, perfused, and brains were harvested for tissue slicing and immunostaining for cFOS. Tissue slices containing right and left primary forepaw somatosensory cortex and primary and secondary motor cortices (+0.5mm A/P) or visual cortex (-2.8mm A/P) were examined for cFOS staining and compared across groups.
  
  Below is a summary table of our findings, and representative tissue slices. While c-FOS IHC was successful, results are not consistent with expectations from the mouse strains used. Only 1 ChR2+ mouse exhibited staining patterns consistent with local S1FP photostimulation, while expression in ChR2- mice was more variable, and in some instances exhibits higher expression in targeted circuits compared to ChR2+ mice. It is possible that awake behaving mice already exhibit high activity in sensorimotor cortex at rest, which might obscure changes specific to optogenetic photostimulation. Regardless, because the tissue staining experiments were inconclusive in healthy animals, we did not proceed with further experiments in the stroke groups, and do not report these findings in the manuscript.
  
  3) Also, it is possible that contralateral stimulation is impairing recovery, not through an effect on the contralateral cortex (the site of the stroke), but on descending projections, or theoretically even through evoking activity or subclinical movement of the contralateral limb (ipsilateral to the stroke). By more carefully mapping the distribution of the activity of the stimulated brain region, and what exactly is being stimulated, these issues can be explored.
  
  The reviewer raises an excellent point. We have added to the “Limitations and Future work” section of the Discussion on pages 15-16
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.02.442355v1
www.biorxiv.org www.biorxiv.org

Higher order thalamus encodes correct goal-directed action

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Evaluation Summary:
  
  This study, which will be of interest to neuroscientists in the fields of learning and memory, somatosensation, and motor behavior, uses systems neuroscience tools to expand our view how the postero-medial (POm) nucleus of the thalamus contributes to goal-directed behavior. The reviewers suggested additional ontogenetic experiments to clarify the nature and specificity of those roles. They also indicated that certain alternative explanations to the experimental observations could be addressed for a more balanced presentation and interpretation of the results.
  
  We thank the editors and reviewers for their constructive comments. We have now performed additional analysis and revised the text which we believe has improved the manuscript.
  
  Reviewer #1 (Public Review):
  
  1) Fig 1 - Supp 1 suggests that virus expression was always limited to POm. Drawing borders expressing areas from epifluorescence images is probably very dependent on imaging parameters. The Methods indicate that the authors scaled so that no pixels were saturated. This could mean that there was some weak expression of GCaMP6f or ArchT outside of POm. As I understand it, the authors set exposure/gains by the brightest points in the image. The limited extent of the infection in the figures might just reflect its center, which is brightest, rather than its full extent. If there were GCaMP or ArchT in VPL, some results would need to be reinterpreted.
  
  We agree with the reviewer that the determined expression areas are dependent on imaging parameters, however, we are confident that the virus expression used for analysis in this study are confined to the POm. In this study, our analysis of targeting of POm is three-fold. First, we optimized the volume of virus loaded to the minimum necessary to observe POm projections in S1 (a single targeted injection of 60 nl). Second, we analyzed the fluorescence spread using fluorescence microscopy after every experiment. We set exposure to use the full dynamic range of the image as previously described (Gambino et al., 2014). Occasionally, the virus spread to the adjacent VPM nucleus and this was easily recognizable by the characteristic VPM projections with the barrels of the barrel cortex. These animals were excluded from this study and not further analyzed. The VPL nucleus is located further caudally in respect to the VPM and again, we were able to identify if the virus has spread to this nucleus via posthoc fluorescence microscopy. These animals were excluded from this study and not further analyzed. We note that our stereotaxic injections were not flawless and the virus occasionally spread along the injection pipette track and into high-order visual thalamic nuclei LP and LD, superficial to POm. This is shown in Figure 1. These two nuclei, however, do not target S1 (Kamishina et al., 2009; van Groen and Wyss, 1992) and were therefore not imaged within our study. Third, we analyze the projection profile in FPS1 to ensure that it corresponds to the projection profile of POm and not VPL. If there is fluorescence in non-targeted areas, then the experiments were excluded from analysis.
  
  An additional degree of precision is offered by our imaging and optogenetic strategy. Calcium imaging was performed in layer 1 which is targeted by POm (Meyer et al., 2010), and not VPL which targets layer 4. Therefore, spillover into VPL would not influence our imaging results as we only image axons in layer 1 which is targeted by POm. Furthermore, during the optogenetic experiments, the fiber optic was targeted to the POm (not the VPL), thus providing a secondary POm localization of the photo-inhibited region. This is now discussed in the revised manuscript.
  
  2) Calcium responses are weaker during the naïve state than the expert state (Fig.1D,E), similar to the start of the reversal training (Fig.4G,H). If POm encodes correct actions, why is there any response at all in naïve mice? Is that not also a sign of stimulus encoding? Might there be another correlate of correctness with regard to the task, such as an expert mouse holding their paw more firmly or still on the stimulating rod? This could alter the effective stimulus or involve different motor signals to POm.
  
  We agree with the reviewer that the POm is encoding the stimulus in the naïve state. This is evident in our study, and others, which show that the POm increases activity during sensory input in naïve mice. In the expert state, stimulus encoding may also be performed by a subset of POm axons, however, our findings show that, overall, there is a significant increase in the POm activity which is dependent on the behavioral performance (HIT, MISS), and not on the presentation of the stimulus. This is not due to licking motion as there was similar POm activity during the action and suppression tasks which involved licking and not licking for reward (Figure 3E). Furthermore, all experiments were monitored online via a behavioral camera to examine the location of the forepaw on the stimulus during all trials, and trials where the paw was not clearly resting on the stimulating rod were excluded from analysis. However, we cannot rule out that non-detectable changes in postures/paw grip may occur which may alter the effectiveness of the stimulus. This is now discussed in the revised manuscript.
  
  3) The authors are rightly concerned that licking might contribute to POm activity and expend some good effort checking this. The reversal is a good control, but doesn't produce identical POm activity. The other licking analyses, while good, did not completely rule out licking effects. First, lines 110-111 state "…as there was no correlation between licking frequency and POm axonal activity (Figure 1I)", but Fig.1I doesn't seem to support that statement. Second, the authors analyze isolated spontaneous licks, but these probably involve less licking and less overall motion than during a real response.
  
  We thank the reviewer for acknowledging the effort we made to assess the influence of licking behavior on POm axonal activity. We now include a more direct analysis in the revised manuscript illustrating the relationship between the licking response and POm activity. This analysis shows there is no correlation between licking and POm axonal activity (linear regression, p = 0.9228), further suggesting that POm axonal activity is not simply due to licking behavior.
  
  4) Many figures (Fig.1F, 2B, 3C, 4C) make it apparent that a population of axons respond very early to the stimulus itself. I understand the authors point that many of their analyses show that on average the axons are not strongly modulated by this stimulus, but this is not true of every axon. Either some of these axons are coming from cells outside of POm (see #1) or some POm cells are stimulus driven. In either case, if some axons are strongly stimulus driven, the activity of these axons will correlate with correct choices. The stimulus and correct choices are themselves highly correlated because the animals perform so well. I do not understand how stimulus encoding and choice encoding can be disentangled by either behavior or the two behaviors in comparison. Simple stimulus encoding might be further modulated by arousal or reward expectation that increases with task learning (see #6).
  
  In this study, we are able to disentangle stimulus encoding and choice encoding by comparing the POm axonal activity with the different behavioral performance (HIT or MISS). Here, the same stimulus is always presented (tactile, 200 Hz), however, the mouse response differs. Despite receiving the same tactile stimulus, POm signaling in forepaw S1 is significantly increased during correct HIT trials compared with MISS trials in both the action and suppression task. Therefore, we do not believe POm axonal activity is predominantly encoding sensory information in this task. We agree with the reviewer that individual POm axons are heterogenous and a subset of axons may respond to the sensory stimulus during the behavior. We now state this in the revised manuscript. However, if some axons are strongly stimulus driven, the activity of these axons should correlate with both correct and incorrect choices as the same stimulus is also delivered during MISS trials. We now highlight this in the revised manuscript.
  
  Simple stimulus encoding might be further modulated by arousal or reward expectation that increases with task learning. In our study, the increase in POm activity during HIT behaviour was not due to elevated task engagement as, despite similar levels of arousal (Figure 4B), POm activity in expert mice differed in comparison to chance performance (switch behaviour; Figure 4G, H). This is now discussed in detail in the revised manuscript.
  
  5) I was unable to understand the author's conclusion about what POm is doing. They use terms like "behavioral flexibility" to describe its purpose, but the connection of this term to POm is not explained. Is a role as a flexibility switch really supported? Why does S1 need POm to signal a correct choice? Fig.6 did not seem helpful here. Couldn't S1 just detect the stimulus on its own and transmit consequent signals to wherever they need to be to generate behavior?
  
  We have now revised the manuscript and clearly define behavioral flexibility and to improve the clarity of our conclusions. We believe that S1 needs POm to signal a correct choice as behavior needs to be dynamically modulated at all times. If S1 simply detected the stimulus on its own and transmitted a consequent signals to generate behavior, then important modulatory processes that lead to dynamic changes in behavior would not be processed. Along with other feedback projections, the POm targets the upper layers of the cortex, whereas external sensory information targets the layer 4 input layer. At the level of a single pyramidal neuron, this means POm input lands on the tuft dendrites whereas external sensory information lands on the proximal basal dendrites. This segregation of input provides a great cellular mechanism for increasing the computational capabilities of neurons. Since the POm is most active in the expert state during correct behavior, we believe the POm plays a vital role in providing behaviorally relevant information. Our findings illustrate that the POm is simply not conveying a ‘Go’ signal as POm activity was not increased during correct behavior in chance performance.
  
  6) Arousal or reward expectation may be better explanations than flexibility. Lines 323-324 say that POm activity increased with pupil diameter normally but reversed during reward delivery. Which data support this statement? With regards to pupil, the Results only seem to indicate that there is no difference in diameter between the two conditions (expert and 50% chance) using 3 bins of data. However, I could not find the time windows used for computing these. Pupil is known to be lagged and the timing could be critical.
  
  The statement that ‘POm activity increased with pupil diameter normally but reversed during reward delivery’ stems from data illustrated in Figure 1I and 3B. For space and flow of the manuscript, we weren’t able to show them on the same graph as per below. Here, you can see that during reward (blue), POm activity decreased compared to response (green) whereas the pupil diameter was maximum during reward delivery. We now include more information in the methods regarding pupil tracking (see line 908 to 916, Data analysis and statistical methods; Pupil tracking).
  
  7) There are other possible interpretations of the results when the authors target POm for optogenetic suppression (around lines 246-248). The effects here are also consistent with preventing tonic and evoked POm activity from reaching lots of target structures other than S1: S2, PPC, motor cortex, dorsolateral striatum, etc. Maybe one of these cannot respond to the stimulus as well and Hits decrease?
  
  We now include a discussion in the revised manuscript that ‘since the POm targets many cortical and subcortical regions (Alloway et al., 2017; Oh et al., 2014; Trageser and Keller, 2004; Yamawaki and Shepherd, 2015), target-specific photo-inhibition is required to illustrate which POm projection pathway specifically influences goal-directed behavior.’
  
  8) Line 689. What alerts the mouse that a catch trial is happening? Is there something like an audio cue for onset of stimulus trials and catch trials? If there is no cue, wouldn't mice be in a different behavioral state during catch trials than during stimulus trials? The trial types could differ by more than the presence of the stimulus.
  
  There is broadband noise during the trial that acts as a cue. This is detailed in the methods and text.
  
  Reviewer #2 (Public Review):
  
  In this manuscript, D LaTerra et al explored the function of POm neurons during a tactile-based, goal-directed reward behavior. They target POm neurons that project to forepaw S1 and use two-photon Ca2+imaging in S1 to monitor activity as mice performed a task where forepaw tactile stimulation (200 Hz, 500 ms) predicted a reward if mice licked at a reward port within 1.5 seconds. If mice did not lick, there was a time-out instead of a reward. The authors found that POm-S1 axons showed enhanced responses during the baseline period, the response window after the cue, and during reward delivery. They then showed that a subset of neurons were active during the response window during correct trials when the tactile stimulus served as a cue, but not on catch trials where animals spontaneously licked for a reward.
  
  They then showed that POm axonal activity in S1 increased during the response window for "HIT" trials where animals correctly responded to the tactile stimulus with licking but the activity was less during "MISS" trials where animals did not respond. In order to probe whether this activity in the response window was being driven by motor activity, they designed a suppression task in which animals had to learn to suppress licking in response to the tactile stimulus in order to the receive a reward. POm neurons also showed increased activity during the response window even though action was being suppressed. However, this activity was less than during the action task. Thus, although POm activity is not encoding action, its activity is significantly different during an action-based task than an action suppression one. They then analyzed calclium activity during the training period between the action task and the suppression task in which animals were learning the new contingency and were not performing as experts. In this non-expert context there was not a difference between in POm axonal activity between "HIT" and "MISS" trials.
  
  Lastly, they used ArchT to inhibit POm cell body activity during the tactile stimulus and response window of some trials and showed that they reduced performance during the trials when light was on.
  
  Altogether, this paper provides evidence that POm neurons are not simply encoding sensory information. They are modulated by learning and their activity is correlated to performance in this goal-directed task. However, the actual role of the POm input to S1 is not discernable from the current experiments. Subsets of neurons show significant activity during the response window as well as reward. In addition, the role of this input is different during the switch task than during expert performance. There are a number of outstanding questions, which, if answered, would help to directly define the role of these neurons in this specific paradigm. For instance, the authors record specifically from POm axons in S1. How distinct is this activity from other neurons in the POm? Some POm neurons still show significant activity during MISS trials. Do these neurons have a different function than those that show a preferential response during HIT trials? Does POm activity during the switch task, which has a component of extinction training, differ from when the animals are first learning the action-based task? Likewise, are the same neurons that acquire a response during the initial learning of the action-based task, the same neurons that are responding during the action suppression task?
  
  The authors provide great evidence that POm neurons that project to the S1 do not simply encode sensory information or actions, but are instead signaling during correct performance. However, inhibition of cell bodies did not dramatically effect performance and it is still unclear what role this circuit actually plays in this behavior. Finer-tuned optogenetic experiments and analysis of cell bodies within POm may provide greater details that will help define this circuit's role.
  
  We thank the reviewer for their comments. We have now revised the manuscript to clearly state the role of the POm during the goal-directed behavioral tasks used in this study. We have provided more information regarding the range of activity patterns in POm axons within S1.
  
  The POm contains a heterogenous population of neurons and since it projects to multiple cortical and subcortical regions, the activity of POm axonal projections in S1 may indeed be different to other projection targets.
  
  The activity of POm axons during MISS behavior may have a different function than those that show a preferential response during HIT trials, however, this evoked rate is not significantly different to baseline and therefore is hard to differentiate from spontaneous activity (see Figure 2). Furthermore, the evoked rate of POm activity during the switch task is not significantly different compared to naïve mice (p = 0.159; Kruskal-Wallis test). This information is now included in the manuscript.
  
  It is unknown whether the same neurons that acquire a response during the initial learning of the action-based task are the same neurons that are responding during the action suppression task as we were unable to conclusively determine whether or not the same POm axons were imaged in the different protocols.
  
  Reviewer #3 (Public Review):
  
  In their paper "Higher order thalamus flexibly encodes correct goal-directed behavior", LaTerra et al. investigate the function of projections from the thalamic nucleus POm to primary somatosensory cortex (S1) in the performance of goal-directed behaviors. The authors performed in vivo calcium imaging of POm axons in layer 1 of the forepaw region of S1 (fpS1) to monitor the activity of POm-fpS1 projections while mice performed a tactile detection task. They report that the activity of POm-fpS1 axons on successful ('hit') trials was increased in trained mice relative to naïve mice. Additionally, the authors used an action suppression variant of the task to show that POm-fpS1 axon activity was higher on successful trials over unsuccessful ('miss') trials regardless of the correct motor response required. During transition between task conditions, when mice perform at chance levels, the increase of POm-fpS1 activity during correct trials is no longer seen. Finally, the authors use inhibitory optogenetic tools to suppress POm activity, revealing a modest suppression in behavioral success. The authors conclude from these data that POm-fpS1 axons preferentially "encode and influence correct action selection" during tactile goal-oriented behavior.
  
  This study presents several interesting findings, particularly with respect to the change in activity of POm-fpS1 axons during successful execution of a trained behavior. Additionally, the similarity in responses of POm-fpS1 on both the 'goal-directed action' and 'action suppression' tasks provides convincing evidence that POm-fpS1 activity is not likely to encode the motor response. Overall, these results have important implications for how activity in higher order thalamic nuclei corresponds to learning a sensorimotor behavior, and the authors use several clever experiments to address these questions. Yet, the major claim that POm encodes 'correct performance' should be defined more clearly. As is, there are alternative explanations that could be raised and should be discussed in more depth (Points 1), especially as it relates to any causal role the authors ascribe to POm (Point 2). In addition some clarification as to which types of signals (i.e. frequency of active axons vs. amplitude of signal in the active axons) the authors feel are most informative would be helpful (Point 3).
  
  We thank the reviewer for their helpful comments and assessment of our study. We have now addressed all comments and revised the manuscript accordingly.
  
  1) The authors argue that POm activity reflects 'correct task performance' and that the increased activity of POm-fpS1 axons in the response epoch is not due to sensory encoding. An alternative explanation is that POm-fpS1 axons do convey sensory information, and these connections are facilitated with learning - meaning the activity of pathways conveying sensory signals that are correlated with task success could be facilitated with training, and this facilitation could be disrupted during the switching task. In this sense, the activity profiles do not encode 'correct action' per se, but rather represent the sensory responses whose correlation to rewarded action have been reinforced with training (which would also be a very interesting finding). This would be quite distinct from the "cognitive functions" they ascribe to this pathway (line 341). It might have helped to introduce a delay period in between the sensory stimulus and response epoch to try to distinguish responses that encode information about the sensory stimulus from those that might be involved in encoding task performance. However, as is, it is difficult to distinguish between these two scenarios with this data, and thus the interpretations the authors present could be rephrased with alternatives discussed in more depth.
  
  Based on multiple findings within this study, we suggest that the POm does not predominantly encode sensory information. This is most evident when comparing POm activity during correct (HIT) and incorrect (MISS) behavior in both the action and suppression tasks. As shown in Figures 2 and 3, there is a considerable difference in activity during correct (HIT) and incorrect (MISS) trials, even though the same stimulus was delivered in both trial types. This finding suggests that POm axons do not convey sensory information which is facilitated with learning as, if this were true, it could be expected that HIT and MISS responses would be similarly increased in expert (HIT and MISS) compared to naïve mice. This is now discussed in detail in the revised manuscript.
  
  We agree that it would have been beneficial to separate the stimulus from the response period in the behavioral paradigm. However, to avoid engaging working memory, we did not wish to enforce a delay period in this study. We have, in another study, enforced a short delay period (500 ms) between the stimulus and response epoch. Here, the evoked rate of POm axonal activity in expert mice was three-fold greater in the (now clearly separated) response epoch compared to the stimulus epoch (0.30 ± 0.02 vs. 0.099 ± 0.01, n = 196 dendrites; p < 0.0001; Wilcoxon matched-pairs signed rank test). Although out of the scope of this study, these unpublished results provides further confirmation and confidence in the analysis performed and conclusions made in this study.
  
  2) Similarly, while the authors attempt to establish a causal role for POm in task performance by optogenetically inhibiting POm during the response epoch, the results are also consistent with a deficit in sensory processing, and cannot be interpreted strictly as a disruption of the encoding of 'correct action' task performance signals. Furthermore, these perturbation studies do not demonstrate that the POm-fpS1 projections they are studying are implicated in the modest behavioral deficits. As the authors state, POm projects to many targets (lines 63-66), and similar sensory-based, goal-directed behaviors do not require S1 (lines 302-305). In light of these points, some of the statements ascribing a causal role for these projections in task success could be rephrased (e.g. line 33 "to encode and influence correct action selection", line 252 "a direct influence", line 340 "plays an active role during correct performance").
  
  We agree that the decrease in correct performance during optogenetic inhibition of POm cell bodies may also be explained by a deficit in sensory processing. However, in this study, we went to great lengths to illustrate that the POm is encoding correct action, and not sensory information (detailed in response to 1). This is further expanded upon in the revised manuscript. We also agree that the perturbation studies do not directly demonstrate that the POm to S1 projections are driving the behavioral deficits. We therefore only refer to the POm itself when discussing the influence on behavior and we have now revised the manuscript accordingly.
  
  3) Event amplitude and probability were both quantified, but were not consistently reported throughout the manuscript and figures. For example, Figure 1 reports both probability and amplitude (Figure 1G and H), whereas Figure 2 only reports probability. Thus, it was not always clear as to whether the authors were ascribing biological significance to one or both of these measures, given that in some cases differences were found in one and not the other, and which of the measures were reported was occasionally switched. It would be helpful for the authors to clarify the significance they assign to each measure, and report both measures side by side for all experiments if they interpret them both as relevant.
  
  We thank the reviewer for this observation and have now included a statement discussing the reporting of Ca2+ transient probability and/or amplitude in the methods. Throughout the Figures, we typically illustrated probability of an evoked transient as this is a reliable measure which was dramatically altered within this study. We now report the Ca2+ transient peak amplitudes during HIT and MISS trials for direct comparison of both measures (Figure 2).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.07.05.188821v1
www.medrxiv.org www.medrxiv.org

New submission 24/02/2023, 15:20:34

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  It is now widely accepted that the age of the brain can differ from the person's chronological age and neuroimaging methods are ideally suited to analyze the brain age and associated biomarkers. Preclinical studies of rodent models with appropriate neuroimaging do attest that lifestyle-related prevention approaches may help to slow down brain aging and the potential of BrainAGE as a predictor of age-related health outcomes. However, there is a paucity of data on this in humans. It is in this context the present manuscript receives its due attention.
  
  Comments:
  
  1) Lifestyle intervention benefits need to be analyzed using robust biomarkers which should be profiled non-invasively in a clinical setting. There is increasing evidence of the role of telomere length in brain aging. Gampawar et al (2020) have proposed a hypothesis on the effect of telomeres on brain structure and function over the life span and named it as the "Telomere Brain Axis". In this context, if the authors could measure telomere length before and after lifestyle intervention, this will give a strong biomarker utility and value addition for the lifestyle modification benefits. 2) Authors should also consider measuring BDNF levels before and after lifestyle intervention.
  
  Response to comments 1+2: we agree that associating both telomere length and BDNF level with brain age would be interesting and relevant. However, we did not measure these two variables. We would certainly consider adding these in future work. Regarding telomere length, we now include a short discussion of brain age in relation to other bodily ages, such as telomere length (Discussion section):
  
  “Studying changes in functional brain aging is part of a broader field that examines changes in various biological ages, such as telomere length1, DNA methylation2, and arterial stiffness3. Evaluating changes in these bodily systems over time allows us to capture health and lifestyle-related factors that affect overall aging and may guide the development of targeted interventions to reduce age-related decline. For example, in the CENTRAL cohort, we recently reported that reducing body weight and intrahepatic fat following a lifestyle intervention was related to methylation age attenuation4. In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among dataset6 and larger training sample size5 may improve the accuracy of such models in the future. We also suggest that examining the dynamics of multiple bodily ages and their interactions would enhance our understanding of the complex aging process8,9. “
  
  And
  
  “These findings complement the growing interest in bodily aging indicated, for example, by DNA methylation4 as health biomarkers and interventions that may affect them.”
  
  Reviewer #2 (Public Review):
  
  In this study, Levakov et al. investigated brain age based on resting-state functional connectivity (RSFC) in a group of obese participants following an 18-month lifestyle intervention. The study benefits from various sophisticated measurements of overall health, including body MRI and blood biomarkers. Although the data is leveraged from a solid randomized control set-up, the lack of control groups in the current study means that the results cannot be attributed to the lifestyle intervention with certainty. However, the study does show a relationship between general weight loss and RSFC-based brain age estimations over the course of the intervention. While this may represent an important contribution to the literature, the RSFC-based brain age prediction shows low model performance, making it difficult to interpret the validity of the derived estimates and the scale of change. The study would benefit from more rigorous analyses and a more critical discussion of findings. If incorporated, the study contributes to the growing field of literature indicating that weight-reduction in obese subjects may attenuate the detrimental effect of obesity on the brain.
  
  The following points may be addressed to improve the study:
  
  Brain age / model performance:
  
  1) Figure 2: In the test set, the correlation between true and predicted age is 0.244. The fitted slope looks like it would be approximately 0.11 (55-50)/(80-35); change in y divided by change in x. This means that for a chronological age change of 12 months, the brain age changes by 0.11*12 = 1.3 months. I.e., due to the relatively poor model performance, an 80-year-old participant in the plot (fig 2) has a predicted age of ~55. Hence, although the age prediction step can generate a summary score for all the RSFC data, it can be difficult to interpret the meaning of these brain age estimates and the 'expected change' since the scale is in years.
  
  2) In Figure 2 it could also help to add the x = y line to get a better overview of the prediction variance. The estimates are likely clustered around the mean/median age of the training dataset, and age is overestimated in younger subs and overestimated in older subs (usually referred to as "age bias"). It is important to inspect the data points here to understand what the estimates represent, i.e., is variation in RSFC potentially lost by wrapping the data in this summary measure, since the age prediction is not particularly accurate, and should age bias in the predictions be accounted for by adjusting the test data for the bias observed in the training data?
  
  Response to comment 1+2: we agree with the reviewer that due to the relatively moderate correlation between the predicted and observed age, a large change in the observed age corresponds to a small change in the predicted age. We now state this limitation in Results section 2.1:
  
  “Despite being significant and reproducible, we note that the correlations between the observed and predicted age were relatively moderate.”
  
  And discuss this point in the Discussion section:
  
  “In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among dataset6 and larger training sample size5 may improve the accuracy of such models in the future.”
  
  Moreover, , we now add the x=y line to Fig. 2, so the readers can better assess the prediction variance as suggested by the reviewer:
  
  We prefer to avoid using different scales (year/month) in the x and y axes to avoid misleading the readers, but the list of observed and predicted ages are available as SI files with a precision of 2 decimals point (~3 days).
  
  We note that despite the moderate precision accuracy, we replicated these results in three separate cohorts.
  
  Regarding the effect of “age bias” (also known as “regression attenuation” or “regression dilution” 10), we are aware of this phenomenon and agree that it must be accounted for. In fact, the “age bias” is one of the reasons we chose to use the difference between the expected and observed ages as the primary outcome of the study, as this measure already takes this bias into account. To demonstrate this effect we now compute brain age attenuation in two ways: 1. As described and used in the current study (Methods 4.9); and 2. By regressing out the effect of age on the predicted brain age at both times separately, then subtracting the adjusted predicted age at T18 from the adjusted predicted age at T0. The second method is the standard method to account for age bias as described in a previous work 11. Below is a scatter plot of both measures across all participants:
  
  The x-axis represents the first method, used in the current study, and the y-axis represents the second method, described in Smith et al., (2019). Across all subjects, we found a nearly perfect 1:1 correspondence between the two methods (r=.998, p<0.001; MAE=0.45), as the two are mathematically identical. The small gap between the two is because the brain age attenuation model also takes into account the difference in the exact time that passed between the two scans for each participant (mean=21.36m, std = 1.68m).
  
  We now note this in Methods section 4.9:
  
  “We note that the result of computing the difference between the bias-corrected brain age gap at both times was nearly identical to the brain age attenuation measure (r=.99, p<0.001; MAE=0.45). The difference between the two is because the brain age attenuation model takes into account the difference in the exact time that passed between the two scans for each participant (mean=21.36m, std = 1.68m).”
  
  3) In Figure 3, some of the changes observed between time points are very large. For example, one subject with a chronological age of 62 shows a ten-year increase in brain age over 18 months. This change is twice as large as the full range of age variation in the brain age estimates (average brain age increases from 50 to 55 across the full chronological age span). This makes it difficult to interpret RSFC change in units of brain age. E.g., is it reasonable that a person's brain ages by ten years, either up or down, in 18 months? The colour scale goes from -12 years to 14 years, so some of the observed changes are 14 / 1.5 = 9 times larger than the actual time from baseline to follow-up.
  
  The questions above should be investigated and addressed in the context of potential challenges with using brain age as a marker (see e.g., https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.25837, https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.26144).
  
  We agree that our model precision was relatively low, especially compared to the period of the intervention, as also stated by reviewer #1. We now discuss this issue in light of the studies pointed out by the reviewer (Discussion section):
  
  “In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among datasets6 and larger training sample size5 may improve the accuracy of such models in the future.”
  
  Again, we note that despite the moderate precision accuracy, we replicated these results in three separate cohorts and found that both the correlation and the MAE between the predicted and observed age were significant in all of them.
  
  RSFC for age prediction:
  
  1) Several studies show better age prediction accuracy with structural MRI features compared to RSFC. If the focus of the study is to use an accurate estimate of brain ageing rather than specifically looking at changes in RSFC, adding structural MRI data could be helpful.
  
  We focused on brain structural changes in a previous work, and the focus of the current work was assessing age-related functional connectivity alterations. We now added a few sentences in the Introduction section that would hopefully better motivate our choice:
  
  “We previously found that weight loss, glycemic control, lowering of blood pressure, and increment in polyphenols-rich food were associated with an attenuation in brain atrophy 12. Obesity is also manifested in age-related changes in the brain’s functional organization as assessed with resting-state functional connectivity (RSFC). These changes are dynamic13 and can be observed in short time scales14 and thus of relevance when studying lifestyle intervention.”
  
  2) If changes in RSFC are the main focus, using brain age adds a complicated layer that is not necessarily helpful. It could be easier to simply assess RSFC change from baseline to follow up, and correlate potential changes with changes in e.g., BMI.
  
  We are specifically interested in age-related changes as we described a-priori in the registration of the study: https://clinicaltrials.gov/ct2/show/NCT03020186
  
  Moreover, age-related changes in RSFC are complex, multivariate and dependent upon the choice of theoretical network measures. We think that a data-driven brain age prediction approach might better capture these multifaceted changes and their relation to aging. We now state this in the Introduction section:
  
  “Studies have linked obesity with decreased connectivity within the default mode network15,16 and increased connectivity with the lateral orbitofrontal cortex17, which are also seen in normal aging18,19. Longitudinal trials have reported changes in these connectivity patterns following weight reduction20,21, indicating that they can be altered. However, findings regarding functional changes are less consistent than those related to anatomical changes due to the multiple measures22 and scales23 used to quantify RSFC. Hence, focusing on a single measure, the functional brain age, may better capture these complex, multivariant changes and their relation to aging. “
  
  The lack of control groups
  
  1) If no control group data is available, it is important to clarify this in the manuscript, and evaluate which conclusions can and cannot be drawn based on the data and study design.
  
  We agree that this point should be made more clear, and we now state this in the limitation section of the Discussion:
  
  “We also note that the lack of a no-intervention control group limits our ability to directly relate our findings to the intervention. Hence, we can only relate brain age attenuation to the observed changes in health biomarkers.”
  
  Also, following reviewers’ #2 and #3 comments, we refer to the weight loss following 18 months of lifestyle intervention instead of to the intervention itself. This is now made clear in the title, abstract, and the main text.
  
  Reviewer #3 (Public Review):
  
  The authors report on an interesting study that addresses the effects of a physical and dietary intervention on accelerated/decelerated brain ageing in obese individuals. More specifically, the authors examined potential associations between reductions in Body-Mass-Index (BMI) and a decrease in relative brain-predicted age after an 18-months period in N = 102 individuals. Brain age models were based on resting-state functional connectivity data. In addition to change in BMI, the authors also tested for associations between change in relative brain age and change in waist circumference, six liver markers, three glycemic markers, four lipid markers, and four MRI fat deposition measures. Moreover, change in self-reported consumption of food, stratified by categories such as 'processed food' and 'sweets and beverages', was tested for an association with change in relative brain age. Their analysis revealed no evidence for a general reduction in relative brain age in the tested sample. However, changes in BMI, as well as changes in several liver, glycemic, lipid, and fat-deposition markers showed significant covariation with changes in relative brain age. Three markers remained significant after additionally controlling for BMI, indicating an incremental contribution of these markers to change in relative brain age. Further associations were found for variables of subjective food consumption. The authors conclude that lifestyle interventions may have beneficial effects on brain aging.
  
  Overall, the writing is concise and straightforward, and the langue and style are appropriate. A strength of the study is the longitudinal design that allows for addressing individual accelerations or decelerations in brain aging. Research on biological aging parameters has often been limited to cross-sectional analyses so inferences about intra-individual variation have frequently been drawn from inter-individual variation. The presented study allows, in fact, investigating within-person differences. Moreover, I very much appreciate that the authors seek to publish their code and materials online, although the respective GitHub project page did not appear to be set to 'public' at the time (error 404). Another strength of the study is that brain age models have been trained and validated in external samples. One further strength of this study is that it is based on a registered trial, which allows for the evaluation of the aims and motivation of the investigators and provides further insights into the primary and secondary outcomes measures (see the clinical trial identification code).
  
  One weakness of the study is that no comparison between the active control group and the two experimental groups has been carried out, which would have enabled causal inferences on the potential effects of different types of interventions on changes in relative brain age. In this regard, it should also be noted that all groups underwent a lifestyle intervention. Hence, from an experimenter's perspective, it is problematic to conclude that lifestyle interventions may modulate brain age, given the lack of a control group without lifestyle intervention. This issue is fueled by the study title, which suggests a strong focus on the effects of lifestyle intervention. Technically, however, this study rather constitutes an investigation of the effects of successful weight loss/body fat reduction on brain age among participants who have taken part in a lifestyle intervention. In keeping with this, the provided information on the main effect of time on brain age is scarce, essentially limited to a sign test comparing the proportions of participants with an increase vs. decrease in relative brain age. Interestingly, this analysis did not suggest that the proportion of participants who benefit from the intervention (regarding brain age) significantly exceeds the number of participants who do not benefit. So strictly speaking, the data rather indicates that it's not the lifestyle intervention per sé that contributes to changes in brain age, but successful weight loss/body fat reduction. In sum, I feel that the authors' claims on the effects of the intervention cannot be underscored very well given the lack of a control group without lifestyle intervention.
  
  We agree that this point, also raised by reviewer #2, should be made clear, and we now state this in the limitation section of the Discussion:
  
  “We also note that the lack of a no-intervention control group limits our ability to directly relate our findings to the intervention. Hence, we can only relate brain age attenuation to the observed changes in health biomarkers.”
  
  Also, following reviewers #2 and #3, we refer to the weight loss following 18 months of lifestyle intervention instead of to the intervention itself. This is now explicitly mentioned in the title, abstract, and within the text:
  
  Title: “The effect of weight loss following 18 months of lifestyle intervention on brain age assessed with resting-state functional connectivity”
  
  Abstract: “…, we tested the effect of weight loss following 18 months of lifestyle intervention on predicted brain age, based on MRI-assessed resting-state functional connectivity (RSFC).”
  
  Another major weakness is that no rationale is provided for why the authors use functional connectivity data instead of structural scans for their age estimation models. This gets even more evident in view of the relatively low prediction accuracies achieved in both the validation and test sets. My notion of the literature is that the vast majority of studies in this field implicate brain age models that were trained on structural MRI data, and these models have achieved way higher prediction accuracies. Along with the missing rationale, I feel that the low model performances require some more elaboration in the discussion section. To be clear, low prediction accuracies may be seen as a study result and, as such, they should not be considered as a quality criterion of the study. Nevertheless, the choice of functional MRI data and the relevance of the achieved model performances for subsequent association analysis needs to be addressed more thoroughly.
  
  We agree that age estimation from structural compared to functional imaging yields a higher prediction accuracy. In a previous publication using the same dataset12, we demonstrated that weight loss was associated with an attenuation in brain atrophy, as we describe in the introduction:
  
  “We previously found that weight loss, glycemic control and lowering of blood pressure, as well as increment in polyphenols rich food, were associated with an attenuation in brain atrophy 12.”
  
  Here we were specifically interested in age-related functional alterations that are associated with successful weight reduction. Compared to structural brain changes aging effect on functional connectivity is more complex and multifaced. Hence, we decided to utilize a data-driven or prediction-driven approach for assessing age-related changes in functional connectivity by predicting participants’ functional brain age. We now describe this rationale in the introduction section:
  
  “Studies have linked obesity with decreased connectivity within the default mode network15,16 and increased connectivity with the lateral orbitofrontal cortex17, which are also seen in normal aging18,19. Longitudinal trials have reported changes in these connectivity patterns following weight reduction20,21, indicating that they can be altered. However, findings regarding functional changes are less consistent than those related to anatomical changes due to the multiple measures22 and scales23 used to quantify RSFC. Hence, focusing on a single measure, the functional brain age, may better capture these complex changes and their relation to aging.”
  
  We address the point regarding the low model performance in response to reviewer #2, comment #2.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2022.09.21.22280182v1
www.biorxiv.org www.biorxiv.org

Motor planning brings human primary somatosensory cortex into action-specific preparatory states

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Evaluation Summary:
  
  The authors studied the neural correlates of planning and execution of single finger presses in a 7T fMRI study focusing on primary somatosensory (S1) and motor (M1) cortices. BOLD patterns of activation/deactivation and finger-specific pattern discriminability indicate that M1 and S1 are involved not only during execution, but also during planning of single finger presses. These results contribute to a developing story that the role of primary somatosensory cortex goes beyond pure processing of tactile information and will be of interest for researchers in the field of motor control and of systems neuroscience.
  
  We thank all reviewers and the editor for their assessment of our paper. We acknowledge that our description of the methods and some interpretation of the results can be clarified and expanded. We address every comment and proposed suggestion in the following below.
  
  Reviewer #1 (Public Review):
  
  This is a very important study for the field, as the involvement of S1 in motor planning has never been described. The paradigm is very elegant, the methods are rigorous and the manuscript is clearly written. However, there are some concerns about the interpretation of the data that could be addressed.
  
  We thank Reviewer #1 for the positive evaluation of our study. We clarify our methodological choices and interpretation of the data in the following response.
  
  • The authors claim that planning and execution patterns are scaled version of each other, and that overt movement during planning is prevented by global deactivation. This is an interesting perspective, however the presented data are not fully convincing to support this claim:
  
  (1) the PCM analysis shows that correlation models ranging from 0.4 to 1 perform similarly to the best correlation model. This correlation range is wide and suggests that the correspondence between execution/planning patterns is only partial.
  
  The reviewer is correct that the current data leaves us with a specific amount of uncertainty. However, it should be noted that the maximum-likelihood estimates of correlations between noisy patterns are biased, as they are constrained to be smaller or equal to 1. Thus, we cannot test the hypothesis that the correlation is 1 by just comparing correlation estimates to 1 (for details on this, see our recent blog on this topic: http://www.diedrichsenlab.org/BrainDataScience/noisy_correlation/). To test this idea, we therefore use a generative approach (the PCM analysis). We find that no correlation model has a higher log-likelihood than the 1-correlation model, therefore we cannot rule out that the underlying true correlation is actually 1. In other words, we have as much evidence that the correspondence is only partial as we do that the correspondence is perfect. The ambiguity given by the wide correlation range is due to the role of measurement noise in the data and should not be interpreted as if the true correlation was lower than 1. What we can confidently conclude is that activity patterns have a substantial positive correlation between planning and execution. We take this opportunity to clarify this point in the results section.
  
  (2) in Fig.4 A-B, the distance between execution/planning patterns is much larger than the distance between fingers. How can such a big difference be explained if planning/execution correspond to scaled versions of the same finger-specific patterns? If the scaling is causing this difference, then different normalization steps of the patterns should have very specific effects on the observed results: 1) removing the mean value for each voxel (separately for execution and planning conditions) should nullify the scaling and the planning/execution patterns should perfectly align in a finger-specific way; 2) removing the mean pattern (separately for each finger conditions) should effectively disturb the finger-specific alignment shown in Fig.4C. These analyses would corroborate the authors' conclusion.
  
  The large distance between planning and execution patterns (compared to the distance between fingers) is caused by the fact that the average activity pattern associated with planning differs substantially from the average activity pattern during execution. Such a large difference is of course expected, given the substantially higher activity during execution. However, here we are testing the hypothesis that the pattern vectors that are related to a specific finger within either planning or execution are scaled version of each other. Visually, this can be seen in Figure 4B (bottom), where the MDS plot is rotated, such the line of sight is in the direction of the mean pattern difference between planning and execution—such that it disappears in the projection. Relative to the baseline mean of the data (cross), you can see that arrangement of the fingers in planning (orange) is a scaled version of the arrangement during execution (blue). The PCM model provides a likelihood-based test for this idea. The model accounts for the overall difference between planning and execution by including (and estimating) model terms related to the mean pattern of planning and execution, respectively, therefore effectively removing the mean activation of planning and execution. We have now explained this better in the results and methods sections, also referring to a Jupyter notebook example of the correlation model used (https://pcm-toolbox-python.readthedocs.io/en/latest/demos/demo_correlation.html).
  
  Regarding your analysis suggestions, removing the mean pattern for planning and execution across fingers as a fixed effect (suggestion 1) leads to the distance structure shown in Fig 4B (bottom)—showing that the finger-specific patterns during planning are scaled versions of those during execution (also see Fig. R1 below). On the other hand, subtracting the mean finger pattern across planning and execution (suggestion 2) will not fully remove the finger specific activation as the finger-specific patterns are differently scaled in planning and execution. Furthermore, neither of these subtraction analyses allows for a formal test of the hypotheses that the data can be explained by a pure scaling of the finger-specific patterns.
  
  Figure R1. RDM of left S1 activity patterns evoked by the three fingers (1, 3, 5) during no-go planning (orange) and execution (blue) after removing the mean pattern across fingers (separately for planning and execution). The bottom shows the corresponding multidimensional scaling (MDS) projection of the first two principal components. Black cross denotes mean pattern across conditions.
  
  • A conceptual concern is related to the task used by the authors. During the planning phase, as a baseline task, participants are asked to maintain a low and constant force for all the fingers. This condition is not trivial and can even be considered a motor task itself. Therefore, the planning/execution of the baseline task might interfere with the planning/execution of the finger press task. Even more controversial, the design of the motor task might be capturing transitions between different motor tasks (force on all finger towards single-finger press) rather than pure planning/execution of a single task. The authors claim that the baseline task was used to control for involuntary movements, however, EMG recordings could have similarly controlled for this aspect, without any confounds.
  
  Participants received training the day before scanning, which made the “additional” motor task very easy, almost trivial. In fact, the system was calibrated so that the natural weight of the hand on the keys was enough to bring the finger forces within the correct range to be maintained. Thus, very little planning/online control was required by the participants before pressing the keys. As for the concern of capturing transitions between different motor tasks, that it is indeed always the case in natural behavior. Arguably there is no such thing as “pure rest” in the motor system, active effort has to be made even to maintain posture. Furthermore, if the motor system considers the hold phase as a simultaneous movement phase, it should have prevented M1 and S1 to participate in the planning of upcoming movements, as it would be busy with maintaining and controlling the pre-activation. Having found clear planning related signals in M1 and S1 in this situation makes our argument, if anything, stronger.
  
  Finally, we specifically chose not to do EMG recordings because finger forces are a more sensitive measure of micro movements than EMG. Extensive pilot experiments for our papers studying ipsilateral representations and mirroring (e.g., Diedrichsen et al., 2012; Ejaz et al., 2018) have shown that we can pick up very subtle activations of hand muscles by measuring forces of a pre-activated hand, signals that clearly escape detection when recording EMG in the relaxed state. Based on these results, we actually consider the recording of EMG during the relaxed state as an insufficient control for the absence of cortical-spinal drive onto hand muscles. This is especially a concern when recording EMG during scanning, due to the decreased signal-to-noise ratio.
  
  • In Fig.2F, the authors show no-planning related information in high-order areas (PMd, aSPL), while such information is found in M1 and S1. This null result from premotor and parietal areas is rather surprising, considering current literature, largely cited by the authors, pointing to high-order motor or parietal areas involved in action planning.
  
  We agree with the reviewer that, to some extent, the lack of involvement of high-order areas in planning is surprising. However, we believe that task difficulty (i.e., planning demands) plays a role in the amount of observed planning activation. In other words, because participants were only asked to plan repeated movements of one finger, there was little to plan. The fact that this may have contributed to the null result in premotor and parietal areas was further confirmed by the second half of the dataset, which is not reported in the current paper. Here, we investigated the planning of multi-finger sequences, where planning demands are certainly higher. We found that high-order areas such as PMd and SPL were indeed active and involved in the planning of those, as expected. We decided to split the dataset across two publications as the multi-finger sequences have their own complexities, which would have distracted from the main finding of planning related activity in M1 and S1.
  
  Reviewer #3 (Public Review):
  
  I found the manuscript to be well written and the study very interesting. There are, however, some analytical concerns that in part arise because of a lack of clarity in describing the analyses.
  
  1) Some details regarding the methods used and results in the figures were missing or difficult to understand based on the brief description in the Methods section or figure legend.
  
  We thank Reviewer #3 for pointing out some lack of clarity in our description of the methods. We now expanded both the methods section and the figure captions (Fig. 2-3-4).
  
  2) I think the manuscript would benefit from a more balanced description on the role of S1. As the authors state, S1 is traditionally thought to process afferent tactile and proprioceptive input. However, in the past years, S1 has been shown to be somatopically activated during touch observation, attempted movements in the absence of afferent tactile inputs, and through attentional shifts (Kikkert et al., 2021; Kuehn et al., 2014; Puckett et al., 2017; Wesselink et al., 2019). Furthermore, S1 is heavily interconnected with M1, so perhaps if such activity patterns are present in M1, they could also be expected in S1?
  
  To better characterize the role of S1 during movement planning, we now include recent research showing that S1 can be somatotopically recruited even in the absence of tactile inputs.
  
  3) Related to the previous comment: If attentional shifts on fingers can activate S1 somatotopically, could this potentially explain the results? Perhaps the participants were attending to the fingers that were cued to be moved and this would have led to the observed activity patterns. I don't think the data of the current study allows the authors to tease apart these potential contributions. It is likely that both processes contribute simultaneously.
  
  We agree that our results could also be explained by attentional shifts on the fingers. It is very likely that, during planning, participants were specifically focusing on the cued finger. However, as the reviewer points out, our current dataset cannot distinguish between planning and attention as voluntary planning requires attention. We expanded the discussion section to include this possibility.
  
  4) The authors repeatedly interpret the absences of significant differences as indicating that the tested entities are the same. This cannot be concluded based on results of frequentist statistical testing. If the authors would like to make such claims, then they I think they should include Bayesian analysis to investigate the level of support for the null hypothesis.
  
  We have now clarified the parts in the manuscript that sounded as if we were interpreting the absence of significant difference (null results) as significant absence of differences (equivalence).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.12.17.423254v3
www.biorxiv.org www.biorxiv.org

New submission 06/10/2022, 09:45:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This study investigates how pathogens might shape animal societies by driving the evolution of diﬀerent social movement rules. The authors ﬁnd that higher disease costs induce shifts away from positive social movement (preference to move towards others) to negative social movement (avoidance from others). This then has repercussions on social structure and pathogen spread.
  
  Overall, the study comprises a good mixture of intuitive and less intuitive results. One major weakness of the work, however, is that the model is constructed around one pathogen that repeatedly enters a population across hundreds of generations. While the authors provide some justiﬁcation for this, it does not capture any biological realism in terms of the evolution of the pathogen itself, which would be expected. The lack of co-evolution in the model substantially limits the generality of the results. For example, a number of recent studies have reported that animals might be expected to become very social when pathogens are very infectious, because if the pathogen is unavoidable they may as well gain the beneﬁts of being social. The authors make some arguments about being focused on introduction events, but this does not really align well with their study design that carries through many generations after the introduction. Given the rapid evolutionary dynamics, perhaps the study could have a more focused period immediately after the initial introduction of the pathogen to look at rapid evolutionary responses (albeit this may need some sensitivity analyses around the parameters such as the mutation rates).
  
  We appreciate the reviewer’s evaluation of our work, and acknowledge that we have not currently included evolutionary dynamics for the pathogen.
  
  One conceptual impediment to such inclusion is knowing how pathogen traits could be modelled in a mechanistic way. For example, it is widely held that there is a trade-oﬀ between infection cost and transmissibility, with a quadratic relationship between them, but this is a pattern and not a process per se. We are unsure which mechanisms could be modelled that impinge upon both infection cost and transmissibility.
  
  On the practical side, we feel that a mechanistic, individual-based model that includes both pathogen and host evolution would become very challenging to interpret. It might be more tractable to begin with a mechanistic, spatial model that examines pathogen trait evolution with an unchanging host (such as an adaptation of Lion and Boots, 2010). We would be happy to take this on in future work, with a view to combining models thereafter.
  
  We have taken the suggestion to focus on the period immediately after the introduction, and we now focus on the following 500 generations. While 500 generations is still a long time, we would note that our model dynamics typically stabilise within 200 generations. We show the following generations primarily to check that some stability in the dynamics has indeed been reached (but see our new scenario 2).
  
  We also appreciate the point regarding mutation rates. Our mutation rates are relatively high to account for the small size of our population. We have found that with smaller mutation rates (0.001 rather than 0.01), evolutionary shifts in our population do not occur within the ﬁrst 500 generations. This is primarily because prior to pathogen introduction, the ‘agent avoiding’ strategy that becomes common later is actually quite rare. Whether a rapid transition takes place thus depends on whether there are any agent avoiding individuals in the population at the moment of pathogen introduction, or on whether such individuals emerge rapidly thereafter through mutations on the social weights. We expect that with larger population sizes, we would be able to recover our results with smaller mutation rates as well.
  
  A ﬁnal, and much more minor comment is whether this is really a paper about movement. The model does not really look at evolutionary changes in how animals move, but rather at where they move. How important is the actual movement process under this model? For example, would the results change if the model was constructed without explicit consideration of space and resources, but instead simply modelled individuals' decisions to form and break ties? (Similar to the recent paper by Ashby & Farine https://onlinelibrary.wiley.com/doi/full/10.1111/evo.14491 ). It might help to provide more information about how putting social decisions into a spatially explicit framework is expected to extend studies that have not done so (e.g.., because they are analytical).
  
  This paper is indeed about movement, as where to move is a key part of the movement ecology paradigm (Nathan et al. 2008). That said, we appreciate the advice to emphasise the importance of social decisions in a spatial context, we have added these to the Introduction (L. 79 – 81) and Discussion (L. 559 – 562). In brief, we do expect diﬀerent dynamics that result from the explicit spatial context, as compared to a model in which social associations are probabilistic and could occur with any individual in the population.
  
  In our models, individual social tendency (whether they are prefer moving towards others) is separated from individual sociality (whether they actually associate with other individuals). This can be seen from our (new) Fig. 3D, in which individuals of each of the social strategies can sometimes have similar numbers of associations (although modulated by movement). This separation of the pattern from the underlying process is possible, we believe, due to the heterogeneity in the social landscape created by the explicit spatial context.
  
  Reviewer #2 (Public Review):
  
  This theoretical study looks at individuals' strategies to acquire information before and after the introduction of pathogens into the system. The manuscript is well-written and gives a good summary of the previous literature. I enjoyed reading it and the authors present several interesting ﬁndings about the development of social movement strategies. The authors successfully present a model to look at the costs and beneﬁts of sociality.
  
  I have a couple of major comments about the work in its current form that I think are very important for the authors to address. That said, I think this is a promising start and that with some revisions, this could be a valuable contribution to the literature on behavioral ecology.
  
  We appreciate the reviewer’s kind words.
  
  Before starting, I would like to be precise that, given the scope of the models and the number of parameter choices that were necessary, I am going to avoid criticisms of the decisions made when designing the models. However, there are a few assumptions I rather ﬁnd problematic and would like to give proper attention to.
  
  The ﬁrst regards social vs. personal information. Most of the model argumentation is based on the reliance on social information (considering four, but to me overlapping, social strategies that are somehow static and heritable) but in fact, individuals may oscillate between relying on their personal information and/or on social information -- which may depend on the availability of resources, population density, stochastic factors, among others (Dall et al. 2005 Trends Ecol. Evol., Duboscq et al. 2016 Frontiers in Psychology). In my opinion, ignoring the inﬂuence of personal and social information decreases the signiﬁcance of this work. I am aware that the authors consider the detection of food present in the model, but this is considered to a much smaller extent (as seen in their weight on individual decisions) than the social information cues.
  
  We appreciate the point that individuals can switch between relying on social and personal information. However, we would point out that in our model, the social strategies are not static. The social strategy is a convenient way of representing individuals’ position in behavioural trait-space (the ‘behavioural hypervolume’ of Bastille-Rousseau and Wittemeyer 2019). This essentially means that the importance assigned to each of the three cues available in our model varies among individuals. There are indeed individuals that are primarily guided by the density of food items, and this is the commonest ‘overall’ movement strategy before the pathogen is introduced. We represent this by showing how the importance of social information is low before pathogen introduction (Fig. 2B).
  
  While we primarily focus on the importance of social information, this is because the population quite understandably evolves a persistent preference for moving towards food items (i.e., using personal information if available). We have made this clearer in the text on lines 367 – 371.
  
  Critically, it is also unclear how, if at all, the information and pathogen traits are related to each other. If a handler gets sick, how does this aﬀect its foraging activity (does it stop foraging, slow its activities, or does it show signs of sickness)? Perhaps this model is attempting to explore the emergence of social movement strategies only, but how they disentangle an individual's sickness status and behavioral response is unclear.
  
  We appreciate that infection may lead to physiological eﬀects (e.g. altered metabolic rates, reduction in cognitive capacity) that may then inﬂuence behaviour. Our model aims to be relatively simple and general one, and does not consider the explicit mechanisms by which infection imposes a cost on ﬁtness. Thus we do not include any behavioural modiﬁcations due to infection, as we feel that these would be much too complex to include in such a model. We would be happy to explore, in future work, phenomena such as the evolution of self-isolation and infection detection which is common among animals such as social insects (Stroeymeyt et al. 2018, Pusceddu et al. 2021).
  
  However, we have considered an alternative implementation of our model’s scenario 1 which could be interpreted as the infection reducing foraging eﬃciency by a certain percentage (other interpretations of the redirection of energy away from reproduction are also possible). We show how this implementation leads to very similar outcomes as those seen in our
  
  Very little is presented about the virulence of the pathogens and how they could aﬀect the emergence of social strategies. The authors keep their main argumentation based on the introduction of novel pathogens (without distinctions on their pathogenicity), but a behavioral response is rather inﬂuenced by how fast individuals are infected and which are their chances of recovering. Besides, they consider that only one or two social interactions would be enough for pathogen transmission to occur.
  
  We have indeed considered a ﬁxed transmission probability of 0.05, a relatively modest attack rate. Setting transmission probability to two other values (0.025, 0.1), we ﬁnd that our general results are recovered - there is an evolutionary transition away from sociality, with the proportion of agent avoidance evolved increasing with the transmission probability. While we do not show these results in the main text, we have included ﬁgures showing the proportions of each social movement strategy here for the reviewers’ reference.
  
  Figures showing the proportion of social movement strategies in two simulation runs of our default implementation of scenario 1 (dE = 0.25, R = 2, pathogen introduction begins from G = 500). Top: Probability of transmission = 0.025 (half of the default). Bottom: Probability of transmission = 0.10 (double the default). Overall, the proportion of agent avoidance evolved (purple) increases with the probability of transmission. Each ﬁgure shows a single replicate of each parameter combination, for only 1,000 generations.
  
  Another important component is that individuals do not die, and it seems that they always have a chance (even if it is small) to reproduce. So, how the authors consider unsuccessful strategies in the model outputs or how these social strategies would be potentially "dismissed" by natural selection are not considered.
  
  We appreciate the point that our simulation does not include mortality eﬀects, and that all individuals have some small chance of reproducing. There are a few practical and conceptual challenges when incorporating this level of realism in a general model. Including mortality eﬀects could allow for the emergence of more complex density-dependent dynamics, as dead individuals would not be able to transmit the pathogen to other foragers (although for some pathogens, this could be a valid choice), nor would they be sources of social information. This would make the model much more challenging to interpret, and we have tried to keep this model as simple as possible.
  
  We have also sought to keep the model’s focus on the evolutionary dynamics, and to not focus on mortality. In order to balance this aim with the reviewer's suggestion, we have included a new implementation of the model’s scenario 1 which has a threshold on reproduction. That means that only individuals with a positive energy balance (intake > infection costs) are allowed to reproduce. We show a potentially counter-intuitive result, that the more social ‘handler tracking’ strategy persists at a higher frequency than in our default implementation, despite having a higher infection rate than the ‘agent avoiding’ strategy. We suggest that this is because the ‘agent avoiding’ individuals have very low or no intake. This is suﬃcient in our default implementation to have relatively higher ﬁtness than the more frequently infected handler tracking individuals.
  
  Reviewer #3 (Public Review):
  
  Gupte and colleagues develop an individual-based model to examine how the introduction of a novel pathogen inﬂuences the evolution of social cue use in a population of agents for which social cues can both facilitate more eﬃcient foraging, but also expose individuals to infection. In their simulations, individuals move across a landscape in search of food, and their movements are guided by a combination of cues related to food patches, individuals that are currently handling food items, and individuals that are not actively handling food. The latter two cues can provide indirect information about the likely presence of food due to the patchiness of food across the landscape.
  
  The authors ﬁnd that prior to introducing the novel pathogen, selection favors strategies that home in on agents, regardless of whether those agents are currently handling food items. The overall contribution of these social cues to movement decisions, however, tends to be relatively small. After pathogen introduction, agents evolve to rely more heavily on social information and to either be more selective in their use of it (attending to other agents that are currently handling food and avoiding non-handlers) or avoiding other agents altogether. Gupte and colleagues further examine the ecological consequences of these shifts in social decision-making in terms of individuals' overall movement, food consumption, and infection risk. Relative to pre-introduction conditions, individuals move more, consume less food, and are less likely to be infected due to reduced contact with others. Epidemiological models on emergent social networks conﬁrm that evolved behavioral changes generate networks that impede the spread of disease.
  
  The introduction of novel pathogens into wild populations is expected to be increasingly common due to climate change and increasing global connectedness. The approach taken here by the authors is a potentially worthwhile avenue to explore the potential eco-evolutionary consequences of such introductions. A major strength of this study is how it couples ecological and evolutionary timescales. Dominant behavioral strategies evolve over time in response to changing environmental conditions and impact social, foraging, and epidemiological dynamics within generations. I imagine there are many further questions that could be fruitfully explored using the authors' framework. There are, however, important caveats that impact the interpretation of the authors' ﬁndings.
  
  First, reproduction bears no cost in this model. Individuals produce oﬀspring in proportion to their lifetime net energy intake, which is increased by consuming food and decreased by a set amount per turn once infected. However, prior to reproduction, net energy intake is normalized (0-1) according to the lowest individual value within the generation. This means that individuals need not maintain a positive energy balance nor even consume food at all to successfully reproduce, so long as they perform reasonably well relative to other members of the population. Since consuming food is not necessary to reproduce, declining per capita intake due to evolved social avoidance (Fig. 1d) likely decreases the importance of food to an individual's reproductive success relative to simply avoiding infection. This dynamic could explain the delayed emergence of the 'agent avoiding' strategy (Fig. 1a), as this strategy potentially is only viable once per capita intake reaches a suﬃciently low level across the population (Fig. 1d). I am curious to know what the results would be if reproduction required some minimal positive net energy, such that individuals must risk food patches in order to reproduce. It would also be useful for the authors to provide information on how net energy intake changes across generations, as well as whether (and if so, how) attraction to the food itself may change over time.
  
  We thank the reviewer for their assessment of our work, and appreciate the point raised here (and in an earlier review) about individuals potentially reproducing without any intake. We have addressed this by running our default model [repeated introductions, R = 2, dE = 0.25], with a threshold on reproduction such that only individuals with a positive energy balance can reproduce. We mention these results in the text (L. 495 – 500), and show related ﬁgures in the SI Appendix. In brief, as the reviewer suggests, agent avoiding is less common for our default parameter combination, but becomes as common as the default combination when the infection cost is doubled (to dE = 0.5).
  
  We appreciate the reviewer’s suggestion about decreasing per-capita intake being a precondition for the proliferation of the agent avoiding strategy. With our new results, we now show that there is no overall decrease in intake, but the agent avoiding strategy still becomes a common strategy after pathogen introduction. As the reviewer suggests, this is because these individuals have an equivalent net energy as handler tracking individuals, as they are less frequently infected.
  
  We suggest that the delayed emergence of the agent avoiding strategy is primarily due to mutation limitations – such individuals are uncommon or non-existent in the simulation before pathogen introduction, and random mutations are required for them to emerge. As we have noted in response to an earlier comment, this becomes clear when the mutation rate is reduced from 0.01 to 0.001 – agent avoidance usually does not evolve at all.
  
  A second important caveat is that the evolutionary responses observed in the model only appear when novel pathogen introductions are extremely frequent. The model assumes no pathogen co-evolution, but rather that the same (or a functionally identical) pathogen is re-introduced every generation (spillover rate = 1.0). When the authors considered whether evolutionary responses were robust to less frequent introductions, however, they found that even with a per-generation spillover rate of 0.5, there was no impact on social movement strategies. The authors do discuss this caveat, but it is worth highlighting here as it bears on how general the study's conclusions may be.
  
  We appreciate the reviewer’s point entirely. We would point out that current knowledge about pathogen introductions across species and populations in the wild is very poor. However, the ongoing highly pathogenic avian inﬂuenza outbreak (Wille and Barr 2022), the spread of multiple strains of SARS-CoV-2 to wild deer in several diﬀerent human-to-wildlife transmission events, and recent work on the potential for coronavirus spillovers from bats to humans, all suggest that at least some generalist pathogens must circulate quite widely among wildlife, often crossing into novel host species or populations. We have added these considerations to the text on lines 218 – 231.
  
  We have also added, in order to confront this point more squarely, a new scenario of our model in which the pathogen is introduced just once, and then transmits vertically and horizontally among individuals (lines 519 – 557). This scenario more clearly suggests when evolutionary responses to pathogen introductions are likely to occur, and what their consequences might be for a pathogen becoming endemic in a population. This scenario also serves as a potential starting point for models of host-pathogen trait co-evolution, and we have added this consideration to the text on lines 613 – 623.
  
  References
  
  ● Albery, G. F. et al. 2021. Multiple spatial behaviours govern social network positions in a wild ungulate. - Ecology Letters 24: 676–686.
  
  ● Bastille-Rousseau, G. and Wittemyer, G. 2019. Leveraging multidimensional heterogeneity in resource selection to deﬁne movement tactics of animals. - Ecology Letters 22: 1417–1427.
  
  ● Gupte, P. R. et al. 2021. The joint evolution of animal movement and competition strategies. - bioRxiv in press.
  
  ● Lion, S. and Boots, M. 2010. Are parasites ‘“prudent”’ in space? - Ecology Letters 13: 1245–1255.
  
  ● Lloyd-Smith, J. O. et al. 2005. Superspreading and the eﬀect of individual variation on disease emergence. - Nature 438: 355–359.
  
  ● Nathan, R. et al. 2008. A movement ecology paradigm for unifying organismal movement research. - PNAS 105: 19052–19059.
  
  ● Pusceddu, M. et al. 2021. Honey bees increase social distancing when facing the ectoparasite varroa destructor. - Science Advances 7: eabj1398.
  
  ● Sánchez, C. A. et al. 2022. A strategy to assess spillover risk of bat SARS-related coronaviruses in Southeast Asia. - Nat Commun 13: 4380.
  
  ● Stroeymeyt, N. et al. 2018. Social network plasticity decreases disease transmission in a eusocial insect. - Science 362: 941–945.
  
  ● Wilber, M. Q. et al. 2022. A model for leveraging animal movement to understand spatio-temporal disease dynamics. - Ecology Letters in press.
  
  ● Wille, M. and Barr, I. G. 2022. Resurgence of avian inﬂuenza virus. - Science 376: 459–460.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.09.483239v3
www.biorxiv.org www.biorxiv.org

Distinct protocerebral neuropils associated with attractive and aversive female-produced odorants in the male moth brain

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  In the manuscript by Kymre, Liu and colleagues, the authors investigate how pheromone signals are interpreted by the projection neurons of the male moth brain. While the olfactory neurons and glomerular targets of pheromone signaling is known, the signaling of the projection neurons (output neurons) that carry pheromone signaling to higher regions of the brain remained unknown. The authors utilized a series of technically challenging experiments to identify the anatomy and functional responses of projection neurons responding to pheromone mixtures, primary pheromone, secondary pheromone, and behavioral antagonist odors. By calcium imaging of MGC mALT neurons, the authors identify that odor responses in PNs are broader than the olfactory neuron counterparts (ie, the behavioral antagonist activates OSNs innervating the dma glomerulus, whereas the antagonist actives dma and dmp glomeruli). The authors then perform a series of elegant experiments by which the odor responses of different mALT PNs are recorded by electrophysiology, and the anatomy of the recorded neurons identified by dye fill and computer reconstruction. This allowed analysis of the temporal response properties of the neurons to be correlated with their axonal processes in different brain regions. The data suggest that attractive pheromone signals activate the SIP and SLP regions, while aversive signals primarily active regions in the LH. Finally, the authors present a model of pheromone signaling based on these findings.
  
  The work presents the first glimpse at the signaling from mALT PNs. The technical challenges in performing these experiments did limit the number of neurons that could be recorded and imaged. As such, the comprehensiveness of the study was not clear, or if additional experiments might alter the findings. The connection of protocerebrum anatomy with functional signaling (as summarized in Figure 6) could have been more clearly articulated.
  
  The manuscript could benefit by revisions to the text and figure presentations that would make it more accessible to a broader audience.
  
  We thank the reviewer for the comments and suggestions. We understand that the issue regarding completeness of data aroused concern. The neuron collection obtained via intracellular recording always makes up a compromise between a collection that covers absolutely “all” neurons and a neuron collection that includes the majority of neurons, reflecting the activity of the whole neuron population. We considered our neuron collection as representative for two main reasons: (1) The neurons included in this study were randomly collected from all three MGC units and not aimed from one specific unit. The proportions of identified neurons originating from each MGC unit are highly consistent with the volume of the relevant unit. (2) Up to now, our collection of MGC PNs comprises every previously reported neuron type not only in H. armigera but in all heliothine moths studied. Evidently, our anatomical data provided a solid foundation making it unlikely that a considerable amount of new MGC PN types would be discovered in future studies. However, the principal objection raised by the reviewer is very timely – since we were not able to confirm that our collection included every MGC PN, the possibility of additional neuronal types remains open.
  
  Therefore, we decided to examine the content validity of our framework based on the features of the current neuron collection - that is, whether the presented outline would be fundamentally altered if additional PNs were included. A computational experiment was conducted including the mean firing traces of four neuron groups, each innervating the same protocerebral region. Here, the firing traces of individual PNs were shuffled based on formation of new neuron assemblies by randomly recruiting two-thirds of the PNs in the group. The data shuffling was repeated 5 times, and each time a different assembly of neurons was included. Cross correlations between the mean firing traces of each assembly showed that neuronal response profiles were unchanged in the neuropils associated with distinct behavioral valences (Fig. 7F). This high association contrasted with low correlations between the firing traces of every two PNs (Fig. 7G), indicating the representativeness of the presented data on the 42 MGC-PNs identified here. The issue concerning the completeness of the findings is included in a special paragraph in the discussion and in Fig. 7D-G.
  
  We also thank the reviewer for pointing out the importance of an expedient data presentation including a written text and figure material clearly communicating the major findings. In line with the editor’s recommendations, we have performed comprehensive revision of all main parts of the manuscript. We have, for example, included an introductive figure (Fig. 1) providing essential background information. In the result section, we profoundly reorganized the data presentation by highlighting the major findings both in the text and figure material. As suggested by the editor, a new figure is made, figure 3 (substituting the original Fig.2), visualizing the main neuron types in separate panels as well as in joint plots (confocal data and 3D-models), and presenting descriptive/predictive frameworks reflecting the stimulus evoked neuronal activity within the relevant output regions of the PNs. The discussion is also reshaped, for instance, by including the issue of parallel olfactory processing in the current species as well as across different species. Altogether, we believe the revision has made the article more relevant to a broad audience. We hope our study dealing with one of the severe pest insect species that inhabit our planet will be of interest.
  
  Reviewer #2:
  
  Using calcium imaging of mALT PNs in the AL as well as intracellular recordings and subsequent stainings of individual PNs, the authors evaluate the response properties of different PNs to the three pheromone components, including the primary pheromone Z11-16:AL, the secondary component Z9-16:AL and a minor component Z9-14:AL which functions as an antagonist at higher concentrations. The authors conclude from their data that PNs have widespread aborizations in higher brain centers that are organized according to behavioral significance, i.e. with regard to attraction versus repulsion. Although the authors characterize morphologically and functionally a considerable number of neurons, the data are highly descriptive and exhibit a rather large level of variability which impedes, in my opinion, a generalization of response properties for different neuron types. The conclusion that the projection patterns in the higher brain centers, such as the LH, VLP and SIP reflect behavioral significance proves rather difficult from the data presented in this study. Additional data, such as e.g. calcium imaging of pheromone responses in the higher brain areas would support the notion of a valence-based map in these regions.
  
  The intracellular recordings are certainly elaborate, but do not allow drawing a general picture about how coding of pheromones in the individual MGC compartments of the AL is transformed into a representation in higher brain centers. In my opinion the authors could not sufficiently address their major goal which is to understand how the neuronal circuitry underlying pheromone processing is encoding the individual pheromone components that induce opposite valences. The study would highly benefit if the authors would reconstruct their individual PN staining and register them into a standard moth brain (as done in other insect species, such as honeybees and flies) to allow a categorization and matching of morphological properties. Then the different PNs could be compared based on morphological parameters and subsequently be assigned to specific neuron classes, while response properties could be assessed for the different types.
  
  First, we would like to thank the reviewer for the suggestions. The reviewer points out that additional experiments, «such as calcium imaging of pheromone responses in the higher brain areas” might support the notion of valence-based maps in these regions. Unfortunately, these kinds of experiments are currently not feasible for the neuron groups we are interested in. Fura labeled calcium imaging has its restriction since this method can only be used to examine a brain region based on retrograde labeling of the neurons of interest, such as applying dye into the calyx for examining the responses of medial-tract PN dendrites in the antennal lobe (see Fig. A1 below). Notably, the calcium-imaging measurements from the LH in honeybee, obtained from retrogradely labeled lateral tract PNs, could be performed because of the accessibility of this PN population type for such an experiment (see Fig. B below; Roussel et al., 2014, Current Biology 24, 561-567). The PNs of interest here, confined to the mALT and mlALT, end up in the lateral protocerebrum. Therefore, measuring calcium imaging responses in the lateral protocerebrum from retrogradely labelled neurons confined to these tracts appears to be unfeasible (Fig. A2 below). So far, no study has managed to perform retrograde labeling of the axon terminals of mALT/mlALT PNs in the higher brain centers of moths. Considering utilization of the bath application technique including a membrane-permeable calcium indicator, this method gives access to calcium signals only in the most superficial brain areas. The neuropil regions innervated by the mALT PNs are located too deep (the only accessible output region would be the calyces). Finally, the moth species used here lacks proper genetic tools that might allow investigation of a specific strain expressing a calcium indictor.
  
  Figure(A1-A2): Fura retrograde labeling of PNs confined to the medial tract (mALT) from two different brain cites in moth. Figure B: Fura retrograde labeling of lateral-tract (lALT) PNs in honeybee brain. Calcium imaging measurements are feasible in the areas marked in green, including the antennal lobe (AL in A-B) and a part of lateral protocerebrum region (B). While the areas marked in red (shown in A1-A2) are not ideal for imaging experiment, as the neuronal signals (black arrows) will be physically blocked by the damaged axons.
  
  In addition, the reviewer has the following objection: “Although the authors characterize morphologically and functionally a considerable number of neurons, the data are highly descriptive and exhibit a rather large level of variability which impedes, in my opinion, a generalization of response properties for different neuron types.” We assume the reviewer refers to the individual neuron data when he/she points out the relatively high variability. Indeed, the high-resolution information obtained by the intracellular recording/staining technique include descriptive data with a certain extent of variability – particularly regarding the spiking data representing every single action potential at the time scale of a few milliseconds. The main reason for performing both in vivo calcium imaging and intracellular recording experiments is that these two approaches form an optimal combination of illustrating the neuronal activity in different granularities. During calcium imaging, we recorded pheromone responses in distinct groups of MGC PNs, i.e., at a higher population scale. One main restriction of calcium imaging is the low temporal resolution (sampling frequency in this study was 100 ms). For comparison, the intracellular recordings had a sampling frequency less than 1 ms. Altogether, by combining the two techniques we could collect data from the relevant MGC-PNs both at the neuron population level (low temporal resolution) and single neuron level (high spatial and temporal resolution). Comparison of the data obtained from the two experimental approaches demonstrated a high degree of correspondence. We believe that the high-resolution intracellular recording data reflect the peculiar features that precisely characterize individual neurons. Otherwise, in case the reviewer has objections against the detailed descriptions in the results part, we have revised the original manuscript (including text and figure material) emphasizing on the main findings and minimizing the description of details.
  
  The reviewer also suggests registering the neurons into a standard brain framework to “allow drawing a general picture about how coding of pheromones in the individual MGC compartments of the AL is transformed into a representation in higher brain centers”. To register individual PNs into a standard brain is no doubt an ideal method to compare the neurons’ architecture within the same species as well as across different models – especially if we want to compare the neurons’ projection patterns. Unlike the honeybee and the fruit fly already having an averaged standard brain available (reconstructed and standardized based on morphological data from different individuals), H. armigera has a representative brain (reconstructed from morphological data of one individual), published by Chu et al., (2020a). As we have experienced, errors due to local distortions often occur when registering neurons into a representative brain. The same is to some degree also the case for registration of neurons into an averaged brain framework. How informative the results are, will always depend both on the resolution of the standard and the resolution of the neuron data. Thus, the accuracy and the quality of the registration is based on the richness of details in the raw image data, i.e. how dense the registration grid is. If only a few neuropils are used, the precision of registration will obviously be limited. An ideal reconstruction for registration would include a dense grid of landmarks - or, as in the fruit fly, the actual image data.
  
  Generally, the terminal projections of medial- and mediolateral tract MGC PNs in the moth cover several widespread areas in the protocerebrum and the most important objective of the current study was to map the neuropils innervated by each of the 32 physiologically identified neurons presented here. In line with the suggestion from the reviewer, we have added AMIRA reconstructions in the revised manuscript, including not only the skeleton of individual PNs but also 3D reconstructions of the neuropil regions innervated by each neuron. These data, confirming the neurons’ morphological properties, are presented in the figure supplement. In addition, for visualization purposes, we plotted each traced skeleton onto the representative brain, based on the reconstructed data obtained by using the ‘transform editor’ function in AMIRA (Fig. 3). In the revised version of the manuscript, we have also submitted all morphological data (confocal stacks and 3D-AMIRA reconstructions) of the main MGC-PN types to the newly established Insect brain database (InsectbrainDB, 2021) – a unified and open access platform for archiving and sharing functional data obtained not only from H. armigera but from other insect species as well.
  
  In addition to registering different PNs into a common frame, another reliable evidence for such comparison is raw confocal data including identifiable neurons simultaneously stained in the same brain. In Fig. 3C, we demonstrate overlapping terminal projections in the LH of two uniglomerular MGC-PNs originating from each of the two smaller MGC-units, the dma and dmp. And in Fig. 4, we show the terminal projections of MGC-PNs confined to each of the three main tracts, demonstrating overlapping terminal arbors for medial- and mediolateral-tract neurons whereas the lateral-tract neuron projects to a separate area.
  
  Reviewer #3:
  
  Summary of goals:
  
  In the moth Helicoverpa armigera the authors examined whether projection neurons from different antennal lobe tracts encoding sex-pheromone components with different valence occupy distinct projection areas in the protocerebrum of the midbrain.
  
  Strengths and weaknesses of methods and results:
  
  Methods chosen are adequate and state of the art. In vivo calcium imaging allowed for more easy imaging of a population of neurons, in search for statistically significant responses to pheromone components of different concentrations, quality, and valence. The main, general drawbacks of calcium imaging is the lower temporal resolution that does not allow for detection of single action potentials at the scale of few ms and the inability of fine spatial resolution of projection patterns of single neurons. This was compensated for by excellent intracellular recordings of single antennal lobe projection neurons, stainings of single cells, and embedding in the 3D standardized H. armigera brain. The data a very carefully analyzed with adequate analysis software and adequate statistical analysis and the most relevant results are shown in very good Figures. I also very much appreciate all of the supplementary figures. I do not see any relevant weakness in the methods and the respective results. However, as outlined in detail in the reply to the authors, the wording of the manuscript can be improved, to make it clearer and understandable without the need to read previous publications.
  
  Everybody working with odors knows about the difficulty to precisely control and measure the exact molar concentration of odorants applied. But since the authors showed in previous publications that they take great care to control odor stimuli they should include also in the Material and Methods of this publication more details about concentration of the respective odor stimuli or mixtures employed.
  
  Did they achieve their aims? Do data support conclusions?
  
  Yes, the data support their conclusions as clearly shown in their excellent recordings, their excellent combination of physiological and morphological analysis, as well as their thorough statistical analysis.
  
  Discussion of the likely impact of the work on the field, utility of methods:
  
  This is an excellent, synergistic collaboration of different international experts in insect olfaction. It is still under-estimated how important the combination of single cell analysis in intracellular recordings with neural network analysis via calcium imaging is. Schemes of frequency encoding versus temporal encoding can only be deciphered with a clever combination of these techniques. This manuscript adds important insights into information processing of olfactory stimuli of antagonistic valence. It starts to become clear that in different sensory systems valence of aversive versus attractive sensory stimuli is processed in parallel pathways. Most likely antagonistic pathways connected to different neuronal units in premotor areas of the midbrain, connecting to parallel de- and ascending pathways of central pattern generators in the thorax. In addition, the current work provides relevant new information about processing of pheromone information in the different antennal lobe tracts in another important species. Thus, we may be one step closer to the future manipulation of sexual reproduction of specific insect pests.
  
  Context for others for interpretations:
  
  Sympatric heliothine moths use the same sex-pheromone components but at different concentration ratios, allowing for distinction of species that do not inter-mate. Thus, understanding how pheromone components at defined concentrations with opposite valence are processed in the brain to guide aversive or attractive behavioral interactions is relevant not only for determining principles of higher-order olfactory processing, but also to understand evolution of new species.
  
  We thank the reviewer for the comments and suggestions. To improve the part of the manuscript covering background information, we have included a new figure in the introduction section, Fig. 1, providing an overview of the olfactory pathway in male moths. Here, the schematic drawing (A) contains an overview of the uniglomerular medial-tract PNs confined to the plant-odor and pheromone sub-system, respectively, and their distinct paths from the periphery to higher olfactory centers. In the schematic drawing (B), we provide an overview of the three main ALTs in the moth. A detailed description of the system is included in the relevant figure legend. In addition, we have included a section in the discussion that compares morphological and physiological properties of MGC-PNs confined to each of the three parallel tracts. Finally, a consideration implying the distinct roles of the parallel ALTs is added.
  
  As suggested by the reviewer, we have added more precise information about the relevant odor stimuli in the revised version of the manuscript. We have clarified all details regarding pheromone concentrations as well as ratios in the materials and method section. In addition, we included relevant background knowledge on species-specific pheromone blends of sympatric moth species.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.12.11.421289v1
www.biorxiv.org www.biorxiv.org

New submission 25/10/2022, 13:19:02

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This work sheds light on the adverse effects of Bacillus thuringiensis, a strong pathogenic bacteria used as a microbial pesticide to kill lepidopteran larvae that threaten crops, on gut homeostasis of non-susceptible organisms. By using the Drosophila melanogaster as a non-susceptible organism model, this paper reveals the mechanisms by which the bacteria disrupt gut homeostasis. Authors combined the use of different genetic tools and Western blot experiments to successfully demonstrate that bacterial protoxins are released and activated throughout the fly gut after ingestion and influence intestinal stem cell proliferation and intestinal cell differentiation. This phenomenon relies on the interaction of activated protoxins with specific components of adherens junctions within the intestinal epithelium. Due to conserved mechanisms governing intestinal cell differentiation, this work could be the starting point for further studies in mammals.
  
  The conclusions proposed by the authors are in general well supported by the data. However, some improvements in data representation, as well as additional key control experiments, would be needed to further reinforce some key points of the paper.
  
  We thank reviewer1 for her appreciation of the work and in depth analysis of the data. We agree with all her comments and believe the suggestions significantly improved the manuscript.
  
  1) Figure 1 and others: Several graphs in the manuscript show the number of cells/20000µm2. How is the shape of the gut in the different conditions studied in this manuscript? The gut shape (shrunk gut versus normal gut for example) could influence the number of cells seen in a small area. For example, the number of total cells quantified in a small area (here 20000µm2) of a shrunk gut can be increased while their size decrease. As a result, the quantification of a specific cell type in a small region (here 20000µm2) can be biased and not represent the real number of cells present in the whole posterior part of the R4 region. Would it make sense to calculate a ratio "number of X cells/number of DAPI positive cells per 20000µm2"?
  
  We provided a suitable answer in the "Essential Revisions point 1" corresponding to this reviewer's concern. To summarize, we have now added whole posterior midgut images in the different conditions to highlight the intestinal morphology (Figure 1-figure supplement 1A). The whole gut morphology was not affected by the different challenges we performed. Indeed, we used low doses of spores and/or toxins in order to mimic "natural" amounts of spores/toxins the fly can eat in the environment and in order to avoid drastic gut lining disturbances.
  
  We have also added the cell type ratio in figure 1- figure supplement 2.
  
  2) Figure 4: Is it possible that Arm staining is less intense between ISC and progenitors after ingestion of the bacteria due to the fact there is a high rate of stem cell proliferation? Could it be an indirect effect of stem cell proliferation rather than the binding of the toxins to Cadherins?
  
  We thank the reviewer for this pertinent comment. Indeed, for this reason, we compared the intensity of Arm expression at the junction between neighboring progenitors with the Arm intensity around the rest of the cellular membranes and calculated the ratio between both values (see Figure 4-figure supplement 1F-G for an illustration of how we proceeded and the new section in the Material and Methods 736-742). Using this method, even if the whole Arm staining intensity is different (in all the midgut), the ratio reflects the internal cell-cell interaction changes between the two neighboring cells. Moreover, we have observed that Arm staining (using the usual monoclonal antibody N2 7A1 from the DSHB) was very variable from one midgut to another in the same feeding/intoxication condition. So, we do not want to draw conclusion about the whole Arm intensity due to this variability whatever are the intoxication conditions. Finally, the challenged guts always displayed a more disorganized epithelium due to cell proliferation and differentiation. Consequently, Arm staining in ECs and progenitor cells are found in the same focal plane while in unchallenged and well-organized guts, Arm staining in ECs is above the focal plane of Arm staining in progenitor cells. This likely leads to the impression that Arm staining is more intense in challenged midguts. This method description is now added in the Material and methods section (lines 736-742).
  
  Could the authors use the ReDDM system to distinguish between "old" and newly formed cells? This could be a good control to make sure that the signal is quantified in similar cells between the control and the different conditions.
  
  We have analyzed intensity of Arm expression between pairs of GFP cells. Most of these pairs arose from de novo divisions. Indeed, as shown in control conditions (water) with Dl-ReDDM (for example see figure 1-figure supplement 1D), pairs of GFP cells (ISC-ISC) are rare. Most pairs correspond to ISC-EB or ISC-EEP pairs with the progenitor marked by the RFP, meaning that it just arises from the GFP+ mother ISC. Therefore we assume, that in the esg>GFP genotype, pairs of GFP+ cells correspond to one ISC and one progenitor (see Figure 4 – figure supplement 1A-A'). Therefore, when we analyzed the Arm intensity between pairs of GFP cells after intoxication, these cells are very likely "newborn" cells. Even if we suppose there are ISCs and progenitors that remain stuck together for a long time (for instance several days), Cry1A toxins can also be able to disrupt their cell junction. In the context of Cry1A toxin activity, it seems important to analyze the whole impact on cell-cell junctions without discriminating old and new cell-cell interactions.
  
  We tried to use anti-Arm and anti-Pros double staining to mark new EEPs. Unfortunately, anti-Arm and anti-Prospero antibodies were both raised in mice. Co-staining with both antibodies give rise to bad labelling either for Arm or for Prospero or for both. Our first author spent lot of energy trying to set up good conditions but unfortunately this was unsuccessful.
  
  Here is an example of what we got (this was the best image we got) with esg>GFP flies fed with water (control) and labelled for Arm and Pros in red. White arrows point two EEPs. Red arrows points the Arm staining between two precursors (ISC/ISC or ISC/EB or EB/EB). It was extremely hard to identify junctions marked by Arm between EEPs and ISCs because the Pros staining was too strong.
  
  Another example with flies fed with spores of SA11 (increasing the number of EEs). In green is the esg>GFP and in Red Arm and Prospero. The right panel correspond to the single red channel (Arm/Prospero).
  
  Nevertheless, we have now performed a similar analysis in an esg>GFP, Shg::RFP background and analyzed Shg::RFP (Tomato::DE-Cadherin) labelling intensity. We found similar results that are presented in the new Figure 4 (data we Arm have been moved in Figure 4-figure supplement 1). This last analysis have been included in the text lines 285-299.
  
  Figure 4E' and 4G': Arm staining seems more intense when looking at the whole membrane levels of cells compared to control. Is it possible that the measured ratio contact intensity/membrane intensity presented in Figure 4I could be impacted and not reflect the real contact intensity between ISC and progenitor cells?
  
  Please check our answer just above: "…//… we have observed that Arm staining (using the usual monoclonal antibody N2 7A1 from the DSHB) was very variable from one midgut to another in the same feeding/intoxication condition. So, we do not want to draw conclusion about the whole Arm intensity due to this variability whatever are the intoxication conditions".
  
  See also our intensity measurement method described above to avoid bias: "…//… we compared the intensity of Arm expression at the junction between neighboring progenitors with the Arm intensity around the rest of the cellular membranes and calculated the ratio between both values (see Figure 4-figure supplement 1F-G for an illustration of how we proceeded and the new section in the Material and Methods 736-742). Using this method, even if the whole Arm staining intensity is different (in all the midgut), the ratio reflects the internal cell-cell interaction changes between the two neighboring cells."
  
  What is the hypothesis of the authors about the decrease of Arm or DE-Cad seen after bacterial/crystal ingestion? Does the interaction between the toxins and DE-Cad induce a relocation of DE-Cad?
  
  It has been shown that E-Cadherin could be recycled when adherens junctions are destabilized both in Drosophila and mammals(Buchon et al., 2010; O'Keefe et al., 2007; Tiwari et al., 2018). To investigate this possibility, we tried to analyze DE-Cad cytoplasmic relocalization using anti-DE-Cad immunostaining (DCAD2 antibody from DSHB) as well as Shg::RFP (Bloomington stock #58789) or Shg::GFP (Bloomington stock #60584) endogenous fusion. Unfortunately, we did not see obvious differences. Nevertheless, we have now added the split channels of the Shg::RFP labelling in the different conditions in Figure 4A-D'. Nevertheless, we are still interested in the behavior of the DE-cadherin (and signaling, see (Liang et al., 2017)) upon binding of the Cry1A toxin. N. Zucchini-Pascal (author in this article) are currently investigating this question.
  
  The authors should add more details about the way to quantify in the Material and methods section. How many cells have been quantified per intestine? How did they choose the cells where they quantified the contact intensity?..etc
  
  These details were missing in the methods and we thank the reviewer for highlighting this issue. We added these information to the methods (lines 725-742). The number of cell pairs analyzed was present in the raw data related to figure 4 but absent from the main figure and legend. It is now rectified. We only measured the intensity in isolated pairs of cells.
  
  Figure 4B, D, F and H: How did the authors recognize the ISCs?
  
  We agree with the reviewer comment. We cannot recognize ICS per se. Green cells correspond either to ISCs or to EBs. We modified the text accordingly (lines 285-287).
  
  Could the authors do quantifications of DE-Cad signal?
  
  This has been done. It is shown now in figure 4E and in Table 1. We also adapted the text (lines 289-299) to fine-tune our interpretation in light of this new analysis. Indeed, what we have now defined as "mild" adherens junction intensity is between the ratio 1.4 and 1.6 instead of the previous ratio (1.3 to 1.6), because we observed most of the EEP progenitors arising from cell displaying a junction intensity with their mother cells below the 1.4 ratio (see Table 1).
  
  Like Arm staining, the staining seems stronger at the whole membrane level in F and H compared to the control.
  
  As we described above for Arm staining, the intensity of Tomato::DE-Cad labelling can differ from one posterior midgut to another one. One simple explanation would be related to changes in the structure of midgut epithelium which is well organized in unchallenged conditions, while in challenged midguts the epithelial cells are not well-arranged anymore due to rapid cell proliferation and differentiation. Consequently, DE-Cad labelling in ECs is at the same level as that in ISC/progenitors cells, giving the impression that the labelling is stronger.
  
  3) Figure 5: How is the stem cell proliferation upon overexpression of DE-Cad in control or upon bacteria/crystals ingestion? Do the authors think that the decrease of Pros+RFP+ new cells upon overexpression of DE-Cad could result from a decrease of stem cell proliferation?
  
  Great suggestion. Thereby, we chose to count the progenitor cells (GFP+ cells) reflecting the ISC division during the last 3 days. Moreover, this also has the advantage of working on the same pictures (samples) used for all the analyzes shown in figure 5 and Figure 5-figure supplement 1. Hence, If we consider the number of GFP+ cells (esg expressing cells corresponding to ISC, EB or EEP) in challenged midguts, the overexpression of the DE-Cad did not seem to alter ISC division. In addition, we still observed more GFP+ cells when the midguts were challenged with SA11 or crystals than with BtkCry, in agreement with the rate of ISC division observed in the WT genetic background shown in figure 1B.
  
  We have now added the counting of GFP+ cells in Figure 5-figure supplement 1E. The text has been modified to integrate this results (lines 306-308).
  
  Did the authors quantify the % of new ECs in the context of overexpression of DE-Cad?
  
  The data has been added in figure 5F. The text has been modified to integrate this result lines 312-313.
  
  Figure 5F: As asked before, did the authors distinguish the signal between newly born cells and the signal between older cells?
  
  In the new figure 5G: we used the esg-ReDDM system that is very efficient. Almost all ISC and progenitors express the GFP. The counting have been done between cell pairs that express both the GFP and RFP. It is specified in the text lines 310-311. Nevertheless, we cannot distinguish between new and old cells here. Indeed, the esg-ReDDM system induce both the GFP and the RFP in all esg+ cells (the old ones and the new ones). Hence, if a division has occurred just before the induction of the system to give birth for instance to an ISC and an EB, both cells will express the GFP and the RFP. But should we consider those pairs of cells as old cells or new cells? Noteworthy, as we analyzed the intensity of junctions 3 days after intoxication and induction of the ReDDM system, we assume that the pairs of GFP+/RFP+ cells arose after the induction of the system. Indeed, to our knowledge, nobody has shown in the posterior midgut, that a progenitor remains stuck to its mother ISC as long as 3 days. Even if we assume that this event can occur, Cry1A toxins can also be able to disrupt their cell junction.
  
  We now have removed the DAPI channel and added the RFP+ channel in Figure 5-figure supplement 1A-D' (previously the Figure S4A-D) to illustrate this explanation and to facilitate the interpretation by the reader.
  
  It would be interesting to compare the junction intensity between mother ISCs and their daughter progenitors before and after intoxication in a same intestine. But we think that this event is quite rare because of the experimental conditions we used (i.e. analyses 3 days after the induction of the ReDDM/intoxication).
  
  The same experiments (stem cell proliferation + quantification of the % of new ECs) could be also done when authors overexpress of the Connectin, supplemental figure 5. This would be another control to conclude that the effects on cell differentiation are specific due to the interaction between DE-Cad and the toxins.
  
  We have added the analyses in Figure 5 - figure supplement 2J and K.
  
  The text has been completed lines 317-320.
  
  In the "crystals" condition, the overexpression of Connection seems to partially rescue the increase % of new Pros+RFP+ new cells observed in Figure 3F (Figure S5I compared to Figure 3F).
  
  Yes, we agree with the reviewer comment. In an esg-ReDDM background (figure 3F), crystals induced a much greater increase in EE numbers than did SA11 spores. However, in a WT or esg>GFP background, crystals induced a similar increase in EE/EEP to that induced by SA11 spores. So we do not yet have explanation excepted the genetic background of the esg-ReDDM.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.13.488147v2
www.biorxiv.org www.biorxiv.org

New submission 26/09/2022, 10:22:49

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors start the study with an interesting clinical observation, found in a small subset of prostate cancers: FOXP2-CPED1 fusion. They describe how this fusion results in enhanced FOXP2 protein levels, and further describe how FOXP2 increases anchorageindependent growth in vitro, and results in pre-malignant lesions in vivo. Intrinsically, this is an interesting observation. However, the mechanistic insights are relatively limited as it stands, and the main issues are described below.
  
  Main issues:
  
  1) While the study starts off with the FOXP2 fusion, the vast majority of the paper is actually about enhanced FOXP2 expression in tumorigenesis. Wouldn't it be more logical to remove the FOXP2 fusion data? These data seem quite interesting and novel but they are underdeveloped within the current manuscript design, which is a shame for such an exciting novel finding. Along the same lines, for a study that centres on the prostate lineage, it's not clear why the oncogenic potential of FOXP2 in mouse 3T3 fibroblasts was tested.
  
  We thank the reviewer very much for the comment. We followed the suggestion and added a set of data regarding the newly identified FOXP2 fusion in Figure 1 to make our manuscript more informative. We tested the oncogenic potential of FOXP2 in NIH3T3 fibroblasts because NIH3T3 cells are a widely used model to demonstrate the presence of transformed oncogenes2,3. In our study, we observed that when NIH3T3 cells acquired the exogenous FOXP2 gene, the cells lost the characteristic contact inhibition response, continued to proliferate and eventually formed clonal colonies. Please refer to "Answer to Essential Revisions #1 from the Editors” for details.
  
  2) While the FOXP2 data are compelling and convincing, it is not clear yet whether this effect is specific, or if FOXP2 is e.g. universally relevant for cell viability. Targeting FOXP2 by siRNA/shRNA in a non-transformed cell line would address this issue.
  
  We appreciate these helpful comments. Please refer to the "Answer to Essential Revisions #1 from the Editors” for details.
  
  3) Unfortunately, not a single chemical inhibitor is truly 100% specific. Therefore, the Foretinib and MK2206 experiments should be confirmed using shRNAs/KOs targeting MEK and AKT. With the inclusion of such data, the authors would make a very compelling argument that indeed MEK/AKT signalling is driving the phenotype.
  
  We thank the reviewer for highlighting this point and we agree with the reviewer’s point that no chemical inhibitor is 100% specific. In this study, we used chemical inhibitors to provide further supportive data indicating that FOXP2 confers oncogenic effects by activating MET signaling. We characterized a FOXP2-binding fragment located in MET and HGF in LNCaP prostate cancer cells by utilizing the CUT&Tag method. We also found that MET restoration partially reversed oncogenic phenotypes in FOXP2-KD prostate cancer cells. All these data consistently supported that FOXP2 activates MET signaling in prostate cancer. Please refer to the "Answer to Essential Revisions #2 from the Editors” and to the "Answer to Essential Revisions #7 from the Editors” for details.
  
  4) With the FOXP2-CPED1 fusion being more stable as compared to wild-type transcripts, wouldn't one expect the fusion to have a more severe phenotype? This is a very exciting aspect of the start of the study, but it is not explored further in the manuscript. The authors would ideally elaborate on why the effects of the FOXP2-CPED1 fusion seem comparable to the FOXP2 wildtype, in their studies.
  
  We thank the reviewer very much for the comment. We had quantified the number of colonies of FOXP2- and FOXP2-CPED1-overexpressing cells, and we found that both wildtype FOXP2 and FOXP2-CPED1 had a comparable putative functional influence on the transformation of human prostate epithelial cells RWPE-1 and mouse primary fibroblasts NIH3T3 (P = 0.69, by Fisher’s exact test for RWPE-1; P = 0.23, by Fisher’s exact test for NIH3T3). We added the corresponding description to the Results section in Line 487 on Page 22 in the tracked changes version of the revised manuscript. Please refer to the "Answer to Essential Revisions #5 from the Editors” for details.
  
  5) The authors claim that FOXP2 functions as an oncogene, but the most-severe phenotype that is observed in vivo, is PIN lesions, not tumors. While this is an exciting observation, it is not the full story of an oncogene. Can the authors justifiably claim that FOXP2 is an oncogene, based on these results?
  
  We appreciate the comment, and we made the corresponding revision in the revised manuscript. Please refer to the "Answer to Essential Revisions #3 from the Editors” for details.
  
  6) The clinical and phenotypic observations are exciting and relevant. The mechanistic insights of the study are quite limited in the current stage. How does FOXP2 give its phenotype, and result in increased MET phosphorylation? The association is there, but it is unclear how this happens.
  
  We appreciate this valuable suggestion. In the current study, we used the CUT&Tag method to explore how FOXP2 activated MET signaling in LNCaP prostate cancer cells, and we identified potential FOXP2-binding fragments in MET and HGF. Therefore, we proposed that FOXP2 activates MET signaling in prostate cancer through its binding to MET and METassociated gene. Please refer to the "Answer to Essential Revisions #2 from the Editors” for details.
  
  Reviewer #2 (Public Review):
  
  1) The manuscript entitled "FOXP2 confers oncogenic effects in prostate cancer through activating MET signalling" by Zhu et al describes the identification of a novel FOXP2CPED1 gene fusion in 2 out of 100 primary prostate cancers. A byproduct of this gene fusion is the increased expression of FOXP2, which has been shown to be increased in prostate cancer relative to benign tissue. These data nominated FOXP2 as a potential oncogene. Accordingly, overexpression of FOXP2 in nontransformed mouse fibroblast NIH-3T3 and human prostate RWPE-1 cells induced transforming capabilities in both cell models. Mechanistically, convincing data were provided that indicate that FOXP2 promotes the expression and/or activity of the receptor tyrosine kinase MET, which has previously been shown to have oncogenic functions in prostate cancer. Notably, the authors create a new genetically engineered mouse model in which FOXP2 is overexpressed in the prostatic luminal epithelial cells. Overexpression of FOXP2 was sufficient to promote the development of prostatic intraepithelial neoplasia (PIN) a suspected precursor to prostate adenocarcinoma and activate MET signaling.
  
  Strengths:
  
  This study makes a convincing case for FOXP2 as 1) a promoter of prostate cancer initiation and 2) an upstream regulator of pro-cancer MET signaling. This was done using both overexpression and knockdown models in cell lines and corroborated in new genetically engineered mouse models (GEMMs) of FOXP2 or FOXP2-CPED1 overexpression in prostate luminal epithelial cells as well as publicly available clinical cohort data.
  
  Major strengths of the study are the demonstration that FOXP2 or FOXP2-CPED1 overexpression transforms RWPE-1 cells to now grow in soft agar (hallmark of malignant transformation) and the creation of new genetically engineered mouse models (GEMMs) of FOXP2 or FOXP2-CPED1 overexpression in prostate luminal epithelial cells. In both mouse models, FOXP2 overexpression increased the incidence of PIN lesions, which are thought to be a precursor to prostate cancer. While FOXP2 alone was not sufficient to cause prostate cancer in mice, it is acknowledged that single gene alterations causing prostate cancer in mice are rare. Future studies will undoubtedly want to cross these GEMMs with established, relatively benign models of prostate cancer such as Hi-Myc or Pb-Pten mice to see if FOXP2 accelerates cancer progression (beyond the scope of this study).
  
  We appreciate these positive comments from the reviewer. We agree with the suggestion from the reviewer that it is worth exploring whether FOXP2 is able to cooperate with a known disease driver to accelerate the progression of prostate cancer. Therefore, we are going to cross Pb-FOXP2 transgenic mice with Pb-Pten KO mice to assess if FOXP2 is able to accelerate malignant progression.
  
  2) Weaknesses: It is unclear why the authors decided to use mouse fibroblast NIH3T3 cells for their transformation studies. In this regard, it appears likely that FOXP2 could function as an oncogene across diverse cell types. Given the focus on prostate cancer, it would have been preferable to corroborate the RWPE-1 data with another prostate cell model and test FOXP2's transforming ability in RWPE-1 xenograft models. To that end, there is no direct evidence that FOXP2 can cause cancer in vivo. The GEMM data, while compelling, only shows that FOXP2 can promote PIN in mice and the lone xenograft model chosen was for fibroblast NIH-3T3 cells.
  
  To determine the oncogenic activity of FOXP2 and the FOXP2-CPDE1 fusion, we initially used mouse primary fibroblast NIH3T3 for transformation experiments, because NIH3T3 cells are a widely used cell model to discover novel oncogenes2,3,10,11. Subsequently, we observed that overexpression of FOXP2 and its fusion variant drove RWPE-1 cells to lose the characteristic contact inhibition response, led to their anchorage-independent growth in vitro, and promoted PIN in the transgenic mice. During preparation of the revised manuscript, we tested the transformation ability of FOXP2 and FOXP2-CPED1 in RWPE1 xenograft models. We subcutaneously injected 2 × 106 RWPE-1 cells into the flanks of NOD-SCID mice. The NODSCID mice were divided into five groups (n = 5 mice in each group): control, FOXP2overexpressing (two stable cell lines) and FOXP2-CPED1- overexpressing (two cell lines) groups. The experiment lasted for 4 months. We observed that no RWPE-1 cell-injected mice developed tumor masses. We propose that FOXP2 and its fusion alone are not sufficient to generate the microenvironment suitable for RWPE-1-xenograft growth. Collectively, our data suggest that FOXP2 has oncogenic potential in prostate cancer, but is not sufficient to act alone as an oncogene.
  
  3) There is a limited mechanism of action. While the authors provide correlative data suggesting that FOXP2 could increase the expression of MET signaling components, it is not clear how FOXP2 controls MET levels. It would be of interest to search for and validate the importance of potential FOXP2 binding sites in or around MET and the genes of METassociated proteins. At a minimum, it should be confirmed whether MET is a primary or secondary target of FOXP2. The authors should also report on what happened to the 4-gene MET signature in the FOXP2 knockdown cell models. It would be equally significant to test if overexpression of MET can rescue the anti-growth effects of FOXP2 knockdown in prostate cancer cells (positive or negative results would be informative).
  
  We appreciate all the valuable comments. As suggested, we performed corresponding experiments, please refer to the " Answers to Essential Revisions #2 from the Editors”, to the "Answer to Essential Revisions #6 from the Editors”, and to the "Answer to Essential Revisions #7 from the Editors” for details.
  
  Reviewer #3 (Public Review):
  
  1) In this manuscript, the authors present data supporting FOXP2 as an oncogene in PCa. They show that FOXP2 is overexpressed in PCa patient tissue and is necessary and sufficient for PCa transformation/tumorigenesis depending on the model system. Overexpression and knock-down of FOXP2 lead to an increase/decrease in MET/PI3K/AKT transcripts and signaling and sensitizes cells to PI3K/AKT inhibition.
  
  Key strengths of the paper include multiple endpoints and model systems, an over-expression and knock-down approach to address sufficiency and necessity, a new mouse knock-in model, analysis of primary PCa patient tumors, and benchmarking finding against publicly available data. The central discovery that FOXP2 is an oncogene in PCa will be of interest to the field. However, there are several critically unanswered questions.
  
  1) No data are presented for how FOXP2 regulates MET signaling. ChIP would easily address if it is direct regulation of MET and analysis of FOXP2 ChIP-seq could provide insights.
  
  2) Beyond the 2 fusions in the 100 PCa patient cohort it is unclear how FOXP2 is overexpressed in PCa. In the discussion and in FS5 some data are presented indicating amplification and CNAs, however, these are not directly linked to FOXP2 expression.
  
  3) There are some hints that full-length FOXP2 and the FOXP2-CPED1 function differently. In SF2E the size/number of colonies between full-length FOXP2 and fusion are different. If the assay was run for the same length of time, then it indicates different biologies of the overexpressed FOXP2 and FOXP2-CPED1 fusion. Additionally, in F3E the sensitization is different depending on the transgene.
  
  We appreciate these valuable comments and constructive remarks. As suggested, we performed the CUT&Tag experiments to detect the binding of FOXP2 to MET, and to examine the association of CNAs of FOXP2 with its expression. Please refer to the " Answer to Essential Revisions #2 from the Editors" and the " Answer to Essential Revisions #4 from the Editors" for details. We also added detailed information to show the resemblance observed between FOXP2 fusion- and wild-type FOXP2-overexpressing cells. We added the corresponding description to the Results section in Line 487 on Page 22 in the tracked changes version of the revised manuscript. Please refer to the “Answer to Essential Revisions #5 from the Editors” for details.
  
  2) The relationship between FOXP2 and AR is not explored, which is important given 1) the critical role of the AR in PCa; and 2) the existing relationship between the AR and FOXP2 and other FOX gene members.
  
  We thank the reviewer very much for highlighting this point. We agree that it is important to examine the relationship between FOXP2 and AR. We therefore analyzed the expression dataset of 255 primary prostate tumors from TCGA and observed that the expression of FOXP2 was significantly correlated with the expression of AR (Spearman's ρ = 0.48, P < 0.001) (Figure 1. a). Next, we observed that both FOXP2- and FOXP2-CPED1overexpressing 293T cells had a higher AR protein abundance than control cells (Figure 1. b). In addition, shRNA-mediated FOXP2 knockdown in LNCaP cells resulted in a decreased AR protein level compared to that in control cells (Figure 1. c). However, we analyzed our CUT&Tag data and observed no binding of FOXP2 to AR (Figure 1. d). Our data suggest that FOXP2 might be associated with AR expression.
  
  Figure 1. a. AR expression in a human prostate cancer dataset (TCGA, Prostate Adenocarcinoma, Provisional; n = 493) classified by FOXP2 expression level (bottom 25%, low expression, n = 120; top 25%, high expression, n = 120; negative expression, n = 15). P values were calculated by the MannWhitney U test. The correlation between FOXP2 and AR expression was evaluated by determining the Spearman's rank correlation coefficient. b. Immunoblot analysis of the expression levels of AR in 293T cells with overexpression of FOXP2 or FOXP2-CPED1. c. Immunoblot analysis of the expression levels of AR in LNCaP cells with stable expression of the scrambled vector or FOXP2 shRNA. d. CUT&Tag analysis of FOXP2 association with the promoter of AR. Representative track of FOXP2 at the AR gene locus is shown.
  
  Reference
  
  Mayr C, Bartel DP. Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell. 2009 Aug 21;138(4):673-84.
  
  Gara SK, Jia L, Merino MJ, Agarwal SK, Zhang L, Cam M et al., Germline HABP2 Mutation Causing Familial Nonmedullary Thyroid Cancer. N Engl J Med. 2015 Jul 30;373(5):448-55.
  
  Kohno T, Ichikawa H, Totoki Y, Yasuda K, Hiramoto M, Nammo T et al., KIF5B-RET fusions in lung adenocarcinoma. Nat Med. 2012 Feb 12;18(3):375-7.
  
  Chen F, Byrd AL, Liu J, Flight RM, DuCote TJ, Naughton KJ et al., Polycomb deficiency drives a FOXP2-high aggressive state targetable by epigenetic inhibitors. Nat Commun. 2023 Jan 20;14(1):336.
  
  Kaya-Okur HS, Wu SJ, Codomo CA, Pledger ES, Bryson TD, Henikoff JG et al., CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat Commun. 2019 Apr 29;10(1):1930.
  
  Spiteri E, Konopka G, Coppola G, Bomar J, Oldham M, Ou J et al., Identification of the transcriptional targets of FOXP2, a gene linked to speech and language, in developing human brain. Am J Hum Genet. 2007 Dec;81(6):1144-57.
  
  Lai CS, Fisher SE, Hurst JA, Vargha-Khadem F, Monaco AP. A forkhead-domain gene is mutated in a severe speech and language disorder. Nature. 2001 Oct 4;413(6855):519-23.
  
  Hannenhalli S, Kaestner KH. The evolution of Fox genes and their role in development and disease. Nat Rev Genet. 2009 Apr;10(4):233-40.
  
  Shu W, Yang H, Zhang L, Lu MM, Morrisey EE. Characterization of a new subfamily of winged-helix/forkhead (Fox) genes that are expressed in the lung and act as transcriptional repressors. J Biol Chem. 2001 Jul 20;276(29):27488-97.
  
  Wang C, Liu H, Qiu Q, Zhang Z, Gu Y, He Z. TCRP1 promotes NIH/3T3 cell transformation by over-activating PDK1 and AKT1. Oncogenesis. 2017 Apr 24;6(4):e323.
  
  Suh YA, Arnold RS, Lassegue B, Shi J, Xu X, Sorescu D et al., Cell transformation by the superoxide-generating oxidase Mox1. Nature. 1999 Sep 2;401(6748):79-82.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.07.06.498943v1
www.biorxiv.org www.biorxiv.org

The Oncoprotein BCL6 Enables Cancer Cells to Evade Genotoxic Stress

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2 (Public Review):
  
  In the paper entitled "The Oncoprotein BCL6 Enables Cancer Cells to Evade Genotoxic Stress", through comparing transcriptional profilings of ETO sensitive versus resistant tumor cell lines, the authors found that BCL6 was selectively upregulated in ETO-resistant tumor cells, and their further in vitro and in vivo data suggest that Bcl6 upregulation via the IFN-STAT1-Bcl6 axis conferred tumor resistance to genotoxic stress, and targeting Bcl6 significantly improved therapeutic efficacy of ETO/ADR in mouse tumor models.
  
  Their findings are interesting and may inspire new combinational therapeutic strategy in treating chemotherapy resistant cancers, although a number of issues remain to be further clarified.
  
  Major concerns:
  
  Through using in vitro assays, the authors defined a panel of genotoxic agents (ETO, ADR, etc) resistant or sensitive tumor cell lines, and indicated the resistance was caused by BCL6 upregulation. It was expected in the following on animal studies, the authors would choose tumor cell lines with clearly defined phenotypes characterized in their study. But it was not the cases in their studies. For examples, in Fig S2C and Fig 7B, the authors used an ambiguous tumor cell line HCT116 to test ETO resistance, which had only a borderline level of resistance to ETO (Fig 1A) but yet sensitive to ADR (Fig S1A), whereas in Fig 2H, the authors chose a tumor cell line (MCF7) not examined in their study, instead of the high ETO-resistant tumor cell lines H661/Capan-2 or high ADR-resistant cell lines DLD-1/H836.
  
  We thank the reviewer very much for these insightful comments.
  
  (1) We sincerely agree with the reviewer that our experiments should be carried out using cell lines that possess clear and potent resistance phenotype. However, some resistant cell lines (e.g., H661 and Capan-2) are hard to form tumors in mice according to published literature or our experiences. That’s why we initially chose the resistant cell line HCT116 for animal studies. To follow the reviewer’s suggestion and further validate our findings, in our revised manuscript, we additionally set up a tumor xenograft mouse model using PANC28 cells that are more resistant to etoposide than HCT116 cells. Our new data consistently showed that the BCL6 abundance in PANC28 xenografts was apparently increased by etoposide treatment, and as expected, BCL6 knockdown significantly sensitized etoposide. We have supplemented these new data in Figure 2D, Figure 2-figure supplement 1C and Figure 7C of our revised manuscript.
  
  (2) Moreover, we also tested the in vitro sensitizing effects of BCL6 knockdown to etoposide and doxorubicin using Capan-2 and H838 cells that are much more resistant to genotoxic agents. As expected, our results showed that BCL6 genetic knockdown attenuated the clonogenic growth of these cells in the presence of etoposide or doxorubicin. We are sorry that we can't supplement all these figures in our revised manuscript due to limited space. We have added the Capan-2 data in our revised manuscript (Figure 2E).
  
  (3) In the previous version of our manuscript, we analyzed published datasets (Biomed Pharmacother. 2014 May;68(4):447-53; PLoS One. 2012;7(9):e45268), and found that BCL6 upregulation was also observed in cells with acquired chemoresistance (MCF7/ETO and A2780/ADR; Figure 1E). We further examined the inhibitory action of BCL6 silencing in the acquired chemo-resistant MCF7/ADR cells that we generated previously in our laboratory. Our results showed that BCL6 interference was sufficient to suppress the growth of MCF7/ADR cells. In attempting to make consistency of used cell lines across the experimental panels in our study, nevertheless, we decided to remove the MCF7/ADR proliferation data in our revised manuscript.
  
  Fig 3, the concept of tumor cell expressing IFNa/IFNg conferring genotoxic resistance sounds very interesting and novel, but the authors only tested IFNa/g expression at transcriptional level, protein expression data should be also provided.
  
  We appreciate the reviewer’s comments.
  
  In our study, we have examined the protein contents of IFN-α and IFN-γ using an ELISA assay. Our results showed that etoposide treatment led to a significant increase in IFN-α and IFN-γ contents in resistant cells. The results were expressed as fold change over the untreated control (Figure 3, H-I). We have revised the related figure legends to make it clearer to readers.
  
  Fig 3F-3I, ETO-induced interferon response should be examined comprehensively in different tumor cell lines as listed in Fig 1A/2A. Similarly, effect of exogenous IFNa/IFNg on ETO-resistance should be also examined comprehensively in both sensitive or resistant tumor cell lines. In addition, the effect of blocking IFNg/IFNa on ETO-resistance should be also tested in different tumor cell lines. These data are extremely useful for extending or strengthening the broad impact or influence of their findings.
  
  We appreciate for the reviewer’s suggestion.
  
  We agree that more cell lines should be examined in the context of exogenous addition of IFNα/IFNγ or IFNα/IFNγ blockade. However, it is hard for us to test all the cell lines as listed in Figure 1A/2A. In our revised manuscript, we expanded cell line panel in this part and supplemented several new data as listed below.
  
  (1) In addition to the sensitive cell line H522 that has been already shown in our previous manuscript, we further tested PC9 cells and consistently found that exogenous addition of IFN-α and IFN-γ also protected PC9 cells from etoposide-induced cell death.
  
  (2) In addition to the resistant cell line Capan-2 that has been already shown in our previous manuscript, we further tested H838 cells and consistently found that knockdown of the IFN-α receptor IFNAR1 led to an enhanced sensitivity of H838 cells to etoposide, as indicated by decreased IC50 values of etoposide and impaired clonogenic growth of H838 cells compared with the control group.
  
  (3) In addition to the resistant cell line PANC28 that has been already shown in our previous manuscript, we further employed Capan-2 and H838 cells and consistently found that antibodies against IFN-γ also increased the killing ability of etoposide towards these resistant cells.
  
  We are sorry that we can't supplement all these figures in our revised manuscript due to limited space. We have added the Capan-2 data in our revised manuscript (Figure 3O and Figure 3-figure supplement 1F).
  
  Fig 4A-L, the authors examined activation of IFN-STAT1-Bcl6 axis in tumor cells in different angles via different approaches, but using different tumor cell lines in different panels of experiments, making it quite annoying and difficult to judge their findings across different tumor cell lines. At least, ETO or IFNa/IFNg induced STAT1 upregulation and its phosphorylation should be examined comprehensively in both resistant and sensitive tumor cell lines.
  
  We thank so much for this helpful comment.
  
  We are so sorry for the inconsistency of cell lines used in our previous manuscript. We have employed consistent cell lines across the experimental panels and supplemented additional data in our revised manuscript. We chose the chemo-resistant cell line Capan-2, PANC28, H838 and HCT116 for mechanistic studies, and correspondingly, we employed the chemo-sensitive cell line H522, PC9 and PANC-1 for comparison in certain assays.
  
  As suggested by the reviewer, we tested more cell lines to further elucidate the IFN-STAT1-Bcl6 axis. Our results showed that etoposide treatment promoted STAT1 abundance and its phosphorylated levels in etoposide-resistant Capan-2, PANC28 and H838 cells, but not in sensitive H522, PC9 and PANC-1 cells. Additionally, IFN-α and IFN-γ significantly led to a simultaneous increase in STAT1, phosphorylated STAT1 and BCL6 expression in the same resistant cell panel.
  
  We have supplemented the new data in Figure 4A and Figure 4, C-F of our revised manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.15.448559v1
www.biorxiv.org www.biorxiv.org

Motor planning under uncertainty

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  In this paper, Alhussein and Smith set out to determine whether motor planning under uncertainty (when the exact goal is unknown before the start of the movement) results in motor averaging (average between the two possible motor plans) or in performance optimization (one movement that maximizes the probability of successfully reaching to one of the two targets). Extending previous work by Haith et al. with two new, cleanly designed experiments, they show that performance optimization provides a better explanation of motor behaviour under uncertainty than the motor averaging hypothesis.
  
  We thank the reviewer for the kind words.
  
  1) The main caveat of experiment 1 is that it rules out one particular extreme version of the movement averaging idea- namely that the motor programs are averaged at the level of muscle commands or dynamics. It is still consistent with the idea that the participant first average the kinematic motor plans - and then retrieve the associated force field for this motor plan. This idea is ruled out in Experiment 2, but nonetheless I think this is worth adding to the discussion.
  
  This is a good point, and we have now included it in the paper as suggested – both in motivating the need for Expt 2 in the Results section and when interpreting the results of Expt 1 in the Discussion section.
  
  2) The logic of the correction for variability between the one-target and two-target trials in Formula 2 is not clear to me. It is likely that some of the variability in the two-target trials arises from the uncertainty in the decision - i.e. based on recent history one target may internally be assigned a higher probability than the other. This is variability the optimal controller should know about and therefore discard in the planning of the safety margin. How big was this correction factor? What is the impact when the correction is dropped ?
  
  Short Answer:
  
  (1) If decision uncertainty contributed to motor variability on 2-target trials as suggested, 2-target trials should display greater motor variability than 1-target trials. However, 1-target and 2-target trials display levels of motor variability that are essentially equal – with a difference of less than 1% overall, as illustrated in Fig R2, indicating that decision uncertainty, if present, has no clear effect on motor variability in our data.
  
  (2) The sigma2/sigma1 correction factor is, therefore, very close to 1, with an average value of 1.00 or 1.04 depending on how it’s computed. Thus, dropping it has little impact on the main result as shown in Fig R1.
  
  Longer, more detailed, answer:
  
  We agree that it could be reasonable to think that if it were true that motor variability on 2-target trials were consistently higher than that on 1-target trials, then the additional variability seen on 2-target trials might result from uncertainty in the decision which should not affect safety margins if the optimal controller knew about this variability. However, detailed analysis of our data suggests that this is not the case. We present several analyses below that flush this out.
  
  We apologize in advance that the response we provide to this seemingly straightforward comment is so lengthy (4+ pages!), especially since capitulating to the reviewer’s assertion that “correction” for the motor variability differences between 1 & 2-target trails should be removed from our analysis, would make essentially no difference in the main result, as shown Fig R1 above. Note that the error bars on the data show 95% confidence intervals. However, taking the difference in motor variability (or more specifically, it’s ratio) between 1-target and 2-target trials into account, is crucial for understanding inter-individual differences in motor responses in uncertain conditions. As this reviewer (and reviewer 2) points out below, we did a poor job of presenting the inter-individual differences analysis in the original version of this paper, but we have improved both the approach and the presentation in the current revision, and we think that this analysis is important, despite being secondary to the main result about the group-averaged findings.
  
  Therefore, we present analyses here showing that it is unlikely that decision uncertainty accounts for the individual-participant variability differences we observe between 1-target and 2-target trials in our experiments (Fig R2). Instead, we show that the variability differences we observe in different conditions for individual participants are due to (largely idiosyncratic) spatial differences in movement direction (Fig R3), which when taken into account, afford a clearly improved ability to predict the size of the safety margins around the obstacles, both in 1-target trials where there is no ‘decision’ to be made (Figs R4-R6) and in 2-target trials (Figs R5-R6).
  
  Variability is, on average, nearly identical on 1-target & 2-target trials, indicating no measurable decision-related increase in variability on 2-target trials
  
  At odds with the idea that decision uncertainty is responsible for a meaningful fraction of the 2-target trial variability that we measure, we find that motor variability on 2-target trials is essentially unchanged from that on one-target trials overall as shown in Fig R2 (error bars show 95% confidence intervals). This is the case for both the data from Expt 2a (6.59±0.42° vs 6.70±0.96°, p > 0.8), and for the critical data from Expt 2b that was designed to dissociate the MA hypothesis from the PO hypothesis (4.23 ±0.17° vs 4.23±0.27°, p > 0.8 for the data from Expt 2b), as well as when the data from Expts 2a-b are pooled (4.78±0.24° vs 4.81±0.35°, p > 0.8). Note that the nominal difference in motor variability between 1-target and 2-target trials was just 1.7% in the Expt 2a data, 0.1% in the Expt 2b data, and 0.6% in the pooled data. This suggests little to no overall contribution of decision uncertainty to the motor variability levels we measured in Expt 2.
  
  Correspondingly, the sigma2/sigma1 ‘correction factor’ (which serves to scale the safety margin observed on 1-target trials up or down based on increased or decreased motor variability on 2-target trials) is close to 1. Specifically, this factor is 1.01±0.13 (mean±SEM) for Expt 2a and 1.04±0.09 for Expt 2b, if measured as mean(sigma2i/sigma1i), where sigma1i and sigma2i are the SDs of the initial movement directions on 1-target and 2-target trials. This factor is 1.02 for Expt 2a and 1.00 for Expt 2b, if instead measured as mean(sigma2i)/mean(sigma1i), and thus in either case, dropping it has little effect on the main population-averaged results for Expt 2 presented in Fig 4b in the main paper. Fig R1 shows versions of the PO model predictions in Fig 4b computed with or without dropping the sigma2/sigma1 ‘correction factor’ that reviewer asks about. These with vs without versions are quite similar for the results from both Expt 2a and Expt 2b. In particular, the comparison between our experimental data and the population-average-based model predictions for the MA vs the PO hypotheses, show highly significant differences between the abilities of the MA and PO models to explain the experimental data in Expt 2b (Fig R1, right panel), whether or not the sigma2/sigma1 correction is included for the comparison between MA and PO predictions (p<10-13 whether or not the sigma2/sigma1 term included, p=4.31×10-14 with it vs p=4.29×10-14 without it). Analogously, for Expt 2a (where we did not expect to show meaningful differences between the MA and PO model predictions), we also find highly consistent results when the sigma2/sigma1 term is included vs not (Fig R1, left panel) (p=0.37 for the comparison between PO and MA predictions with the sigma2/sigma1 term included vs 0.38 without it).
  
  Analysis of left-side vs right-side 1-target trial data indicates the existence of participant-specific spatial patterns of variability.
  
  With the participant-averaged data showing almost identical levels of motor variability on 1-target and 2-target trials, it is not surprising that about half of participants showed nominally greater variability on 1-target trials and about half showed nominally greater variability on 2-target trials. What was somewhat surprising, however, was that 16 of the 26 individual participants in Expt 2b displayed significantly higher variability in one condition or the other at α=0.05 (and 12/26 at α=0.01). Why might this be the case? We found an analogous result when breaking down the 1-target trial data into +30° (right-target) and -30° (left-target) trials that could offer an explanation. Note that the 2-target trial data come from intermediate movements toward the middle of the workspace, whereas the 1-target trial data come from right-side or left-side movements that are directed even more laterally than the +30° or -30° targets themselves (the average movement directions to these obstacle-obstructed lateral targets were +52.8° and -49.0°, respectively, in the Expt 2b data, see Fig 4a in the main paper for an illustration). Given the large separation between 1 & 2-target trials (~50°) and between left and right 1-target trails (~100°), differences in motor variability would not be surprising. The analyses illustrated in Figs R3-R6 show that these spatial differences indeed have large intra-individual effects on movement variability (Fig R3) and, critically, large a subsequent effect on the ability to predict the safety margin observed in one movement direction from motor variability observed at another (Figs R4-R6).
  
  Fig R3 shows evidence for intra-individual direction-dependent differences in motor variability, obtained by looking at the similarity between within-participant spatially-matched (e.g. left vs left or right vs right, Fig R3a) compared to spatially-mismatched (left vs right, Fig R3b) motor variability across individuals. To perform this analysis fairly, we separated the 60 left-side obstacle1-target trial movements for each participant into those from odd-numbered vs even-numbered trials (30 each) to be compared. And we did the same thing for the 60 right-side obstacle 1-target trial movements. Fig R3a shows that there is a large (r=+0.70) and highly significant (p<10-6) across-participant correlation between the variability measured in the spatially-matched case, i.e. for the even vs odd trials from same-side movements, indicating that the measurement noise for measuring movement variability using n=30 movements (movement variability was measured by standard deviation) did not overwhelm inter-individual differences in movement variability.
  
  The strength of this correlation would increase/decrease if we had more/less data from each individual because that would decrease/increase the noise in measuring each individual’s variability. Therefore, to be fair, we maintained the same number of data points for each variability measurement (n=30) for the spatially-mismatched cases shown in Fig R3b and R3c. The strong positive relationship between odd-trial and even-trial variability across individuals that we observed in the spatially-matched case is completely obscured when the target direction is not controlled for (i.e. not maintained) within participants, even though left-target and right-target movements are randomly interspersed. In particular, Fig R3b shows that there remains only a small (r=+0.09) and non-significant (p>0.5) across-participant correlation between the variability measured for the even vs odd trials from opposite-side movements that have movement directions separated by ~100°. This indicates that idiosyncratic intra-individual spatial differences in motor variability are large and can even outweigh inter-individual differences in motor variability seen in Fig R3a. Fig R3c shows that an analogous effect holds between the laterally-directed 1-target trials and the more center-directed 2-target trials that have movement directions separated by ~50°. In this case, the correlation that remains when the target direction is not is maintained within participants, is also near zero (r=-0.13) and non-significant (p>0.3). It is possible that some other difference between 1-target & 2-target trials might also be at play here, but there is unlikely to be a meaningful effect from decision variability given the essentially equal group-average variability levels (Fig R2).
  
  Analysis of left-side vs right-side 1-target trial data indicates that participant-specific spatial patterns of variability correspond to participant-specific spatial differences in safety margins.
  
  Critically, dissection of the 1-target trial data also shows that the direction-dependent differences in motor variability discussed above for right-side vs left-side movements predict direction-dependent differences in the safety margins. In particular, comparison of panels a & b in Fig R4 shows that motor variability, if measured on the same side (e.g. the right-side motor variability for the right-side safety margin), strongly predicts interindividual differences in safety margin (r=0.60, p<0.00001, see Fig R4b). However, motor variability, if measured on the other side (e.g. the right-side motor variability for the left-side safety margin), fails to predict interindividual differences in safety margin (r=0.15, p=0.29, see Fig R4a). These data show that taking the direction-specific motor variability into account, allows considerably more accurate individual predictions of the safety margins used for these movements. In line with that idea, we also find that interindividual differences in the % difference between the motor variability measured on the left-side vs the right-side predicts inter-individual differences in the % difference between the safety margin measured on the left-side vs the right-side as shown in Fig R4c (r=0.52, p=0.006).
  
  Analyses of both 1-target trial and 2-target trial data indicate that participant-specific spatial patterns of variability correspond to participant-specific spatial differences in safety margins.
  
  Not surprisingly, the spatial/directional specificity of the ability to predict safety margins from measurements of motor variability observed in the 1-target trial data in Fig R4, is present in the 2-target data as well. Comparison of panels a-d in Fig R5 shows that motor variability from 1-target and 2-target trial data in Expt 2b strongly predict interindividual differences in 1-target and 2-target trial safety margins (r=0.72, p=3x10-5 for the 2-target trial data (see Fig R5d), r=0.59, p=1x10-3 for the 1-target trial data (see Fig R5a)).
  
  This is the case even though the 1-target and 2-target trial data display essentially equal population-averaged levels of motor variability. However, in Expt 2b, motor variability, if measured on 1-target trials fails to predict inter-individual differences in the safety margin on 2-target trials (r=0.18, p=0.39, see Fig R5c), and motor variability, if measured on 2 target trials fails to predict inter-individual differences in the safety margin on 1-target trials (r=-0.12, p=0.55, see Fig R5b). As an aside, note that Fig 5a is similar to 4b in content, in that 1-target trial safety margins are plotted against motor variability levels in both cases. But in 5a, the left and right- target data are averaged whereas in 4b the left and right-target data are both plotted resulting in 2N data points. Also note that the correlations are similar, r=+0.59 vs r=+0.60, indicating that in both cases the amount of motor variability predicts the size of the safety margin.
  
  A final analysis indicating that the spatial specificity of motor variability rather than the presence of decision variability accounts for the ability to predict safety margins is shown in Fig R6. This analysis makes use of the contrast between Expt 2b (where there is a wide spatial separation (51° on average) between 1-target trials and 2-target trials because participants steer laterally around the Expt 2b 1-target trial obstacles, i.e. away from the center), and Expt 2a (where there is only a narrow spatial separation (10.4° on average) between the movement directions of 1-target trials and 2-target trials because participants steer medially around the Expt 2a 1-target trial obstacles, i.e. toward the center). If the spatial specificity of motor variability drove the ability to predict safety margins (and thus movement direction) on 2-target trials, then such predictions should be noticeably improved in Expt 2a compared to Expt 2b, because the spatial match between 1-target trials and 2-target trials is five-fold better in Expt 2a than in Expt2b. Fig R6 shows that this is indeed the case. Specifically, comparison of the 3rd and 4th clusters of bars (i.e. the data on the right side of the plot), shows that the ability to predict 2-target trial safety margins from 1-target trial variability and conversely the ability to predict 1-target trial safety margins from 2-target trial variability are both substantially improved in Expt 2a compared to Expt 2b (compare the grey bars in the 4th vs the 3rd clusters of bars).
  
  Moreover, comparison of the 1st and 2nd clusters of bars (i.e. the data on the left side of the plot), shows that the ability to predict left 1-target trial safety margins from right 1-target trial variability and conversely the ability to predict right 1-target trial safety margins from left 1-target trial variability are also both substantially improved in Expt 2a compared to Expt 2b (compare the grey bars in the 1st vs the 2nd clusters of bars). This corresponds to a spatial separation between the movement directions on left vs right 1-target trials of 20.7° on average in Expt 2a in contrast to a much greater 102° in Expt 2b.
  
  The analyses illustrated in Figs R4-R6 make it clear that accurate prediction of interindividual differences in safety margins critically depend on spatially-specific information about motor variability, and we have, therefore, included this information for the analyses in the main paper, as it is especially important for the analysis of inter-individual differences in motor planning presented in Fig 5 of the manuscript.
  
  3) Equation 3 then becomes even more involved and I believe it constitutes somewhat of a distractions from the main story - namely that individual variations in the safety margin in the 1-target obstacle-obstructed movements should lead to opposite correlations under the PO and MA hypotheses with the safety margin observed in the uncertain 2-target movements (see Fig 5e). Given that the logic of the variance-correction factor (pt 2) remains shaky to me, these analyses seem to be quite removed from the main question and of minor interest to the main paper.
  
  The reviewer makes a good point. We agree that the original presentation made Equation 3 seem overly complex and possibly like a distraction as well. Based on the comment above and a number of comments and suggestions from Reviewer 2, we have now overhauled this content – streamlining it and making it clearer, in both motivation and presentation. Please see section 2.2 in the point-by-point response to reviewer 2 for details.
  
  Reviewer #2:
  
  The authors should be commended on the sharing of their data, the extensive experimental work, the experimental design that allows them to get opposite predictions for both hypotheses, and the detailed of analyses of their results. Yet, the interpretation of the results should be more cautious as some aspects of the experimental design offer some limitations. A thorough sensitivity analysis is missing from experiment 2 as the safety margin seems to be critical to distinguish between both hypotheses. Finally, the readability of the paper could also be improved by limiting the use of abbreviations and motivate some of the analyses further.
  
  We thank the reviewer for the kind words and for their help with this manuscript.
  
  1) The text is difficult to read. This is partially due to the fact that the authors used many abbreviations (MA, PO, IMD). I would get rid of those as much as possible. Sometimes, having informative labels could also help FFcentral and FFlateral would be better than FFA and FFB.
  
  We have reduced the number of abbreviations used in the paper from 11 to 4 (Expt, FF, MA, PO), and we thank the reviewer for the nice suggestion about changing FFA and FFB to FFLATERAL and FFCENTER. We agree that the suggested terms are more informative and have incorporated them.
  
  2) The most difficult section to follow is the one at the end of the result sections where Fig.5 is discussed. This section consists of a series of complicated analyses that are weakly motivated and explained. This section (starting on line 506) appears important to me but is extremely difficult to follow. I believe that it is important as it shows that, at the individual level, PO is also superior to MA to predict the behavior but it is poorly written and even the corresponding panels are difficult to understand as points are superimposed on each other (5b and e). In this section, the authors mention correcting for Mu1b and correcting for Sig2i/Sig1Ai but I don't know what such correction means. Furthermore, the authors used some further analyses (Eq. 3 and 4) without providing any graphical support to follow their arguments. The link between these two equations is also unclear. Why did the authors used these equations on the pooled datasets from 2a and 2b ? Is this really valid ? It is also unclear why Mu1Ai can be written as the product of R1Ai and Sig1Ai. Where does this come from ?
  
  We agree with the reviewer that this analysis is important, and the previous explanation was not nearly as clear as it could have been. To address this, we have now overhauled the specifics of the context in Figure 5 and the corresponding text – streamlining the text and making it clearer, in both motivation and presentation (see lines 473-545 in the revised manuscript). In addition to the improved text, we have clarified and improved the equations presented for analysis of the ability of the performance optimization (PO) model to explain inter-individual differences in motor planning in uncertain conditions (i.e. on 2-target trials) and have provided more direct graphical support for them. Eq 4 from the original manuscript has been removed, and instead we have expanded our analyses on what was previously Eq 3 (now Eq 5 in the revised manuscript). We have more clearly introduced this equation as a hybrid between using group-averaged predictions and participant-individualized predictions, where the degree of individualization for all parameters is specified with the individuation index 𝑘. For example, a value of 1 for 𝑘 would indicate complete weighting of the individuated model predictors. The equation that follows in the revised manuscript, Eq 6, is a straightforward extension of Eq 5 where each model parameter was instead multiplied by a different individuation index. With this, we now present the partial-R2 statistic associated with each model predictor (see revised Figs 5a and 5e) to elucidate the effect of each. We have, additionally, now plotted the relationships between the each of the 3 model predictors and the inter-individual differences that remain when the other two predictors are controlled (see revised Figs 5b-d and Fig 5f-h). These analyses are all shown separately for each experiment, as per the reviewer’s suggestion, in the revised version of Fig 5.
  
  Overall, this section is now motivated and discussed in a more straightforward manner, and now provides better graphical support for the analyses reported in the manuscript. We feel that the revised analysis and presentation (1) more clearly shows the extent to which inter-individual differences in motor planning can be explained by the PO model, and (2) does a better job of breaking down how the individual factors in the model contribute to this. We sincerely thank the reviewer for helping us to make the paper easier to follow and better illustrated here.
  
  3) In experiment 1, does the presence of a central target not cue the participants to plan a first movement towards the center while such a central target was never present in other motor averaging experiment.
  
  Unfortunately, the reviewer is mistaken here, as central target locations were present in several other experiments that advocated for motor averaging which we cite in the paper. The central target was not present on any 2-target trials in our experiments, in line with previous work. It was only present on 1-target center-target trials.
  
  In the adaptation domain, people complain that asking where people are aiming would induce a larger explicit component. Similarly, one could wonder whether training the participants to a middle target would not induce a bias towards that target under uncertainty.
  
  Any “bias” of motor output towards the center target would predict an intermediate motor output which would favor neither model because our experiment designs result in predictions for motor output on different sides of center for 2-target trials in both Expt 1 and Expt 2b. Thus we think any such effect, if it were to occur, would simply reduce the amplitude of the result. However, we found an approximately full-sized effect, suggesting that this is not a key issue.
  
  4) The predictions linked to experiment 2 are highly dependent on the amount of safety margin that is considered. While the authors mention these limitations in their paper, I think that it is not presented with enough details. For instance, I would like to see a figure similar to Fig.4B when the safety margin is varied.
  
  We apologize for any confusion here. The reviewer seems to be under the impression that we can specifically manipulate safety margins around the obstacle in making model predictions for experiment 2. This is, however, not the case for either of the two safety margins in the performance-optimization (PO) modelling. Let us clarify. First, the safety margin on 1-target trials, which serves as input to the PO model, is experimentally measured on obstacle-present 1-target trials, and thus cannot be manipulated. Second, the predicted safety margin on 2-target trials is the output of the PO model and thus cannot be manipulated. There is only one parameter in the main PO model (the one for making the PO prediction for the group-average data presented in Fig 4b, see Eq 4), and that is the motor cost weighting coefficient (𝛽). 𝛽 is implicitly present in Eq 2 as well, fixed at 1/2 in this baseline version of the PO model. It is of course true that changing the motor cost weighting will affect the model output (the predicted 2-trial safety margin), but we do not think that the reviewer is referring to that here, since he or she asks about that directly in section 2.4.4 and in section 2.4.6 below, where we provide the additional analysis requested.
  
  For exp1, it would be good to demonstrate that, even when varying the weight of the two one-target profiles for motor averaging, one never gets a prediction that is close to what is observed.
  
  Here the reviewer is referring an apparent inconsistency between our analysis of Expts 1 and 2, because in Expt 2 (but not in Expt 1) we examine the effect of varying the relative weight of the two 1-target trials for motor averaging. However, we only withheld this analysis in Expt 1 because it would have little effect. Unlike Expt 2, the measured motor output on left and right 1-target trials in Expt 1 is remarkably similar (see the left panel in Fig R7a below (which is based on Fig 2b from the manuscript)). This is because left and right 1-target trials in Expt 1 were adapted to the same FF perturbation ( FFLATERAL in both cases), whereas left and right 1-target trials in Expt 2 received very different perturbation levels, because one of these targets was obstacle-obstructed and the other was not. Therefore, varying the relative weightings in Expt 1 would have little effect on the MA prediction as shown in Fig R7b at right. We now realize that is point was not explained to readers, and we have now modified the text in the results section where the analysis of Expt 1 is discussed in order to include a summary of the explanation offered above. We thank the reviewer for surfacing this.
  
  It is unclear in the text that the performance optimization prediction simply consists of the force-profile for the center target. The authors should motivate this choice.
  
  We’re a bit unclear about this comment. This specific point is addressed in the first paragraph under the Results section, the second paragraph under the subsection titled “Adaptation to novel physical dynamics can elucidate the mechanisms for motor planning under uncertainty”, the Figure 2 captions, and in the second paragraph under the subsection titled “Adaptation to a multi-FF environment reveals that motor planning during uncertainty occurs via performance-optimization rather than motor averaging”. Direct quotes from the original manuscript are below:
  
  Line 143: “However, PO predicts that these intermediate movements should be planned so that they travel towards the midpoint of the potential targets in order to maximize the probability of final target acquisition. This would, in contrast to MA, predict that intermediate movements incorporate the learned adaptive response to FFB, appropriate for center-directed movements, allowing us to decisively dissociate PO from MA.”
  
  Line 200: “In contrast, PO would predict that participants produce the force pattern (FFB) appropriate for optimizing the planned intermediate movement since this movement maximizes the probability of successful target acquisition5,34 (Fig 1d, right).”
  
  Line 274: “The 2-target trial MA prediction corresponds to the average of the force profiles (adaptive responses) associated with the left and right 1-target EC trials plotted in Fig 2b, whereas the 2-target trial PO prediction corresponds to the force profile associated with the center target plotted in Fig 2b, as this is appropriate for optimizing a planned intermediate movement.”
  
  For the second experiment 2, the authors do not present a systematic sensitivity analysis. Fig. 5a and d is a good first step but they should also fit the data on exp2b and see how this could explain the behavior in exp 2a. Second, the authors should present the results of the sensitivity analysis like they did for the main predictions in Fig.4b.
  
  We thank the reviewer for these suggestions. We have now included a more-complete analysis in Fig R8 below, and presented it in the format of Fig 4b as suggested. Please note that we have included the analysis requested above in a revised version of Fig 4b in the manuscript, and ta related analysis requested in section 2.4.6 in the supplementary materials.
  
  Specifically, the partial version of the analysis that had been presented (where the cost weighting for PO as well as the target weighting for MA were fit on Expt 2a and cross-validated using the Expt 2b data, but not conversely fit on Expt 2b and tested on Expt 2a) was expanded to include cross-validation of the Expt 2b fit using the Expt 2a data. As expected, the results from the converse analysis (Expt2b à Expt2a) mirror the results from the original analysis (Expt 2a à Expt 2b) for the cost weighting in the PO model, where the self-fit mean squared prediction errors modestly by 11% for the Expt 2a data, and by 29% for the Expt 2b data. In contrast, for the target weighting in the MA model, the cross-validated predictions did not explain the data well, increasing the self-fit mean squared prediction errors by 115% for the Expt 2a data, and by 750% for the Expt 2b data. Please see lines 411-470 in the main paper for a full analysis.
  
  While I understand where the computation of the safety margin in eq.2 comes from, reducing the safety margin would make the predictions linked to the performance optimization look more and more towards the motor averaging predictions. How bad becomes the fit of the data then ?
  
  We think that this is essentially the same question as that asked in above in section 2.4.1. Please see our response in that section above. If that response doesn’t adequately answer this question, please let us know!
  
  How does the predictions look like if the motor costs are unbalanced (66 vs. 33%, 50 vs. 50% (current prediction), 33 vs. 66% ). What if, in Eq.2 the slope of the relationship was twice larger, twice smaller, etc.
  
  Fig R8 above shows how PO prediction would change using the 2:1 (66:33) and 1:2 (33:66) weightings suggested by the reviewer here, in comparison to the 1:1 weighting present in the original manuscript, the Expt 2a best fit weighting present in the original manuscript, and the Expt 2b best fit weighting that the reviewer suggested we include in section 2.4.2. Please note that this figure is now included as a supplementary figure to accompany the revised manuscript.
  
  The safety margin is the crucial element here. If it gets smaller and smaller, the PO prediction would look more and more like the MA predictions. This needs to be discussed in details. I also have the impression that the safety margin measured in exp 2a (single target trials) could be used for the PO predictions as they are both on the right side of the obstacle.
  
  We again apologize for the confusion. We are already using safety margin measurements to make PO predictions. Specifically, within Expt 2a, we use safety margin measurements from 1-target trials (in conjunction with variability measurements on 1 & 2 target trials) to estimate safety margins on 2-target trials. And analogously within Expt 2b, we use safety margin measurements from 1-target trials (in conjunction with variability measurements on 1 & 2 target trials) to estimate safety margins on 2-target trials. Fig 4b in the main paper shows the results of this prediction (and it now also includes the cross-validated predictions of the refined models as requested in Section 2.4.4 above. Relatedly Fig R1 in this letter shows that, at the group-average level, these predictions for 2-target trial behavior in both Expt 2a and Expt 2b are essentially identical whether they are based solely on the safety margins observed on 1-target trials or on these safety margins corrected for the relative motor variabilities on 1-target and 2-target trials.
  
  5) On several occasions (e.g. line 131), the authors mention that their result prove that humans form a single motor plan. They don't have any evidence for this specific aspect as they can only see the plan that is expressed. They can prove that the latter is linked to performance optimization and not to the motor averaging one. But the absence of motor averaging does not preclude the existence of other motor plans…. Line 325 is the right interpretation.
  
  Thanks for catching this. We agree and have now revised the text accordingly (see for example, lines 53, 134, and 693-695 in the revised manuscript).
  
  6) Line 228: the authors mention that there is no difference in adaptation between training and test periods but this does not seem to be true for the central target. How does that affect the interpretation of the 2-target trials data ? Would that explain the remaining small discrepancy between the refined PO prediction and the data (Fig.2f) ?
  
  There must be some confusion here. The adaptation levels in the training period and the test period data from the central target are indeed quite similar, with only a <10% nominal difference in adaptation between them that is not close to statistically significant (p=0.14). We also found similar adaptation levels between the training and test epochs for the lateral targets (p=0.65 for the left target and p=0.20 for the right target). We further note that the PO predictions are based on test period data. And so, even if there were a clear decrease in adaptation between training and test periods, it would not affect the fidelity of the predictions or present a problem, except in the extreme hypothetical case where the reduction was so great that the test period adaptation was not clearly different from zero (as that would infringe on the ability of the paradigm to make clearly opposite predications for the MA and PO model) – but that is certainly not the case in our data.
  
  Reviewer #3:
  
  In this study, Alhussein and Smith provide two strong tests of competing hypotheses about motor planning under uncertainty: Averaging of multiple alternative plans (MA) versus optimization of motor performance (PO). In this first study, they used a force field adaptation paradigm to test this question, asking if observed intermediate movements between competing reach goals reflected the average of adapted plans to each goal, or a deliberate plan toward the middle direction. In the second experiment, they tested an obstacle avoidance task, asking if obstacle avoidance behaviors were averaged with respect to movements to non-obstructed targets, or modulated to afford optimal intermediate movements based on a commuted "safety margin." In both experiments the authors observed data consistent with the PO hypothesis, and contradictory of the MA hypothesis. The authors thus conclude that MA is not a feasible hypothesis concerning motor planning under uncertainty; rather, people appear to generate a single plan that is optimized for the task at hand.
  
  I am of two minds about this (very nice) study. On the one hand, I think it is probably the most elegant examination of the MA idea to date, and presents perhaps the strongest behavioral evidence (within a single study) against it. The methods are sound, the analysis is rigorous, and it is clearly written/presented. Moreover, it seems to stress-test the PO idea more than previous work. On the other hand, it is hard for me to see a high degree of novelty here, given recent studies on the same topic (e.g. Haith et al., 2015; Wong & Haith, 2017; Dekleva et al., 2018). That is, I think these would be more novel findings if the motor-averaging concept had not been very recently "wounded" multiple times.
  
  We thank the reviewer for the kind words and for their help with this manuscript.
  
  The authors dutifully cite these papers, and offer the following reasons that one of those particular studies fell short (I acknowledge that there may be other reasons that are not as explicitly stated): On line 628, it is argued that Wong & Haith (2017) allowed for across-condition (i.e., timing/spacing constraints) strategic adjustments, such as guessing the cued target location at the start of the trial. It is then stated that, "While this would indeed improve performance and could therefore be considered a type of performance-optimization, such strategic decision making does not provide information about the implicit neural processing involved in programming the motor output for the intermediate movements that are normally planned under uncertain conditions." I'm not quite sure the current paper does this either? For example, in Exp 1, if people deliberately strategize to simply plan towards the middle on 2-target trials and feedback-correct after the cue is revealed (there is no clear evidence against them doing this), what do the results necessarily say about "implicit neural processing?" If I deliberately plan to the intermediate direction, is it surprising that my responses would inherit the implicit FF adaption responses from the associated intermediate learning trials, especially in light of evidence for movement- and/or plan-based representations in motor adaptation (Castro et al., 2011; Hirashima & Nozacki, 2012; Day et al., 2016; Sheahan et a., 2016)?
  
  The reviewer has a completely fair point here, and we agree that the experiments in the current study are amenable to explicit strategization. Thus, without further work, we cannot claim that the current results are exclusively driven by implicit neural processing.
  
  As the reviewer alludes to below, the possibility that the current results are driven by explicit processes in addition to or instead of implicit ones does not directly impact any of the analyses we present – or the general finding that performance-optimization, not motor averaging, underlies motor planning during uncertainty. Nonetheless, we have added a section in the discussion section to acknowledge this limitation. Furthermore, we highlight previous work demonstrating that restriction of movement preparation time suppresses explicit strategization (as the reviewer hints at below), and we suggest leveraging this finding in future work to investigate how motor output during goal uncertainty might be influenced under such constraints. This portion of the discussion section is quoted below:
  
  “An important consideration for the present results is that sensorimotor control engages both implicit and explicit adaptive processes to generate motor output47. Because motor output reflects combined contributions of these processes, determining their individual contributions can be difficult. In particular, the experiments in the present study used environmental perturbations to induce adaptive changes in motor output, but these changes may have been partially driven by explicit strategies, and thus the extent to which the motor output measured on 2-target trials reflects implicit vs explicit feedforward motor planning requires further investigation. One method for examining implicit motor planning during goal uncertainty might take inspiration from recent work showing that in visuomotor rotation tasks, restricting the amount of time available to prepare a movement appears to limit explicit strategization from contributing to the motor response48–51. Future work could dissociate the effects of MA and PO on intermediate movements in uncertain conditions at movement preparation times short enough to isolate implicit motor planning.”
  
  In that same vein, the Gallivan et al 2017 study is cited as evidence that intermediate movements are by nature implicit. First, it seems that this consideration would be necessarily task/design-dependent. Second, that original assumption rests on the idea that a 30˚ gradual visuomotor rotation would never reach explicit awareness or alter deliberate planning, an assumption which I'm not convinced is solid.
  
  We generally agree with the reviewer here. We might add that in addition to introducing the perturbation gradually, Gallivan and colleagues enforced a short movement preparation time (325ms). However, we agree that the extent to which explicit strategies contribute to motor output should clearly vary from one motor task to another, and on this basis alone, the Gallivan et al 2017 study should not be cited as evidence that intermediate movements must universally reflect implicit motor planning. We have explained this limitation in the discussion section (see quote below) and have revised the manuscript accordingly.
  
  “We note that Gallivan et al. 2017 attempted to control for the effects of explicit strategies by (1) applying the perturbation gradually, so that it might escape conscious awareness, and (2) enforcing a 325ms preparation time. Intermediate movements persisted under these conditions, suggesting that intermediate movements during goal uncertainty may indeed be driven by implicit processes. However, it is difficult to be certain whether explicit strategy use was, in fact, effectively suppressed, as the study did not assess whether participants were indeed unaware of the perturbation, and the preparation times used were considerably larger than the 222ms threshold shown to effectively eliminate explicit contributions to motor output."
  
  The Haith et al., 2015 study does not receive the same attention as the 2017 study, though I imagine the critique would be similar. However, that study uses unpredictable target jumps and short preparation times which, in theory, should limit explicit planning while also getting at uncertainty. I think the authors could describe further reasons that that paper does not convince them about a PO mechanism.
  
  We had omitted a detailed discussion of the Haith et al 2015 study as we think that the key findings, while interesting, have little to do with motor planning under uncertainty. But we now realize that we owe readers an explanation of our thoughts about it, which we have now included in the Discussion. This paragraph is quoted below, and we believe it provides a compelling reason why the Haith et al. 2015 study could be more convincing about PO for motor planning during uncertainty.
  
  “Haith and colleagues (2015) examined motor planning under uncertainty using a timed-response reaching task where the target suddenly shifted on a fraction (30%) of trials 150-550ms] before movement initiation. The authors observed intermediate movements when the target shift was modest (±45°), but direct movements towards either the original or shifted target position when the shift was large (±135°). The authors argued that because intermediate movements were not observed under conditions in which they would impair task performance, that motor planning under uncertainty generally reflects performance-optimization. This interpretation is somewhat problematic, however. In this task, like in the current study, the goal location was uncertain when initially presented; however, the final target was presented far enough before movement onset that this uncertainty was no longer present during the movement itself, as evidenced by the direct-to-target motion observed when the target location was shifted by ±135°. Therefore the intermediate movements observed when the target location shifted by ±45° are unlikely to reflect motor planning under uncertain conditions. Instead, these intermediate movements likely arose from a motor decision to supplement the plan elicited by the initial target presentation with a corrective augmentation when the plan for this augmentation was certain. The results thus provide beautiful evidence for the ability of the motor system to flexibly modulate the correction of existing motor plans, ranging from complete inhibition to conservative augmentation, when new information becomes available, but provide little information about the mechanisms for motor planning under uncertain conditions.”
  
  If the participants in Exp 2 were asked both "did you switch which side of the obstacle you went around" and "why did you do that [if yes to question 1]", what do the authors suppose they would say? It's possible that they would typically be aware of their decision to alter their plan (i.e., swoop around the other way) to optimize success. This is of course an empirical question. If true, it wouldn't hurt the authors' analysis in any way. However, I think it might de-tooth the complaint that e.g. the Wong & Haith study is too "explicit."
  
  The participants in Expts 1, 2a, and 2b were all distinct, so there was no side-switching between experiments per se. However, the reviewer’s point is well taken. Although we didn’t survey participants, it’s hard to imagine that any were unaware of which side they traveled around the obstacle in Expt 2. Certainly, there was some level of awareness in our experiments, and while we would like to believe that the main findings arose from low-level, implicit motor planning, we frankly do not know the extent to which our findings may have depended on explicit planning. We have now clarified this key point and discussed it’s implications in the discussion section of the revised paper. That said, we do still think that the direct-to-target movements in the Wong and Haith study were likely the result of a strategic approach to salvaging some reward in their task. Please see the new section in the discussion titled: “Implicit and explicit contributions to motor planning under uncertainty” which for convenience is copied below:
  
  Implicit and explicit contributions to motor planning under uncertainty An important consideration for the present results is that sensorimotor control engages both implicit and explicit adaptive processes to generate motor output. Because motor output reflects combined contributions of these processes, determining their individual contributions can be difficult. In particular, the experiments in the present study used environmental perturbations to induce adaptive changes in motor output, but these changes may have been partially driven by explicit strategies, and thus the extent to which the motor output measured on 2-target trials reflects implicit vs explicit feedforward motor planning requires further investigation. One method for examining implicit motor planning during goal uncertainty might take inspiration from recent work showing that in visuomotor rotation tasks, restricting the amount of time available to prepare a movement appears to limit explicit strategization from contributing to the motor response. Future work could dissociate the effects of MA and PO on intermediate movements in uncertain conditions at movement preparation times short enough to isolate implicit motor planning.
  
  We note that Gallivan et al. 2017 attempted to control for the effects of explicit strategies by (1) applying the perturbation gradually, so that it might escape conscious awareness, and (2) enforcing a 325ms preparation time. Intermediate movements persisted under these conditions, suggesting that intermediate movements during goal uncertainty may indeed be driven by implicit processes. However, it is difficult to be certain whether explicit strategy use was, in fact, effectively suppressed, as the study did not assess whether participants were indeed unaware of the perturbation, and the preparation times used were considerably larger than the 222ms threshold shown to effectively eliminate explicit contributions to motor output.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.02.11.430753v1
www.biorxiv.org www.biorxiv.org

Protective mitochondrial fission induced by stress responsive protein GJA1-20k

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #3:
  
  Weaknesses:
  
  Previously it was suggested that mitochondrial biogenesis was increased with increased levels of GJA1-20k. Is this a difference in the cellular model (HEK) and do the changes in cell culture accurately recapitulate the changes seen in animals?
  
  The Reviewer is correct that GJA1-20k did not alter the mitochondrial biogenesis in HEK293 cells (Figure 1–figure supplement 2) whereas AAV9-transduced adult cardiomyocytes showed increased mitochondrial DNA copy number (Figure 1–figure supplement 2C), consistent with our previous study (Basheer et al., JCI insight, 2018). We expect that increased mitochondrial biogenesis is a function of chronic GJA1-20k overexpression in vivo, and thus a separate phenomenon from the acute mitochondrial fission which occurs within one minute of GJA1-20k accumulation around a mitochondrion (Figure 4). The HEK cell line, in which overexpressed GJA1-20k is present for a much shorter time, does not induce mitochondrial biogenesis (Figure 1–figure supplement 2), and thus is an excellent cellular model in which we can study GJA1-20k induced fission.
  
  The revised manuscript has been modified to include the above new data (Figure 1–figure supplement 2) and discussion:
  
  —Results section (lines 121 – 129): Previously we reported that GJA1-20k is involved in mitochondrial biogenesis (Basheer, Fu et al. 2018). Consistent with our previous study, AAV9-transduced adult cardiomyocytes showed increased mitochondrial DNA copy number and GJA1-20k deficient mice (Gja1M213L/M213L) had decreased copy number. However, exogenous GJA1-20k did not alter the mitochondrial biogenesis in HEK293 cells. Nor did exogenous GJA1-20k affect membrane potential or baseline ATP production (Figure 1–figure supplement 2A–C). In addition to mitochondrial DNA copy number, neither biogenesis nor mitophagy protein markers were altered in either GJA1-20k transfected HEK293 cells or Gja1M213L/M213L mouse hearts (Figure 1–figure supplement 2D – G).
  
  —Discussion section (lines 289 – 292): Yet the presence of GJA1-20k, while inducing mitochondrial fission and smaller mitochondria (Figure 1, 3 and 4), does not either reduce MFN1 or MFN2, activate DRP1, change membrane potential, ATP production, mitochondrial biogenesis, or mitophagy (Figure 2; Figure 1 – figure supplement 2).
  
  Mdivi-1 is not a selective Drp1 inhibitor. It is a Complex I inhibitor, leading to unintended changes in mitochondrial dynamics in response to ETC stress. Rather than Mdivi-1, a dominant negative Drp1 mutant K38A could be overexpressed to see whether this prevents GJA1-20k-mediated fission. If it still goes through, then I agree that Drp1 is not involved at all.
  
  We appreciate Reviewer #3’s thoughtful suggestion and, in this revised manuscript, we studied mitochondrial morphology in the presence of K38A. As seen in Figure 2C and D of the revised manuscript, K38A elongated mitochondria, as expected from inhibited Drp1 mediated fission. However, despite Drp1 inhibition by K38A, in the presence of GJA1-20k, mitochondria remain small, further supporting that GJA1-20k-mediated fission is DRP1-independent.
  
  —Results section (lines 140 – 150): To further investigate whether GJA1-20k induced reduction in mitochondrial size is dependent on DRP1, we analyzed mitochondrial morphology after inhibiting DRP1 by performing siRNA- mediated DRP1 knock-down (Figure 2—figure supplement 1A–C) or transfecting DRP1 dominant negative mutant (K38A), all with or without GJA1-20k transfection. With either method of DRP1 inhibition, the average area of individual mitochondria increased, consistent with inhibiting canonical fission (Figure 2C, D). In addition, K38A has more pronounced DRP1 inhibition which resulted in greater mitochondrial enlargement than siDRP1 (Figure 2C, D; Figure 2—figure supplement 1F). However, GJA1-20k acts epistatically to DRP1 loss or interference and prevents DRP1-mediated mitochondrial enlargement (Figure 2C–F; Figure 2— figure supplement 1B, C), indicating GJA1-20k can act at or downstream of DRP1.
  
  For the kinetics studies (see Fig 4), I think it is important to measure the timing of the actin recruitment and eventual fission when Drp1 is knocked down and/or when a DN mutant (K38A) is involved. Again, I do not trust the chemical inhibitor (Mdivi-1) data since this does not inhibit Drp1 activity.
  
  We would like to thank Reviewer #3 for suggesting we use an additional method of inhibiting Drp1. We analyzed real time actin dynamics under direct DRP1 knock-down. As seen in Mdivi-1 treatment, GJA1- 20k accumulated and then actin assembled around mitochondria and induced fission under DRP1 knockdown (Figure 4 and Video 1 of revised manuscript). The kinetic parameters of fission were also similar between Drp1 knockdown and Mdivi-1 treatment. The original Figure 4 and Video 1 and 2 have been moved to Figure 4–figure supplement 1 and Video 2 and 3, respectively, in order to accommodate the new Drp1 knockdown data (Figure 4 and Video 1).
  
  The revised manuscript has been modified to include the above new data (Figure 4; Video 1):
  
  —Results section (lines 198 – 219): Simultaneous use of fluorescently labelled actin, GJA1-20k, and mitochondria in live cells permit real time imaging of mitochondrial fission events at actin assembly sites. As seen in Video 1 and Figure 4B, GJA1-20k recruits actin to mitochondria, which results in fission. In Video 1, the actin network can be seen to develop around mitochondria and, coinciding with GJA1-20k intensity, forms an increasingly tight band across a mitochondrion which, within one minute, results in mitochondrial fission. The imaging in the bottom row of Figure 4B, and in the right column of Video 1 were obtained by multiplying GJA1-20k signal with actin signal, highlighting the locations at which GJA1-20k and actin are coincident. The respective line-scan profiles in Figure 4C indicate that mitochondrial fission occurs at points where the product of GJA1-20k and actin is the highest. Following accumulation of GJA1-20k and actin (red lines) at these points, a drop in mitochondrial signal (blue lines) is apparent when fission occurs. Fission (low point of blue lines) occurs approximately 45 seconds after co-accumulation of GJA1-20k and actin (high point of red lines, Figure 4C). Time to fission was computed from the time of peak GJA1-20k and actin intensity product, to the time of mitochondrial signal being reduced to background (Figure 4D–F). Statistically, this time to fission occurred at a median of 45 seconds, with a standard deviation of 11 seconds (Figure 4G). Note, the real time imaging shown in Video 1, and Figure 4 were performed under siDRP1. Therefore, the mitochondrial fission induced by cooperation between GJA1-20k and actin can be independent of canonical DRP1-mediated fission. To rule out inadvertent bias by siRNA, we used pharmacologic Mdivi-1 to inhibit DRP1 and, similar to the use of DRP1 siRNA, actin formed around mitochondria at GJA1-20k sites (Figure 4—figure supplement 1A–D) and fission occurred within a similar timescale (Video 2 and 3; Figure 4— figure supplement 1E–H).
  
  The assessment of the impact of ischemic stress with the heterozygous animal (M213L/WT) is hard to interpret. How reduced is the expression of GJA1-20k in these animals and how is mitochondrial function impacted based on Seahorse analysis? The mitochondrial morphology is not altered in these animals, so would mitochondrial function be largely unchanged as well? It is not clear how much GJA1-20k is needed to observe changes in mitochondrial shape and function, and comparisons with the homozygous mutant (M213L/M213L) are not the same, making it difficult to resolve the interpretation of these data.
  
  We appreciate Reviewer #3’s thoughtful and valuable comments. We previously reported that the heterozygous mutant (M213L/WT) expresses approximately half of GJA1-20k compared to WT (Figure 1 in Xiao and Shimura et al., J Clin Invest, 2020). Unfortunately, homozygous mutants die before adulthood, preventing effective comparison of GJA1-20k content on mitochondrial function in adult cardiomyocytes. To compare the impact of the amount of endogenous GJA1-20k on mitochondrial function, we added seahorse data from heterozygous neonatal CMs (Figure 5 C, D) and compared these data to seahorse data from neonatal cardiomyocytes from both wildtype and homozygous mutants. Even though there was no significant difference in mitochondrial size between WT and M213L/WT (Figure 5I, J; Figure 5–figure supplement 1A, B) under basal conditions, the seahorse OCR levels from M213L/WT myocytes is in between that of WT and homozygous (M213L/M213L) (Figure 5 C, D; Figure 5–figure supplement 1C) cardiomyocytes. Since GJA1-20k is a stress responsive peptide which increases under ischemic stress, in the present manuscript, we should like to emphasize that even a partial (50%) decrease in GJA1-20k expression induces mitochondrial fragility to oxidative stress. As shown in new Figure 5 I – L of the revised manuscript, the heterozygous mutant (M213L/WT) has more elongated mitochondria and a high distribution of damaged mitochondria post-I/R compared to WT, consistent with TTC staining, even with no change in mitochondrial size under basal conditions.
  
  The revised manuscript has been modified to include the above new data (Figure 5; Figure 5–figure supplement 1) and discussion:
  
  —Results section (lines 227 – 233) Similarly, maximal respiration is increased in neonatal CMs derived from GJA1-20k deficient Gja1M213L/M213L mice and maximal respiration for heterozygous Gja1M213L/WT mice is between that of WT and Gja1M213L/M213L (Figure 5C, D; Figure 5—figure supplement 1A, B). In addition, observing other OCR parameters, we found a decrease in ATP-linked respiration and reserve capacity in Gja1M213L/WT cardiomyocytes, and an increase in proton leak and non-mitochondrial respiration in Gja1M213L/M213L suggesting that there can be compensatory long-term effects of the Gja1 mutation (Figure 5—figure supplement 1C).
  
  —Results section (lines 241 – 250) However, remarkably, reduced GJA1-20k expression results in an almost complete cardiac infarction after I/R injury (Figure 5E, F). Moreover, ROS production after I/R injury was increased in Gja1M213L/WT mice compared to WT post-I/R (Figure 5G, H). There was no significant difference in mitochondria size at the basal condition between WT and Gja1M213L/WT mice adult CMs as with neonatal CMs (Figure 5I, J), whereas the mitochondria size was significantly increased after I/R injury and the heterozygous Gja1M213L/WT mice had larger mitochondria compared to WT mice post-I/R (Figure 5I, J). Interestingly, the area of mitochondrial matrix was also increased, suggesting loss of cristae in Gja1M213L/WT mice heart (Figure 5K, L). These data indicate that even partial deletion of GJA1-20k results in a profoundly impaired response to ischemic stress.
  
  —Discussion section (lines 350 – 357) Because GJA1-20k-induced fission is associated with less ROS production with oxidative stress (Figure 5 – figure supplement 1D, E), the endogenous generation of GJA1-20k and subsequent decreased ROS production could explain a major benefit of pre-conditioning. Of note, genetic GJA1-20k reduction increases infarct size and ROS production post-I/R injury (Figure 5E–H). In addition, the population of damaged mitochondria is significantly increased in heterozygous Gja1M213L/WT mouse heart post-I/R (Figure 5I–L). Therefore, GJA1-20k induced decreases in ROS production could limit the amount of I/R injury induced by myocardial infarction.
  
  It is still unclear to me how GJA1-20k is affecting mitochondrial size and function. Based on previous papers, this peptide localizes to the surface of mitochondria, but it is not clear how, or whether, it directly facilitates actin recruitment. The interplay with the endoplasmic reticulum (ER), which can nucleate actin at sites of mitochondrial fission, was not examined. If actin is driving membrane remodeling, is it mediated by ER crossover at these sites?
  
  We appreciate Reviewer #3’s thoughtful comment and suggestion. Our unpublished data indicate that GJA1-20k has an actin-binding domain, suggesting direct binding and actin dynamics regulation. As shown in Figure 3 in the present study, GJA1-20k recruits actin around mitochondria membrane and their interaction resulted in fission. In addition, as the Reviewer suggested, our preliminary data showed significant increase in ER network in GJA1-20k-transfected cells (Figure below). Therefore, there is the possibility that ER is also involved in GJA1-20k mediated mitochondrial fission, while further research will be required to reveal the detailed mechanisms. In the present manuscript, we would like to focus on the finding that actin is necessary for GJA1-20k-mediated mitochondrial fission but not DRP1.
  
  ER network association with mitochondria is increased in GJA1-20k-transfected cells. Left: Representative fixed cell images of HEK293 cells with GFP-tagged GST or GJA1-20k. ER and mitochondria were labeled by Protein disulfide-isomerase (PDI) and Tom20, respectively. Right: The quantification of Pearson’s correlation between PDI and mitochondria. The graph is expressed as mean ± SD. p values were determined by two-tailed Mann-Whitney U-test. ***p < 0.001.
  
  We have updated the Discussion section to point to this excellent consideration in the future.
  
  —Discussion section (lines 299 – 302) In addition to actin, the endoplasmic reticulum (ER) membrane can be involved in mitochondrial scission (Friedman, Lackner et al. 2011, Tandler, Hoppel et al. 2018). Future studies should be considered whether GJA1-20k induced actin cytoskeleton arrangements involves ER membrane as well.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.05.442750v1
www.biorxiv.org www.biorxiv.org

New submission 18/07/2022, 13:22:16

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The present study by Zander et al. aims at improving our understanding of CD4+ T cell heterogeneity in response to chronic viral infections. The authors utilize the murine LCMV c13 infection model and perform single cell RNA seq analysis on day 10 post infection to identify multiple, previously unappreciated, T cell subsets. The authors then go on and verify these analyses using multi-color flow cytometry before comparing the transcriptome of CD4 T cells from chronic infection to a previously generated data set of CD4 T cells obtained from acutely-resolved LCMV infection.
  
  The analyses are very well done and provide some interesting novel insights. In particular, the comparison of CD4 T cell subsets across acute and chronic infections is very exciting as they provide a very valuable platform that can answer a long-standing question: do CD4 T cells in chronic infection undergo exhaustion similar to CD8 T cells. While this has been proposed for an extended period, this new dataset by Zander et al. can provide some novel insights by comparing individual cell subsets cross-infection. The manuscript would, however, benefit from a more extensive analysis and focus on this interesting point.
  
  We thank the reviewer for their time and careful assessment of our manuscript. We were happy to hear that the reviewer found our work interesting.
  
  On that note, the authors should take advantage of more accurate and present gene datasets to compare the 'dysfunctional' state of CD4 T cells in chronic infection vs acute infection. Also, a different illustration to demonstrate the module score analyses would be more intuitive.
  
  We have now included T cell “exhaustion” genesets from recently published data (Zander et. al 2019 Immunity), and we have also displayed the relative expression of select signature genes from these genesets in an updated supplemental figure 3.
  
  Also, at multiple sections in the manuscript, the authors are missing the accurate citations as they are still mentioned as '(Ref)'.
  
  We apologize for this oversight and have corrected these citations.
  
  Nevertheless, this study does not require major revisions.
  
  Reviewer #2 (Public Review):
  
  In their study "Delineating the transcriptional landscape and clonal diversity of virus-specific CD4+ T cells during chronic viral infection" Zander and co-workers analyze the phenotypic and clonotypic distributions of T cells specific to a LCMV epitope following infection with a chronic LCMV strain in mice. The paper largely follows an earlier study from the same group (Khatun JEM 2021) that has used a similar experimental strategy to analyze T cells responding to an LCMV strain establishing acute infection, and it adds a scTCRseq component to another earlier study of chronic LCMV (Zander Immunity 2022). The main contributions of the paper are to demonstrate that interesting differences between gene expression profiles between chronic and acute LCMV exist, and to identify a new T cell subset (of unknown functional significance).
  
  While the paper is framed around differences between T cell responses to acute and chronic infections, all analysis is done on T cells at day 10 post primary infection. At such an early time point even the acute LCMV strain virus is likely not completely cleared, or at the very least viral antigens are still presented. The relevance of the presented phenotypic differences to other settings with long-term chronic infection is thus questionable. Additionally, there are a number of methodological concerns regarding the robustness of the statistical and bioinformatic analyses that put in doubt some of the conclusions. Most notably, the analysis of fate biases needs to be substantiated by tests against baseline expectations from random assortment to test for statistical significance.
  
  We thank the reviewer for their careful review of our manuscript as well as their helpful comments.
  
  Regarding the day 10 time point-post LCMV Armstrong infection, several groups have previously reported that LCMV viral load is undetectable by day 10 post-infection (see one published example below), although we completely agree with the reviewer that there is still likely to be viral antigens being presented at this time point, as well as ongoing inflammation, which we believe (and as discussed further below) is actually a strength of the study as it allows for a more fair comparison of the transcriptional state of recently stimulated virus-specific CD4 T cells under different contexts (acute vs chronic LCMV infection) . We chose day 10 post LCMV Cl13 and LCMV Armstrong infections as the timepoint for analysis, as this is approximately the peak of the endogenous Gp66-77 CD4+ T cell response (see previously published data below), and is also when there is a more balanced distribution of Th1, Tfh, and T central memory precursor (Tcmp)/ or memory-like cells in these settings, thereby allowing for sufficient numbers of cells/cluster to conduct an in-depth analysis and high-resolution comparison of these subsets between the two different infections. Further, as some degree of TCR stimulation is still likely being experienced at this timepoint during LCMV Armstrong infection, we believe that this is a more useful comparison than at a memory time point (when CD4 T cells are in a quiescent state) as it gives us a better picture of the differentially expressed genes at the peak of the CD4 T cell response, and also provides insight into how chronic viral infection perturbs the transcriptional program of CD4 T cells.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.12.491625v1
www.biorxiv.org www.biorxiv.org

Cytidine triphosphate promotes efficient ParB-dependent DNA condensation by facilitating one-dimensional spreading from parS

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #4 (Public Review):
  
  Francisco et al. investigate the role of CTP and hydrolysis in the binding of ParB to parS sequence and non-specific DNA at the single-molecule level. Using optical tweezers, they show the specific binding of ParB to parS sites, and demonstrate that this process is enhanced by the presence of CTP or CTPS. They find that lower density ParB proteins are also detected in distal non-specific DNA in the presence of parS, and that ParB spreading is restricted by protein roadblocks. Furthermore, using magnetic tweezers, they show that parS-containing DNA molecules are condensed by ParB at nanomolar protein concentration, which requires CTP binding but not hydrolysis. These finding show the significance of CTP-dependent ParB spreading and impact the understanding of the mechanism of DNA bridging and condensation by ParB networks.
  
  Based on these results, the authors propose a model for ParB-mediated DNA condensation, which requires one-dimensional ParB sliding along DNA from parS sites. Overall, the experiments were carefully done and thoroughly controlled. The manuscript provides critical insights that can be strengthened by addressing the following minor concerns:
  
  1) Did the authors observe the diffusion of isolated ParB foci along DNA? This will provide strong evidence for the proposed diffusion/sliding model.
  
  2) Based on the sliding clamp model, ParB spreading and diffusion result in DNA condensation by forming large DNA loops. Is it possible to show the dynamic spreading of ParB while keep the same numbers of ParB on DNA? For example, can the authors incubate ParB-containing DNA in channel 4 (ParB channel) at a certain time for the loading of ParB on parS sites, and then move it to the buffer channel without free ParB as well as with CTP or CTPS, where the images are acquired at the long interval time to minimize the photobleaching. The fluorescent intensity of the ParB during the spreading process can be analyzed. If the intensity remains constant through spreading in the presence of CTPS but significantly decrease in the presence of CTP, this data will strongly demonstrate the proposed spreading and CTP hydrolysis-dependent dissociation mechanism.
  
  We thank the reviewer for these suggestions to prove spreading. However, we decided to follow an alternative strategy based on the direct imaging of QD-labelled ParB. As described above, this strategy worked well and we have directly visualized ParB diffusion from parS sites.
  
  3) In Figure 2, the authors show the spreading of ParB can be blocked by EcoRI. Can the authors show that EcoRI is bound at the specificity positions? The spreading blockage by protein roadblocks showed in optical tweezers experiments potentially hints that the roadblocks may affect the DNA condensation. Can the authors apply the magnetic tweezers to show the affection of protein roadblocks to DNA condensation in vitro?
  
  It is well established that EcoRI has extremely high affinity and specificity for its site (Terry et al., 1983) and so, since we do not have labelled EcoRI mutant, our experiments assume the sites are occupied. This is one reason we have used multiple sites in our experiments. Nevertheless, we have tested the effect of protein roadblocks in condensation in MT experiments. We found partial concentration consistent with the blocking of spreading of ParB from parS (Fig. R5)
  
  Figure R5. ParB diffusion is required for DNA condensation by ParB. (A) Schematic representation of DNA substrate employed in these MT experiments. It contains a set of 5x EcoRI sites located at 3835 bp from the DIG labelled end, and 7x parS. The positions of the EcoRI and parS sites in the DNA cartoon are represented to scale. (B) Condensation assay using the EcoRI 7x parS DNA substrate under different experimental conditions. ParB partially condenses the DNA molecule when EcoRIE111G is present. (C) Quantification of the extension in base pairs of the non-condensed region under different experimental conditions. In the presence of EcoRIE111G, the length of the non-condensed region agrees well with the length of the region flanked by the DIG end and the EcoRI sites.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.02.11.430778v1
www.medrxiv.org www.medrxiv.org

New submission 22/09/2022, 10:19:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  This study evaluates the causal relationship between childhood obesity on the one hand, and childhood emotional and behavioral problems on the other. It applies Mendelian Randomization (MR), a family of methods in statistical genetics that uses genetic markers to break the symmetry between correlated traits, allowing inference of causation rather than mere correlation. The authors argue convincingly that previous studies of these traits, both those using non-genetic observational epidemiology methods and those using standard MR methods, may be confounded by demographic effects and familial effects. One possible example of this kind of confounding is that the idea that obesity in parents may contribute to emotional and behavioral problems in children; another is the idea that adults with emotional and behavioral issues may be more likely to have children with partners who are obese, and vice-versa. They then make use of a recently proposed "within-family" MR method, which should effectively control for these confounders, at the cost of higher uncertainty in the estimated effect size, and therefore lower power to detect small effects. They report that none of the previously reported associations of childhood BMI with anxiety, depression, or ADHD are replicated using the within-family MR method, and that in the case of depression the primary association appears to be with maternal BMI rather than the child's own BMI.
  
  This argument that these confounders may affect these phenotypes is fairly sound, and within-family MR should indeed do a good job of controlling for them. I do not see any major issues with the cohort itself or the choice of genetic instruments. I also do not see any major issues with the definitions or ascertainment of the phenotypes studied, though I am not an expert on any of these phenotypes in particular. I am especially satisfied with the series of analyses demonstrating that the results are robust to many variations of MR methodology. Overall, I think the positive result this study reports is very credible: that the known association between childhood BMI and depression is likely primarily due to an effect of maternal BMI rather than the child's own BMI (though given that paternal BMI has a similar effect size with only a slightly wider confidence interval, I would instead say that the effect is from parental BMI generally, not specifically maternal.)
  
  In the updated results based on the larger genetic data release, the estimates for the association of maternal BMI and paternal BMI with the child’s depressive symptoms are more clearly different than they were in the smaller dataset (for maternal BMI, beta= 0.11, CI:0.02,0.19, p=0.01; for paternal BMI, beta=0.02, CI:-0.09,0.12, p=0.71). Therefore, in this version, it makes sense to note an association with maternal BMI specifically.
  
  The main weakness of the study comes from its negative results, which the authors emphasize as their primary conclusion: that previously reported associations of childhood BMI with anxiety, depression, and ADHD are not replicated using within-family MR methods. These claims do not seem justified by the evidence presented in this study. In fact, in every panel of figures 2 and 3, the error bars for the within-family MR analysis encompass the estimates for both the regression analysis and the traditional MR analysis, suggesting that the within-family analysis provides no evidence one way or another about which of these analyses is more accurate. More generally, in order to convincingly claim that there is no causal relationship between two traits, an MR study must argue that the study would be powered to detect a relationship if one existed. Within-family MR methods are known to have less power to detect associations and less precision to estimate effect sizes than traditional MR methods or traditional observational epidemiology methods, so it is not sufficient to show that these other methods have power to detect the association. To make this kind of claim, it is necessary to include some kind of power analysis, such as a simulation study or analytic power calculations, and likely also a positive control to show that this method does have power to detect known effects in this cohort.
  
  We agree that it is imperative that negative (i.e. “non-significant”) results are correctly interpreted - it is just as important to discover what is unlikely to affect emotional and behavioural outcomes as what does affect them. Negative results (non-significant estimates) are neither a weakness nor strength of the study, but simply reflect the estimation error in our analysis of the data. The key question is whether our within-family MR estimates are sufficiently powered to detect effect sizes of interest or rule out clinically meaningful effect sizes – or are they simply too imprecise to draw any conclusions? As the reviewer suggests, one way to address this is via a post-hoc power calculation. We consider post-hoc power calculations redundant, since all the information about the power of our analysis is reflected in the standard errors and reported confidence intervals. Moreover, any post-hoc power calculation will be necessarily approximate compared to using the standard errors and confidence intervals which we report.
  
  Despite these methodological reservations, we have conducted simulations to estimate the power of our within-family models (the R code is included at the end of this document). These simulations indicate that we do have sufficient power to detect the size of effects seen for depressive symptoms and ADHD in models using the adult BMI PGS. They also indicate that we cannot rule out smaller effects for non-significant associations (e.g., for the impact of the child’s BMI on anxiety). Naturally, this is entirely consistent with the width of the confidence intervals reported in results tables and in Figures 1 and 2. However, although power calculations are important when planning a study, they make little contribution to interpretation once a study has been conducted and confidence intervals are available (e.g., https://psyarxiv.com/tcqrn/). For this reason, we comment on these simulations in this response to reviewers but do not include them in the manuscript or supplementary materials. At the same time, we have changed the language used in the manuscript to be clearer that the results were imprecise and that values contained within the confidence limits cannot be ruled out.
  
  For example, the discussion now includes the following:
  
  ‘However, within-family MR estimates using the childhood body size PGS are still consistent with small effects of the child’s BMI on all outcomes, with upper confidence limits around a 0.2 standard-deviation increase in the outcome per 5kg/m2 increase in BMI.’
  
  And the conclusion of the paper now reads:
  
  ‘Our results suggest that genetic variation associated with BMI in adulthood affects a child’s depressive and ADHD symptoms, but genetic variation associated with recalled childhood body size does not substantially affect these outcomes. There was little evidence that BMI affects anxiety. However, our estimates were imprecise, and these differences may be due to estimation error. There was little evidence that parental BMI affects a child’s ADHD or anxiety symptoms, but factors associated with maternal BMI may independently influence a child’s depressive symptoms. Genetic studies using unrelated individuals, or polygenic scores for adult BMI, may have overestimated the causal effects of a child’s own BMI.’
  
  Regarding a positive control: for analyses of BMI in adults, suitable positive controls would include directly measured biomarkers such as fat mass or blood pressure or reported medical outcomes like type 2 diabetes. In adolescents and younger adults, age at menarche or other measures of puberty can be used, as these are reliably influenced by BMI. However, the age of the participants for whom within-family effects are being estimated (8 years), together with the lack of any biomarkers such as fat mass (due to the questionnaire-based survey design) mean no suitable measures are available.
  
  Reviewer #3 (Public Review):
  
  Higher BMI in childhood is correlated with behavioral problems (e.g. depression and ADHD) and some studies have shown that this relationship may be causal using Mendelian Randomization (MR). However, traditional MR is susceptible to bias due to population stratification, assortative mating, and indirect effects (dynastic effects). To address this issue, Hughes et al. use within-family MR, which should be immune to the above-listed problems. They were unable to find a causal relationship between children's BMI and depression, anxiety, or ADHD. They do, however, report a causal effect of mother's BMI on depression in their children. They conclude that the causal effect of children's BMI on behavioral phenotypes such as depression and anxiety, if present, is very small, and may have been overestimated in previous studies. The analyses have been carried out carefully in a large sample and the paper is presented clearly. Overall, their assertions are justified but given that the conclusions mostly rest on an absence of an effect, I would like to see more discussion on statistical power.
  
  1) The authors show that the estimates of within-family MR are imprecise. It would be helpful to know how much power they have for estimating effect sizes reported previously given their sample size.
  
  As discussed in response to a comment from reviewer 2, the power of our results is already indicated by our standard errors and confidence intervals. Nevertheless, we conducted simulations to estimate the size of effects which we had 80% power to detect. Results, presented below, are consistent with our main results. As discussed in response to a comment from reviewer 2, we consider post-hoc power calculations redundant when standard errors and confidence intervals are reported; for this reason, we include this information in the response to reviewers but not the manuscript itself.
  
  2) They used the correlation between PGS and BMI to support the assertion that the former is a strong instrument. Were the reported correlations calculated across all individuals? Since we know that stratification, assortative mating, and indirect effects can inflate these correlations, perhaps a more unbiased estimate would be the proportion of children's BMI variance explained by their PGS conditioned on the parents' PGS. This should also be the estimate used in power calculations.
  
  The manuscript has been updated to quote Sanderson-Windmeijer conditional R2 values: the proportion of BMI variance explained by the BMI PGS for each member of a trio, conditional on the PGS of the other members of the trio, and all genetic covariates included in within-family models. Similarly, we now show Sanderson-Windmeijer conditional F-statistics for a model including the child, mother, and father’s BMI instrumented by the child, mother, and father’s PGS.
  
  3) In testing the association of mothers' and fathers' BMI with children's symptoms, the authors used a multivariable linear regression conditioning on the child's own BMI. Was the other parent's BMI (either by itself or using the polygenic score) included as a covariate in the multivariable and MR models? This was not entirely clear from the text or from Fig. 2. I suspect that if there were assortative mating on BMI in the parent's generation, the effect of any one parent's BMI on the child's symptoms might be inflated unless the other parent's BMI was included as a covariate (assuming both mother's and father's BMI affect the child's symptoms).
  
  Non-genetic models include both the mother and father’s phenotypic BMI as well as the child’s, allowing estimation of conditional effects of all three. This controls for assortative mating as noted by the reviewer. This was not previously clear - all relevant text and figure captions have been updated to clarify this.
  
  4) They report no evidence of cross-trait assortative mating in the parents generation. The power to detect cross-trait assortative mating in the parents' generation using PGS would depend on the actual strength of assortative mating and the respective proportions of trait variance explained by PGS. Could the authors provide an estimate of the power for this test in their sample?
  
  We have updated the discussion of assortative mating (in both the results and the discussion section) to note possible limitations of power and clarify that that this approach to examining assortment may not capture its full extent.
  
  The relevant part of the results section now reads:
  
  “In the parents’ generation, phenotypes were associated within parental pairs, consistent with assortative mating on these traits (Appendix 1 – Table 5). Adjusted for ancestry and other genetic covariates, maternal and paternal BMI were positively associated (beta: 0.23, 95%CI: 0.22,0.25, p<0.001), as were maternal and paternal depressive symptoms (beta: 0.18, 95%CI: 0.16,0.20, p<0.001), and maternal and paternal ADHD symptoms (beta: 0.11, 95%CI: 0.09,0.13, p<0.001). Consistent with cross-trait assortative mating, there was an association of mother’s BMI with father’s ADHD symptoms (beta: 0.03, 95%CI: 0.02,0.05, p<0.001) and mother’s ADHD symptoms with father’s depressive symptoms (beta: 0.05,95%CI: 0.05,0.06, p<0.001). Phenotypic associations can reflect the influence of one partner on another as well as selection into partnerships, but regression models of paternal polygenic scores on maternal polygenic scores also pointed to a degree of assortative mating. Adjusted for ancestry and genotyping covariates, there were small associations between parents’ BMI polygenic scores (beta: 0.01, 95%CI: 0.00,0.02, p=0.02 for the adult BMI PGS, and beta: 0.01, 95%CI: 0.00,0.02, p=0.008 for the childhood body size PGS), and of the mother’s childhood body size PGS with the father’s ADHD PGS (beta: 0.01, 95%CI: 0.00,0.02, p=0.03). We did not detect associations with pairs of other polygenic scores, which may be due to insufficient statistical power.”
  
  And the relevant part of the discussion section now reads:
  
  “We found some genomic evidence of assortative mating for BMI, and cross-trait assortative mating between BMI and ADHD, but not between other traits. However, associations between polygenic scores, which only capture some of the genetic variation associated with these phenotypes, may not capture the full extent of genetic assortment on these traits.”
  
  5) Are the actual phenotypes (BMI, depression or ADHD) correlated between the parents? If so, would this not suffice as evidence of cross-trait assortative mating? It is known that the genetic correlation between parents as a result of assortative mating is a function of the correlation in their phenotypes and the heritabilities underlying the two traits (e.g., see Yengo and Visscher 2018). An alternative way to estimate the genetic correlation between parents without using PGS (which is noisy and therefore underpowered) would be to use the phenotypic correlation and heritability estimated using GREML or LDSC. Perhaps this is outside the scope of the paper but I would like to hear the author's thoughts on this.
  
  Associations between maternal and paternal phenotypes are consistent with a degree of assortative mating (shown below). These results have added to Appendix 1 - Table 5, which also shows associations between maternal and paternal polygenic scores, and methods and results updated accordingly (see quoted text in response to the comment above). For comparability, both sets of results are based on regression models adjusting for the mother’s and father’s ancestry PCs and genotyping covariates. We agree that analysis of assortative mating using GREML or LDSC is out of scope for this paper. As noted above, we have updated the discussion to acknowledge the limitations of the approach taken:
  
  ‘We found some genomic evidence of assortative mating for BMI, and cross-trait assortative mating between BMI and ADHD, but not between other traits. However, associations between polygenic scores, which only capture some of the genetic variation associated with these phenotypes, may not capture the full extent of genetic assortment on these traits.’
  
  6) It would be helpful to include power calculations for the MR-Egger intercept estimates.
  
  As with our response to the comments above, post-hoc power calculations are redundant, as all the information about the power of our analysis, including the MR-Egger is indicated by the standard errors and confidence intervals. MR-Egger is less precise than other estimators, as is made clear from the wide confidence intervals reported in the relevant tables (Appendix 1 - Tables 8 and 9). However, we have now updated the discussion to give more weight to this as a limitation. The discussion of pleiotropy in the final paragraph of the discussion now reads:
  
  ‘While robustness checks found little evidence of pleiotropy, these methods rely on assumptions. Moreover, MR-Egger is known to give imprecise estimates (Burgess and Thompson 2017), and confidence intervals from MR-Egger models were wide. Thus, pleiotropy cannot be ruled out.’
  
  Similarly, we have updated the relevant line of the results section, which now reads:
  
  ‘MR-Egger models found little evidence of horizontal pleiotropy, although MR-Egger estimates were imprecise (Appendix 1 - Tables 8 and 9).’
  
  7) Finally, what is the correlation between PGS and genetic PCs/geography in their sample? A correlation might provide evidence to support the point that classic MR effects are inflated due to stratification.
  
  Figures presenting the association of the child’s BMI polygenic scores and their PCs have been added to the supplementary information as Appendix 1 - Figure 2 and Appendix 1 - Figure 3. Consistent with an influence of residual stratification, a regression of the child’s BMI polygenic scores against their ancestry PCs (adjusting for genotyping centre and chip) found that 7 of the 20 PCs were associated at p<0.05 with the adult BMI PGS, and 8 of 20 with the childhood body size PGS (under the null hypothesis, we would expect one association in each case). When parental polygenic scores were added to the models, these associations attenuated towards to null.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2021.09.17.21263612v1
www.biorxiv.org www.biorxiv.org

New submission 22/07/2022, 09:10:10

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This manuscript seeks to identify the mechanism underlying priority effects in a plantmicrobe-pollinator model system and to explore its evolutionary and functional consequences. The manuscript first documents alternative community states in the wild: flowers tend to be strongly dominated by either bacteria or yeast but not both. Then lab experiments are used to show that bacteria lower the nectar pH, which inhibits yeast - thereby identifying a mechanism for the observed priority effect. The authors then perform an experimental evolution unfortunately experiment which shows that yeast can evolve tolerance to a lower pH. Finally, the authors show that low-pH nectar reduces pollinator consumption, suggesting a functional impact on the plant-pollinator system. Together, these multiple lines of evidence build a strong case that pH has far-reaching effects on the microbial community and beyond.
  
  The paper is notable for the diverse approaches taken, including field observations, lab microbial competition and evolution experiments, genome resequencing of evolved strains, and field experiments with artificial flowers and nectar. This breadth can sometimes seem a bit overwhelming. The model system has been well developed by this group and is simple enough to dissect but also relevant and realistic. Whether the mechanism and interactions observed in this system can be extrapolated to other systems remains to be seen. The experimental design is generally sound. In terms of methods, the abundance of bacteria and yeast is measured using colony counts, and given that most microbes are uncultivable, it is important to show that these colony counts reflect true cell abundance in the nectar.
  
  We have revised the text to address the relationship between cell counts and colony counts with nectar microbes. Specifically, we point out that our previous work (Peay et al. 2012) established a close correlation between CFUs and cell densities (r2 = 0.76) for six species of nectar yeasts isolated from D. aurantiacus nectar at Jasper Ridge, including M. reukaufii.
  
  As for A. nectaris, we used a flow cytometric sorting technique to examine the relationship between cell density and CFU (figure supplement 1). This result should be viewed as preliminary given the low level of replication, but this relationship also appears to be linear, as shown below, indicating that colony counts likely reflect true cell abundance of this species in nectar.
  
  It remains uncertain how closely CFU reflects total cell abundance of the entire bacterial and fungal community in nectar. However, a close association is possible and may be even likely given the data above, showing a close correlation between CFU and total cell count for several yeast species and A. nectaris, which are indicated by our data to be dominant species in nectar.
  
  We have added the above points in the manuscript (lines 263-264, 938-932).
  
  The genome resequencing to identify pH-driven mutations is, in my mind, the least connected and developed part of the manuscript, and could be removed to sharpen and shorten the manuscript.
  
  We appreciate this perspective. However, given the disagreement between this perspective and reviewer 2’s, which asks for a more expanded section, we have decided to add a few additional lines (lines 628-637), briefly expanding on the genomic differences between strains evolved in bacteria-conditioned nectar and those evolved in low-pH nectar.
  
  Overall, I think the authors achieve their aims of identifying a mechanism (pH) for the priority effect of early-colonizing bacteria on later-arriving yeast. The evolution and pollinator experiments show that pH has the potential for broader effects too. It is surprising that the authors do not discuss the inverse priority effect of early-arriving yeast on later-arriving bacteria, beyond a supplemental figure. Understandably this part of the story may warrant a separate manuscript.
  
  We would like to point out that, in our original manuscript, we did discuss the inverse priority effects, referring to relevant findings that we previously reported (Tucker and Fukami 2014, Dhami et al. 2016 and 2018, Vannette and Fukami 2018). Specifically, we wrote that: “when yeast arrive first to nectar, they deplete nutrients such as amino acids and limit subsequent bacterial growth, thereby avoiding pH-driven suppression that would happen if bacteria were initially more abundant (Tucker and Fukami 2014; Vannette and Fukami 2018)” (lines 385-388). However, we now realize that this brief mention of the inverse priority effects was not sufficiently linked to our motivation for focusing mainly on the priority effects of bacteria on yeast in the present paper. Accordingly, we added the following sentences: “Since our previous papers sought to elucidate priority effects of early-arriving yeast, here we focus primarily on the other side of the priority effects, where initial dominance of bacteria inhibits yeast growth.” (lines 398-401).
  
  I anticipate this paper will have a significant impact because it is a nice model for how one might identify and validate a mechanism for community-level interactions. I suspect it will be cited as a rare example of the mechanistic basis of priority effects, even across many systems (not just pollinator-microbe systems). It illustrates nicely a more general ecological phenomenon and is presented in a way that is accessible to a broader audience.
  
  Thank you for this positive assessment.
  
  Reviewer #2 (Public Review):
  
  The manuscript "pH as an eco-evolutionary driver of priority effects" by Chappell et al illustrates how a single driver-microbial-induced pH change can affect multiple levels of species interactions including microbial community structure, microbial evolutionary change, and hummingbird nectar consumption (potentially influencing both microbial dispersal and plant reproduction). It is an elegant study with different interacting parts: from laboratory to field experiments addressing mechanism, condition, evolution, and functional consequences. It will likely be of interest to a wide audience and has implications for microbial, plant, and animal ecology and evolution.
  
  This is a well-written manuscript, with generally clear and informative figures. It represents a large body and variety of work that is novel and relevant (all major strengths).
  
  We appreciate this positive assessment.
  
  Overall, the authors' claims and conclusions are justified by the data. There are a few things that could be addressed in more detail in the manuscript. The most important weakness in terms of lack of information/discussion is that it looks like there are just as many or more genomic differences between the bacterial-conditioned evolved strains and the low-pH evolved strains than there are between these and the normal nectar media evolved strains. I don't think this negates the main conclusion that pH is the primary driver of priority effects in this system, but it does open the question of what you are missing when you focus only on pH. I would like to see a discussion of the differences between bacteria-conditioned vs. low-pH evolved strains.
  
  We agree with the reviewer and have included an expanded discussion in the revised manuscript [lines 628-637]. Specifically, to show overall genomic variation between treatments, we calculated genome-wide Fst comparing the various nectar conditions. We found that Fst was 0.0013, 0.0014, and 0.0015 for the low-pH vs. normal, low pH vs. bacteria-conditioned, and bacteria-conditioned vs. normal comparisons, respectively. The similarity between all treatments suggests that the differences between bacteria-conditioned and low pH are comparable to each treatment compared to normal. This result highlights that, although our phenotypic data suggest alterations to pH as the most important factor for this priority effect, it still may be one of many affecting the coevolutionary dynamics of wild yeast in the microbial communities they are part of. In the full community context in which these microbes grow in the field, multi-species interactions, environmental microclimates, etc. likely also play a role in rapid adaptation of these microbes which was not investigated in the current study.
  
  Based on this overall picture, we have included additional discussion focusing on the effect of pH on evolution of stronger resistance to priority effects. We compared genomic differences between bacteria-conditioned and low-pH evolved strains, drawing the reader’s attention to specific differences in source data 14-15. Loci that varied between the low pH and bacteria-conditioned treatments occurred in genes associated with protein folding, amino acid biosynthesis, and metabolism.
  
  Reviewer #3 (Public Review):
  
  This work seeks to identify a common factor governing priority effects, including mechanism, condition, evolution, and functional consequences. It is suggested that environmental pH is the main factor that explains various aspects of priority effects across levels of biological organization. Building upon this well-studied nectar microbiome system, it is suggested that pH-mediated priority effects give rise to bacterial and yeast dominance as alternative community states. Furthermore, pH determines both the strengths and limits of priority effects through rapid evolution, with functional consequences for the host plant's reproduction. These data contribute to ongoing discussions of deterministic and stochastic drivers of community assembly processes.
  
  Strengths:
  
  Provides multiple lines of field and laboratory evidence to show that pH is the main factor shaping priority effects in the nectar microbiome. Field surveys characterize the distribution of microbial communities with flowers frequently dominated by either bacteria or yeast, suggesting that inhibitory priority effects explain these patterns. Microcosm experiments showed that A. nectaris (bacteria) showed negative inhibitory priority effects against M. reukaffi (yeast). Furthermore, high densities of bacteria were correlated with lower pH potentially due to bacteria-induced reduction in nectar pH. Experimental evolution showed that yeast evolved in low-pH and bacteria-conditioned treatments were less affected by priority effects as compared to ancestral yeast populations. This potentially explains the variation of bacteria-dominated flowers observed in the field, as yeast rapidly evolves resistance to bacterial priority effects. Genome sequencing further reveals that phenotypic changes in low-pH and bacteriaconditioned nectar treatments corresponded to genomic variation. Lastly, a field experiment showed that low nectar pH reduced flower visitation by hummingbirds. pH not only affected microbial priority effects but also has functional consequences for host plants.
  
  We appreciate this positive assessment.
  
  Weaknesses:
  
  The conclusions of this paper are generally well-supported by the data, but some aspects of the experiments and analysis need to be clarified and expanded.
  
  The authors imply that in their field surveys flowers were frequently dominated by bacteria or yeast, but rarely together. The authors argue that the distributional patterns of bacteria and yeast are therefore indicative of alternative states. In each of the 12 sites, 96 flowers were sampled for nectar microbes. However, it's unclear to what degree the spatial proximity of flowers within each of the sampled sites biased the observed distribution patterns. Furthermore, seasonal patterns may also influence microbial distribution patterns, especially in the case of co-dominated flowers. Temperature and moisture might influence the dominance patterns of bacteria and yeast.
  
  We agree that these factors could potentially explain the presented results. Accordingly, we conducted spatial and seasonal analyses of the data, which we detail below and include in two new paragraphs in the manuscript [lines 290-309].
  
  First, to determine whether spatial proximity influenced yeast and bacterial CFUs, we regressed the geographic distance between all possible pairs of plants to the difference in bacterial or fungal abundance between the paired plants. If plant location affected microbial abundance, one should see a positive relationship between distance and the difference in microbial abundance between a given pair of plants: a pair of plants that were more distantly located from each other should be, on average, more different in microbial abundance. Contrary to this expectation, we found no significant relationship between distance and the difference in bacterial colonization (A, p=0.07, R2=0.0003) and a small negative association between distance and the difference in fungal colonization (B, p<0.05, R2=0.004). Thus, there was no obvious overall spatial pattern in whether flowers were dominated by yeast or bacteria.
  
  Next, to determine whether climatic factors or seasonality affected the colonization of bacteria and yeast per plant, we used a linear mixed model predicting the average bacteria and yeast density per plant from average annual temperature, temperature seasonality, and annual precipitation at each site, the date the site was sampled, and the site location and plant as nested random effects. We found that none of these variables were significantly associated with the density of bacteria and yeast in each plant.
  
  To look at seasonality, we also re-ordered Fig 2C, which shows the abundance of bacteria- and yeast-dominated flowers at each site, so that the sites are now listed in order of sampling dates. In this re-ordered figure, there is no obvious trend in the number of flowers dominated by yeast throughout the period sampled (6.23 to 7/9), giving additional indication that seasonality was unlikely to affect the results.
  
  Additionally, sampling date does not seem to strongly predict bacterial or fungal density within each flower when plotted.
  
  These additional analyses, now included (figure supplements 2-4) and described (lines 290-309) in the manuscript, indicate that the observed microbial distribution patterns are unlikely to have been strongly influenced by spatial proximity, temperature, moisture, or seasonality, reinforcing the possibility that the distribution patterns instead indicate bacterial and yeast dominance as alternative stable states.
  
  The authors exposed yeast to nectar treatments varying in pH levels. Using experimental evolution approaches, the authors determined that yeast grown in low pH nectar treatments were more resistant to priority effects by bacteria. The metric used to determine the bacteria's priority effect strength on yeast does not seem to take into account factors that limit growth, such as the environmental carrying capacity. In addition, yeast evolves in normal (pH =6) and low pH (3) nectar treatments, but it's unclear how resistance differs across a range of pH levels (ranging from low to high pH) and affects the cost of yeast resistance to bacteria priority effects. The cost of resistance may influence yeast life-history traits.
  
  The strength of bacterial priority effects on yeast was calculated using the metric we previously published in Vannette and Fukami (2014): PE = log(BY/(-Y)) - log(YB/(Y-)), where BY and YB represent the final yeast density when early arrival (day 0 of the experiment) was by bacteria or yeast, followed by late arrival by yeast or bacteria (day 2), respectively, and -Y and Y- represent the final density of yeast in monoculture when they were introduced late or early, respectively. This metric does not incorporate carrying capacity. However, it does compare how each microbial species grows alone, relative to growth before or after a competitor. In this way, our metric compares environmental differences between treatments while also taking into account growth differences between strains.
  
  Here we also present additional growth data to address the reviewer’s point about carrying capacity. Our experiments that compared ancestral and evolved yeast were conducted over the course of two days of growth. In preliminary monoculture growth experiments of each evolved strain, we found that yeast populations did reach carrying capacity over the course of the two-day experiment and population size declined or stayed constant after three and four days of growth.
  
  However, we found no significant difference in monoculture growth between the ancestral stains and any of the evolved strains, as shown in Figure supplement 12B. This lack of significant difference in monoculture suggests that differences in intrinsic growth rate do not fully explain the priority effects results we present. Instead, differences in growth were specific to yeast’s response to early arrival by bacteria.
  
  We also appreciate the reviewer’s comment about how yeast evolves resistance across a range of pH levels, as well as the effect of pH on yeast life-history traits. In fact, reviewer #2 pointed out an interesting trade-off in life history traits between growth and resistance to priority effects that we now include in the discussion (lines 535-551) as well as a figure in the manuscript (Figure 8).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.19.487947v1
www.biorxiv.org www.biorxiv.org

Multistep loading of PCNA onto DNA by RFC

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Schrecker, Castaneda and colleagues present cryo-EM structures of RFC-PCNA bound to 3'ss/dsDNA junction or nicked DNA stabilized by slowly hydrolyzable ATP analogue, ATPyS. They discover that PCNA can adopt an open form that is planar, different from previous models for the loading a sliding clamp. The authors also report a structure with closed PCNA, supporting the notion that closure of the sliding clamp does not require ATP hydrolysis. The structures explain how DNA can be threaded laterally through a gap in the PCNA trimer, as this process is supported by partial melting of the DNA prior to insertion. The authors also visualise and assign a function to the N-terminal domain in the Rfc1 subunit of the clamp loader, which they find modulates PCNA loading at the replication forks, in turn required for processive synthesis and ligation of Okazaki fragments.
  
  This work is extremely well done, with several structures with resolutions better than 3Å, which a significant achievement given the dynamic nature of the PCNA ring loading process. To investigate the role of the N-terminal domain of Rfc1 in PCNA loading, the authors use in vitro reconstitution of the entire DNA replication reaction, which is a powerful method to identify specific defects in Okazaki fragment synthesis and ligation.
  
  Important issues
  
  Figure 3B,D,F. I would find them much more informative if the authors showed the overlay between atomic model and cryo-EM density in the main figure. If the figure becomes too busy, the authors could decide to just add additional panels with the overlay as well as the atomic models alone. I do not think that showing segmented density for the DNA alone, as done is Figure 6C is sufficient. Also including the density for e.g. residues Trp638 and Phe582 seems important.
  
  We thank the reviewer for the suggestion. However, we have been unable to establish a way to show the density for both the protein and DNA in a meaningful manner due to the large number of atoms in the fields of view. For an example, please see Figure 1, which corresponds to Figure 3H. To aid the reader, we have revised several of the Figures and Figure Supplements to include density for the DNA.
  
  Consistent with our structures, recent work from the Kelch group has identified Trp638 and Phe582 as facilitating DNA base flipping (Gaubitz et al., 2022a). Despite the role in base flipping, no growth defects were observed in cells in which either of these residues were mutated and thus their functional role and the role of DNA base-flipping remains unclear.
  
  Cryo-EM samples preparation included substoichiometric RPA, which has been shown to promote DNA loading of PCNA by RFC. Would the authors expect a subset of PCNA-RFC-DNA particles to contain RPA as well? The glycerol gradient gel indicates that, at least in fraction 5, a complex might exist. If the authors think that the particles analyzed cannot contain RPA, it would be useful to mention this.
  
  We have no evidence to suggest that RPA cannot be present in the imaged particles. We have revised the text (lines 150 - 152) clarify that while RPA was present in the sample, we did not observe any density that could not be assigned to either DNA, RFC or PCNA. We therefore suggest that RPA does not interact with the complex in a stable manner.
  
  Published kinetic data indicate that ATP hydrolysis occurs before clamp closure. To incorporate this notion in their model, the authors suggest that ATP hydrolysis might promote PCNA closure by disrupting the planar RFC:PCNA interaction surface and hence the dynamic interaction of PCNA with Rfc2 and -5 in the open state. In addition, ATP hydrolysis promotes RFC disengagement from PCNA-DNA by reverting from a planar to an out-of-plane state. This model appears reasonable and nicely combines published data with the new findings reported by the authors. However, the model is oversimplified in Figure 6, where the only depicted effect of ATP hydrolysis is RFC release. Perhaps the authors could use the figure caption to acknowledge that ATP hydrolysis likely still has a role in facilitating PCNA closure.
  
  We have revised Figure 6 to show that DNA hydrolysis may occur either before or after ring closure.
  
  Can the authors explain what steps should be taken to describe PCNA loading by RFC in conditions where ATP hydrolysis is permitted? How would such experiments further inform the molecular mechanism for the loading of the PCNA clamp?
  
  As highlighted in point 3 above and by the other reviewers, ATP and ATPgS may alter the behavior and energetic landscape of RFC. In our studies, ATPgS was added trap the complex in a pre-hydrolysis state in which all components are assembled. We have added a section to the discussion noting the potential differences and highlighting the need for future studies to better elucidate the role of nucleotide hydrolysis. To achieve a hydrolysis competent complex, one could apply time-resolved cryo-EM approaches where the complex is formed on the grids and quickly vitrified. Such an approach, particularly if coupled with stopped-flow kinetic analyses, may provide additional insights in the kinetics of loading of PCNA onto DNA by RFC.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.02.09.479782v1
www.biorxiv.org www.biorxiv.org

Using synchronized brain rhythms to bias memory-guided decisions

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  The following is the authors’ response to the original reviews.
  
  Reviewer #1 (Recommendations For The Authors):
  
  The brain-machine interface used in this study differs from typical BMIs in that it's not intended to give subjects voluntary control over their environment. However, it is possible that rats may become aware of their ability to manipulate trial start times using their neural activity. Is there any evidence that the time required to initiate trials on high-coherence or low-coherence trials decreases with experience?
  
  This is a great question. First, we designed the experiment to avoid this possibility. Rats were experienced on the sequence of the automatic maze both pre and post implantation (totaling to weeks of pre-training and habituation). As such, the majority of the trials ever experienced by the rat were not controlled by their neural activity. During BMI experimentation, only 10% of trials were triggered during high coherence states and 10% for low coherence states, leaving ~80% of trials not controlled by their neural activity. We also implemented a pseudo-randomized trial sequence. When considered together, we specifically designed this experiment to avoid the possibility that rats would actively use their neural activity to control the maze.
  
  Second, we had a similar question when collecting data for this manuscript and so we conducted a pilot experiment. We took 3 rats from experiment #1 (after its completion) and we required them to perform “forced-runs” over the course of 3-4 days, a task where rats navigate to a reward zone and are rewarded with a chocolate pellet. The trajectory on “forced-runs” is predetermined and rats were always rewarded for navigating along the predetermined route. Every trial was initiated by strong mPFC-hippocampal theta coherence. We were curious as to whether time-to-trial-onset would decrease if we repeatedly paired trial onset to strong mPFC-hippocampal theta coherence. 1 out of 3 rats (rat 21-35) showed a significant correlation between time-to-trial onset and trial number, indicating that our threshold for strong mPFC-hippocampal theta coherence was being met more quickly with experience (Figure R1A). When looking over sessions and rats, there was considerable variability in the magnitude of this correlation and sometimes even the direction (Figure R1B). As such, the degree to which rat 21-35 was aware of controlling the environment by reaching strong mPFC-hippocampal theta coherence is unclear, but this question requires future experimentation.
  
  Author response image 1.
  
  Strong mPFC-hippocampal theta coherence was used to control trial onset for the entirety of forced-navigation sessions. Time-to-trial onset is a measurement of how long it took for strong coherence to be met. A) Time-to-trial onset was averaged across sessions for each rat, then plotted as a function of trial number (within-session experience on the forced-runs task). Rat 21-35 showed a significant negative correlation between time-to-trial onset and trial number, indicating that time-to-coherence reduced with experience. The rest of the rats did not display this effect. B) Correlation between trial-onset and trial number (y-axis; see A) across sessions (x-axis). A majority of sessions showed a negative correlation between time-to-trial onset and trial number, like what was seen in (A), but the magnitude and sometimes direction of this effect varied considerably even within an animal.
  
  Is there any evidence that rats display better performance on trials with random delays in which HPC-PFC coherence was naturally elevated?
  
  This question is now addressed in Extended Figure 5 and discussed in the section titled “strong prefrontal-hippocampal theta coherence leads to correct choices on a spatial working memory task”.
  
  The introduction frames this study as a test of the "communication through coherence" hypothesis. In its strongest form, this hypothesis states that oscillatory synchronization is a pre-requisite for inter-areal communication, i.e. if two areas are not synchronized, they cannot transfer information. Recent experimental evidence shows this relationship is more likely inverted-coherence is a consequence of inter-areal interactions, rather than a cause. See Schneider et al. (DOI: 10.1016/j.neuron.2021.09.037) and Vinck et al. (10.1016/j.neuron.2023.03.015) for a more in-depth explanation of this distinction. The authors should expand their treatment of this hypothesis in light of these findings.
  
  Our introduction and discussions have sections dedicated to these studies now.
  
  Figure 6 - It would be much more intuitive to use the labels "Rat 1", "Rat 2", and "Rat 3"; the "21-4X" identifiers are confusing.
  
  This was corrected in the paper.
  
  Figure 6C - The sub-plots within this figure are rather small and difficult to interpret. The figure would be easier to parse if the data were presented as a heatmap of the ratio of theta power during blue vs. red stim, with each pixel corresponding to one channel.
  
  This suggestion was implemented in the paper. See Fig 6C. Extended Fig. 8 now shows the power spectra as a function of recording shank and channel.
  
  Ext. Figure 2B - What happens during an acquisition failure? Instead of "Amount of LFP data," consider using "Buffer size".
  
  Corrected.
  
  Ext. Figure 2D-E - Instead of "Amount of data," consider using "Window size"
  
  Referred to as buffer size.
  
  Ext. Figure 2E - y-axis should extend down to 4 Hz. Are all of the last four values exactly at 8 Hz?
  
  Yes. Values plateau at 8Hz. These data represent an average over ~50 samples.
  
  Ext. Figure 2F - consider moving this before D/E, since those panels are summaries of panel F
  
  Corrected.
  
  Ext. Figure 4A - ANOVA tells you that accuracy is impacted by delay duration, but not what that impact is. A post-hoc test is required to show that long delays lead to lower accuracy than short ones. Alternatively, one could compute the correlation between delay duration and proportion correctly for each mouse, and look for significant negative values.
  
  We included supplemental analyses in Extended Fig. 4
  
  Reviewer #2 (Recommendations For The Authors):
  
  The authors should replace terms that suggest a causal relationship between PFC-HPC synchrony and behavior, such as 'leads to', 'biases', and 'enhances' with more neutral terms.
  
  Causal implications were toned down and wherever “leads” or “led” remains, we specifically mean in the context of coherence being detected prior to a choice being made.
  
  The rationale for the analysis described in the paragraph starting on line 324, and how it fits with the preceding results, was not clear to me. The authors also write at the start of this paragraph "Given that mPFC-hippocampal theta coherence fluctuated in a periodical manner (Extended Fig. 5B)", but this figure only shows example data from 2 trials.
  
  The reviewer is correct. While we point towards 3 examples in the manuscript now, we focused this section on the autocorrelation analysis, which did not support our observation as we noticed a rather linear decay in correlation over time. As such, the periodicity observed was almost certainly a consequence of overlapping data in the epochs used to calculate coherence rather than intrinsic periodicity.
  
  Shortly after the start of the results section (line 112), the authors go into a very detailed description of how they validated their BMI without first describing what the BMI actually does. This made this and the subsequent paragraphs difficult to follow. I suggest the authors start with a general description of the BMI (and the general experiment) before going into the details.
  
  Corrected. See first paragraph of “Development of a closed-loop…”.
  
  In Figure 2C, as expected, around the onset of 'high' coherence trials, there is an increase in theta coherence but this appears to be very transient. However, it is unclear what the heatmap represents: is it a single trial, single session, an average across animals, or something else? In Figure 3F, however, the increase appears to be much more sustained.
  
  The sample size was rats for every panel in this figure. This was clarified at the end of Fig. 3.
  
  In Figure 2D, it was not clear to me what units of measurement are used when the averages and error bars are calculated. What is the 'n' here? Animals or sessions? This should be made clear in this figure as well as in other figures.
  
  The sample size is rats. This is now clarified at the end of Fig 2.
  
  Describing the study of Jones and Wilson (2005), the authors write: "While foundational, this study treated the dependent variable (choice accuracy) as independent to test the effect of choice outcome on task performance." (line 83) It was not clear to me what is meant by "dependent" and "independent" here. Explaining this more clearly might clarify how the authors' study goes beyond this and other previous studies.
  
  The reviewer is correct. A discussion on independent/dependent variables in the context of rationale for our experiment was removed.
  
  Reviewer #3 (Recommendations For The Authors):
  
  As explained in the public review, my comments mainly concern the interpretation of the experimental paradigm and its link with previous findings. I think modifying these in order to target the specific advance allowed by the paradigm would really improve the match between the experimental and analytical data that is very solid and the author's conclusions.
  
  Concerning the paradigm, I recommend that the authors focus more on their novel ability to clearly dissociate the functional role of theta coherence prior to the choice as opposed to induced by the choice. Currently, they explain by contrasting previous studies based on dependent variables whereas their approach uses an independent variable. I was a bit confused by this, particularly because the task variable is not really independent given that it's based on a brain-driven loop. Since theta coherence remains correlated with many other neurophysiological variables, the results cannot go beyond showing that leading up to the decision it correlates with good choice accuracy, without providing evidence that it is theta coherence itself that enhances this accuracy as they suggest in lines 93-94.
  
  The reviewer is correct. A discussion on independent/dependent variables in the context of rationale for our experiment was removed.
  
  Regarding previous results with muscimol inactivation, I recommend that the authors expand their discussion on this point. I think that their correlative data is not sufficient to conclude as they do that despite "these structures being deemed unnecessary" (based on causal muscimol experiments), they "can still contribute rather significantly" since their findings do not show a contribution, merely a correlation. This extra discussion could include possible explanations of the apparent, and thought-provoking discrepancies that they uncover such as: theta coherence may be a correlate of good accuracy without an underlying causal relation, theta coherence may always correlate with good accuracy but only be causally important in some tasks related to spatial working memory or, since muscimol experiments leave the brain time to adapt to the inactivation, redundancy between brain areas may mask their implication in the physiological context in certain tasks (see Goshen et al 2011).
  
  The second paragraph of the discussion is now dedicated to this.
  
  Possible further analysis :
  
  In Extended 4A the authors show that performance drops with delay duration. It would be very interesting to see this graph with the high coherence / low coherence / yoked trials to see if the theta coherence is most important for longer trials for example.
  
  This is a great suggestion. Due to 10% of trials being triggered by high coherence states, our sample size precludes a robust analysis as suggested. Given that we found an enhancement effect on a task with minimal spatial working memory requirements (Fig. 4), it seems that coherence may be a general benefit or consequence of choice processes. Nonetheless, this remains an important question to address in a future study.
  
  Figure 6: The authors explain in the text that although the effect of stimulation of VMT is variable, overall VMT activation increased PFC-HPC coherence. I think in the figure the results are only shown for one rat and session per panel. It would be interesting to add a figure including their whole data set to show the overall effect as well as the variability.
  
  The reviewer is correct and this comment promoted significant addition of detail to the manuscript. We have added an extended figure (Ext. Fig. 9) showing our VMT stimulation recording sessions. We originally did not include these because we were performing a parameter search to understanding if VMT stimulation could increase mPFC-hippocampal theta coherence. The results section was expanded accordingly.
  
  Changes to writing / figures :
  
  The paper by Eliav et al, 2018 is cited to illustrate the universality of coupling between hippocampal rhythms and spikes whereas the main finding of this paper is that spikes lock to non-rhythmic LFP in the bat hippocampus. It seems inappropriate to cite this paper in the sentence on line 65.
  
  We agree with the reviewer and this citation was removed.
  
  Line 180 when explaining the protocol, it would help comprehension if the authors clearly stated that "trial initiation" means opening the door to allow the rat to make its choice. I was initially unfamiliar with the paradigm and didn't figure this out immediately.
  
  We added a description to the second paragraph of our first results section.
  
  Lines 324 and following: the analysis shows that there is a slow decay over around 2s of the theta coherence but not that it is periodical (as in regularly occurring in time), this would require the auto-correlation to show another bump at the timescale corresponding to the period of the signal. I recommend the authors use a different terminology.
  
  This comment is now addressed above in our response to Reviewer #2.
  
  Lines 344: I am not sure why the stable theta coherence levels during the fixed delay phase show that the link with task performance is "through mechanisms specific to choice". Could the authors elaborate on this?
  
  We elaborated on this point further at the end of “Trials initiated by strong prefrontal-hippocampal theta coherence are characterized by prominent prefrontal theta rhythms and heightened pre-choice prefrontal-hippocampal synchrony”
  
  Line 85: "independent to test the effect of choice outcome on task performance." I think there is a typo here and "choice outcome" should be "theta coherence".
  
  The sentence was removed in the updated draft.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.04.02.535279v4
www.biorxiv.org www.biorxiv.org

New submission 01/07/2022, 17:52:03

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer 1
  
  Employing in vitro and Drosophila model, the authors interrogate which domain of Hsp27 binds to which region on Tau, and how these interactions facilitate the proteinaceous aggregation. They utilized various biochemical, biophysical, cellular, and genetic tools to dissect the association, and identified the structural basis for the specific recognition of Hsp27 to pathogenic p-Tau. Conceivably, Hsp27 may play some role in preventing Tau abnormal aggregation and p-Tau pathology in AD. Overall, the data support the main claim, especially, the biophysical data are very impressive. Nevertheless, the manuscript could be strengthened by complementary cellular or biochemical methods for validation. For example, the authors can use a stably transfected Tau cell line to interrogate Hsp27's role in its cellular aggregation or proteinaceous inclusions by immunoblotting. Immunofluorescent and immunohistochemical staining and IB with different antibodies may be conducted to validate the observations.
  
  REPLY: We sincerely thank the reviewer for the positive assessment of our work, and for providing very insightful suggestions. We appreciate the reviewer for considering our biophysical data to be impressive. We totally agree with the reviewer that the work could be strengthened by complementary cellular methods for validation. In our work, we used the Drosophila tauopathy model, where expression of human TauR406W in the Drosophila nervous system leads to age-dependent neurodegeneration recapitulating some of the salient features of tauopathy in FTDP-171,2, to interrogate the role of Hsp27 in aggregation and proteinaceous inclusions of pTau.
  
  In our Drosophila Tau model study, three different antibodies including a total Tau antibody 5A63, a pTauSer262 specific antibody4, and a hyper-phosphorylated Tau antibody AT8 that recognizes hyper-phosphorylation of Tau at Ser202 and Thr205 sites5 were used in western blot analysis to explore the role of Hsp27. As shown in Figure R1-1A and 1B, overexpression of Hsp27 significantly reduced the level of both pTauSer262 and hyper-phosphorylated Tau at both 2 and 10 days after eclosion (DAE). In addition, we further examined the morphology of the fly brain as well as the accumulation of hyper-phosphorylated Tau by immunofluorescence staining. Consistent with previous findings, brains with neuronal expression of TauR406W exhibited an accumulation of filamentous pTau and a reduction of brain neuropil size indicative of neurodegeneration (Figure R11C-F). Importantly, overexpression of Hsp27 restored the size of brain neuropil and suppressed the accumulation of filamentous pTau (Figure R1-1C-F), suggesting that Hsp27 protects against mutant TauR406W - induced neurodegeneration. Taken together, our Drosophila results show that Hsp27 protects against synaptic dysfunction in a Drosophila tauopathy model by reducing pTau aggregation, which well supports our biophysical data.
  
  Figure R1-1 Hsp27 reduces pTau level and protects against pTau-induced synaptopathy in Drosophila. (This figure represents Fig. 2A-F in the revised manuscript) (A) Brain lysates of 2 and 10 days after eclosion (DAE) wild-type (WT) flies (lanes 1 and 6), flies expressing human Tau with GFP (lanes 4 and 9), or human Tau with Hsp27 (lanes 5 and 10) in the nervous system were probed with antibodies for disease-associated phospho-tau epitopes S262, Ser202/Thr205 (AT8), and total Tau (5A6). Actin was probed as a loading control. Brain lysates of flies carrying only UAS elements were loaded for control (lanes 2, 3, 7, and 8). (B) Quantification of protein fold changes in (A). The levels of Tau species were normalized to actin. Fold changes were normalized to the Tau+GFP group at 2 DAE. n = 3. (C) Brains of WT flies or flies expressing Tau+GFP or Tau+Hsp27 in the nervous system at 2 DAE were probed for AT8 (heatmap) and Hsp27 (green), and stained with DAPI (blue). Scale bar, 30 μm. (D-F) Quantification of the Hsp27 intensity (D, data normalized to WT), brain optic lobe size (E), and AT8 intensity (F, data normalized to the Tau+GFP group). n = 4.
  
  Reviewer 2
  
  Abnormal accumulation and aggregation of amyloid-β protein are one of the main pathological hallmarks of Alzheimer's disease. It is well known that molecular chaperones play central roles in regulating tau function and amyloid assembly in disease. In this manuscript, Zhang, Zhu, Lu, Liu, et al., have investigated that Hsp27, a member of the small heat shock protein, specifically binds to phosphorylated Tau, which prevents pTau fibrillation in vitro and in a Drosophila tauopathy model. Using NMR spectroscopy and cross-linking mass spectrometry, the authors found that the N-terminal domain of Hsp27 directly binds to phosphorylation sites of pTau. Overall, the study is important and provides the demonstration of interactions between Hsp27 and pTau.
  
  REPLY: We sincerely thank the reviewer for the positive remarks of this work, and appreciate that the reviewer summarizes the major conclusions of our manuscript, and evaluates our work is important in the area of fundamental biology of the interaction between chaperones and clients, and its implications in AD pathology.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.08.491088v1
www.biorxiv.org www.biorxiv.org

New submission 09/09/2022, 12:21:55

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Activation of TEAD-dependent transcription by YAP/TAZ has been implicated in the development and progression of a significant number of malignancies. For example, loss of function mutations in NF2 or LATS1/2 (known upstream regulators that promote YAP phosphorylation and its retention and degradation in the cytoplasm) promote YAP nuclear entry and association with TEAD to drive oncogenic gene transcription and occurs in >70% of mesothelioma patients. High levels of nuclear YAP have also been reported for a number of other cancer cell types. As such, the YAP-TEAD complex represents a promising target for drug discovery and therapeutic intervention. Based on the recently reported essential functional role for TEAD palmitoylation at a conserved cysteine site, several groups have successfully targeted this site using both reversible binding non-covalent TEAD inhibitors (i.e., flufenamic acid (FA), MGH-CP1, compound 2 and VT101~107), as well as covalent TEAD inhibitors (i.e., TED-347, DC-TEADin02, and K-975), which have been demonstrated to inhibit YAP-TEAD function and display antitumor activity in cells and in vivo.
  
  Here, Fan et al. disclose the development of covalent TEAD inhibitors and report on the therapeutic potential of this class of agents in the treatment of TEAD-YAP-driven cancers (e.g., malignant pleural mesothelioma (MPM)). Optimized derivatives of a previously reported flufenamic acid-based acrylamide electrophilic warhead-containing TEAD inhibitor (MYF-01-37, Kurppa et al. 2020 Cancer Cell), which display improved biochemical- and cell-based potency or mouse pharmacokinetic profiles (MYF-03-69 and MYP-03-176) are described and characterized.
  
  Strengths:
  
  All of the authors' claims and conclusions are very well supported and justified by the data that is provided. Clear improvements in biochemical- and cell-based potencies have been made within the compound series. Cell-based selective activities in the HIPPO pathway defective versus normal/control cell types are established. Transcriptional effects and the regulation of BMF proapoptotic mRNA levels are characterized. A 1.68 A X-Ray co-crystal structure of MYF-03-69 covalently bound to TEAD1 via Cys359 is provided. In vivo efficacy in a relevant xenograft is demonstrated, using a 30 mg/kg, BID PO dose.
  
  We thank the reviewer for appreciating and highlighting the strengths of our study.
  
  Weaknesses:
  
  Beyond the impact on BMF gene regulation, new biological insights reported here for this compound series are moderate. Progress and differentiation with respect to activity and/or ADME PK profiles relative to the very closely related and previously described (Keneda et al. 2020 Am J Cancer Res 10:4399. PMID 33415007) acrylamide-based covalent TEAD inhibitor K-975 (identical 11 nM cell-based potencies when compared head-to-head and identical reported in vivo efficacy doses of 30 mg/kg) is not entirely clear. Demonstration of on-target in vivo activity is lacking (e.g., impact on BMF gene expression at the evaluated exposure levels).
  
  We thank the reviewer’s question. We have compared mouse liver microsome stability and hepatocyte stability of K-975 and MYF-03-176 and found that K-975 is metabolically less stable.
  
  Consistently, when NCI-H226 cells derived xenograft mice were dosed with 30 mg/kg K-975 twice daily, the tumors kept growing and reach more than 1.5-fold volume on 14th day. While with the same dosage, MYF-03-176 showed a significant tumor regression. K-975 did not reach such efficacy even with 100 or 300 mg/kg twice daily, either in NCI-H226 or MSTO-211H CDX mouse model according to the paper (Keneda et al. 2020 Am J Cancer Res 10:4399).
  
  To demonstrate the on-target in vivo activity, we tested expression of the TEAD downstream genes and BMF in tumor sample after 3-day BID treatment (PD study) and we observed reduction of CTGF, CYR61, ANKRD1 and an increase of BMF, which indicates an on-target activity in vivo.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.10.491316v1
www.biorxiv.org www.biorxiv.org

New submission 07/09/2022, 10:42:17

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  This paper by Angueyra, et al., adds to the field’s current understanding of photoreceptor specification and factors regulating opsin expression in vertebrates. Current models of specification of vertebrate photoreceptors are largely based on studies of mammals. However, a great number of animals including teleosts express a wider array of photoreceptor subtypes. Zebrafish for example have 4 distinct cone subtypes and rods. The approach is sound and the data are quite convincing. The only minor weaknesses are that the statistical analyses need to be revisited and the discussion should be a bit more focused.
  
  To identify differentially expressed transcription factors, the authors performed bulk RNA-seq of pooled, hand-sorted photoreceptors. The selection criterion was tightly controlled to limit unhealthy cells and cellular debris from other photoreceptors subtypes. The pooling of cells provided a considerable depth of sequencing, orders of magnitude better than scSeq. The authors identified known transcription factors and several that appear to be novel or their role has not been determined. The data are made available on the PIs website as is a program to access and compare the gene expression data.
  
  The authors then used CRISPR/Cas9 gene targeting of two known and several novel factors identified in their analysis for effects on cell fate decisions and opsin expression. Phenotyping performed on the injected larvae is possible, and the target genes were applied and sequenced to demonstrate the efficiency of the gene targeting. Targeting of 2 genes with know functions in photoreceptor specification in zebrafish, Tbx2b and Foxq2 resulted in the anticipated changes in cell fate, albeit, the strength of the alterations in cell fate in the F0 larvae appears to be less than the published phenotypes for the inherited alleles. Interestingly, the authors also identified the expression of an RH2 opsin in the SWS2 another cone type. The changes are subtle but important.
  
  The authors then targeted tbx2a, the function of which was not known. The result is quite interesting as it matches the increase of rods and decrease of UV cones observed in tbx2b mutants. However, the injected animals also showed RH2 opsin expression but are now in the LWS cone subtype. These data suggest that Tbx2 transcription factors repress misexpression of opsins in the wrong cell type.
  
  The authors also show that targeting additional differentially expressed factors does not affect photoreceptor fate or survival in the time frame investigated. These are important data to present. For these or any of the other targeted genes above, did the authors test for changes in photoreceptor number or survival?
  
  We have attempted to address this point, but the answer is not clear cut. We used activated caspase-3 inmmunolabeling as a marker of apoptosis (Lusk and Kwan 2022). At 5 dpf, the age we chose to make quantifications, we don’t see an increase in activated caspase-3 positive cells when we compare control and tbx2a F0 mutants (Reviewer Figure 1A-B). Labeled cells are very rare and located near the ciliary marginal zone irrespective of genotype. This suggests that there is no detectable active death at this late stage of development in tbx2 F0 mutants. Earlier in development, at 3 dpf, when photoreceptor subtypes first appear, there is also a normal wave of apoptosis in the retina (Blume et al. 2020; Biehlmaier, Neuhauss, and Kohler 2001), resulting in many cells positive for activated caspase-3; our preliminary quantifications don’t show a marked increase in the number of labeled cells in tbx2a F0 mutants, but we consider that it’s likely that subtle effects might be obscured by the physiological wave of apoptosis (Reviewer Figure 1C-D).
  
  Reviewer Figure 1 - Assessment of apoptosis in tbx2a F0 mutants. (A-B) Confocal images of 5 dpf larval eyes of control (A and A’) and tbx2a F0 mutants (B and B’) counterstained with DAPI (grey) and immunolabeled against activated Caspase 3 (yellow) show sparse and dim labeling, restricted to cells located in the ciliary marginal zone, without clear differences between groups. (C-D) Confocal images of 3 dpf larval eyes of control (C and C’) and tbx2a F0 mutants (D and D’) immunolabeled against activated Caspase 3 show many positive cells, located in all retinal layers, as expected from physiological apoptosis at this stage of development and without clear differences between groups.
  
  Furthermore, the additional single-cell RNA-seq datasets we have reanalyzed suggest that tbx2a and tbx2b are expressed by other retinal neurons and progenitors and not just photoreceptors (Reviewer Figure 2), further confounding attempts at the quantification of apoptosis specifically in photoreceptor progenitors.
  
  Reviewer Figure 2 – Expression of tbx2 paralogues across retinal cell types. The transcription factors tbx2a and tbx2b are expressed by many retinal cells. Plots show average counts across clusters in RNA-seq data obtained by Hoang et al. (2020).
  
  At this stage, we consider that fully resolving this issue is important and will require considerably more work, which we will pursue in the future using full germline mutants and live-imaging experiments.
  
  Reviewer #3 (Public Review):
  
  Angueyra et al. tried to establish the method to identify key factors regulating fate decisions in the retinal visual photoreceptor cells by combining transcriptomic and fast genome editing approaches. First, they isolated and pooled five subtypes of photoreceptor cells from the transgenic lines in each of which a specific subtype of photoreceptor cells are labeled by fluorescence protein, and then subjected them to RNA-seq analyses. Second, by comparing the transcriptome data, they extracted the list of the transcription factor genes enriched in the pooled samples. Third, they applied CRISPR-based F0 knockout to functionally identify transcription factor genes involved in cell fate decisions of photoreceptor subtypes. To benchmark this approach, they initially targeted foxq2 and nr2e3 genes, which have been previously shown to regulate S-opsin expression and S-cone cell fate (foxq2) and to regulate rhodopsin expression and rod fate (nr2e3). They then targeted other transcription factor genes in the candidate list and found that tbx2a and tbx2b are independently required for UV-cone specification. They also found that tbx2a expressed in the L-cone subtype and tbx2b expressed in L-cones inhibit M-opsin gene expression in the respective cone subtypes. From these data, the authors concluded that the transcription factors Tbx2a and Tbx2b play a central role in controlling the identity of all photoreceptor subtypes within the retina.
  
  Overall, the contents of this manuscript are well organized and technically sound. The authors presented convincing data, and carefully analyzed and interpreted them. It includes an evaluation of the presented data on cell-type specific transcriptome by comparing it with previously published ones. I think the current transcriptomic data will be a valuable platform to identify the genes regulating cell-type specific functions, especially in combination with the fast CRISPR-based in vivo screening methods provided here. I hope that the following points would be helpful for the authors to improve the manuscript appropriately.
  
  1) The manuscript uses the word “FØ” quite often without any proper definition. I wonder how “Ø” should be pronounced - zero or phi? This word is not common and has not been used in previous publications. I feel the phrase “F0 knockout,” which was used in the paper cited by the authors (Kroll et al 2021), is more straightforward. If it is to be used in the manuscript, please define “FØ” and “CRISPR-FØ screening” appropriately, especially in the abstract.
  
  We have made changes to replace “FØ” to “F0.” In our other citation (Hoshijima et al., 2019), “F0 embryo” was used throughout the paper. Following our references and Dr Kojima’s suggestion, we adopted “F0 mutant larva” as the most straightforward and less confusing term. We have also made changes in the abstract to define our approach more clearly and made appropriate changes throughout the manuscript.
  
  2) Figure 1-supplement 1 shows that opn1mw4 has quite high (normalized) FPKM in one of the S-cone samples in contrast to the least (or no) expression in the M-cone samples, in which opn1mw4 is expected to be detected. The authors should address a possible origin of this inconsistent result for opn1mw4 expression as well as a technical limitation of using the Tg(opn1mw2:egfp) line for detection of opn1mw4 expression in the GFP-positive cells.
  
  In Figure 1 - Supplement 1, we had attempted to provide a summarized figure of all phototransduction genes, but the big differences in expression levels — in particular, the high expression of opsins genes — forced us to use gene-by-gene normalization for display. Without normalization, the expression of opn1mw4 is very low across all samples, and its detection in that sole S-cone sample can likely be attributed to some degree of inherent noise in our methods. We have revised Figure 1 - Supplement 1: we find that we can avoid gene-by-gene normalization and still provide a good summary of the expression of phototransduction genes if the heatmap is broken down by gene families, which have more similar expression levels. In addition, we have added caveats to the use of the Tg(opn1mw2:egfp) line as our sole M-cone marker in the results section describing our RNA-seq approach, including our inability to provide data on Opn1mw4-expressing M cones.
  
  3) The manuscript lacks a description of the sampling time point. It is well known that many genes are expressed with daily (or circadian) fluctuation (cf. Doherty & Kay, 2010 Annu. Rev. Genet.). For example, the cone-specific gene list in Fig.2C includes a circadian clock gene, per3, whose expression was reported to fluctuate in a circadian manner in many tissues of zebrafish including the retina (Kaneko et al. 2006 PNAS). It appears to be cone-specific at this time point of sample collection as shown in Fig.2, but might be expressed in a different pattern at other time points (eg, rod expression). The authors should add, at least, a clear description of the sampling time points so as to make their data more informative.
  
  We have included this information in the materials and methods. We collected all our samples during the most active peak of the zebrafish circadian rhythm between 11am and 2pm (3h to 6h after light onset) to avoid the influence of circadian fluctuations in our analysis.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.26.470161v3
www.biorxiv.org www.biorxiv.org

New submission 21/02/2023, 09:28:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors use a newly developed object-space memory task comprising of a "Stable" version and "Overlapping" version where two objects are presented in two locations per trial in a square open field. Each version consists of 5 training trials of 5-min presentations of an object-space configuration, with both object locations staying constant across training trials in the Stable condition, and only one object location staying fixed in the Overlapping condition. Memory is tested in a test trial 24 hours later where the opposite configuration is presented - overlapping configuration presented for the Stable condition and stable configuration presented for the Overlapping condition - with the thesis that memory in this test trial for the Overlapping condition will depend on the accumulated memory of spatial patterns over the training trials, whereas memory for the test trial in the Stable condition can be due to episodic memory of last trial or accumulated memory. Memory is quantified using a Discrimination Index (DI), comparing the amount of time animals spend exploring the two object locations.
  
  Here, animals in other groups are also presented with an interference trial equivalent to the test trial, to test if the memory of the Overlapping condition can be disrupted. The behavioral data show that for RGS14 over-expressing animals, memory in the Overlapping condition is diminished compared to controls with no interference or controls where over-expression is inhibited, whereas memory in the Stable condition is enhanced. This is interpreted as interference in semantic-like memory formation, whereas one-shot episodic memory is improved. The authors speculate that increased cortical plasticity should lead to increased and larger delta waves according to the sleep homeostasis hypothesis, and observe that instead increased cortical plasticity leads to less non-REM sleep and smaller delta waves, with more prefrontal neurons with slower firing rates (presumably more plastic neurons). They further report increased hippocampal-cortical theta coherence during task and REM sleep, increased NonREM oscillatory coupling, and changes in hippocampal ripples in RGS14 over-expressing animals.
  
  While these results are interesting, there are several issues that need to be addressed, and the link between physiology and behavioral results is unclear.
  
  1) The behavioral results rely on the interpretation that the Overlapping condition corresponds to semantic-like memory and the Stable condition corresponds to episodic-like memory. While the dissociation in memory performance due to interference seen in these two conditions is intriguing, the Stable condition can correspond not just to the memory of the previous trial but also accumulated memory of a stable spatial pattern over the 5 testing trials, similar to accumulated memory of a changing spatial pattern in the Overlapping pattern.
  
  Yes! We completely agree on this. We do not claim the stable condition corresponds to episodic-like memory, instead we refer to it as simple memory, since it can be solved either way (one trial memory or cumulative memory). We now expanded this in the discussion to make it clearer.
  
  Here, it is puzzling that in the behavioral control with no interference (Figure 1D), memory in the Stable and Overlapping condition is unchanged in the test trial, with the DI statistically at 0 in the test trial. In the original description of the Object Space task by the authors in the referenced paper, the measure of memory was a Discrimination Index significantly higher than 0 in both the Stable and Overlapping conditions. This discrepancy needs to be reconciled. Is the DI for the interference trial shown in Fig. S1 significantly different than 0? No statistics or description is provided in the figure legend here.
  
  As mentioned above, we apologize that we oversimplified the description. The 24h interference trial would be what corresponds to the original test trial. We added a clarifying figure for comparison in S1 (bar graph in addition to the violin plot) and stats. Performance was for all groups and conditions above chance, replicating our previous results.
  
  2) The physiology experiments compare Home cage (HC) conditions to the Object Space task (OS) throughout the manuscript. While some differences are seen in the control and RGS14 over-expressing animals, there is no comparison of the Stable vs. Overlapping condition in the physiology experiments. This precludes making explicit links between physiological observations and behavioral effects.
  
  As also mentioned above, we have now added analysis exploring the detailed OS conditions. We would like to thank the reviewers for giving us the opportunity of doing so.
  
  3) The authors speculate that learning will result in larger and more delta waves as per the synaptic homeostasis hypothesis. It should be noted here that an alternative hypothesis is that there should also be a selective increase in synaptic plasticity for learning and consolidation. The authors do observe that control animals show more frequent and higher-amplitude delta waves, but rather than enhancing this process, RGS14 animals with increased plasticity show the opposite effect. How can this be reconciled and linked with the behavioral data in the Stable and Overlapping condition?
  
  In the context of the Object Space Task, we would expect all behavioural conditions (Stable and Overlapping) to induce synaptic changes since learning does occur also in the Stable condition (see also performance on 24h trial). Thus, especially homeostatic responses such as increase in delta amplitude, we would expect for all experiences independent if subtle statistical rules are presented or not. In contrast, detailed processing, extracting underlying regularities is rather proposed by the Sleep for Active Systems Consolidation Hypothesis to occur during hippocampal-cortical interactions in form of delta/ripple/spindle interactions (with different theories emphasising different types of interactions). As mentioned above, we now add a more specific analysis in this regards, where we can show that the two OS conditions that involve moving objects (where thus potentially statistical regularities can be extracted) show a higher percentage of ripples occurring after large slow oscillations in comparison to home cage or the simple learning condition Stable. In contrast, RGS14 already has higher participation in both control conditions, emphasising that in these animals all experiences are treated by the brain as significant learning condition, explaining the behavioural effect (increased interference due to better memory for the interference). Further, we expanded in the discussion how in RGS we sometimes see an enhancement of learning effects but sometimes see a more complex interaction of what we would expect from physiological learning.
  
  Similarly, there is an increase in slower-firing neurons in RGS14 over-expressing animals. Slower-firing neurons have been proposed to be more plastic in the hippocampus based on their participation in learned hippocampal sequences, but appropriate references or data are needed to support the assertion that slower-firing neurons in the prefrontal cortex are more plastic.
  
  As described above, we have expanded the discussion including other citations that also consider the cortex. We can show that our changes would be expected if one turns the cortex as plastic as the hippocampus.
  
  4) It is noted that changing cortical plasticity influences hippocampal-cortical coupling and hippocampal ripples, suggesting a cortical influence on hippocampal physiological patterns. It has been previously shown that disrupting prefrontal cortical activity does alter hippocampal ripples and hippocampal theta sequences (Schmidt et al., 2019; Schmidt and Redish, 2021). The current results should be discussed in this context.
  
  We would like to thank the reviewer for these suggestions, they are now incorporated in the manuscript.
  
  Reviewer #2 (Public Review):
  
  In this paper, the authors provide evidence to support the longstanding proposition that a dual-learning system/systems-level consolidation (hippocampus attains memories at a fast pace which are eventually transmitted to the slow-learning neocortex) allows rapid acquisition of new memories while protecting pre-existing memories. The authors leverage many techniques (behavior, pharmacology, electrophysiology, modelling) and report a host of behavioral and electrophysiological changes on induction of increased medial prefrontal cortex (mPFC) plasticity which are interesting and will be of significant interest to the broad readership.
  
  The experimental design and analyses are convincing (barring some instances which are discussed below). The following recommendations will bolster the strength/quality of the manuscript:
  
  1) Certain concerns regarding the interpretation and analysis of the behavioral data remain. The authors need to clarify if increased mPFC plasticity leads to only an increase in one-shot memory or 'also' interference of previous information. It seems that the behavioral results could also be explained by the more parsimonious explanation that one-shot memory is improved. Do the current controls tease apart these two scenarios?
  
  We agree we cannot disentangle if one memory is just stronger than the other or if its an overwriting effect. We added this now to the discussion. Of note, we do not think it actually would be possible to distinguish these two effects behaviourally in rodents, or at least we cannot think of a fitting study design that would enable the contrast.
  
  Additionally, the authors need to clarify why the 'no trial' and 'anisomycin' controls for the stable task perform at chance levels on exposure to a new object-place association on test day (Fig 1D).
  
  Violin plots are sometimes hard to see. Here simple bar plots where you can see that the animals are not at chance at the 72h test in the control conditions.
  
  Finally, further description of how the discrimination index (exploration time of novel-exploration time of familiar/sum of both) is recommended i.e., in the stable condition, which 'object' is chosen as 'novel' (as both are in the same locations) for computing the index (Fig 1). Do negative DI values imply a neophobia to novel objects (and thus are a form of memory; this is also crucial because the modelling results (Fig 1E) use both neophilia and neophobia while negative discrimination indexes are considered similar to 0 for interpreting the behavioral results, as stated on page 3, lines 84-86?
  
  We added this now to the methods (For Overlapping it is moved location – stable location, for Stable it is location-to-be-moved-at-test – stable location and for random which is assigned as moved and stable is random, and then for each divided by total time). We agree that neophilia/neophobia (especially changes in the distribution) can be an issue and have discussed it in detail in Schut et al NLM 2020 where we see difference in absolute beta values (thus controlling for philia/phobia differences). We also discuss there why it is difficult to control for this in the DI in more detail. In short, one could use absolute values but then it is difficult to determine what a group chance-level would look like. However, luckily here there is not issue since we did not observe difference in neophilic or phobic tendencies while running the experiments. Critically the interference trial (that can also function as simple test trial) confirms that as a group animals show positive DI and neophilia.
  
  2) The authors report lower firing rates in RGS14414 animals during the task in Fig 2F. It is indeed remarkable how large the reported differences are. The authors need to rule out any differences in the behavioral state of the animals in the two groups during the task, i.e., rest vs. active exploration/movement dynamics. Are only epochs during the task while the animals interact with the objects used for computing the firing rates (same epochs as Fig 1)? If not, doing so will provide a useful comparison with Fig 1. Additionally, although the authors make the case for slow firing rate neurons being important for plasticity (based on Grosmark and Buzsaki, 2016), it is crucial to note that the firing rate dynamic (slow vs. fast) in that study for the hippocampus is defined based on the whole recorded session (predominated by sleep), indeed the firing rates of the two groups (slow vs. fast/plastic vs. rigid) during the task/maze-running do not differ in that study. Therefore, the results here seem incongruent with the Grosmark and Buzsaki paper. Since this finding is central to the main claim of the authors, it either warrants further investigation or a re-interpretation of their results.
  
  As mentioned in the main points, we now added the firing rate analysis (including new groups splits) for wake in the sleep box, NREM and REM separately. Each time the same results are obtained. Currently, we do not yet have the tracking and video synchronization set-up, therefore we cannot split the task for specific behaviours.
  
  However, we now also cite Buzsaki’s original log-normal brain review, where he first proposed the idea. There he also shows same effects as we do, in that the general firing rate distribution is the same for task and different sleep stages, just overall shifted. The analysis from Grosmark included more strigent subselection of neurons to be able to also argue that incorporation into run/replay-sequences could not have been biased by firing rate per se (instead of plasticity). However, the original proposition from Buzsaki does fit to our results. He further presents hippocampus vs cortex firing rates, which also confirm the idea (hippocampus more plastic and has slower firing rates). We included this figure above in the general comments. Further, we now expanded the discussion in this point.
  
  3) A concern remains as to how many of the electrophysiological changes they observe (firing rate differences, LFP differences including coupling, sleep state differences, Figs. 2-4) support their main hypothesis or are a by-product of injection of RGS14414 (for instance, one might argue that an increased 'capability' to learn new information/more plasticity might lead to more NREM sleep for consolidation, etc.). The authors need to carefully interpret all their data in light of their main hypothesis, which will substantially improve the quality/strength of the manuscript.
  
  We now expanded the discussion, included more structure and also include that we cannot disentangle if the cellular changes or sleep oscillation changes or an interaction of both is the cause of the result. Furthermore, we added that we cannot distinguish if the interference memory is stronger or actually overwrites the original training memory.
  
  Reviewer #3 (Public Review):
  
  The authors set out to test the idea that memories involve a fast process (for the acquisition of new information) and a slow process (where these memories are progressively transferred/integrated into more-long term storage). The former process involves the hippocampus and the latter the cerebral cortex. This 'dual-learning' system theoretically allows for new learning without causing interference in the consolidation of older memories. They test this idea by artificially increasing plasticity in the pre-limbic cortex and measuring changes in different learning/memory tasks. They also examined electrophysiological changes in sleep, as sleep is linked to memory formation and synaptic plasticity.
  
  The strengths of the study include a) meticulous analyses of a variety of electrophysiological measurements b) a combination of neurobiological and computational tools c) a largely comprehensive analysis of sleep-based changes. Some weaknesses include questions about the technique for increasing cortical plasticity (is this physiological?) and the absence of some additional experiments that would strengthen the conclusions. However, overall, the findings appear to support the general idea under examination.
  
  This study is likely to be very impactful as it provides some really new information about these important neural processes, as well as data that challenges popular ideas about sleep and synaptic plasticity.
  
  We would like to thank the reviewer for these positive comments. Answers to the weaknesses are presented below in the recommendations for the authors.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.11.21.517356v1
www.biorxiv.org www.biorxiv.org

New submission 02/09/2022, 12:18:55

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer 1 (Public Review):
  
  To me, the strengths of the paper are predominantly in the experimental work, there's a huge amount of data generated through mutagenesis, screening, and DMS. This is likely to constitute a valuable dataset for future work.
  
  We are grateful to the reviewer for their generous comment.
  
  Scientifically, I think what is perhaps missing, and I don't want this to be misconstrued as a request for additional work, is a deeper analysis of the structural and dynamic molecular basis for the observations. In some ways, the ML is used to replace this and I think it doesn't do as good a job. It is clear for example that there are common mechanisms underpinning the allostery between these proteins, but they are left hanging to some degree. It should be possible to work out what these are with further biophysical analysis…. Actually testing that hypothesis experimentally/computationally would be nice (rather than relying on inference from ML).
  
  We agree with the reviewer that this study should motivate a deeper biophysical analysis of molecular mechanisms. However, in our view, the ML portion of our work was not intended as a replacement for mechanistic analysis, nor could it serve as one. We treated ML as a hypothesis-generating tool. We hypothesized that distant homologs are likely to have similar allosteric mechanisms which may not be evident from visual analysis of DMS maps. We used ML to (a) extract underlying similarities between homologs (b) make cross predictions across homologs. In fact, the chief conclusion of our work is that while common patterns exist across homologs, the molecular details differ. ML provides tantalizing evidence to this effect. The conclusive evidence will require, as the reviewer rightly suggests, detailed experimental or molecular dynamics characterization. Along this line, we note that we have recently reported our atomistic MD analysis of allostery hotspots in TetR (JACS, 2022, 144, 10870). See ref. 41.
  
  Changes to manuscript:<br /> “Detailed biophysical or molecular dynamics characterization will be required to further validate our conclusions(38).”
  
  Reviewer 3 (Public Review):
  
  However - at least in the manuscript's present form - the paper suffers from key conceptual difficulties and a lack of rigor in data analysis that substantially limits one's confidence in the authors' interpretations.
  
  We hope the responses below address and allay the reviewer’s concerns.
  
  A key conceptual challenge shaping the interpretation of this work lies in the definition of allostery, and allosteric hotspot. The authors define allosteric mutations as those that abrogate the response of a given aTF to a small molecule effector (inducer). Thus, the results focus on mutations that are "allosterically dead". However, this assay would seem to miss other types of allosteric mutations: for example, mutations that enhance the allosteric response to ligand would not be captured, and neither would mutations that more subtly tune the dynamic range between uninduced ("off) and induced ("on") states (without wholesale breaking the observed allostery). Prior work has even indicated the presence of TetR mutations that reverse the activity of the effector, causing it to act as a co-repressor rather than an inducer (Scholz et al (2004) PMID: 15255892). Because the work focuses only on allosterically dead mutations, it is unclear how the outcome of the experiments would change if a broader (and in our view more complete) definition of allostery were considered.
  
  We agree with the reviewer that mutations that impact allostery manifest in many different ways. Furthermore, the effect size of these mutations runs the full gamut from subtle changes in dynamic range to drastic reversal of function. To unpack allostery further, allostery of aTF can be described, not just by the dynamic range, but by the actual basal and induced expression levels of the reporter, EC50 and Hill coefficient. Given the systemic nature of allostery, a substantial fraction of aTF mutations may have some subtle impact on one or more of these metrics. To take the reviewer’s argument one step further, one would have to accurately quantify the effect size of every single amino acid mutation on all the above properties to have a comprehensive sequence-function landscape of allostery. Needless to say, this is extremely hard! Resolution of small effect sizes is very difficult, even at high sequencing depth. To the best of our knowledge, a heroic effort approaching such comprehensive analysis has been accomplished so far only once (PMID: 3491352).
  
  Our focus, therefore, was to screen for the strongest phenotypic impact on allostery i.e., loss of function. Mutations leading to loss of function can be relatively easily identified by cell-sorting. Because our goal was to compare hotspots across homologs, we surmised that loss of function mutations, given their strong phenotypic impact, are likely to provide the clearest evidence of whether allosteric hotspots are conserved across remote homologs.
  
  The reviewer raised the point of activity-reversing mutations. Yes, there are activity reversing mutations in TetR. However, they represent an insignificant fraction. In the paper cited by the reviewer, there are 15 activity-reversing mutations among 4000 screened. Furthermore, the paper shows that activity-reversing in TetR requires two-tofour mutations, while our library is exclusively single amino acid substitutions. For these reasons, we did not screen for activity-reversing mutations. Nonetheless, we agree with the reviewer that screening for activity-reversing mutations across homologs would be very interesting.
  
  The separation in fluorescence between the uninduced and induced states (the assay dynamic range, or fold induction) varies substantially amongst the four aTF homologs. Most concerningly, the fluorescence distributions for the uninduced and induced populations of the RolR single mutant library overlap almost completely (Figure 1, supplement 1), making it unclear if the authors can truly detect meaningful variation in regulation for this homolog.
  
  Yes, the reviewer is correct that the fold induction ratio varies among the four aTF homologs. However, we note that such differences are common among natural aTFs. Depending on the native downstream gene regulated by the aTF, some aTFs show higher ligand-induced activation, and others are lower. While this is not a hard and fast rule, aTFs that regulate efflux pumps tend to have higher fold induction than those that regulate metabolic enzymes. In summary, the variation in fold induction among the four aTFs is not a flaw in experimental design nor indicates experimental inconsistency but is instead just an inherent property of protein-DNA interaction strength and the allosteric response of each aTF.
  
  Among the four aTFs, wildtype RolR has the weakest fold induction (15-fold) which makes sorting the RolR library particularly challenging. To minimize false positives as much as possible, we require that dead mutant be present in (a) non-fluorescent cells after ligandinduction (b) non-fluorescent cells before ligand-induction (c) at least two out of the three replicates for both sorts. Additionally, for RolR specifically, we adjusted the nonfluorescent gate to be far more stringent than the other three aTFs (Fig. 1 – figure supplement 1). Furthermore, we assign residues as allosteric hotspots, not individual dead mutations. This buffers against false strong signals from stray individual dead mutations. Finally, the top interquartile range winnows them to residues showing strong consistent dead phenotype. As a result of these “safeguards” we have built in, the number of allosteric hotspots of RolR (57) is comparable to the other three aTFs (51, 53 and 48). This suggests that we are not overestimating the number of hotspots despite the weaker fold induction of RolR. We highlight in a new supplementary figure (Figure 1 – figure supplement 4) that changing the read count threshold from 5X to 10X produces near identical patterns of mutations suggesting that our results are also robust to changes in ready depth stringency.
  
  Changes to manuscript: In response to the reviewer's comment, we have added the following sentence.
  
  “We note that the lower fold induction (dynamic range) of RolR makes it particularly challenging to separate the dead variants from the rest.”
  
  The methods state that "variants with at least 5 reads in both the presence and absence of ligand in at least two replicates were identified as dead". However, the use of a single threshold (5 reads) to define allosterically dead mutations across all mutations in all four homologs overlooks several important factors:
  
  Depending on the starting number of reads for a given mutation in the population (which may differ in orders of magnitude), the observation of 5 reads in the gated nonfluorescent region might be highly significant, or not significant at all. Often this is handled by considering a relative enrichment (say in the induced vs uninduced population) rather than a flat threshold across all variants.
  
  We regret the lack of clarity in our presentation. We wish to better explain the rationale behind our approach. First, we understand the reviewer’s point on considering relative enrichment to define a threshold. This approach works well in DMS experiments involving genetic selections, which is commonly the case, because activity scales well with selection stringency. One can then pick enrichment/depletion relative to the middle of the read count distribution as a measure of gain or loss of function.
  
  Second, this strategy does not, in practice, work well for cell-sorting screens. While it may be tempting to think of cell sorting as comparably activity-scaled as genetic selections, in reality, the fidelity of fluorescent-activated cell sorters is much lower. Making quantitative claims of activity based on cell sorting enrichment can be risky. It is wiser to treat cell sorting results as yes/no binary i.e., does the mutation disrupt allostery or not. More importantly, the yes/no binary classification suffices for our need to identify if a certain mutation adversely impacts allosteric activity or not.
  
  Third, the above argument does not imply that all mutations have the same effect size on allostery. They don’t. We capture the effect size on individual residues, not individual mutations, by counting the number of dead mutations at a residue position. This is an important consideration because it safeguards us from minor inconsistencies that inevitably arise from cell sorting.
  
  Fourth, a variant to be classified as allosterically dead, it must be present both in uninduced and induced DNA-bound populations in at least two out of three replicates (four conditions total). This is a stringent criterion for selecting dead variants resulting in highly consistent regions of importance in the protein even upon varying read count thresholds. To the extent possible, we have minimized the possibility of false positive bleed-through.
  
  Finally, two separate normalizations were performed on the total sequence reads to be able to draw a common read count threshold 1) between experimental conditions & replicates and 2) across proteins. First, total sequencing reads were normalized to 200k total across all sample conditions (presorted, -inducer, and +inducer) and replicates for each homolog, allowing comparisons within a single protein. Next, reads were normalized again to account for differences in the theoretical size of each protein’s single-mutant library, allowing for comparisons across proteins by drawing a commont readcount cutoff. For example, total sequencing reads of RolR (4,332 possible mutants) increased by 1.18x relative to MphR (3,667 possible mutants) for a total of 236k reads.
  
  Changes to manuscript: We have provided substantial additional details in the Fluorescence-activated cell sorting and NGS preparation and analysis sections.
  
  We also added the following in the main text.
  
  “In other words, we use cell sorting as a binary classifier i.e., does the mutation disrupt allostery or not. We capture the effect size on individual residues, not individual mutations, by counting the number of dead mutations at a residue position. This is an important consideration because it safeguards us from minor inconsistencies that inevitably arise from cell sorting.”
  
  Depending on the noise in the data (as captured in the nucleotide-specific q-scores) and the number of nucleotides changed relative to the WT (anywhere between 1-3 for a given amino acid mutation) one might have more or less chance of observing five reads for a given mutation simply due to sequencing noise.
  
  All the reads considered in our analyses pass the Illumina quality threshold of Q-score ≥ 30 which as per Illumina represent “perfect reads with no errors or ambiguities”. This translates into a probability of 1 in 1000 incorrect base call or 99.9% base call accuracy.
  
  We use chip-based oligonucleotides to build our DMS library, which allows us to prespecify the exact codon that encodes a point mutation. This means the nucleotide count and protein count are the same. The scenario referred to by the reviewer i.e., “anywhere between 1-3 for a given amino acid mutation” only applies to codon randomized or errorprone PCR library generation. We regret if the chip-based library assembly part was unclear.
  
  Depending on the shape and separation of the induced (fluorescent) and uninduced (non-fluorescent) population distributions, one might have more or less chance of observing five reads by chance in the gated non-fluorescent region. The current single threshold does not account for variation in the dynamic range of the assay across homologs.
  
  We have addressed the concern raised by the reviewer on fluorescent population distributions in answers to questions 10 and 11.
  
  The reviewer makes an important point about the choice of sequencing threshold. We use the sequencing threshold to simply make a binary choice for whether a certain variant exists in the sorted population or not. We do not use the sequencing reads as to scale the activity of the variant. To address the reviewer's comment, we have included a new supplementary figure (Fig 1 – figure supplement 4) where we compare the data by adjust the threshold two levels – 5 and 10 reads. As is evident in the new figure, the fundamental pattern of allosteric hotspots and the overall data interpretation does not change.
  
  TetR: 5x – 53 hotspots, 10x – 51 hotspots
  
  TtgR: 5x – 51 hotspots, 10x – 51 hotspots
  
  MphR: 5x – 48 hotspots, 10x – 48 hotspots
  
  RolR: 5x – 57 hotspots, 10x – 60 hotspots
  
  In other words, changing the threshold to be more or less strict may have a modest impact on the overall number of hotspots in the dataset. Still, the regions of functional importance are consistent across different thresholds. We have expanded the discussion in the manuscript to address this point.
  
  Changes to manuscript: We have now included a new supplementary comparing hotspot data at two thresholds: Figure 1 – figure supplement 4.
  
  We also added the following in the main text.
  
  “To assess the robustness of our classification of hotspots, we determined the number of hotspots at two different sequencing thresholds – 5x and 10x. At 5x and 10x, the number of hotspots are – TetR: 53, 51; TtgR: 51, 51; MphR: 48, 48 and RolR: 57,60, respectively. Changing the threshold has a modest impact on the overall number of hotspots and the regions of functional importance are consistent at both thresholds”
  
  The authors provide a brief written description of the "weighted score" used to define allosteric hotspots (see y-axis for figure 1B), but without an equation, it is not clear what was calculated. Nonetheless, understanding this weighted score seems central to their definition of allosteric hotspots.
  
  We regret the lack of clarity in our presentation. The weighted score was used to quantify the “deadness” of every residue position in the protein. At each position in the protein, the number of mutations that inhibited activity was summed up and the ‘deadness’ of each mutation was weighted based on how many replicates is appeared to inactivate the protein. Weighted score at each residue position is given by
  
  Where at position x in the protein, D1 is the number of mutations dead in one replicate only, D2 is the number of mutations dead in 2 replicates, D3 is the number of mutations dead in 3 replicates, and Total is the total number of variants present in the data set (based on sequencing data). Any dead mutation that is seen in only one replicate is discarded and does not contribute to the “deadness” of the residue. Mutations seen in two and three replicates contribute to the score. We have included a new supplementary figure (Fig. 1 – figure supplement 2) to give the reader a detailed heatmap of all mutations and their impact for each protein.
  
  Changes to manuscript: The weighted scoring scheme is now described in greater detail under Materials and Methods in the “NGS preparation and analysis” section.
  
  The authors do not provide some of the standard "controls" often used to assess deep mutational scanning data. For example, one might expect that synonymous mutations are not categorized as allosterically dead using their methods (because they should still respond to ligand) and that most nonsense mutations are also not allosterically dead (because they should no longer repress GFP under either condition). In general, it is not clear how the authors validated the assay/confirmed that it is giving the expected results.
  
  As we state in response to question 12, we use chip-based oligonucleotides to build our DMS library, which allows us to pre-specify the exact codon that encodes a point mutation. We have no synonymous or nonsense mutations in our DMS library. Each protein mutation is encoded by a single unique codon. The only stop codon is at 3’end of the gene.
  
  The authors performed three replicates of the experiment, but reproducibility across replicates and noise in the assay is not presented/discussed.
  
  Changes to manuscript: A new supplementary table (Table 1) is now provided with the pairwise correlation coefficients between all replicates for each protein.
  
  In the analysis of long-range interactions, the authors assert that "hotspot interactions are more likely to be long-range than those of non-hotspots", but this was not accompanied by a statistical test (Figure 2 - figure supplement 1).
  
  In response to the reviewer's comment, we now include a paired t-test comparing nonhotspots and hotspots with long-range interactions in the main text.
  
  Changes to manuscript: In all four aTFs, hotspots constituted a higher fraction of LRIs than non-hotspots (Figure 2 – figure supplement 1; P = 0.07).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.01.490188v1
www.biorxiv.org www.biorxiv.org

New submission 16/01/2023, 12:18:58

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this study, the authors describe an elegant genetic screen for mutants that suppress defects of MCT1 deletions which are deficient in mitochondrial fatty acid synthesis. This screen identified many genes, including that for Sit4. In addition, genes for retrograde signaling factors (Rtg1, Rtg2 and Rtg3), proteins influencing proteasomal degradation (Rpn4, Ubc4) or ribosomal proteins (Rps17A, Rps29A) were found. From this mix of components, the authors selected Sit4 for further analysis. In the first part of the study, they analyzed the effect of Sit4 in context of MCT1 mutant suppression. This more specific part is very detailed and thorough, the experiments are well controlled and convincing. The second, more general part of the study focused on the effect of Sit4 on the level of the mitochondrial membrane potential. This part is of high general interest, but less well developed. Nevertheless, this study is very interesting as it shows for the first time that phosphate export from mitochondrial is of general relevance for the membrane potential even in wild type cells (as long as they live from fermentation), that the Sit4 phosphatase is critical for this process and that the modulation of Sit4 activity influences processes relying on the membrane potential, such as the import of proteins into mitochondria. However, some aspects should be further clarified.
  
  1) It is not clear whether Sit4 is only relevant under fermentative conditions. Does Sit4 also influence the membrane potential in respiring cells? Fig. S2D shows the membrane potential in glucose and raffinose. Both carbon sources lead to fermentative growths. The authors should also test whether Sit4 levels influence the membrane potential when cells are grown under respirative conditions, such in ethanol, lactate or glycerol. Even if deletions of Sit4 affect respiration, mutants with altered activity can be easily analyzed.
  
  sit4Δ cells fail to grow on nonfermentable media as shown by us (Figure 2—figure supplement 1C) and others (Arndt et al., 1989; Dimmer et al., 2002; Jablonka et al., 2006). In our opinion, the exact reason is unclear, but there is an interesting observation that addition of aspartate can partially restore growth on ethanol (Jablonka et al., 2006). Despite the lack of thorough investigation on this sit4Δ defect, an early study speculated that this defect could be related to the cAMP-PKA pathway (Sutton et al., 1991). This study pointed out genetic interactions of SIT4 with multiple genes in cAMP-PKA (Sutton et al., 1991). In addition, sit4Δ cells have similar phenotypes as those cAMP-PKA null mutants, such as glycogen accumulation, caffeine resistant, and failure to grow on nonfermentable media (Sutton et al., 1991). We have not found sit4Δ mutants that could grow on nonfermentable media based on literature search.
  
  2) The authors should give a name to the pathway shown in Fig. 4D. This would make it easier to follow the text in the results and the discussion. This pathway was proposed and characterized in the 90s by George Clark-Walker and others, but never carefully studied on a mechanistic level. Even if the flux through this pathway cannot be measured in this study, the regulatory role of Sit4 for this process is the most important aspect of this manuscript.
  
  We now refer this mechanism as the mitochondrial ATP hydrolysis pathway.
  
  3) To further support their hypothesis, the authors should show that deletion of Pic1 or Atp1 wipes out the effect of a Sit4 deletion. In these petite-negative mutants, the phosphate export cycle cannot be carried out and thus, Sit4, should have no effect.
  
  The mitochondrial phosphate transport activity is electroneutral as it also pumps a proton together with inorganic phosphate. The F1 subunit of the ATP synthase (Atp1 and Atp2) is suggested among many literatures to be responsible for the ATP hydrolysis. We performed tetrad dissection to generate atp1Δ or atp2Δ in pho85Δ background. After streaking the single colony to a fresh plate, we noticed that atp1Δ mct1Δ and atp2Δ mct1Δ cells are lethal, and knocking out PHO85 rescued this synthetic lethality. It is not surprising that atp1Δ mct1Δ or atp2Δ mct1 Δ cells are lethal since the F1 subunit is important to generate a minimum of MMP in mct1 Δ cells when the ETC is absent (i.e., rho0 cells). However, knocking out PHO85 can generate MMP independent of F1 subunit of ATP synthase, which is suggested by the viable atp1Δ mct1Δ pho85Δ and atp2Δ mct1Δ pho85Δ cells. There are many ATPases in the mitochondrial matrix that could hydrolyze ATP for ADP/ATP carrier to generate MMP theoretically. However, we do not currently know exactly which ATPase(s) is activated by phosphate starvation. This data is now included as Figure 5—figure supplement 1F-G.
  
  4) What is the relevance of Sit4 for the Hap complex which regulates OXPHOS gene expression in yeast? The supplemental table suggests that Hap4 is strongly influenced by Sit4. Is this downstream of the proposed role in phosphate metabolism or a parallel Sit4 activity? This is a crucial point that should be addressed experimentally.
  
  To investigate the role of the Hap complex in MMP generation in sit4Δ cells, we overexpressed and knocked out HAP4, the catalytic subunit of the Hap complex, separately in wild-type and sit4Δ cells. We confirmed the HAP4 overexpression by the enriched abundance of ETC complexes as shown in the BN-PAGE (Figure 2—figure supplement 1E). However, we did not observe any rescue of ETC or ATP synthase in mct1Δ cells when HAP4 was overexpressed. The enriched level of ETC complexes by HAP4 overexpress is not sufficient to rescue the MMP (Figure 2—figure supplement 1F).
  
  Next, we knocked out HAP4 in sit4Δ cells. Knocking out SIT4 could still increase MMP in hap4Δ cells with a much-reduced magnitude, which phenocopied ETC subunit and RPO41 deletion in sit4Δ cells (Figure 2—figure supplement 1G).
  
  In conclusion, the Hap complex is involved in the MMP increase when SIT4 is absent. However, it is not sufficient to increase MMP by overexpressing HAP4. The Hap complex discussion is now included in the manuscript, and the data is presented as Figure 2—figure supplement 1E-G.
  
  5) The authors use the accumulation of Ilv2 precursors as proxy for mitochondrial protein import efficiency. Ilv2 was reported before as a protein which, if import into mitochondria is slow, is deviated into the nucleus in order to be degraded (Shakya,..., Hughes. 2021, Elife). Is it possible that the accumulation of the precursor is the result of a reduced degradation of pre-Ilv2 in the nucleus rather than an impaired mitochondrial import? Since a number of components of the ubiquitin-proteasome system were identified with Sit4 in the same screen, a role of Sit4 in proteasomal degradation seems possible. This should be tested.
  
  We thank the reviewer for pointing out this potential caveat with our Ilv2-FLAG reporter. With limited search and tests, we could not find another reporter that behaves like Ilv2FLAG. The reason Ilv2-FLAG is a perfect reporter for this study is because in wild-type cells, Ilv2-FLAG is not 100% imported. Therefore, we could demonstrate that mitochondria with higher MMP import more efficiently. Unfortunately, all of the mitochondrial proteins that we tested could efficiently import in wild-type cells. To identify other suitable mitochondrial proteins that behave like Ilv2-FLAG, we would need to conduct a more comprehensive screen.
  
  To address the concern of the involvement of protein degradation in obscuring the interpretation of Ilv2-FLAG import, we performed two experiments. First, we measured the proteasomal activity in wild-type and our mutants using a commercial kit (Cayman). We did not observe a statistically significant difference in 20S proteasomal activity between wild-type and sit4Δ cells.
  
  In the second experiment, we reduced the MMP of sit4 cells using CCCP treatment and measured the Ilv2-FLAG import. We first treated sit4Δ cells with different dosage of CCCP for six hours and measured their MMP. sit4Δ cells treated with 75 µM CCCP had comparable MMP to wild-type cells. When we treated sit4Δ cells with higher concentrations of CCCP, most of the cells did not survive after six hours. Next, we performed the Ilv2-FLAG import assay. We observed similar level of unimported Ilv2FLAG (marked with *) in sit4Δ cells treated with 75 µM CCCP. This result confirms that sit4Δ cells have similar Ilv2-FLAG turnover mechanism and activity as the wild-type cells, because when we lower the MMP in sit4Δ background we observe a similar level of unimported Ilv2-FLAG. We thus feel confident in concluding that the Ilv2-FLAG import results are indeed an accurate proxy for MMP level. These data are now included as Figure 1—figure supplement 1H-J in the manuscript.
  
  Author response image 1.
  
  Reviewer #2 (Public Review):
  
  This study reports interesting findings on the influence of a conserved phosphatase on mitochondrial biogenesis and function. In the absence of it, many nucleus-encoded mitochondrial proteins among which those involved in ATP generation are expressed much better than in normal cells. In addition to a better understanding of th mechanisms that regulate mitochondrial function, this work may help developing therapeutic strategies to diseases caused by mitochondrial dysfunction. However there are a number of issues that need clarification.
  
  1) The rationale of the screening assay to identify genes required for the gene expression modifications observed in mct1 mutant is not clear. Indeed, after crossing with the gene deletion libray, the cells become heterozygote for the mct1 deletion and should no longer be deficient in mtFAS. Thank you for clarifying this and if needed adjust the figure S1D to indicate that the mated cells are heterozygous for the mct1 and xxx mutations.
  
  We updated the methods section and the graphic for the genetic screen to clarify these points within the SGA workflow overview. After we created the heterozygote by mating mct1Δ cells with the individual KO cells in the collection, these diploids underwent sporulation and selection for the desired double KO haploid. As a result, the luciferase assay was performed in haploid cells with MCT1 and one additional non-essential gene deleted.
  
  2) The tests shown in Fig. S1E should be repeated on individual subclones (at least 100) obtained after plating for single colonies a glucose culture of mct1 mutant, to determine the proportion of cells with functional (rho+) mtDNA in the mct1 glucose and raffinose cultures. With for instance a 50% proportion of rho- cells, this could substantially influence the results of the analyses made with these cells (including those aiming to evaluate the MMP).
  
  We agree that this would provide a more confident estimate for population-level characterization of these colonies. It is important to note that we randomly chose 10 individual subclones, and 100% of these colonies were verified to be rho+. This suggests the population has functional mtDNA, and thus felt confident in the identity of our populations.
  
  3) The mitochondria area in mct1 cells (Fig.S1G) does not seem to be consistent with the tests in Fig. 1C. that indicate a diminished mitochondrial content in mct1 cells vs wild-type yeast. A better estimate (by WB for instance) of the mitochondrial content in the analyzed strains would enable to better evaluate MMP changes monitored with Mitotracker since the amount of mitochondria in cells correlate with the intensity of the fluorescence signal.
  
  As this reviewer pointed out, we quantified mitochondrial area based on Tom70-GFP signal. This measurement is quantified by mitochondrial area over cell size. Cell size is an important parameter when measuring organelle size as most of the organelles scale up and down with the cell size. mct1Δ cells generally have smaller cell size than WT cells. Therefore, the mitochondrial area of mct1Δ cells was not significantly different from WT cells when scaled to cell size. We believe this is the best method to compare mitochondrial area. As for quantifying MMP from these microscopy images, we measured the average MitoTracker Red fluorescence intensity of each mitochondria defined by Tom70-GFP. This method inherently normalizes to subtract the influence of mitochondria area when quantifying MMP.
  
  4) Page 12: "These data demonstrate that loss of SIT4 results in a mitochondrial phenotype suggestive of an enhanced energetic state: higher membrane potential, hyper-tubulated morphology and more effective protein import." Furthermore, the sit4 mutant shows higher levels of OXPHOS complexes compared to WT yeast.
  
  Despite these beneficial effects on mitochondria, the sit4 deletion strain fails to grow on respiratory substrates. It would be good to know whether the authors have some explanation for this apparent contradiction.
  
  We agree that this was initially puzzling. We provide a more complete explanation above (see comments to reviewer #1 - major concern #1). Briefly, the growth deficiency in non-fermentable media with sit4Δ cells was reported and studied by multiple groups (Arndt et al., 1989; Dimmer et al., 2002; Jablonka et al., 2006). These seems to indicate that sit4Δ cells contain more ETC complexes and more OCR but cannot respire on nonfermentable carbon source. However, we do not think there is yet a clear explanation for this phenotype. One interesting observation reported is the addition of aspartate partly restoring cells’ growth on ethanol (Jablonka et al., 2006). One early study speculates that this defect could be related to the cAMP-PKA pathway. Sutton et al. pointed out genetic interactions with sit4 and multiple genes in cAMP-PKA (Sutton et al., 1991). In addition, sit4Δ cells have similar phenotypes as those cAMP-PKA null mutants, such as glycogen accumulation, caffeine resistance, and failure to grow on non-fermentable media. However, to keep this manuscript succinct, we opted to stay focused on MMP.
  
  Reviewer #3 (Public Review):
  
  In this study, the authors investigate the genetic and environmental causes of elevated Mitochondrial Membrane Potential (MMP) in yeast, and also some physiological effects correlated with increased MMP.
  
  The study begins with a reanalysis of transcriptional data from a yeast mutant lacking the gene MCT1 whose deletion has been shown to cause defects in mitochondrial fatty acid synthesis. The authors note that in raffinose mct1del cells, unlike WT cells, fail to induce expression of many genes that code for subunits of the Electron Transport Chain (ETC) and ATP synthase. The deletion of MCT1 also causes induction of genes involved in acetyl-CoA production after exposure to raffinose. The authors therefore conduct a screen to identify mutants that suppress the induction of one of these acetylCoA genes, Cit2. They then validate the hits from this screen to see which of their suppressor mutants also reduce expression in four other genes induced in a mct1del strain. This yielded 17 genes that abolished induction of all 5 genes tested in an mct1del background during growth on raffinose.
  
  The authors chose to focus on one of these hits, the gene coding for the phosphatase SIT4 (related to human PP6) which also caused an increase in expression of two respiratory chain genes. The authors then investigated MMP and mitochondrial morphology in strains containing SIT4 and MCT1 deletions and surprisingly saw that sit4del cells had highly elevated MMP, more reticular mitochondria, and were able to fully import the acetolactate synthase protein Ilv2p and form ETC and ATP synthase complexes, even in cells with an mct1del background, rescuing the low MMP, fragmented mitochondria, low import of Ilv2 and an inability to form ETC and ATP synthase complexes phenotypes of the mct1del strain. Surprisingly, the authors find that even though MMP is high and ETC subunits are present in the sit4del mct1del double deletion strain, that strain has low oxygen consumption and cannot grow under respiratory conditions, indicating that the elevated MMP cannot come from fully functional ETC subunits. The authors also observe that deleting key subunits of ETC complex III (QCR2) and IV (COX5) strongly reduced the MMP of the sit4del mutant, which would suggest that the majority of the increase in MMP of the sit4del mutant was dependant on a partially functional ETC. The authors note that there was still an increase in MMP in the qcr2del sit4del and cox4del sit4del strains relative to qcr2del and cox4del strains indicating that some part of the increase in MMP was not dependent on the ETC.
  
  The authors dismiss the possibility that the increase in MMP could have been through the reversal of ATP synthase because they observe that inhibition of ATP synthase with oligomycin led to an increase of MMP in sit4del cells. Indicating that ATP synthase is operating in a forward direction in sit4del cells.
  
  Noting that genes for phosphate starvation are induced in sit4del cells, the authors investigate the effects of phosphate starvation on MMP. They found that phosphate starvation caused an increase in MMP and increased Ilv2p import even in the absence of a mitochondrial genome. They find that inhibition of the ADP/ATP carrier (AAC) with bongkrekic acid (BKA) abolishes the increase of MMP in response to phosphate starvation. They speculate that phosphate starvation causes an increase in MMP through the import and conversion of ATP to ADP and subsequent pumping of ADP and inorganic phosphate out of the mitochondria.
  
  They further show that MMP is also increased when the cyclin dependent kinase PHO85 which plays a role in phosphate signaling is deleted and argue that this indicates that it is not a decrease in phosphate which causes the increase in MMP under phosphate starvation, but rather the perception of a decrease in phosphate as signalled through PHO85. Unlike in the case of SIT4 deletion, the increase in MMP caused by the deletion of pho85 is abolished when MCT1 is deleted.
  
  Finally they show an increase in MMP in immortalized human cell lines following phosphate starvation and treatment with the phosphate transporter inhibitor phosphonoformic acid (PFA). They also show an increase in MMP in primary hepatocytes and in midgut cells of flies treated with PFA.
  
  The link between phosphate starvation and elevated MMP is an important and novel finding and the evidence is clear and compelling. Based on their experiments in various mammalian contexts, this link appears likely to be generalizable, and they propose and begin to test an interesting hypothesis for how MMP might occur in response to phosphate starvation in the absence of the Electron Transport Chain.
  
  The link between phosphate starvation and deletion of the conserved phosphatase SIT4 is also interesting and important, and while the authors' experiments and analysis suggest some connection between the two observations, that connection is still unclear.
  
  Major points
  
  Mitotracker is great fluorescent dye, but it measures membrane potential only indirectly. There is a danger when cells change growth rates, ion concentrations, or when the pH changes, all MMP indicating dyes change in fluorescence: their signal is confounded Change in phosphate levels can possibly do both, alter pH and ion concentrations. Because all conclusions of the manuscript are based on a change in MMP, it would be a great precaution to use a dye-independent measure of membrane potential, and confirm at least some key results.
  
  Mitochondrial MMP does strongly influence amino acid metabolism, and indeed the SIT4 knockout has a quite striking amino acid profile, with histidine, lysine, arginine, tyrosine being increased in concentration. http://ralser.charite.de/metabogenecards/Chr_04/YDL047W.html Could this amino acid profile support the conclusions of the authors? At least lysine and arginine are down in petites due to a lack of membrane potential and iron sulfur cluster export.- and here they are up. Along these lines, according to the same data resource, the knock-outs CSR2, ASF1, SSN8, YLR0358 and MRPL25 share the same metabolic profile. Due to limited time I did not re-analyse the data provided by the authors- but it would be worth checking if any of these genes did come up in the screens of the authors.
  
  We tested the mutants within the same cluster as SIT4 shown in this paper from the deletion collection and measured their MMP. yrl358cΔ cells have similar high MMP as observed in sit4Δ cells. However, this gene has a yet undefined function. Beyond YRL358C, we did not observe similar MMP increases in other gene deletions from this panel, which does not support the notion that amino acids such as histidine, lysine, arginine, or tyrosine play a determining effect in driving MMP.
  
  The media condition and strain used in the suggested paper is very different from what we used in our study. Instead of growing prototrophic cells in minimal media without any amino acids, we used auxotrophic yeast strains and grew them in media containing complete amino acids. So far, none of the other defects or signaling associated with SIT4 deletion could influence MMP as much as the phosphate signaling. We interpret these data to support the hypothesis that the MMP observation in sit4Δ cells is connected with the phosphate signaling as illustrated by the second half of the story in our manuscript.
  
  Author reponse image 2.
  
  One important claim in the manuscript attempts to explain a mechanism for the MMP increase in response to phosphate starvation which is independent of the ETC and ATP synthase.
  
  It seems to me the only direct evidence to support this claim is that inhibition of the AAC with BKA stops the increase of mitotracker fluorescence in response to phosphate starvation in both WT and rho0 cells (Figs 4B and 4C). It would strengthen the paper if the authors could provide some orthogonal evidence.
  
  This is a similar comment as raised by reviewer #1 - major concern #3. We refer the reviewer to our discussion and the new data above. Briefly, we do not think F1 subunit is responsible for the ATP hydrolysis activity to generate MMP in phosphate depleted situation. We believe there are additional ATPase(s) in the mitochondrial matrix that can be utilized to couple to ADP/ATP carrier for MMP generation during phosphate starvation. However, we have not identified the relevant ATPase(s) at this point, and it is likely that multiple ATPases could contribute to this activity.
  
  Introduction/Discussion The author might want to make the reader of the article aware that the 'reversal' of the ATP synthase directionality -i.e. ATP hydrolysis by the ATP synthase as a mechanism to create a membrane potential (in petites), has always been a provocative idea - but one that thus far could never be fully substantiated. Indeed some people that are very familiar with the topic, are skeptical this indeed happens. For instance, Vowinckel et al 2021 (PMID: 34799698) measured precise carbon balances for peptide cells, and found no evidence for a futile cycle - peptides grow slower, but accumulate the same biomass from glucose as peptides that re-evolve at a fast growth rate . Perhaps the manuscript could be updated accordingly.
  
  We thank the reviewer for pointing out this additional relevant study. We have rephased the referenced sentence in the introduction. The MMP generation in phosphate starvation is independent of the F1 portion of ATP synthase. Therefore, our data neither supports or refutes either of these arguments.
  
  In the introduction and conclusion there is discussion of MMP set points. In particular the authors state:
  
  "Critically, we find that cells often prioritize this MMP setpoint over other bioenergetic priorities, even in challenging environments, suggesting an important evolutionary benefit."
  
  This does not seem to be consistent with the central finding of the manuscript that MMP changes under phosphate starvation. MMP doesn't seem so much to have a 'set point' but rather be an important physiological variable that reacts to stimuli such as phosphate starvation.
  
  The reviewer raises a rational alternative hypothesis to the one that we have proposed. In reality, both of these are complete speculations to explain the data and we can’t think of any way to test the evolutionary basis for the mechanisms that we describe. We recognize that untested/untestable speculative arguments have limitations and there are viable alternative hypotheses. We have softened our language to ensure that it is clear that this is only a speculation.
  
  The authors suggest that deletion of Pho85 causes an increase in MMP because of cellular signaling. However, they also state in the conclusion:
  
  "Unlike phosphate starvation, the pho85D mutant has elevated intracellular phosphate concentrations. This suggests that the phosphate effect on MMP is likely to be elicited by cellular signaling downstream of phosphate sensing rather than some direct effect of environmental depletion of phosphate on mitochondrial energetics."
  
  The authors should cite the study that shows deletion of PHO85 causes increased intracellular phosphate concentrations. It also seems possible that the 'cellular signaling' that causes the increase in MMP could be a result of this increase in intracellular phosphate concentrations, which could constitute a direct effect of an environmental overload of phosphate on mitochondrial energetics.
  
  We now cited the literature that shows higher intracellular phosphate in pho85Δ cells (Gupta et al., 2019; Liu et al., 2017). Depleting phosphate in the media drastically reduced intracellular phosphate concentration, which is the opposing situation as pho85Δ cells. Nevertheless, we observed higher MMP in either situation. We concluded from these two observations that the increase in MMP is a response to the signaling activated by phosphate depletion rather than the intracellular phosphate abundance.
  
  Related to this point, in the conclusion, the authors state:
  
  "We now show that intracellular signaling can lead to an increased MMP even beyond the wild-type level in the absence of mitochondrial genome."
  
  In sum, the data shows that signaling is important here- but signaling alone is only the message - not the biophysical process that creates a membrane potential. The authors then could revise this slightly.
  
  We have rephrased this sentence as suggested, which now reads “We now show that intracellular signaling triggers a process that can lead to an increased MMP even beyond the wild-type level in the absence of mitochondrial genome”.
  
  The authors state in the conclusion that
  
  "We first made the observation that deletion of the SIT4 gene, which encodes the yeast homologue of the mammalian PP6 protein phosphatase, normalized many of the defects caused by loss of mtFAS, including gene expression programs, ETC complex assembly, mitochondrial morphology, and especially MMP (Fig. 1)"
  
  The data shown though indicates that a defect in mtFAS in terms of MMP, deletion of SIT4 causes a huge increase (and departure away from normality) whether or not mct1 is present (Fig 1D)
  
  We changed the word “normalized” to “reversed”. In the discussion section, we also emphasized that many of these increases are independent of mitochondrial dysfunction induced by loss of mtFAS.
  
  The language "SIT4 is required for both the positive and negative transcriptional regulation elicited by mitochondrial dysfunction" feels strong. SIT4 seems to influence positive transcriptional regulation in response to mitochondrial dysfunction caused by MCT1 deletion (but may not be the only thing as there appears to be an increase in CIT2 expression in a sit4del background following a further deletion of MCT1). In terms of negative regulation, SIT4 deletion clearly affects the baseline, but MCT1 deletion still causes down regulation of both examples shown in Fig 1B, showing that negative transcriptional regulation can still occur in the absence of SIT4. The authors might consider showing fold change of expression as they do in later figures (Figs 4B and C) to help the reader evaluate the quantitative changes they demonstrate.
  
  We now displayed the fold change as suggested. This sentence now reads “These data suggest that SIT4 positively and negatively influences transcriptional regulation elicited by mitochondrial dysfunction”.
  
  The authors induce phosphate starvation by adding increasing amounts of potassium phosphate monobasic at a pH of 4.1 to phosphate dropout media supplemented with potassium. The authors did well to avoid confounding effects of removing potassium. The final pH of YNB is typically around 5.2. Is it possible that the authors are confounding a change in pH with phosphate starvation? One would expect the media in the phosphate starvation condition to have a higher pH than the phosphate replacement or control media. Is a change in pH possibly a confounding factor when interpreting phosphate starvation? Perhaps the authors could quantify the pH of the media they use for the experiment to understand how much of a factor that could be. One needs to be careful with Miotracker and any other fluorescent dye when pH changes. Albeit having constraints on its own, MitoLoc as a protein rather than small molecule marker of MMP might be a good complement.
  
  We followed the protocol used by many other studies that depleted phosphate in the media. The reason we and others adjusted the media without inorganic phosphate to a pH of 4.1 is because that is the pH of phosphate monobasic. From there, we could add phosphate monobasic to create +Pi media without changing the media pH. Therefore, media containing different concentrations of phosphate all have the exact same pH. We now emphasize that all media containing different levels of inorganic phosphate have the same pH to the manuscript to eliminate such concern (see page 18).
  
  Even though all media have the similar pH, we also provided complementary data using a parallel approach to measure the MMP by assessing mitochondrial protein import as demonstrated previously with Ilv2-FLAG, which shares the same principle as mitoLoc.
  
  Reference
  
  Arndt, K. T., Styles, C. A., & Fink, G. R. (1989). A suppressor of a HIS4 transcriptional defect encodes a protein with homology to the catalytic subunit of protein phosphatases. Cell, 56(4), 527–537. https://doi.org/10.1016/00928674(89)90576-X
  
  Dimmer, K. S., Fritz, S., Fuchs, F., Messerschmitt, M., Weinbach, N., Neupert, W., & Westermann, B. (2002). Genetic basis of mitochondrial function and morphology in Saccharomyces cerevisiae. Molecular Biology of the Cell, 13(3), 847–853. https://doi.org/10.1091/mbc.01-12-0588
  
  Gupta, R., Walvekar, A. S., Liang, S., Rashida, Z., Shah, P., & Laxman, S. (2019). A tRNA modification balances carbon and nitrogen metabolism by regulating phosphate homeostasis. ELife, 8, e44795. https://doi.org/10.7554/eLife.44795
  
  Jablonka, W., Guzmán, S., Ramírez, J., & Montero-Lomelí, M. (2006). Deviation of carbohydrate metabolism by the SIT4 phosphatase in Saccharomyces cerevisiae. Biochimica et Biophysica Acta (BBA) - General Subjects, 1760(8), 1281–1291. https://doi.org/10.1016/j.bbagen.2006.02.014
  
  Liu, N.-N., Flanagan, P. R., Zeng, J., Jani, N. M., Cardenas, M. E., Moran, G. P., & Köhler, J. R. (2017). Phosphate is the third nutrient monitored by TOR in Candida albicans and provides a target for fungal-specific indirect TOR inhibition. Proceedings of the National Academy of Sciences, 114(24), 6346–6351. https://doi.org/10.1073/pnas.1617799114
  
  Sutton, A., Immanuel, D., & Arndt, K. T. (1991). The SIT4 protein phosphatase functions in late G1 for progression into S phase. Molecular and Cellular Biology, 11(4), 2133–2148.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.25.513802v1
www.biorxiv.org www.biorxiv.org

New submission 09/01/2023, 11:51:40

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This study provides further detailed analysis of recently published Fly Atlas datasets supplemented with newly generated single cell RNA-seq data obtained from 6,000 testis cells. Using these data, the authors define 43 germline cell clusters and 22 somatic cell clusters. This work confirms and extends previous observations regarding changing gene expression programs through the course of germ cell and somatic cell differentiation.
  
  This study makes several interesting observations that will be of interest to the field. For example, the authors find that spermatocytes exhibit sex chromosome specific changes in gene expression. In addition, comparisons between the single nucleus and single cell data reveal differences in active transcription versus global mRNA levels. For example, previous results showed that (1) several mRNAs remain high in spermatids long after they are actively transcribed in spermatocytes and (2) defined a set of post-meiotic transcripts. The analysis presented here shows that these patterns of mRNA expression are shared by hundreds of genes in the developing germline. Moreover, variable patterns between the sn- and sc-RNAseq datasets reveals considerable complexity in the post-transcriptional regulation of gene expression.
  
  Overall, this paper represents a significant contribution to the field. These findings will be of broad interest to developmental biologists and will establish an important foundation for future studies. However, several points should be addressed.
  
  In figure 1, I am struck by the widespread expression of vasa outside of the germ cell lineage. Do the authors have a technical or biological explanation for this observation? This point should be addressed in the paper with new experiments or further explanation in the text.
  
  Thank you for pointing this out. We found that our single cell dataset shows a similar (low) level of vasa expression outside the germline, suggesting that this is not due to single nucleus versus single cell RNA-seq (cluster 1, red in the lefthand umap).
  
  Analyzing the single nucleus RNA-seq in more detail revealed that, compared to the germline, both the fraction of cells in a cluster expressing vasa and the level at which they express it are very low. This analysis is included in a new Figure 1 – figure supplement 1. It is likely that much of this is due to a technical artifact, such as ambient RNA. Finally, we note in the resubmission that vasa is in fact expressed in embryonic somatic cells, and thus some of the vasa expression we observe may be real (Renault. Biol Open 2012; https://doi.org/10.1242/bio.20121909).
  
  Plots in the original submission drew undue attention to the few somatic cells that exhibited vasa signal, due to the fact that expressing cell points were forced to the front of the plot. Given our new analysis reporting the low levels and fraction of cells exhibiting vasa expression (Figure 1 – figure supplement 1), we have modified the panels of Figure 1, changing point size to more faithfully reflect the small proportion of somatic cells with some vasa expression.
  
  The proposed bifurcation of the cyst cells into head and tail populations is interesting and worth further exploration/validation. While the presented in situ hybridization for Nep4, geko, and shg hint at differences between these populations, double fluorescent in situs or the use of additional markers would help make this point clearer. Higher magnification images would also help in this regard.
  
  We thank the reviewer for their suggestions on clarifying the differences between HCC and TCC populations. As suggested, we have repeated the FISH experiments of Nep4 and geko with higher resolution, and included the additional marker Coracle that demarcates the junction between HCC and TCC (Figure 6O,Q,S,T). These panels replaced previous Nep4 and geko FISH images (see previous Figure 6Q,U,U’). FISH for Nep4 validated the split, and the enrichment of geko strongly suggests that this arm represents one cell type (HCCs). We have not yet identified a gene reciprocally enriched to the other arm. Therefore, in the revised submission, we call the assignment of TCC identity, and to a lesser extent, HCC identity ‘tentative’, but point out that genes predicted to be enriched to one or the other arm represent fertile candidates for the field to test.
  
  Reviewer #2 (Public Review):
  
  In this manuscript the authors explain in greater detail a recent testis snRNAseq dataset that many of these authors published earlier this year as part of the Fly Cell Atlas (FCA) Li et al. Science 2022. As part of the current effort additional collaborators were recruited and about 6,000 whole cell scRNAseq cells were added to the previous 42,000 nuclei dataset. The authors now describe 65 snRNseq clusters, each representing potential cell types or cell states, including 43 germline clusters and 22 somatic clusters. The authors state that this analysis confirms and extends previously knowledge of the testis in several important areas.
  
  “However, in areas where testis biology is well studied, such as the development of germ cells from GSC to the onset of spermatocyte differentiation, the resolution seems less than current knowledge by considerable margins. No clusters correspond to GSCs, or specific mitotic spermatogonia, and even the major stages of meiotic prophase are not resolved. Instead, the transitions between one state and the next are broad and almost continuous, which could be an intrinsic characteristic of the testis compared to other tissues, of snRNAseq compared to scRNAseq, or of the particular experimental and software analysis choices that were used in this study.”
  
  Note that the referee raises the same issue later in their review also. To respond succinctly, we placed the relevant sentence from a later portion of this referee’s comment here
  
  “Support for the view that the problems are mostly technical, rather than a reflection of testis biology, comes from studies of scRNAseq in the mouse, where it has been possible to resolve a stem cell cluster, and germ cell pathways that follow known germ cell differentiation trajectories with much more discrete steps than were reported here (for example, Cao et al. 2021 cited by the authors).”
  
  Respectfully, we have a different interpretation of other work as cited by this referee. Our data, as well as that from others, supports the notion that transitions are generally broad and continuous and are indeed a feature of testis biology. As we report here, data from both single cell and single nucleus RNAseq exhibit transitions from one cluster to the next. Thus, this feature cannot be due to the choice of method (single cell versus single nucleus).
  
  In fact, prior scRNA-seq results on systems containing a continuously renewing cell population, such as is the case in the testis, do indeed exhibit a contiguous trajectory rather than discrete, well-separated cell states in gene expression space (that is, in a UMAP presentation). For example, this is the case from single-cell or single-nucleus sequencing from spermatogenesis in mouse (Cao et al 2021), human (Sohni et al 2019), and zebrafish (Qian et al 2022).
  
  Along differentiation trajectories in these tissues, successive clusters are defined by their aggregate, transcript repertoire. Indeed, differentially-expressed genes can be identified for clusters, with expression enriched in a given cluster. However, expression is rarely restricted to a cluster. For instance, Cao et al. subcluster spermatogonia into four subgroups, termed SPG1-4. They state clearly that these SPG1-4 “follow a continuous differentiation trajectory,” as can be inferred by marker expression across cells in this lineage. Similar to our findings, while the spermatogonia can fall into discrete clusters, gene expression patterns are contiguous. For example, the “undifferentiated” marker used in Cao et al, Crabp1, clearly shows expression in SPG1-3, annotated as spermatogonial stem cells, undifferentiated spermatogonia, and early differentiated spermatogonia, respectively. Likewise, markers for the “SPG3” state spermatogonia have detectable expression in SPG2 and SPG4, and likewise for markers of the “SPG4” state (with expression found also in SPG3). <br /> Analogous study of human spermatogenesis arrives at a similar conclusion. In that work, although clusters are named as “spermatogonial stem cell (SSC)”, the authors are careful to specifically point out that, “…while we refer to the SSC-1 and SSC-2 cell clusters as ‘‘SSCs,’’ scRNA-seq is not a functional assay and thus we do not know the percentage of cells in these clusters with SSC activity. These subsets almost certainly contain other A-SPG cells [A type spermatogonia], including SPG progenitors that have committed to differentiate.” (Sohi et al 2019)
  
  Thus, the work in several disparate systems, all involving renewing lineages, finds that discrete clusters, such as a “stem cell cluster” are not identified. In the Drosophila testis, germline differentiation flows in a continuous-like manner similar to spermatogenesis in several other organisms studied by scRNA-seq, and our finding is not a function of the methodology, but rather a facet of the biology of the organ.
  
  Operating in parallel with continuous differentiation, we did find evidence of, and extensively discussed in concert with Figure 4, huge and dramatic shifts in transcriptional state in spermatocytes compared to spermatogonia, in early spermatids compared to spermatocytes, and in late spermatid elongation. Lastly, as we describe further below, new data in this resubmission identify four distinct genes with stage-selective expression as predicted by our analysis (new Figure 2 - figure supplement 1), illustrating the utility of our study for the field to find new markers and new genes to test for function.
  
  A goal of the study was to identify new rare cell types, and the hub, a small apical somatic cell region, was mentioned as a target region, since it regulates both stem cell populations, GSCs and CySCs, is capable of regeneration, and other fascinating properties. However the analysis of the hub cluster revealed more problems of specificity. 41 or 120 cells in the cluster were discordant with the remaining 79 which did express markers consistent with previous studies. Why these cells co-clustered was not explained and one can only presume that similar problems may be found in other clusters.
  
  Our writing seems not to have been clear enough on this point and we thank the reviewer. We have revised the section. In addition, we have added new data (Figure 7 - figure supplement 2). We had already stated that only 79 of these 120 nuclei were near to each other in 2D UMAP space, while other members of original cluster 90 were dispersed. Thus the 79 hub nuclei in fact clustered together on the UMAP. Other nuclei that mapped at dispersed positions were initially ‘called’ as part of this cluster in the original Fly Cell Atlas (FCA) paper (Li et al., 2022), making it obvious that a correction to that assignment was necessary, which we carried out. To our eye, no other called cluster was represented by such dispersed groupings. For the hub, we definitively established the 79 nuclei to represent hub cells by marker gene analysis, including the identification of a new maker, tup, that was included in the 79 annotated hub nuclei but excluded from the 41 other nuclei (Figure 7). In this resubmission, to independently verify the relationship of the 79 nuclei to each other, we subjected the 120 nuclei from the original cluster 90 defined by the FCA study to hierarchical clustering using only genes that are highly expressed and variable in these nuclei (Figure 7 - figure supplement 2). This computationally distinct approach strongly supported our identification of the 79 definitive hub nuclei.
  
  Indeed, many other indications of specificity issues were described, including contamination of fat body with spermatocytes, the expression of germline genes such as Vasa in many somatic cell clusters like muscle, hemocytes, and male gonad epithelium, and the promiscuous expression of many genes, including 25% of somatic-specific transcription factors, in mid to late spermatocytes. The expression of only one such genes, Hml, was documented in tissue, and the authors for reasons not explained did not attempt to decisively address whether this phenomenon is biologically meaningful.
  
  We discussed the question of vasa expression in somatic clusters in some detail above, in response to referee #1, and included new analysis in the resubmission.
  
  With respect to the observation of ‘somatic gene’ expression in spermatocytes, we are also intrigued. We do not believe this is due to “contamination,” but rather a spermatocyte expression program that includes expression of somatic genes. First, these somatic markers were not observed in other germline clusters, which would be expected if this was due to general transcript contamination. Second, we observed expression of somatic markers in spermatocytes independently in the single-cell and single-nucleus data, making it unlikely to be an artifact of preparation of isolated nuclei. Finally, in the resubmission, in addition to Hml, we validated ‘somatic’ marker expression in spermatocytes by FISH of a somatic, tail cyst cell marker, Vsx1. Vsx1 is predicted to be expressed at low levels in spermatocytes in our dataset and is clearly visible in germline cells by FISH (Figure 3 – figure supplement 2G,H). We also refer the referee to Figure 6K, where the mRNA for the somatic cyst cell marker eya was observed by FISH at low levels in spermatocytes.
  
  A truly interesting question mentioned by the authors is why the testis consistently ranks near the top of all tissues in the complexity of its gene expression. In the Li et al. (2022) paper it was suggested that this is due an inherently greater biological complexity of spermiogenesis than other tissues. It seems difficult to independently and rationally determine "biological complexity," but if a conserved characteristic of testis was to promiscuously express a wide range of (random?) genes, something not out of the question, this would be highly relevant and important.
  
  We agree that the massive transcriptional program found in spermatocytes is, indeed, truly interesting. There are many speculations as to why spermatocytes are so highly transcriptional, including the possibility of “transcriptional scanning” (e.g., Xia et al. 2020) regulating the evolution of new genes. Testing such models is beyond the scope of this paper. However, one must also keep in mind that spermatogenesis involves one of the most dramatic cellular transformations in biology, where cellular components spanning from nuclei to chromatin to Golgi, cell cycle, extensive membrane addition, changes in cell shape, and building of a complex swimming organelle all must occur and be temporally coordinated. Small wonder that many genes must be expressed to accomplish these tasks.
  
  Unfortunately, the most likely problems are simply technical. Drosophila cells are small and difficult to separate as intact cells. The use of nuclei was meant to overcome this inherent problem, but the effectiveness of this new approach is not yet well-documented. Support for the view that the problems are mostly technical, rather than a reflection of testis biology, comes from studies of scRNAseq in the mouse, where it has been possible to resolve a stem cell cluster, and germ cell pathways that follow known germ cell differentiation trajectories with much more discrete steps than were reported here (for example, Cao et al. 2021 cited by the authors).
  
  We respectfully disagree with the referee about this collection of statements. First, the use of snRNASeq has been extensively characterized and compared to scRNA-seq in brain tissue by McLaughlin et al., 2021 (cited in the original submission) and was shown to be effective (McLaughlin, et al. eLife 2021;10:e63856. DOI: https://doi.org/10.7554/eLife.63856). snRNA-seq has a distinct advantage when dealing with long, thin cells, such as neurons or cyst cells (as featured in this work), where cytoplasm can easily be sheared off during cell isolation. Second, in a previous portion of our response to this referee, we discussed how our interpretation of Cao et al., 2021 differs from that expressed by this referee. Lastly, as requested in ‘Essential revision’ 2, we adjusted clustering methods and selected four genes, two predicted to be markers for early stage germline cells, and two for mid-spermatocyte stage development. FISH analysis demonstrates that expression for each of these maps to the appropriate stages (new Figure 2 - figure supplement 1). This confirms that the datasets we present in this manuscript can be mined to identify unique, diagnostic markers for various stages.
  
  The conclusions that were made by the authors seem to either be facts that are already well known, such as the problem that transcriptional changes in spermatocytes will be obscured by the large stored mRNA pool, or promises of future utility. For example, "mining the snRNA-seq data for changes in gene expression as one cluster advances to the next should identify new sub-stage-specific markers." If worthwhile new markers could be identified from these data, surely this could have been accomplished and presented in a supplemental Table. As it currently stands, the manuscript presents the dataset including a fair description of its current limitations, but very little else of novel biological interest is to be found.
  
  “In sum, this project represents an extremely worthwhile undertaking that will eventually pay off. However, some currently unappreciated technical issues, in cell/nuclear isolation, and certainly in the bioinformatic programs and procedures used that mis-clustered many different cells, has created the current difficulties.
  
  Most scRNAseq software is written to meet the needs of mammalian researchers working with cultured cells, cellular giants compared to Drosophila and of generally similar size. Such software may not be ideal for much smaller cells, but which also include the much wider variation in cell size, properties and biological mechanisms that exist in the world of tissues.”
  
  We appreciate the referee’s acknowledgement that this ‘undertaking will eventually pay off’. It was not our intention to address ‘function’ for this study, but rather to make the system accessible to the broadest community possible. We are uncertain if there is any remaining reservation held by this referee. A brief summary of what we covered in the manuscript may help allay any residual concern. Obviously, study of the Drosophila testis and spermatogenesis benefits from the knowledge of a large number of established cell-type and stage-selective markers. Thus, we extensively used the community’s accepted markers to assign identity to clusters in both the sn- and sc-RNA-seq UMAPs. We believe that effort well establishes the validity and reliability of the dataset . Furthermore, we identified upwards of a dozen new markers out of the cluster analysis, and verified their expression by FISH or reporter line in various figures throughout (tup, amph, piwi, geko, Nep4, CG3902, Akr1B, loqs, Vsx1, Drep2, Pxt, CG43317, Vha16-5, l(2)41Ab). To our mind, these contributions, coupled with annotation of the datasets, suggest strongly that they will serve the community well. This is especially true as we provide users with objects that they can feed into commonly used software algorithms such as Seurat and Monocle to explore the datasets to their purposes. Rather than simply relying on default settings within some of the applications, we also adjusted parameters for various clusterings as called for; some of which were in response to astute comments from referees, and included in the resubmission. Of course, it is possible that rare issues may arise in the datasets as these are further studied, but that is the case with all scRNA-seq data, and is not specific to work on this model organism.
  
  Reviewer #3 (Public Review):
  
  In this study, the authors use recently published single nucleus RNA sequencing data and a newly generated single cell RNA sequencing dataset to determine the transcriptional profiles of the different cell types in the Drosophila ovary. Their analysis of the data and experimental validation of key findings provide new insight into testis biology and create a resource for the community. The manuscript is clearly written, the data provide strong support for the conclusions, and the analysis is rigorous. Indeed, this manuscript serves as a case study demonstrating best practices in the analysis of this type of genomics data and the many types of predictions that can be made from a deep dive into the data. Researchers who are studying the testis will find many starting points for new projects suggested by this work, and the insightful comparison of methods, such as between slingshot and Monocle3 and single cell vs single nucleus sequencing will be of interest beyond the study of the Drosophila testis.
  
  We greatly appreciate the reviewer’s comments.
  
  Reviewer #4 (Public Review):
  
  This is an extraordinary study that will serve as key resource for all researchers in the field of Drosophila testis development. The lineages that derive from the germline stem cells and somatic stem cells are described in a detail that has not been previously achieved. The RNAseq approaches have permitted the description of cell states that have not been inferred from morphological analyses, although it is the combination of RNAseq and morphological studies that makes this study exceptional. The field will now have a good understanding of interactions between specific cell states in the somatic lineage with specific states in the germ cell lineage. This resource will permit future studies on precise mechanisms of communication between these lineages during the differentiation process, and will serve as a model for studies of co-differentiation in other stem cell systems. The combination of snRNAseq and scRNAseq has conclusively shown differences in transcriptional activation and RNA storage at specific stages of germ cell differentiation and is a unique study that will inform other studies of cell differentiation.
  
  Could the authors please describe whether genes on the Y chromosome are expressed outside of the male germline. For example, what is represented by the spots of expression within the seminal vesicle observed in Figure 3D?
  
  Prior work demonstrated that proteins encoded by Y-linked genes are not expressed outside of the germline (Zhang et al. Genetics 2020. https://doi.org/10.1534/genetics.120.303324). In our snRNAseq dataset, we find that genes on the Y chromosome are not highly expressed outside of the male germline (on the order of ~100-fold lower in other tissues). In fact, we observe Y chromosome transcripts at this level in many nuclei across tissues collected for the Fly Cell Atlas project, including the ovary. Since we have not followed up on the Fly Cell Atlas observations directly using FISH to examine Y chromosome transcript expression outside the germline, we cannot rule out the possibility that such low level expression is real. However, the detection across several tissues argues that this is likely technical artifact. With regard to ‘spots of expression within the seminal vesicle’ (Figure 3D), a spot is colored red if the average expression level of genes on the Y chromosome is greater in that cell than in an average cell on our plot. These red spots are likely due to ambient RNA being carried over.
  
  I would appreciate some discussion of the "somatic factors" that are observed to be upregulated in spermatocytes (e.g. Mhc, Hml, grh, Syt1). Is there any indication of functional significance of any of these factors in spermatocytes?
  
  This is an excellent question. Although we validated expression for several (Hml, Vsx1 and eya), we did not test for their function here and this issue remains to be studied. This is now directly stated in the main text.
  
  In the discussion of cyst cell lineage differentiation following cluster 74 the authors state that neither the HCC or TCC lineages were enriched for eya (Figure 6V). It seems in this panel that cluster 57 shows some enrichment for eya - is this regarded as too low expression to be considered enriched?
  
  We thank the reviewer for their insightful comment and we agree with their conclusions. We have modified the text to reflect the low, but present, expression of eya in the HCC and TCC lineages. The text now reads as follows at line (insert line # here): “Enrichment of eya was dramatically reduced in the clusters along either late cyst cell branch compared to those of earlier lineage nuclei (Figure 6J,U).”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.07.26.501581v1
www.medrxiv.org www.medrxiv.org

NREM sleep signatures of memory disruption and psychiatric symptoms in young people with 22q11.2 deletion syndrome

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This paper presents analysis of an impressive dataset acquired from sibling pairs, where one child had a specific gene mutation (22q11.2DS), whereas other child served as a blood-related, healthy control. The authors gathered rich, multi-faced data, including genetic profile, behavioral testing, neuropsychiatric questionnaires, and sleep PSG.
  
  The analyses explore group differences (gene mutation vs. healthy controls) in terms of sleep architecture, sleep-specific brain oscillations and performance on a memory task.
  
  The authors utilized a solid mix-model statistical approach, which not only controlled for the multi-comparison problem, but also accounted for between-subject and within-family variance. This was supplemented by mediation analysis, exploring the exact interaction between the variables. Remarkably, the two subject groups were gender balanced, and were matched in terms of age and sex.
  
  Thank you for this endorsement of our approach.
  
  There are some aspects requiring clarification. In the discussion section, some claims come across as too general, or too speculative, and lack proper evidence in the current analysis of in the references.
  
  We have extensively revised our discussion, including introducing more referencing and adding subheadings which we hope makes our conclusions both more structured and better evidenced (Discussion, pages 27 – 31)
  
  Furthermore, the authors seem to treat their (child) participants with the gene mutation as forerunners of (adult) schizophrenic patients, to whom their repeatedly compare the findings. However, less than half of these children with 22q11.2DS are expected to develop psychotic disorders. In fact, they are at risk of many other neuropsychiatric conditions (incl. intellectual disability, ASD, ADHD, epilepsy) (cf. introduction section).
  
  We have revised our introduction (page 4 -5) and discussion to clarify the significant comorbidity in 22q11.2DS. We discuss the limitations and future directions section of our work in the discussion (page 30)
  
  Furthermore, the liberal criteria for detecting slow-waves, along with odd topography of the detections, limit the credibility of the slow-wave-related results.
  
  As there is no single common method for SW detection, as noted on page 37, we prioritised rate of detection in order to provide a robust dataset for spindle-SW coupling analysis. We considered the use of an absolute detection threshold (e.g. – 75 microVolts) – however, because our participants were of a wide range of ages (6 to 20 years), and it is established that the absolute amplitude of the EEG decreases across childhood (e.g. Hahn et al 2020), our view is that the use of an absolute detection threshold would potentially bias the detection of slow waves by age. We have added comments on this matter to the methods section (page 37)
  
  Lastly, we cannot be sure whether the presented memory effects reflect between-group difference in general cognitive capacities, or, as claimed, in overnight memory consolidation.
  
  We have added statistical analysis of the overnight change in performance (results, page 6) to explore this point. We clarify that although 22q11.2DS is associated with slower learning and worse accuracy in the test session, there is not a difference in overnight change in 22q11.2DS.
  
  Generally, the current study introduces dataset connecting various aspects of 22q11.2DS. It has a great potential for complementing the current state of knowledge not only in the clinical, but also in sleep-science field.
  
  Thank you
  
  Reviewer #2 (Public Review):
  
  This study examines 22q11.2 microdeletion syndrome in 28 individuals and their unaffected siblings. Though the sample size is small, it is on par with many neuroimaging studies of the syndrome. Part of the interest in this disorder arises from the risk this syndrome confers for neuropsychiatric disorders in general and psychosis specifically. The authors examine sleep neurophysiology in 22q11.2DS and their siblings. Principal findings include increase slow wave and spindle amplitudes in deletion carriers as compared to controls.
  
  Strengths of this manuscript include:
  
  The inclusion of siblings as a control group, which minimizes environmental and (other) genetic confounds
  
  The data analyses of the sleep EEG are appropriate and in-depth
  
  High-density sleep EEG allows for topographic mapping
  
  We thank the reviewer for this positive endorsement of our work
  
  Weaknesses of this manuscript include:
  
  The manuscript is framed as an investigation of the psychosis and schizophrenia; however, psychotic experiences did not differ between 22q11.2DS and healthy controls. Therefore, the emphasis on schizophrenia and psychosis does not pertain to this sample and the manuscript introduction and discussion should be carefully reframed. The final sentence of the abstract is also not supported by the data: "... out findings may therefore reflect delayed or compromised neurodevelopmental processes which precede, and may be biomarkers for, psychotic disorders".
  
  We have expanded our abstract, introduction and discussion to reflect the complex neurodevelopment phenotype observed in 22q11.2DS, and discuss the links between our findings, and elements of this phenotype
  
  What is the rationale for using a mediation model to test for the association between genotype and psychiatric symptoms? Given the modest sample size would a regression to test the association between genotype and psychiatric symptoms be more appropriate?
  
  Our rationale for mediation analysis was to expand on making simple group comparisons for various measures by asking if genotype effects on particular psychiatric/behavioural measures were potentially mediated by EEG measures. This is of considerable interest because, as noted above, the behavioural and psychiatric phenotype in 22q11.2DS is complex, and therefore dissection of links between particular EEG features and phenotypes, and asking if EEG measures can be biomarkers for these phenotypes, may give insight into this complexity.
  
  From Table 1, which presents means, standard deviations and statistics, it is hard to tell if there is a range of symptoms or if there are some participants with 22q11.2DS who met diagnostic criteria for a the listed disorder while others who have no or sub-threshold symptoms. This is important and informs the statistical analysis. Given the broad range of psychiatric symptoms, I also wonder if a composite score of psychopathology may be more appropriate. What about other psychiatric symptoms such as depression?
  
  We have added a supplementary figure to figure 1 to provide individual participants scores on psychiatric measures and FSIQ to fully inform the reader about individual data.
  
  We have taken the approach of using symptom scores, rather than using binary cut offs for diagnosis, to maximise the use of our dataset, and given many/all psychiatric phenotypes exist on a spectrum, to reflect the difference between clinical and research diagnoses.
  
  Regarding depression, it has been previously demonstrated in 22q11.2DS that mood disorders are rare at young ages (Chawner et al 2019), therefore given the low frequency, we have not included depression in this dataset
  
  We have considered the utility of a composite psychopathology score; however, it is already established that 22q11.2DS can be associated with a broad range of psychiatric/behavioural difficulties; in this study we were primarily interested in exploring the links (if any) between specific groups of symptoms, and specific features of the sleep phenotype. Therefore, we feel a composite psychopathology score would not add to the overall clarity of the manuscript
  
  The age range is very broad spanning 6 to 20 years. As there are marked changes in the sleep EEG with age, it is important to understand the influence of age. The small sample size precludes investigating age by group interactions meaningfully, but the presentation of the ages of 22q11.2DS and controls, rather than means, standard deviations and ranges, would be helpful for the reader to understand the sample.
  
  We have added scatter plots of EEG measures and age to each figure supplement to allow the reader to see changes with age
  
  Also, a figure showing individual data (e.g., spindle power) as a function of age and group would be informative. The authors should also discuss the possibility that the difference between the groups may vary as a function of age as has been shown for cortical grey matter volume (Bagaiutdinova et al., Molecular Psychiatry, 2021).
  
  We have provided plots of individual data with age for our main figures, in the figure supplements. We also note we have included age as a covariate in all main statistical models (methods, page 39). We thank the reviewer for the additional reference, this has been added to the discussion (page 29)
  
  There is a large group difference with regards to full scale IQ. IQ is associated with sleep spindles (e.g., Gruber et al., Int J of Psychphsy, 2013; Geiger et al., SLEEP, 2011). For this reason, the authors should control for IQ in all analyses.
  
  We note that the relationship between spindle characteristics and IQ has been questioned (e.g. Reynolds et al 2018 performed a meta-analysis which suggests no correlation with FSIQ, which would suggest against the suggested approach). We also note that genotype effects on FSIQ were not mediated by spindle properties. Furthermore, the phenotype in 22q11.2DS is complex, while lower IQ is a well evidenced part, it is only one component. We are unclear if it would be justified to regress out only one component of a phenotype.
  
  The authors find greater power in the delta and sigma bands in 22q11.2DS compared to their siblings. Looking at the Figure 2, it appears power is elevated across frequencies. If this were the case, this would likely change the interpretation of the findings, and suggest that the sleep EEG likely reflects changes in cortical thickness between controls and 22q11.2DS participants.
  
  We thank the review for this interesting comment. We have now altered the approach taken to our analysis of spectral data in order to probe overall differences in overall power, using the IRASA approach described by Hahn et al 2020. We present these results on page 13, and use measures derived from this analysis in the mediation and behavioural analyses, and discuss these findings in the discussion (page 29)
  
  Along the same lines as the above comment, it would be interesting to examine REM sleep and test how specific to sleep spindles and slow waves these findings are.
  
  We have now added analysis of REM-derived spectral measures, which we believe complement our finding of altered proportions of REM sleep in 22q11.2DS compared to controls (page 13)
  
  Reviewer #3 (Public Review):
  
  In this study, Donnelly and colleagues quantified sleep oscillations and their coordination in in young people with 22q11.2 Deletion Syndrome and their siblings. They demonstrate that 22q11.2DS was associated with enhanced power the in slow wave and sleep spindle range, elevated slow-wave and spindle amplitudes and altered coupling between spindles and slow-waves. In addition, spindle and slow-wave amplitudes in 22q11.2DS correlated negatively with the outcomes of a memory test. Overall, the topic and the results of the present study are interesting and timely. The authors employed many thoughtful analyses, making sense out of complicated data. However, some features of the manuscript need further clarification.
  
  1.) Several interesting results of the manuscript are related to altered sleep spindle characteristics in 22q11.2DS (increased power, increased amplitudes and altered coupling with slow waves). On top of that the authors report, that the spindle frequency was correlated with age. I was wondering whether the authors might want to take these individual (age-related) differences into account in their analyses. The authors could detect the peak spindle frequency per participant and inform their spindle detection procedure accordingly. This procedure might lead to an even more clear cut picture concerning altered spindle activity in 22q11.2DS.
  
  We thank the review for this informative suggestion. We have now implemented this method, detecting spindles for each individual at a frequency defined through IRASA analysis of the EEG (results, page 13; methods, page 35), and then using the properties of spindles detected through this method in further analysis.
  
  We have included age as a covariate in all main models (methods, page 39), and present individual data scattered with age in our figure supplements.
  
  2.) The authors state in the methods section that EEG data was re-referenced to a common average during pre-processing. Did the authors take into account that this reference scheme will lead to a polarity inversion of the signal, potentially over parietal/occipital areas? This inversion will not affect spindle related analyses, but might misguide the detection of slow waves and hence confound related analyses and results.
  
  We have reviewed our data preprocessing pipeline, and updated it based on the latest methods suggested from the EEGlab authors (methods, page 33). As a supplementary analysis we applied a heuristic signal polarity measure described by the authors of the luna software package https://zzz.bwh.harvard.edu/luna/vignettes/nsrr-polarity/ and did not observe any inversion of polarity in our sample.
  
  In the included figure (below) we calculated the Hjorth measure of signal polarity as described in luna, at every electrode and plotted a topoplot of the measure. In the figure numbers > 0 represent signals with a positive polarity, values < 0 a negative polarity. As demonstrated in the figure, there were no electrodes with a positive polarity, although we note that the most peripheral electrodes had an approximately neutral polarity, whereas more central electrodes had a slight negative bias.
  
  We also note that we only detected negative half waves with our slow wave detection algorithm, using a threshold set for each channel based on its own characteristics, so would not necessarily expect alterations in slow waves detection. Further, other authors have suggested that average referencing does not impact SW detection (e.g. Wennberg 2010)
  
  3.) I have some issues understanding the reported slow wave - spindle coupling results. Figure 5A indicates that ~100 degrees correspond to the down-state of the slow wave. Figure 5E shows that spindles preferentially clustered at fronto-central electrodes between 0 and 90 degrees, hence they seem to peak towards the slow wave downstate. This finding is rather puzzling given the prototypical grouping of sleep spindles by slow wave up-states (Staresina, 2015; Helfrich, 2018; Hahn, 2020). Could it be that the majority of detected spindles represent slow spindles (9-12 Hz; Mölle, 2011)?
  
  We observed peaks of spindle activity in the range of 9 – 24 degrees (so on the descending slope from the positive peak of the slow wave), but an average spindle frequencies in the 12 – 13 Hz range. Given we allowed each individual to have an individual spindle detection frequency, as above, and did not observe bimodal distributions of power in the sigma frequency band (Figure 2 Supplement 1), we do not believe our spindles specifically represent slow spindles
  
  Slow spindles are known to peak rather at the up- to down-state transition (which would fit the reported results) and show a frontal distribution (which again would fit to the spindle amplitude topographies in Fig 3E). If that was the case, it would make sense to specifically look at fast spindles (12-16 Hz) as well, given their presumed role in memory consolidation (Klinzing, 2019).
  
  We agree with the reviewer’s assessment of the distribution of the putative spindles we have detected. However, as we and other authors (Hahn et al 2020) have noted, we did not observe discrete fast and slow spindle frequency peaks in our analysis of the PSD (as has been observed by other authors e.g. Cox et al 2017). For this reason, and to reduce the complexity of the manuscript, we believe the best approach with our dataset is to focus on spindles at large, rather than detecting spindles in arbitrary frequency bands.
  
  In addition, is it possible that the rather strong phase shift from fronto-central to occipital sites is driven by a polarity inversion due to using a common reference (see comment 2)?
  
  As noted above, we do not observe significant polarity inversion in our signals using the luna heuristic measure. We were not able to identify published literature to inform our investigation of this suggestion, but would be happy to consider any specific suggestions from the reviewer
  
  Apart from that I would suggest to statistically evaluate non-uniformity using e.g. the Rayleigh test (both within and across participants).
  
  We have added an analysis of non-uniformity to the results section (results, page 20).
  
  4.) Somewhat related to the point raised above. The authors state that in the methods that slow wave spindle events were defined as time-windows were the peaks of spindles overlapped with slow waves. How was the duration of slow waves defined in this scenario? If it was up- to up-state the authors might miss spindles which lock briefly after the post down-state upstate, thereby overrepresenting spindles that lock to early phases of slow waves. Why not just defining a clear slow wave related time-window, such as slow wave down-state {plus minus} 1.5 seconds?
  
  We have implemented this suggestion (methods, page 38)
  
  5.) The authors correlated the NREM sleep features with the outcomes of a post-sleep memory test (both encoding and an initial memory test took place pre-sleep). If the authors want to show a clear association between sleep-related oscillations and the behavioural expressions of memory consolidation, taking just the post sleep memory task is probably not the best choice. The post-sleep test will, as the pre-sleep test, in isolation rather reflect general memory related abilities. To uncover the distinct behavioural effects of consolidation the authors should assess the relative difference between the pre- and post-sleep memory performance and correlate this metric with their EEG outcomes.
  
  We have added evening-morning performance difference as a measure to the results (page 6); however as there was no difference between groups in overnight change in performance, we focus on morning performance in relating behaviour to EEG outcomes (explored in results, page 6)
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2021.11.08.21266020v1
www.biorxiv.org www.biorxiv.org

Cell-surface tethered promiscuous biotinylators enable small-scale surface proteomics of human exosomes

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Cell surface proteins are of vital interest in the functions and interactions of cells and their neighbors. In addition, cells manufacture and secrete small membrane vesicles that appear to represent a subset of the cell surface protein composition.
  
  Various techniques have been developed to allow the molecular definition of many cell surface proteins but most rely on the special chemistry of amino acid residues in exposed on the parts of membrane proteins exposed to the cell exterior.
  
  In this report Kirkemo et al. have devised a method that more comprehensively samples the cell surface protein composition by relying on the membrane insertion or protein glycan adhesion of an enzyme that attaches a biotin group to a nearest neighbor cellular protein. The result is a more complex set of proteins and distinctive differences between normal and a myc oncogene tumor cells and their secreted extracellular vesicle counterparts. These results may be applied to the identification of unique cell surface determinants in tumor cells that could be targets for immune or drug therapy. The results may be strengthened by a more though evaluation of the different EV membrane species represented in the broad collection of EVs used in this investigation.
  
  We thank the reviewer for recognizing the importance of the work outlined in the manuscript. We have addressed the necessary improvements in the essential revisions section above.
  
  Reviewer #2 (Public Review):
  
  This paper describes two methods for labeling cell-surface proteins. Both methods involve tethering an enzyme to the membrane surface to probe the proteins present on cells and exosomes. Two different enzyme constructs are used: a single strand lipidated DNA inserted into the membrane that enables binding of an enzyme conjugated to a complementary DNA strand (DNA-APEX2) or a glycan-targeting binding group conjugated to horseradish peroxidase (WGA-HRP). Both tethered enzymes label proteins on the cell surface using a biotin substrate via a radical mechanism. The method provides significantly enhanced labeling efficiency and is much faster than traditional chemical labeling methods and methods that employ soluble enzymes. The authors comprehensively analyze the labeled proteins using mass spectrometry and find multiple proteins that were previously undetectable with chemical methods and soluble enzymes. Furthermore, they compare the labeling of both cells and the exosomes that are formed from the cells and characterize both up- and down-regulated proteins related to cancer development that may provide a mechanistic underpinning.
  
  Overall, the method is novel and should enable the discovery of many low-abundance cell-surface proteins through more efficient labeling. The DNA-APEX2 method will only be accessible to more sophisticated laboratories that can carry out the protocols but the WGA-HRP method employs a readily available commercial product and give equivalent, perhaps even better, results. In addition, the method cannot discriminate between proteins that are genuinely expressed on the cell from those that are non-specifically bound to the cell surface.
  
  The authors describe the approach and identify two unique proteins on the surface of prostate cell lines.
  
  Strengths:
  
  Good introduction with appropriate citations of relevant literature Much higher labeling efficiency and faster than chemical methods and soluble enzyme methods. Ability to detect low-abundance proteins, not accessible from previous labeling methods.
  
  Weaknesses: The DNA-APEX2 method requires specialized reagents and protocols that are much more challenging for a typical laboratory to carry out than conventional chemical labeling methods.
  
  The claims and findings are sound. The finding of novel proteins and the quantitative measurement of protein up- and down-regulation are important. The concern about non-specifically bound proteins could be addressed by looking at whether the detected proteins have a transmembrane region that would enable them to localize in the cell membrane.
  
  We thank the reviewer for recognizing the strengths and importance of this work. We also thank the reviewer for mentioning the issue of non-specifically bound proteins. As addressed above in the essential revisions sections, we believe that any low affinity, non-specific binding proteins are likely removed in the multiple wash/centrifugation steps on cells or the multiple centrifugation steps and sucrose gradient purification on EVs. Given the likelihood for removal of non-specific binders, we believe that the secreted proteins identified are likely high affinity interactions and their differential expression on either cells or EVs play an important part in the downstream biology of both sample types. However, the previous data presentation did not clarify which proteins pertained to the transmembrane plasma membrane proteome versus secreted protein forms. For further clarity in the data presentation (Figure 3D, 4D, 5D), we have bolded proteins that are also found in the SURFY database that only includes surface annotated proteins with a predicted transmembrane domain (Bausch-Fluck et al., The in silico human surfaceome. PNAS. 2018). We have also italicized proteins that are annotated to be secreted from the cell to the extracellular space (Uniprot classification). We have updated the text and caption as shown below:
  
  New Figure 3:
  
  Figure 3. WGA-HRP identifies a number of enriched markers on Myc-driven prostate cancer cells. (A) Overall scheme for biotin labeling, and label-free quantification (LFQ) by LC-MS/MS for RWPE-1 Control and Myc over-expression cells. (B) Microscopy image depicting morphological differences between RWPE-1 Control and RWPE-1 Myc cells after 3 days in culture. (C) Volcano plot depicting the LFQ comparison of RWPE-1 Control and Myc labeled cells. Red labels indicate upregulation in the RWPE-1 Control cells over Myc cells and green labels indicate upregulation in the RWPE-1 Myc cells over Control cells. All colored proteins are 2-fold enriched in either dataset between four replicates (two technical, two biological, p<0.05). (D) Heatmap of the 15 most upregulated transmembrane (bold) or secreted (italics) proteins in RWPE-1 Control and Myc cells. Scale indicates intensity, defined as (LFQ Area - Mean LFQ Area)/standard deviation. Extracellular proteins with annotated transmembrane domains are bolded and annotated secreted proteins are italicized. (E) Table indicating fold-change of most differentially regulated proteins by LC-MS/MS for RWPE-1 Control and Myc cells. (F) Upregulated proteins in RWPE-1 Myc cells (Myc, ANPEP, Vimentin, and FN1) are confirmed by western blot. (G) Upregulated surface proteins in RWPE-1 Myc cells (Vimentin, ANPEP, FN1) are detected by immunofluorescence microscopy. The downregulated protein HLA-B by Myc over-expression was also detected by immunofluorescence microscopy. All western blot images and microscopy images are representative of two biological replicates. Mass spectrometry data is based on two biological and two technical replicates (N = 4).
  
  New Figure 4:
  
  Figure 4. WGA-HRP identifies a number of enriched markers on Myc-driven prostate cancer EVs. (A) Workflow for small EV isolation from cultured cells. (B) Labeled proteins indicating canonical exosome markers (ExoCarta Top 100 List) detected after performing label-free quantification (LFQ) from whole EV lysate. The proteins are graphed from least abundant to most abundant. (C) Workflow of exosome labeling and preparation for mass spectrometry. (D) Heatmap of the 15 most upregulated proteins in RWPE-1 Control or Myc EVs. Scale indicates intensity, defined as (LFQ Area - Mean LFQ Area)/SD. Extracellular proteins with annotated transmembrane domains are bolded and annotated secreted proteins are italicized. (E) Table indicating fold-change of most differentially regulated proteins by LC-MS/MS for RWPE-1 Control and Myc cells. (F) Upregulated proteins in RWPE-1 Myc EVs (ANPEP and FN1) are confirmed by western blot. Mass spectrometry data is based on two biological and two technical replicates (N = 4). Due to limited sample yield, one replicate was performed for the EV western blot.
  
  New Figure 5:
  
  Figure 5. WGA-HRP identifies a number of EV-specific markers that are present regardless of oncogene status. (A) Matrix depicting samples analyzed during LFQ comparison--Control and Myc cells, as well as Control and Myc EVs. (B) Principle component analysis (PCA) of all four groups analyzed by LFQ. Component 1 (50.4%) and component 2 (15.8%) are depicted. (C) Functional annotation clustering was performed using DAVID Bioinformatics Resource 6.8 to classify the major constituents of component 1 in PCA analysis. (D) Heatmap of the 25 most upregulated proteins in RWPE-1 cells or EVs. Proteins are listed in decreasing order of expression with the most highly expressed proteins in EVs on the far left and the most highly expressed proteins in cells on the far right. Scale indicates intensity, defined as (LFQ Area - Mean LFQ Area)/SD. Extracellular proteins with annotated transmembrane domains are bolded and annotated secreted proteins are italicized. (E) Table indicating fold-change of most differentially regulated proteins by LC-MS/MS for RWPE-1 EVs compared to parent cells. (F) Western blot showing the EV specific marker ITIH4, IGSF8, and MFGE8.Mass spectrometry data is based on two biological and two technical replicates (N = 4). Due to limited sample yield, one replicate was performed for the EV western blot.
  
  Authors mention time-sensitive changes but it is unclear how this method would enable one to obtain this kind of data. How would this be accomplished? The statement "Due to the rapid nature of peroxidase enzymes (1-2 min), our approaches enable kinetic experiments to capture rapid changes, such as binding, internalization, and shuttling events." Yes, it is faster, but not sure I can think of an experiment that would enable one to capture such events.
  
  We thank the reviewer for this comment and giving us an opportunity to elaborate on the types of experiments enabled by this new method. A previous study (Y, Li et al. Rapid Enzyme-Mediated Biotinylation for Cell Surface Proteome Profiling. Anal. Chem. 2021) showed that labeling the cell surface with soluble HRP allowed the researchers to detect immediate surface protein changes in response to insulin treatment. They demonstrated differential surfaceome profiling changes at 5 minutes vs 2 hours following treatment with insulin. Only methods utilizing these rapid labeling enzymes could allow for this type of resolution. A few other biological settings that experience rapid cell surface changes are: response to drug treatment, T-cell activation and synapse formation (S, Valitutti, et al. The space and time frames of T cell activation at the immunological synapse. FEBS Letters. 2010) and GPCR activation (T, Gupte et al. Minute-scale persistence of a GPCR conformation state triggered by non-cognate G protein interactions primes signaling. Nat. Commun. 2019). We also believe the method would be useful for post-translational processes where proteins are rapidly shuttling to the cell surface. We have updated the discussion to elaborate on these types of experiments.
  
  "Due to the fast kinetics of peroxidase enzymes (1-2 min), our approaches could enable kinetic experiments to capture rapid post-translational trafficking of surfaces proteins, such as response to insulin, certain drug treatments, T-cell activation and synapse formation, and GPCR activation."
  
  The authors do not have any way to differentiate between proteins expressed by cells and presented on their membranes from proteins that non-specifically bind to the membrane surface. Non-specific binding (NSB) is not addressed. Proteins can non-specifically bind to the cell or EV surface. The results are obtained by comparisons (cells vs exosomes, controls vs cancer cells), which is fine because it means that what is being measured is differentially expressed, so even NSB proteins may be up- and down-regulated. But the proteins identified need to be confirmed. For example, are all the proteins being detected transmembrane proteins that are known to be associated with the membrane?
  
  As mentioned above, we utilized the most rigorous informatics analysis available (Uniprot and SURFY) to annotate the proteins we find as having a signal sequence and/or TM domain. Data shown in heatmaps are based off of significance (p < 0.05) across all four replicates, which supports that any secreted proteins present are likely due to actual biological differences between oncogenic status and/or sample origin (i.e. EV vs cell). We have addressed this point in a previous comment above.
  
  The term "extracellular vesicles" (EVs) might be more appropriate than "exosomes" to describe the studied preparation.
  
  As we describe above in response to earlier comments, we have systematically changed from using exosomes to small extracellular vesicles and better defined the isolation procedure that we used in the methods section.
  
  Reviewer #3 (Public Review):
  
  The article by Kirkemo et al explores approaches to analyse the surface proteome of cells or cell-derived extracellular vesicles (EVs, called here exosomes, but the more generic term "extracellular vesicles" would be more appropriate because the used procedure leads to co-isolation of vesicles of different origin), using tools to tether proximity-biotinylation enzymes to membranes. The authors determine the best conditions for surface labeling of cells, and demonstrate that tethering the enzymes (APEX or HRP) increases the number of proteins detected by mass-spectrometry. They further use one of the two approaches (where HRP binds to glycans), to analyse the biotinylated proteome of two variants of a prostate cancer cell line, and the corresponding EVs. The approaches are interesting, but their benefit for analysis of cells or EVs is not very strongly supported by the data.
  
  First, the authors honestly show (fig2-suppl figures) that only 35% of the proteins identified after biotinylation with their preferred tool actually correspond to annotated surface proteins. This is only slightly better than results obtained with a non-tethered sulfo-NHS-approach (30%).
  
  We thank the reviewer for this comment. The reason we utilize membrane protein enrichment methods is that membrane protein abundance is low compared to cytosolic proteins and their identification can be overwhelmed by cytosolic contaminants. Nonetheless, despite our best efforts to limit labeling to the membrane proteins, cytosolic proteins can carry over. Thus, we utilize informatics methods to identify the proteins that are annotated to be membrane associated. The Uniprot GOCC (Gene Ontology Cellular Component) Plasma Membrane database is the most inclusive of membrane proteins only requiring they contain either a signal sequence, transmembrane domain, GPI anchor or other membrane associated motifs yielding a total of 5,746 proteins. This will include organelle membrane proteins. It is known that proteins can traffic from the internal organelles to the cell surface so these can be bonified cell surface proteins too. To increase the informatics stringency for membrane proteins we have now applied a new database aggregated from work by the Wollscheid lab, called SURFY (Bausch-Fluck et al., The in silico human surfaceome. PNAS. 2018). This is a machine learning method trained on 735 high confidence membrane proteins from the Cell Surface Protein Atlas (CSPA). SURFY predicts a total of 2,886 cell surface proteins. When we filter our data using SURFY for proteins, peptides and label free quantitation (LFQ) area for three methods, we find that the difference between NHS-Biotin and WGA-HRP expands considerably (see new Figure 3-Supplemental Figure 1 below). We observe these differences when the datasets are searched with either the GOCC Plasma Membrane database or the entire human Uniprot database. The difference is especially large for LFQ analysis, which quantitatively scores peptide intensity as opposed to simply count the number hits as for protein and peptide analysis. Cytosolic carry over is the major disadvantage of NHS-Biotin, which suppresses signal strength and is reflected in the lower LFQ values (24% for NHS-biotin compared to 40% for WGA-HRP). We have updated the main text and supplemental figure below:
  
  "Both WGA-HRP and biocytin hydrazide had similar levels of cell surface enrichment on the peptide and protein level when cross-referenced with the SURFY curated database for extracellular surface proteins with a predicted transmembrane domain (Figure 3 - Figure supplement 1A). Sulfo-NHS-LC-LC-biotin and whole cell lysis returned the lowest percentage of cell surface enrichment, suggesting a larger portion of the total sulfo-NHS-LC-LC-biotin protein identifications were of intracellular origin, despite the use of the cell-impermeable format. These same enrichment levels were seen when the datasets were searched with the curated GOCC-PM database, as well as the Uniprot entire human proteome database (Figure 3 - Figure supplement 1B). Importantly, of the proteins quantified across all four conditions, biocytin hydrazide and WGA-HRP returned higher overall intensity values for SURFY-specified proteins than either sulfo-NHS-LC-LC-biotin or whole cell lysis. Importantly, although biocytin hydrazide shows slightly higher cell surface enrichment compared to WGA-HRP, we were unable to perform the comparative analysis at 500,000 cells--instead requiring 1.5 million--as the protocol yielded too few cells for analysis."
  
  Figure 3-Figure Supplement 1. Comparison of surface enrichment between replicates for different mass spectrometry methods. (A) The top three methods (NHS-Biotin, Biocytin Hydrazide, and WGA-HRP) were compared for their ability to enrich cell surface proteins on 1.5 M RWPE-1 Control cells by LC-MS/MS after being searched with the Uniprot GOCC Plasma Membrane database. Shown are enrichment levels on the protein, peptide, and average MS1 intensity of top three peptides (LFQ area) levels. (B) The top three methods (NHS-Biotin, Biocytin Hydrazide, and WGA-HRP) were compared for their ability to enrich cell surface proteins on 1.5 M RWPE-1 Control cells by LC-MS/MS after being searched with the entire human Uniprot database. Shown are enrichment levels on the protein, peptide, and average MS1 intensity of top three peptides (LFQ area) levels. Proteins or peptides detected from cell surface annotated proteins (determined by the SURFY database) were divided by the total number of proteins or peptides detected. LFQ areas corresponding to cell surface annotated proteins (SURFY) were divided by the total area sum intensity for each sample. The corresponding percentages for two biological replicates were plotted.
  
  There are additional advantages to WGA-HRP over NHS-biotin. These include: (i) labeling time is 2 min versus 30 min, which would afford higher kinetic resolution as needed, and (ii) the NHS-biotin labels lysines, which hinders tryptic cleavage and downstream peptide analysis, whereas the WGA-HRP labels tyrosines, eliminating impacts on tryptic patterns. WGA-HRP is slightly below biocytin hydrazide in peptide and protein ID and somewhat more by LFQ. However, there are significant advantages over biocytin hydrazide: (i) sample size for WGA-HRP can be reduced a factor of 3-5 because of cell loss during the multiple washing steps after periodate oxidation and hydrazide labeling, (ii) the time of labeling is dramatically reduced from 3 hr for hydrazide to 2 min for WGA-HRP, and (iii) the HRP enzyme has a large labeling diameter (20-40 nm, but also reported up to 200 nm) and can label non-glycosylated membrane proteins as opposed to biocytin hydrazide that only labels glycosylated proteins. The hydrazide method is the current standard for membrane protein enrichment, and we feel that the WGA-HRP will compete especially when cell sample size is limited or requires special handling. In the case of EVs, we were not able to perform hydrazide labeling due to the two-step process and small sample size.
  
  Indeed the list of identified proteins in figures 4 and 5 include several proteins whose expected subcellular location is internal, not surface exposed, and whose location in EVs should also be inside (non-exhaustively: SDCBP = syntenin, PDCD6IP = Alix, ARRDC1, VPS37B, NUP35 = nucleopore protein)…
  
  We thank the reviewer for this comment. We have elaborated on this point in a number of response paragraphs above. The proteins that the reviewer points out are annotated as “plasma membrane” in the very inclusive GOCC plasma membrane database. However, this means that they may also spend time in other locations in the cell or reside on organelle membranes. We have done further analysis to remove any intracellular membrane residing proteins that are included in the GOCC plasma membrane database, including the five proteins mentioned above. We also have further highlighted proteins that appear in the SURFY database, as discussed above and in our response to Reviewer 2’s comment. To increase stringency, we have bolded proteins that are found in the more selective SURFY database and italicized secreted proteins. Due to our new analysis and data presentation, it is more clear which markers are bona fide extracellular resident membrane proteins. We have updated the Figures and Figure legends as mentioned above, as well as added this statement in the Data Processing and Analysis methods:
  
  "Additionally, to not miss any key surface markers such as secreted proteins or anchored proteins without a transmembrane domain, we chose to initially avoid searching with a more stringent protein list, such as the curated SURFY database. However, following the analysis, we bolded proteins found in the SURFY database and italicized proteins known to be secreted (Uniprot)."
  
  The membrane proteins identified as different between the control and Myc-overexpressing cells or their EVs, would have been identified as well by a regular proteomic analysis.
  
  To directly compare surfaceomes of EVs to cells, we are compelled to use the same proteomic method. For parental cell surfaceomic analysis, a membrane enrichment method is required due to the high levels of cytosolic proteins that swamp out signal from membrane proteins. Although EVs have a higher proportion of membrane to cytosol, whole EV proteomics would still have significant cytosolic contamination.
  
  Second, the title highlights the benefit of the technique for small-scale samples: this is demonstrated for cells (figures 1-2), but not for EVs: no clear quantitative indication of amount of material used is provided for EV samples. Furthermore, no comparison with other biotinylation technics such as sulfo-NHS is provided for EVs/exosomes. Therefore, it is difficult to infer the benefit of this technic applied to the analysis of EVs/exosomes.
  
  We appreciate the reviewer for this comment. We have updated the methods as mentioned above in our response to the Essential Revisions. In brief, the yield of EVs post-sucrose gradient isolation was 3-5 µg of protein from 16x15 cm2 plates of cells, totaling 240 mL of media. Since we had previously demonstrated that our method was superior to sulfo-NHS for enriching surface proteins on cells, we proceeded to use the WGA-HRP for the EV labeling experiments.
  
  In addition, the WGA-based tethering approach, which is the only one used for the comparative analysis of figures 4 and 5, possibly induces a bias towards identification of proteins with a particular glycan signature: a novelty would possibly have come from a comparison of this approach with the other initially evaluated, the DNA-APEX one, where tethering is induced by lipid moieties, thus should not depend on glycans. The authors may have then identified by LC-MS/MS specific glycan-associated versus non-glycan-associated proteins in the cells or EVs membranes. Also ideally, the authors should have compared the 4 combinations of the 2 enzymes (APEX and HRP) and 2 tethers (lipid-bound DNA and WGA) to identify the bias introduced by each one.
  
  We thank the reviewer for this comment. We performed analysis to determine whether there was a bias towards Uniprot annotated “Glyco” vs “Non-Glyco” surface proteins within the SURFY database identified across the WGA-HRP, APEX2-DNA, APEX2, and HRP labeling methods. We performed this analysis by measuring the total LFQ area detected for each category (glycoprotein vs non-glycoprotein) and dividing that by the total LFQ area found across all proteins detected in the sample. We found similar normalized areas of non-glyco surface proteins between WGA-HRP and APEX2-DNA suggesting there is not a bias against non-glycosylated proteins in the WGA-HRP sample. There were slightly elevated levels of Glycoproteins in the WGA-HRP sample over APEX2-DNA. It is not surprising to us that there is little bias because the free-radicals generated by biotin-tyramide can label over tens of nanometers and thus can label not just the protein they are attached to, but neighbors also, regardless of glycosylation status. We have added this as Figure 2-Supplement 3, and amended the text in the manuscript below in purple.
  
  Figure 2 – Figure Supplement 3: Comparison of enrichment of Glyco- vs Non-Glyco-proteins. (A) TIC area of Uniprot annotated Glycoproteins compared to Non-Glycoproteins in the SURFY database for each labeling method compared to total TIC area. There was not a significant difference in detection of Non-Glycoproteins detected between WGA-HRP and APEX2-DNA and only a slightly higher detection of Glycoproteins in the WGA-HRP sample over APEX2-DNA.
  
  "As the mode of tethering WGA-HRP involves GlcNAc and sialic acid glycans, we wanted to determine whether there was a bias towards Uniprot annotated 'Glycoprotein' vs 'Non-Glycoprotein' surface proteins identified across the WGA-HRP, APEX2-DNA, APEX2, and HRP labeling methods. We looked specifically looked at surface proteins founds in the SURFY database, which is the most restrictive surface database and requires that proteins have a predicted transmembrane domain (Bausch-Fluck et al., The in silico human surfaceome. PNAS. 2018). We performed this analysis by measuring the average MS1 intensity across the top three peptides (area) for SURFY glycoproteins and non-glycoproteins for each sample and dividing that by the total LFQ area found across all GOCC annotated membrane proteins detected in each sample. We found similar normalized areas of non-glyco surface proteins across all samples (Figure 2 - Figure supplement 4). If a bias existed towards glycosylated proteins in WGA-HRP compared to the glycan agnostic APEX2-DNA sample, then we would have seen a larger percentage of non-glycosylated surface proteins identified in APEX2-DNA over WGA-HRP. Due to the large labeling radius of the HRP enzyme, we find it unsurprising that the WGA-HRP method is able to capture non-glycosylated proteins on the surface to the same degree (Rees et al. Selective Proteomic Proximity Labeling Assay SPPLAT. Current Protocols in Protein Science. 2015). There is a slight increase in the area percentage of glycoproteins detected in the WGA-HRP compared to the APEX2-DNA sample but this is likely due to the fact that a greater number of surface proteins in general are detected with WGA-HRP."
  
  As presented the article is thus an interesting technical description, which does not convince the reader of its benefit to use for further proteomic analyses of EVs or cells. Such info is of course interesting to share with other scientists as a sort of "negative" or "neutral" result. Maybe a novelty of the presented work is the differential proteome analysis of surface enriched EV/cell proteins in control versus myc-expressing cells. Such analyses of EVs from different derivatives of a tumor cell line have been performed before, for instance comparing cells with different K-Ras mutations (Demory-Beckler, Mol Cell proteomics 2013 # 23161513). However, here the authors compare also cells and EVs, and find possibly interesting discrepancies in the upregulated proteins. These results could probably be exploited more extensively. For instance, authors could give clearer info (lists) on the proteins differentially regulated in the different comparisons: in EVs from both cells, in EVs vs cells, in both cells.
  
  We appreciate the reviewer for this critique and have updated the manuscript accordingly. We have changed the title to “Cell surface tethered promiscuous biotinylators enable small-scale comparative surface proteomic analysis of human extracellular vesicles and cells” to more accurately depict the focus of our manuscript which, as the reviewer highlighted, is that this technology allows for comparative analysis between the surfaceomes of cells vs EVs. We appreciate the fine work from the Coffey lab on whole EV analysis of KRAS transformed cells. They identified a mix of surface and cytosolic proteins that change in EVs from the transformed cells, whereas our data focuses specifically on the surfaceome differences in Myc transformed and non-transformed cells and corresponding small EVs. We believe this makes important contributions to the field as well.
  
  To further address the reviewer’s suggestions, we additionally have significantly reorganized the figures to better display the differentially regulated proteins. We have removed the volcano plots and instead included heatmaps with the top 30 (Figure 3 and Figure 4) and top 50 (Figure 5) differentially regulated proteins across cells and EVs. We have also updated the lists of proteins in the supplemental source tables section. See responses to Reviewer 2 above for the updates to Figures 3-5. We have additionally included supplemental figures with lists of differentially upregulated proteins in the EV and Cell samples, which are shown below:
  
  Figure 3 – Supplement 3: List of proteins comparing enriched targets (>2-fold) in Myc cells versus Control cells. Targets that were found enriched (Myc/Control) in the Control cells (left) and Myc cells (right). The fold-change between Myc cells and Control cells is listed in the column to the right of the gene name.
  
  Figure 4 – Supplement 1: List of proteins comparing enriched targets (>1.5-fold) in Myc EVs versus Control EVs. Targets that were found enriched (Myc/Control) in the Control EVs (left) and Myc EVs (right). The fold-change between Myc EVs and Control EVs is listed in the column to the right of the gene name.
  
  Figure 4 – Figure Supplement 2: Venn diagram comparing enriched targets (>2-fold) in Cells and EVs. (A) Targets that were found enriched in the Control EVs (purple) and Control cells (blue) when each is separately compared to Myc EVs and Myc cells, respectively. The 5 overlapping enriched targets in common between Control cells and Control EVs are listed in the center. (B) Targets that were found enriched in the Myc EVs (purple) and Myc cells (blue) when each is separately compared to Control EVs and Control cells, respectively. The 12 overlapping enriched targets in common between Myc cells and Myc EVs are listed in the center.
  
  Figure 5 - Supplement 1: List of proteins comparing enriched targets (>2-fold) in Control EVs versus Control cells and Myc EVs versus Myc cells. (A)Targets that were found enriched (EV/cell) in the Control samples are listed. The fold-change values between Control EVs and Control cells are listed in the column to the right of the gene name. (B)Targets that were found enriched (EV/cell) in the Myc samples are listed. The fold-change values between Myc EVs and Myc cells are listed in the column to the right of the gene name.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.22.461393v1
www.biorxiv.org www.biorxiv.org

New submission 01/09/2022, 17:25:03

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  The work proposes a new computational rule for classifying synaptic plasticity outcome based on the geometry of synaptic enzyme dynamics. Specifically, the authors implement a multi-timescale model of hippocampal synaptic plasticity induction that takes into account the dynamics of the membrane potential, calcium concentration as well as CaMKII and calcineurin signalling pathways. They show that the proposed rule could be applied to reproduce the outcomes from nine published experimental studies involving different spike-timing and frequency-dependent plasticity induction protocols, animal ages, and experimental conditions. The model has been also used to generate predictions regarding the effect of spike-timing irregularity on plasticity outcomes. The proposed approach constitutes an interesting and original idea that contributes to the ongoing effort in discovering the rules of synaptic plasticity.
  
  The conclusions of this paper are mostly well supported by data, but some model assumptions and interpretation of modelling results need to be clarified and extended.
  
  1) The proposed model captures well the stochastic nature of the dendritic spine ion channels and receptors except for the calcium-sensitive potassium (SK) channel that has been modelled deterministically. Given that the same justification in terms of small number of channels present in the small dendritic spine compartment applies to the SK channels as well as to the voltage gated calcium channels and the AMPA and NMDA receptors, it is not clear why the authors have chosen a deterministic representation in the case of SK. The implications of this assumption needs to be investigated and discussed.
  
  There are several stochastic models of AMPA and NMDA receptors based on single-channel recordings. Additionally, we had enough experimental data on single channel recordings to build a custom Markov chain model of VGCCs. For the SK channel, we could not find enough experimental data (age-dependence activity, temperature sensitivity, etc.) to custom-build a stochastic model. We thus decided to implement a deterministic model. Yet, we understand the reviewers’ comment that in theory, a stochastic model of SK channels could impact our results. We thus now provide a simulation with a stochastic model of SK, comparing it to the deterministic model implemented in the study.
  
  We describe a minimal version of a stochastic model of SK compatible with the deterministic version. The deterministic model of SK channel fit at ~35C is described in the methods section.
  
  Because of the factor ρ 𝑓𝑆𝐾 in the equation, which multiplies r(Ca) by ~2, the equation cannot be related to a 2-state Markov chain (MC). This could probably be possible with a 3-state MC but we used a different strategy. Noting that ρ 𝑆𝐾 ∼ 2 , we introduce a new equation
  
  As 0 < r(Ca) < 1, it is straightforward to introduce a 2-state MC for which the above equation describes the probability of the open state. We then simulate two such independent (for a given Ca concentration) channels and approximate 𝑚 𝑆𝐾 as the sum (which belongs to [0,2Nsk]) of the open states for the 2 channels.
  
  As the reviewer can see in the figure below, we do not find a major difference in the simulations of 3 protocols. Thus, we argue that adding a stochastic version of the SK channels in our current study would not fundamentally alter our main conclusions.
  
  Figure Legend: a comparison using Tigaret et al. 2016 1Pre2Post10 and 1Pre2Post50 protocols, and 900 at 50 Hz protocol from Dudek and Bear 1992 (100 repetitions) between the model with the deterministic SK channel (original model - blue), and the modified model including the stochastic SK channel (stochastic SK - red). Deterministic vs stochastic SK channel does not significantly modify the model’s behaviour.
  
  To explain our rationale of using a deterministic version of SK channel, we provide this sentence in the Methods when describing SK channel model: “"Due to a lack of single-channel recordings of SK channels, and a lack of published stochastic models of SK channels, we modelled SK channels deterministically. In tests we found that this assumption had only a negligible impact on the outcomes of plasticity protocols (data not shown)" (page 40).
  
  2) Many of the model parameters have been set to values previously estimated from synaptic physiology and biochemistry experiments, However, a significant number of important parameter values have been tuned to reproduce the plasticity experiments targeted in this study. As such, it needs to be explained which of the plasticity outcomes have been reproduced because the parameters are chosen to do so. A clarification would have helped to substantiate the authors' conclusions.
  
  Most parameters were set with values previously defined by experimental work. We referred to these publications where necessary throughout the Methods and Tables in our original manuscript. For the few free parameters that were adjusted, we now provide additional information wherever necessary for the Tables concerned.
  
  ● In the legend of Table 4 (neuron electrical properties), we explain which parameters are different from values obtained from the literature to fit experimental data (Golding et al. 2001; Buchanan et al. 2007).
  
  ● Parameters for the sodium and potassium conductance (Table 5) are labelled as generic since they are intentionally set to produce the BaP dynamics we have shown in the paper.
  
  ● Table 6 has no free parameters.
  
  ● Table 7 caption now includes a description saying ’Note that the buffer concentration, calcium diffusion coefficient, calcium diffusion time constant and calcium permeability were considered free parameters to adjust the calcium dynamics’.
  
  ● In Table 8 we had originally pointed out how we adapted the GluN2B rates from a published GluN2A model (Popescu et al. 2004; and Iacobucci and Popesco 2018). We now describe this adaptation in the Table 8 legend. In this Table, we now also better explain how we adjusted the NMDAr model to reflect the ratio between GluN2B and GluN2A, fitted from Sinclair et al. 2016; and the NMDAr conductance depending on calcium fitted from Maki and Popescu 2014.
  
  ● In Table 9 caption we now explain how the GABAr number and conductance were modified to fit GABAr currents as in Figures 15 b and e. The relevant parameters are indicated in the table.
  
  ● In Table 10 caption we now state the number of VGCCs per subtype that we used as a free parameter to reproduce the calcium dynamics (Figure 12).
  
  3) Adding experimental testing of model predictions, for example, that firing variability can alter the rules of plasticity, in the sense that it is possible to add noise to cause LTP for protocols that did not otherwise induce plasticity would be needed to increase confidence in the presented modelling results.
  
  We agree that it would be interesting in the future to test the many model predictions suggested in this work with biological experiments. This would however require a lot of work and will be the subject of further studies.
  
  Reviewer #3 (Public Review):
  
  This manuscript presents and analyzes a novel calcium-dependent model of synaptic plasticity combining both presynaptic and postsynaptic mechanisms, with the goal of reproducing a very broad set of available experimental studies of the induction of long-term potentiation (LTP) vs. long-term depression (LTD) in a single excitatory mammalian synapse in the hippocampus. The stated objective is to develop a model that is more comprehensive than the often-used simplified phenomenological models, but at the same time to avoid biochemical modeling of the complex molecular pathways involved in LTP and LTD, retaining only its most critical elements. The key part of this approach is the proposed "geometric readout" principle, which allows to predict the induction of LTP vs. LTD by examining the concentration time course of the two enzymes known to be critical for this process, namely (1) the Ca2+/calmodulin-bound calcineurin phosphatase (CaN), and (2) the Ca2+/calmodulin-bound protein kinase (CaMKII). This "geometric readout" approach bypasses the modeling of downstream pathways, implicitly assuming that no further biochemical information is required to determine whether LTP or LTD (or no synaptic change) will arise from a given stimulation protocol. Therefore, it is assumed that the modeling of downstream biochemical targets of CaN and CaMKII can be avoided without sacrificing the predictive power of the model. Finally, the authors propose a simplified phenomenological Markov chain model to show that such "geometric readout" can be implemented mechanistically and dynamically, at least in principle.
  
  Importantly, the presented model has fully stochastic elements, including stochastic gating of all channels, stochastic neurotransmitter release and stochastic implementation of all biochemical reactions, which allows to address the important question of the effect of intrinsic and external noise on the induction of LTP and LTD, which is studied in detail in this manuscript.
  
  Mathematically, this modeling approach resembles a continuous stochastic version of the "liquid computing" / "reservoir computing" approach: in this case the "hidden layer", or the reservoir, consists of the CaMKII and CaM concentration variables. In this approach, the parameters determining the dynamics of these intermediate ("hidden") variables are kept fixed (here, they are constrained by known biophysical studies), while the "readout" parameters are being trained to predict a target set of experimental observations.
  
  Strengths:
  
  1) This modeling effort is very ambitious in trying to match an extremely broad array of experimental studies of LTP/LTD induction, including the effect of several different pre- and post-synaptic spike sequence protocols, the effect of stimulation frequency, the sensitivity to extracellular Ca2+ and Mg2+ concentrations and temperature, the dependence of LTP/LTD induction on developmental state and age, and its noise dependence. The model is shown to match this large set of data quite well, in most cases.
  
  2) The choice for stochastic implementation of all parts of the model allows to fully explore the effects of intrinsic and extrinsic noise on the induction of LTP/LTD. This is very important and commendable, since regular noise-less spike firing induction protocols are not very realistic, and not every relevant physiologically.
  
  3) The modeling of the main players in the biochemical pathways involved in LTP/LTD, namely CaMKII and CaN, aims at sufficient biological realism, and as noted above, is fully stochastic, while other elements in the process are modeled phenomenologically to simplify the model and reveal more clearly the main mechanism underlying the LTP/LTD decision switch.
  
  4) There are several experimentally verifiable predictions that are proposed based on an in-depth analysis of the model behavior.
  
  We thank the reviewer for pointing out these strengths.
  
  Weaknesses:
  
  1) The stated explicit goal of this work is the construction of a model with an intermediate level of detail, as compared to simplified "one-dimensional" calcium-based phenomenological models on the one hand, and comprehensive biochemical pathway models on the other hand. However, the presented model comes across as extremely detailed nonetheless. Moreover, some of these details appear to be avoidable and not critical to this work. For instance, the treatment of presynaptic neurotransmitter release is both overly detailed and not sufficiently realistic: namely, the extracellular Ca2+ concentration directly affects vesicle release probability but has no effect on the presynaptic calcium concentration. I believe that the number of parameters and the complexity in the presynaptic model could be reduced without affecting the key features and findings of this work.
  
  This point is largely answered in Essential Revisions point 4 where we argue the choices we made for the presynaptic model. We acknowledge, however, that in this current version, we did not incorporate all biophysical components, such as the modulation of presynaptic calcium concentration with external calcium variations and multivesicular release. The calcium-dependence of presynaptic release, as modeled currently, is however fitted in Figure 8e against data from Hardingham et al. 2006 and Tigaret et al. 2016. These current limitations could be addressed in a next version of our presynaptic model where we also plan to incorporate age and temperature influence.
  
  2) The main hypotheses and assumptions underlying this work need to be stated more explicitly, to clarify the main conclusions and goals of this modeling work. For instance, following much prior work, the presented model assumes that a compartment-based (not spatially-resolved) model of calcium-triggered processes is sufficient to reproduce all known properties of LTP and LTD induction and that neither spatially-resolved elements nor calcium-independent processes are required to predict the observed synaptic change. This could be stated more explicitly. It could also be clarified that the principal assumption underlying the proposed "geometric readout" mechanisms is that all information determining the induction of LTP vs. LTP is contained in the time-dependent spine-averaged Ca2+/calmodulin-bound CaN and CaMKII concentrations, and that no extra elements are required. Further, since both CaN and CaMKII concentrations are uniquely determined by the time course of postsynaptic Ca2+ concentration, the model implicitly assumes that the LTP/LTD induction depends solely on spine-averaged Ca2+ concentration time course, as in many prior simplified models. This should be stated explicitly to clarify the nature of the presented model.
  
  We thank the reviewer for the suggestions on how to clarify the main hypotheses and assumptions of our work. We slightly modified the sentences provided by the reviewer and added them in the main text (page 2, lines 82 and page 19, lines 593).
  
  3) In the Discussion, the authors appear to be very careful in framing their work as a conceptual new approach in modeling STD/STP, rather than a final definitive model: for instance, they explicitly discuss the possibility of extending the "geometric readout" approach to more than two time-dependent variables, and comment on the potential non-uniqueness of key model parameters. However, this makes it hard to judge whether the presented concrete predictions on LTP/LTD induction are simply intended as illustrations of the presented approach, or whether the authors strongly expect these predictions to hold. The level of confidence in the concrete model predictions should be clarified in the Discussion. If this confidence level is low, that would call into question the very goal of such a modeling approach.
  
  These are very good questions. Let us first comment on the parameter uniqueness. We believe, like in E. Marder’s work on ion channels expression in neurons, that the synapse has the possibility to adapt its internal parameters (proteins number, transition rates, etc) to provide a given functioning behaviour. As a by-product, there is non uniqueness of parameters associated with behavior. Additionally, since our model is able to reproduce 9 published experimental outcomes with a single set of parameters, it is a functioning synapse with adjusted parameters which output the expected behaviours. Thus by extrapolation, our confidence in the further predictions is high. We modified sentences in the discussion section to argue this point (page 21, line 707).
  
  Let us comment now on increasing the complexity. To our best, we strived to design a plasticity readout as simple as possible yet providing a functioning synapse. Given our success to reproduce 9 published experimental outcomes with a single set of parameters, adding more complexity would be akin to overfitting.
  
  4) The authors presented a simplified mechanistic dynamical Markov chain process to prove that the "geometric readout" step is implementable as a dynamical process, at least in principle. However, a more realistic biochemical implementation of the proposed "region indicator" variables may be complex and not guaranteed to be robust to noise. While the authors acknowledge and touch upon some of these issues in their discussion, it is important that the authors will prove in future work that the "geometric readout" is implementable as a biochemical reaction network. Barring such implementation, one must be extra careful when claiming advantages of this approach as compared to modeling work that attempts to reconstruct the entire biochemical pathways of LTP/LTD induction.
  
  We acknowledge this issue and agree this would be an interesting subject for future work.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.30.437703v3
www.biorxiv.org www.biorxiv.org

New submission 10/03/2023, 13:24:10

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  1) Comment: To determine the effect of diseased monocytes on retinal health, light-injured mouse retinas were injected with monocytes isolated from AMD patients (Figure 1 - figure supplement 1). This resulted in a reduction in photoreceptor number and ERG b-wave amplitude. However, the light-injured control eye was injected with PBS only, so no cells were present. The reasoning for using this control was not provided. The appropriate injection control would include monocytes isolated from non-AMD patients. This control should be performed side-by-side with cells from AMD patients.
  
  We thank the reviewer for this important comment. The purpose of the current study was to identify the macrophage subtype that may be associated with cell death in aAMD. We have previously reported that macrophages from AMD patient demonstrate a different phenotype compared with healthy patient in the rodent model for laser induced CNV (Hagbi-Levi S et al, 2016). Per the reviewer comment, we have performed additional experiments to assess the effect of monocytes from healthy controls in the photic retinal injury model. Results showed that monocytes from AMD and healthy patients exert different impact on the retina in this rodent model for aAMD. Interestingly, we found that monocytes from healthy patients were more neurotoxic to photoreceptors compared with monocytes from AMD patients. These results are included in the revised ms. as Figure 1- figure supplement 1H. A possible explanation for these findings is discussed in lines 179-190 of the revised manuscript. This finding reinforces the idea that the use of monocytes from AMD patients in the experiments is required to obtain a comprehensive understanding of their involvement in the progression of the disease.
  
  2) Comment: The authors hypothesize, from the experiments presented in Figure 1 - figure supplement 1, that the injected monocytes generated macrophages in the retina, which were responsible for the observed neurotoxicity (Lines 143-145). However, no direct evidence was presented. This idea should be tested in vivo. This could be done by injecting tracer-labeled human AMD-derived monocytes into light-injured mouse retinas. If the authors' hypothesis is true, collected retinas should contain tracer-labeled cells that express macrophage markers. Tracer-labeled M2a macrophage cells should be present since subsequent experiments identify this subclass as being associated with retinal cell death.
  
  Thank you for this important comment. To address the reviewers comment, retinal section from mice exposed to photic-retinal injury and injected with Dio-tracer labelled monocytes were stained with two M2a macrophages markers, CD206 (mannose receptor) and VEGF (Kadomoto, S et al, 2022; Jayasingam SD et al, 2019). Interestingly, we found co-localization of Dio-tracer staining (representing the injected human macrophages) with CD206 and VEGF markers in monocytes localized in different retinal layers, but not in monocytes remaining in the vitreous cavity. These data indicate that M2a markers are expressed during the polarization of monocytes into M2a phenotype which is maintained only upon entry into the retina tissue. These results were included in Figure 1- figure supplement 1K-S and discussed in the revised manuscript in lines 179-182.
  
  3) Comment: Photoreceptor number and b-wave amplitudes were measured in light-injured retinas injected with one of four macrophage cell types generated from human AMD-derived monocytes. The authors conclude that only injection of M2a cells reduced photoreceptor number and b-wave amplitudes (Figure 1C, E). This may be true, but it is difficult for the reader to make a conclusion (especially in Fig. 1E) due to the large error bars and five different traces overlapping each other. To make these results easier to interpret, graph control cells with only one experimental sample (cell type) at a time.
  
  Thank you for this comment. Per the reviewer comment, the graphs were modified in the revised ms. (Figure 1, panel H-K).
  
  4) Comment: Most injected macrophages were located in the vitreous. In the case of M2a cells, the authors note that "several of the cells migrated across the retinal layers reaching the subretinal space" (Lines 167,168). One possible explanation for why M0, M1, and M2c macrophages did not induce retinal degeneration is that they did not migrate to the subretinal space and around the optic nerve head. Supplementary figures should be added to demonstrate that this is not the case.
  
  Thank you for this comment. To address the reviewer comment we compared the migration patterns of the different macrophage phenotypes following intravitreal injection in mice exposed to photic-injury. Our results indicated that M0, M1 and M2c macrophages, similarly to M2a macrophages, migrated to the subretinal space and around the optic nerve. Thus, the neurotoxic effect of M2a is not explained by their capacity to infiltrate the retinal tissues. These results was included in Figure 1- figure supplement 2 E-H of the revised manuscript. These results are supported by our ex-vivo experiments, showing that co-culture of M2a macrophages with a retinal explants was associated with increased photoreceptor cells death compared to M1 macrophages. The results are presented and discussed in the revised manuscript in lines 200-203.
  
  5) Comment: Figure 1 - figure supplement 2: Panel A, B cells were stained with CD206 to demonstrate the presence of M2a macrophages (panel B). The authors conclude that panel A contains M1 and panel B contains M2a cells. The lack of CD206 expression illustrates that panel A cells are not M2a macrophages but do not demonstrate they are M1 macrophages. A control using an M1 cell marker is necessary to show that panel A cells are M1 and M1 cells are not detected in M2a cultures.
  
  Thank you for this comment. We have validated the phenotype of each macrophages subtype by qPCR (Figure 1 panel A). To further address the reviewer comment, we have performed additional immunocytochemistry for M1 macrophages using anti-CD80 antibody which is utilized as M1 macrophages marker (Bertani FR et al.2017). Results of the staining confirmed the identity of the M1 macrophages. These new results were included in Figure 1- figure supplement 2A, and are discussed in lines 168-170.
  
  6) Comment: Ex vivo, apoptotic photoreceptor and RPE cells are observed when cultured with M2a macrophages (Figure 2). Do injected M2a cells also induce apoptosis of RPE cells in vivo? This is important to establish that retinal explants are a good model for in vivo experiments.
  
  Thank you for this comment. To address the reviewer comment, we assessed RPE apoptosis (using TUNEL, Caspase 3 staining and RPE65 marker) after M2A cells delivery, in the in-vivo photic injury model. We could not detect apoptotic signal in the RPE layers 7 days after photic injury and therefore could not evaluate the effect of M2a macrophages on the RPE cells in-vivo (see Author response image 1). One possible explanation is that RPE cells that have undergone apoptosis are rapidly removed from the damaged tissue and are no longer detectable unlike photoreceptors. Furthermore, a study that investigated the impact of bright light on RPE cells in-vivo, showed that although RPE cells undergone structural and chemical modifications after photic-injury, TUNEL signal was not detected because RPE cell die by necrosis mechanism and not apoptosis (Jaadane I et al, 2017). Other studies validated that blue light induces RPE necrosis (Song W et al, 2022; Mohamed A et al, 2022). Taken together, it seems that ex-vivo retinal explant and in-vivo photic injury both simulate the mechanism of retinal cell death. However, the use of ex-vivo model allows for establishing the direct impact of M2a macrophages on retina in non-inflammatory context.
  
  Author responnse image 1.
  
  7) Comment: Reactive oxygen species (ROS) production was measured to determine if M2a cell-mediated neurotoxicity was due to oxidative stress. It is concluded that a ROS increase is partly responsible (Line 218). The data do not support this conclusion. ROS was detected in cultured M2a macrophages. More importantly, however, there was no increase in oxidative damage in vivo. The in vivo and cell culture results contradict each other so no conclusion can be made. The lack of in vivo confirmation weakens the argument that ROS drives M2a neurotoxicity. Text suggesting a role for ROS in neurotoxicity should be appropriately edited (Lines including 218, 244, 401,406,481).
  
  Thank you for this comment. The manuscript was revised according to the reviewer suggestion (Lines 250-256).
  
  8) Comment: The authors ask if the photoreceptor cell death is cytokine-mediated. Multiple cytokines were enriched in M2a-conditioned media. Of particular interest were CCR1 ligands MPIF1 and MCP4. The implication is that these two ligands mediate the M2a macrophages to photoreceptor cell death through CCR1. However, there is no attempt to show that either MPIF1 or MCP4 are present in vivo, or are sufficient to induce the retinal response observed. This could be demonstrated by injection of MPIF1 or MCP4. Evidence that either ligand phenocopies M2a macrophage injection would be direct evidence that CCR1 ligands activate the retinal response. Furthermore, co-injection with BX174 should block the effect of these ligands if they work through CCR1.
  
  Thank you for this comment. The identification of CCR1 ligands expression from M2a polarized macrophages directed our decision to study CCR1 in the context of atrophic AMD. We do not claim that these specific CCR1 ligands are sufficient to activate CCR1 and exert retinal injury. The mechanism is likely more complex. Yet, to address the reviewer comment, we have performed the experiments suggested by the reviewer. Mice were exposed to photic injury and immediately injected in one eye with MPIF1, MCP-4, or a combination of both and in second eye with PBS as vehicle. Intravitreal cytokines delivery was repeated two days later (following the half-life time of these cytokines) and ERG were recorded two days after the last injection. Injection of cytokines at a concentration of 300 ng per eye did not exacerbated photoreceptor death. Then, the same experiment was repeated with two higher concentrations of cytokine, 1.2 ug/eye and 2 ug/eye, but no changes are observed between the cytokines treated-eyes and the vehicle treated-eyes. Based on previous studies reporting the physiological concentration of different cytokines in eyes of un/healthy individuals and on experiments in which different cytokines are injected in rodent eye (Estevao C et al, 2021. Zeng Y et al, 2019; Roybal CN et al, 2018; Mugisho OO et al, 2018), the cytokine concentrations used in our experiment are in the range in which effect on the retina is expected.
  
  It is likely that a synergistic effect of M2a-secreted proteins in a particular microenvironment is necessary to increase the level of retinal damage (Bartee E et al, 2013). It is also likely that in the photic retinal injury model there is upregulation of cytokines that may mask additional delivery of exogenous cytokines. Comprehensive understanding of the complex interactions of these cytokines during retinal degeneration is beyond the scope of the current manuscript which is not focus on identifying ligand-induced CCR1 activation and its consequences. Additionally, we suggest that due to cytokine redundancy (Nicola NA; 1994), demonstrating that MPIF-4 or MCP-3 can increase photoreceptor death is not required for proving CCR1 receptor involvement.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.01.03.522541v1
www.biorxiv.org www.biorxiv.org

New submission 02/08/2022, 15:49:47

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  Charpentier et al. use facial recognition technology to show that mothers in a group of mandrills lead their offspring to associate with phenotypically similar offspring. Mandrills are a species of primate that live in large, matrilineal troops, with a single, dominant male that fathers the majority of the offspring. Male breeder turnover and extra-pair mating by females can lead to variation in relatedness between group members and the potential for kin-selected benefits from preferentially cooperating with closer relatives within the group. The authors argue that the strategy of influencing the social network of their offspring could be favoured by "second-order kin selection", a mechanism by which inclusive fitness benefits are accrued to female actors through kin-selected benefits to their offspring. This interpretation is supported by a theoretical model.
  
  The paper highlights a previously unappreciated mechanism for favouring association between non-kin in social groups and also contributes a nice insight into the complexity of social interactions in a relatively understudied wild primate species. The conclusions are strengthened by data showing associations between mothers were not influenced by the facial similarity of their offspring -- this suggests that mothers are making decisions based on the appearance of offspring and not their mothers.
  
  Some remaining questions regarding the strength of the authors' interpretation exist: Given the challenges of studying mandrills in the field, the fact that the study reports data from a single group is understandable but potential issues remain with the independence of data points. There may be an additional issue arising from the fact that this troop is semi-captive.
  
  The study group is not semi-captive. Instead, it originated from two release events of a few captive individuals into the wild (in 2002 and 2006). The population is now composed of more than 250 individuals and all of them, except for 7 founder females (<3%), were born in the wild. In addition, the study group is not fed and occasionally wanders into a fenced protected area. Fences of the park do not represent a boundary for mandrills and most of the time (c.a. 80% of days), the study group ranges outside the park. We have clarified this misunderstanding.
  
  Regarding the independence of data points, we would be grateful if this reviewer could clarify her/his thoughts. As a tentative response, we indeed have access to a single (although large) study group, but that’s unfortunately often the case when studying primates or other large mammals. Regarding our study questions, we have clearly demonstrated increased nepotism among paternally related mandrills in two different social groups (Charpentier et al. 2007: semi-captive mandrills; Charpentier et al. 2020: wild mandrills). More generally, we do not see any parsimonious explanations for why the studied mandrills would behave or experienced selective pressures that may have differently shaped their genetic structure and social organization compared to other wild mandrill groups.
  
  The number of genotyped offspring is relatively small (n = 15) and paternity is inferred from the identity of the dominant male. However, the authors also refer to the fact that it's normal for female mandrills to mate with several males during ovulation.
  
  Indeed, both sexes mate promiscuously during the mating season. We have very recently (June 2022) obtained new genetic profiles for a subset of the study infants (it took two years to obtain these data). We have now increased our sample size of infants with a known father, from 15 to 32. With these new data, we were able to distinguish between four categories of infant-infant dyads: those sharing the same father (PHS), those not sharing the same father (not PHS), those conceived during the same alpha male tenure, and those that were not (both infants with unknown dads). The graph below shows the average facial distance among individuals for each of these four categories. It shows that infants conceived during the same alpha male tenure are significantly more similar to each other than infants sired by different fathers or during the tenure of different alpha males, but they are also significantly less similar to each other than infants born to the same father (the four categories are all significantly different from each other, except when comparing infants born to different fathers with those conceived during different alpha male tenures). As suggested by this reviewer, the fact that females mate predominantly with the alpha male, but to some extent also with other males, likely explains the difference between “same father” and “same alpha male tenure”. Importantly, however, considering all infants conceived during the same alpha male tenure as “PHS” is highly conservative. It is thus likely that knowing the paternity of every infant would produce even clearer effects (and indeed, increasing the data set from 15 to 32 strengthened this result). We have now updated this result (first model) based on this new sample.
  
  What evidence is there to support a beneficial effect of nepotism in this species?
  
  In mandrills, females who affiliate more (groom more/associate more) with their groupmates (kin or non-kin) during juvenility also reproduce 1 year earlier than those females that are poorly socially integrated (Charpentier et al. 2012). These results are similar to what is known in many mammalian species (see for review Snyder-Mackler et al. 2020). However, the positive effects of a rich social life are generally triggered by all group members, not only close kin. However, if beneficial social relationships impact the direct fitness of individuals, as reported in mandrills and other species, then kin selection theory predicts that these effects should further translate into indirect fitness benefits.
  
  We have now added this relevant reference (Charpentier et al. 2012) in the revised version of our manuscript and present the results of this early study on mandrills.
  
  What form could nepotism take and does it necessarily have to involve full sibs?
  
  We are unsure why this reviewer is mentioning full-sibs here. For this reviewer information, on the 2556 study dyads (model 1 on the impact of maternal and paternal origins on facial distance), only one dyad was a full-sib pair. Full-sibs are therefore very rare in the study population due to male migration patterns and generally short alpha male tenures.
  
  If a female did not associate with offspring as shown here, would nepotistic interactions simply arise between her offspring and offspring that were less facially similar?
  
  We guess that facial similarity would not be a predictor of spatial association anymore. Indeed, we think that young mandrills do not use self-referent phenotype matching, precluding the self-evaluation of those infants that look like them. However, as stated below, we cannot fully exclude the possibility that other social partners, such as fathers, may also influence infant-infant relationships, although we think that this alternative mechanism is less parsimonious than the one we propose and test.
  
  Reviewer #2:
  
  This paper uses data on patterns of spatial association and facial similarity in mandrills to develop a new hypothesis for the evolution of kin recognition based on facial cues. Previous work on this system has shown that, among females, paternal half-sibs resemble each other visually more than maternal half-sisters do. The authors hypothesise that this paternally inherited facial similarity provides opportunities for kin selection, but it is unclear how offspring themselves could recognise kin using phenotype matching since they are unable to see their own face. One answer to this puzzle is that third parties -- mothers -- may promote social interactions between their own offspring and other offspring that resemble them since these other offspring are likely to share the same father. In support of this hypothesis, the authors find that mothers and offspring show spatial proximity to infants that are facially more similar than average. They also use an analytical evolutionary model to confirm the logic of this hypothesis. The model shows that mothers can gain inclusive fitness benefits by encouraging reciprocal social interaction among their offspring and other paternally-related offspring. They term this idea 'second-order' kin selection and identify a range of other circumstances in which it might play an important role in shaping the evolution of social behaviour.
  
  The main strengths of the paper are the interesting mandrill data and the cutting-edge methods used to analyse facial similarity, which have stimulated the development of a theoretically interesting hypothesis about the evolution of facially based kin recognition. The theoretical model enhances the generality and rigour of the work. The paper will be of wide interest and the concept of second-order kin selection may be applicable to other social circumstances, such as interactions among in-laws in close-knit family groups. Thus, I can see that this paper will be a stimulus for future work.
  
  We are grateful for these positive comments.
  
  The data are, I think, rather overinterpreted in terms of the degree to which they support the hypothesis. The spatial proximity data are interesting, but on their own, they are not definitive support for the hypothesis or model. A more critical approach to the hypothesis, clearly setting out the limitations of the data, and what tests in future could be used to falsify the hypothesis or model, would make for a stronger paper.
  
  We agree with this general comment and have addressed it by 1. Adding a model on grooming relationships between females and infants, 2. Toning down our interpretation throughout the manuscript and 3. Propose future directions of research.
  
  Overall the authors have presented data that support a fascinating new mechanism by which natural selection can influence social interactions among the members of family groups, in potentially surprising ways. I also find it remarkable that 60 years after the development of the kin selection theory new implications of this theory are still being uncovered. The concept of second-order kin selection may prove important in understanding the evolution of social organisation and behaviour in species that live in groups containing a mixture of kin and non-kin, such as many primates and of course humans.
  
  We are grateful to this reviewer for this very positive comment. We fully agree with the fact that 60 years after the kin selection theory has emerged, we are still discovering further implications!
  
  Reviewer #3:
  
  This is a very interesting and impressive manuscript. It is complex in its multiple components, and in some ways that makes it a difficult manuscript to evaluate. There is a lot in it, including empirical analyses of a face dataset and of behavioral association data, combined with a theoretical model.
  
  We are very grateful for this positive comment and are glad that you liked our manuscript.
  
  The three main findings are: 1) Paternal siblings look alike (similar to, and building on, a recent manuscript the authors published elsewhere); 2) Infants that are more facially similar tend to associate; and 3) mothers tend to be found in association with other unrelated infants that look more like their own infants. Such results are interesting, and indeed one potential interpretation, perhaps even the most likely, is that mothers are behaving in such a way that promotes association between their own infants and the paternal kin of their infants.
  
  Nonetheless, the evidence provided is logically only consistent with the authors' hypothesis, rather than being strong direct evidence for it. As such, the current framing and indeed the title, "Primate mothers promote proximity between their offspring and infants who look like them", are both problematic. (In addition, the title should be about mandrills, not "primates", since this manuscript does not provide evidence from any other species.) The evidence provided is consistent with the hypothesis, but also consistent with other potential hypotheses. The evidence given to dismiss other potential hypotheses is not strong, and rests on the fact that many males are not around all year to influence things, and that "males that were present during a given reproductive cycle are not responsible for maintaining proximity with either infants or their mothers (MJEC and BRT, pers. obs.)".
  
  We agree with this comment. Although, after examining several alternative mechanisms, in the light of the natural history of mandrills we are confident that the proposed mechanism is at work in that species, although we cannot firmly exclude some of these alternative mechanisms. To address this comment, we have changed the title of our manuscript that now reads “Mandrill mothers associate with infants who look like their own offspring using phenotype matching”. We have also included an additional model on grooming relationships (see response to R1) and have toned down the interpretation of our results throughout our revised manuscript. Finally, we have further discussed alternative scenario, in particular the one involving fathers (see details above).
  
  My opinion is that these are really interesting analyses and data, which are being somewhat undermined by the insistence that only one hypothesis can explain the observed association patterns. It could easily be presented differently, as a demonstration that paternal siblings look alike and that they associate. The authors could then go on to explore different possible explanations for this using their association data, make the case that maternal behavior is the most plausible (but not the only) explanation, and present their model of how such behavior could bring fitness benefits.
  
  In my view, such a presentation would be both more cautious and more appropriate, without in any way reducing the impact or importance of the data. In the current iteration, I think there are issues because the data do not provide sufficient support for the surety of the title and conclusion, as presented.
  
  We think that the current organization of our manuscript was not that different from the one proposed here and follows a reasoning already proposed in a former manuscript (Charpentier et al. 2020). Indeed, we first start by reminding the reader what we already know from that previous studies: paternal siblings look alike and they associate. We then go on exploring different mechanisms. That being said, and as suggested, we have been more cautious in interpreting our results, that are indeed only correlative.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.09.491128v1
www.biorxiv.org www.biorxiv.org

New submission 23/12/2023, 17:13:42

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In this work George et al. describe RatInABox, a software system for generating surrogate locomotion trajectories and neural data to simulate the eﬀects of a rodent moving about an arena. This work is aimed at researchers that study rodent navigation and its neural machinery.
  
  Strengths:
  
  The software contains several helpful features. It has the ability to import existing movement traces and interpolate data with lower sampling rates. It allows varying the degree to which rodents stay near the walls of the arena. It appears to be able to simulate place cells, grid cells, and some other features.
  
  The architecture seems ﬁne and the code is in a language that will be accessible to many labs.
  
  There is convincing validation of velocity statistics. There are examples shown of position data, which seem to generally match between data and simulation.
  
  Weaknesses:
  
  There is little analysis of position statistics. I am not sure this is needed, but the software might end up more powerful and the paper higher impact if some position analysis was done. Based on the traces shown, it seems possible that some additional parameters might be needed to simulate position/occupancy traces whose statistics match the data.
  
  Thank you for this suggestion. We have added a new panel to ﬁgure 2 showing a histogram of the time the agent spends at positions of increasing distance from the nearest wall. As you can see, RatInABox is a good ﬁt to the real locomotion data: positions very near the wall are under-explored (in the real data this is probably because whiskers and physical body size block positions very close to the wall) and positions just away from but close to the wall are slightly over explored (an eﬀect known as thigmotaxis, already discussed in the manuscript).
  
  As you correctly suspected, ﬁtting this warranted a new parameter which controls the strength of the wall repulsion, we call this “wall_repel_strength”. The motion model hasn’t mathematically changed, all we did was take a parameter which was originally a ﬁxed constant 1, unavailable to the user, and made it a variable which can be changed (see methods section 6.1.3 for maths). The curves ﬁt best when wall_repel_strength ~= 2. Methods and parameters table have been updated accordingly. See Fig. 2e.
  
  The overall impact of this work is somewhat limited. It is not completely clear how many labs might use this, or have a need for it. The introduction could have provided more specificity about examples of past work that would have been better done with this tool.
  
  At the point of publication we, like yourself, also didn’t know to what extent there would be a market for this toolkit however we were pleased to ﬁnd that there was. In its initial 11 months RatInABox has accumulated a growing, global user base, over 120 stars on Github and north of 17,000 downloads through PyPI. We have accumulated a list of testimonials[5] from users of the package vouching for its utility and ease of use, four of which are abridged below. These testimonials come from a diverse group of 9 researchers spanning 6 countries across 4 continents and varying career stages from pre-doctoral researchers with little computational exposure to tenured PIs. Finally, not only does the community use RatInABox they are also building it: at the time of writing RatInABx has received logged 20 GitHub “Issues” and 28 “pull requests” from external users (i.e. those who aren’t authors on this manuscript) ranging from small discussions and bug-ﬁxes to signiﬁcant new features, demos and wrappers.
  
  Abridged testimonials:
  
  ● “As a medical graduate from Pakistan with little computational background…I found RatInABox to be a great learning and teaching tool, particularly for those who are underprivileged and new to computational neuroscience.” - Muhammad Kaleem, King Edward Medical University, Pakistan
  
  ● “RatInABox has been critical to the progress of my postdoctoral work. I believe it has the strong potential to become a cornerstone tool for realistic behavioural and neuronal modelling” - Dr. Colleen Gillon, Imperial College London, UK
  
  ● “As a student studying mathematics at the University of Ghana, I would recommend RatInABox to anyone looking to learn or teach concepts in computational neuroscience.” - Kojo Nketia, University of Ghana, Ghana
  
  ● “RatInABox has established a new foundation and common space for advances in cognitive mapping research.” - Dr. Quinn Lee, McGill, Canada
  
  The introduction continues to include the following sentence highlighting examples of past work which relied of generating artiﬁcial movement and/or neural dat and which, by implication could have been done better (or at least accelerated and standardised) using our toolbox.
  
  “Indeed, many past[13, 14, 15] and recent[16, 17, 18, 19, 6, 20, 21] models have relied on artiﬁcially generated movement trajectories and neural data.”
  
  Presentation: Some discussion of case studies in Introduction might address the above point on impact. It would be useful to have more discussion of how general the software is, and why the current feature set was chosen. For example, how well does RatInABox deal with environments of arbitrary shape? T-mazes? It might help illustrate the tool's generality to move some of the examples in supplementary ﬁgure to main text - or just summarize them in a main text ﬁgure/panel.
  
  Thank you for this question. Since the initial submission of this manuscript RatInABox has been upgraded and environments have become substantially more “general”. Environments can now be of arbitrary shape (including T-mazes), boundaries can be curved, they can contain holes and can also contain objects (0-dimensional points which act as visual cues). A few examples are showcased in the updated ﬁgure 1 panel e.
  
  To further illustrate the tools generality beyond the structure of the environment we continue to summarise the reinforcement learning example (Fig. 3e) and neural decoding example in section 3.1. In addition to this we have added three new panels into ﬁgure 3 highlighting new features which, we hope you will agree, make RatInABox signiﬁcantly more powerful and general and satisfy your suggestion of clarifying utility and generality in the manuscript directly.
  
  On the topic of generality, we wrote the manuscript in such a way as to demonstrate how the rich variety of ways RatInABox can be used without providing an exhaustive list of potential applications. For example, RatInABox can be used to study neural decoding and it can be used to study reinforcement learning but not because it was purpose built with these use-cases in mind. Rather because it contains a set of core tools designed to support spatial navigation and neural representations in general. For this reason we would rather keep the demonstrative examples as supplements and implement your suggestion of further raising attention to the large array of tutorials and demos provided on the GitHub repository by modifying the ﬁnal paragraph of section 3.1 to read:
  
  “Additional tutorials, not described here but available online, demonstrate how RatInABox can be used to model splitter cells, conjunctive grid cells, biologically plausible path integration, successor features, deep actor-critic RL, whisker cells and more. Despite including these examples we stress that they are not exhaustive. RatInABox provides the framework and primitive classes/functions from which highly advanced simulations such as these can be built.”
  
  Reviewer #3 (Public Review):
  
  George et al. present a convincing new Python toolbox that allows researchers to generate synthetic behavior and neural data specifically focusing on hippocampal functional cell types (place cells, grid cells, boundary vector cells, head direction cells). This is highly useful for theory-driven research where synthetic benchmarks should be used. Beyond just navigation, it can be highly useful for novel tool development that requires jointly modeling behavior and neural data. The code is well organized and written and it was easy for us to test.
  
  We have a few constructive points that they might want to consider.
  
  Right now the code only supports X,Y movements, but Z is also critical and opens new questions in 3D coding of space (such as grid cells in bats, etc). Many animals effectively navigate in 2D, as a whole, but they certainly make a large number of 3D head movements, and modeling this will become increasingly important and the authors should consider how to support this.
  
  Agents now have a dedicated head direction variable (before head direction was just assumed to be the normalised velocity vector). By default this just smoothes and normalises the velocity but, in theory, could be accessed and used to model more complex head direction dynamics. This is described in the updated methods section.
  
  In general, we try to tread a careful line. For example we embrace certain aspects of physical and biological realism (e.g. modelling environments as continuous, or ﬁtting motion to real behaviour) and avoid others (such as the biophysics/biochemisty of individual neurons, or the mechanical complexities of joint/muscle modelling). It is hard to decide where to draw but we have a few guiding principles:
  
  RatInABox is most well suited for normative modelling and neuroAI-style probing questions at the level of behaviour and representations. We consciously avoid unnecessary complexities that do not directly contribute to these domains.
  
  Compute: To best accelerate research we think the package should remain fast and lightweight. Certain features are ignored if computational cost outweighs their beneﬁt.
  
  Users: If, and as, users require complexities e.g. 3D head movements, we will consider adding them to the code base.
  
  For now we believe proper 3D motion is out of scope for RatInABox. Calculating motion near walls is already surprisingly complex and to do this in 3D would be challenging. Furthermore all cell classes would need to be rewritten too. This would be a large undertaking probably requiring rewriting the package from scratch, or making a new package RatInABox3D (BatInABox?) altogether, something which we don’t intend to undertake right now. One option, if users really needed 3D trajectory data they could quite straightforwardly simulate a 2D Environment (X,Y) and a 1D Environment (Z) independently. With this method (X,Y) and (Z) motion would be entirely independent which is of unrealistic but, depending on the use case, may well be suﬃcient.
  
  Alternatively, as you said that many agents eﬀectively navigate in 2D but show complex 3D head and other body movements, RatInABox could interface with and feed data downstream to other softwares (for example Mujoco[11]) which specialise in joint/muscle modelling. This would be a very legitimate use-case for RatInABox.
  
  We’ve ﬂagged all of these assumptions and limitations in a new body of text added to the discussion:
  
  “Our package is not the ﬁrst to model neural data[37, 38, 39] or spatial behaviour[40, 41], yet it distinguishes itself by integrating these two aspects within a uniﬁed, lightweight framework. The modelling approach employed by RatInABox involves certain assumptions:
  
  It does not engage in the detailed exploration of biophysical[37, 39] or biochemical[38] aspects of neural modelling, nor does it delve into the mechanical intricacies of joint and muscle modelling[40, 41]. While these elements are crucial in speciﬁc scenarios, they demand substantial computational resources and become less pertinent in studies focused on higher-level questions about behaviour and neural representations.
  
  A focus of our package is modelling experimental paradigms commonly used to study spatially modulated neural activity and behaviour in rodents. Consequently, environments are currently restricted to being two-dimensional and planar, precluding the exploration of three-dimensional settings. However, in principle, these limitations can be relaxed in the future.
  
  RatInABox avoids the oversimpliﬁcations commonly found in discrete modelling, predominant in reinforcement learning[22, 23], which we believe impede its relevance to neuroscience.
  
  Currently, inputs from diﬀerent sensory modalities, such as vision or olfaction, are not explicitly considered. Instead, sensory input is represented implicitly through eﬃcient allocentric or egocentric representations. If necessary, one could use the RatInABox API in conjunction with a third-party computer graphics engine to circumvent this limitation.
  
  Finally, focus has been given to generating synthetic data from steady-state systems. Hence, by default, agents and neurons do not explicitly include learning, plasticity or adaptation. Nevertheless we have shown that a minimal set of features such as parameterised function-approximator neurons and policy control enable a variety of experience-driven changes in behaviour the cell responses[42, 43] to be modelled within the framework.
  
  What about other environments that are not "Boxes" as in the name - can the environment only be a Box, what about a circular environment? Or Bat flight? This also has implications for the velocity of the agent, etc. What are the parameters for the motion model to simulate a bat, which likely has a higher velocity than a rat?
  
  Thank you for this question. Since the initial submission of this manuscript RatInABox has been upgraded and environments have become substantially more “general”. Environments can now be of arbitrary shape (including circular), boundaries can be curved, they can contain holes and can also contain objects (0-dimensional points which act as visual cues). A few examples are showcased in the updated ﬁgure 1 panel e.
  
  Whilst we don’t know the exact parameters for bat ﬂight users could fairly straightforwardly ﬁgure these out themselves and set them using the motion parameters as shown in the table below. We would guess that bats have a higher average speed (speed_mean) and a longer decoherence time due to increased inertia (speed_coherence_time), so the following code might roughly simulate a bat ﬂying around in a 10 x 10 m environment. Author response image 1 shows all Agent parameters which can be set to vary the random motion model.
  
  Author response image 1.
  
  Semi-related, the name suggests limitations: why Rat? Why not Agent? (But its a personal choice)
  
  We came up with the name “RatInABox” when we developed this software to study hippocampal representations of an artiﬁcial rat moving around a closed 2D world (a box). We also ﬁtted the random motion model to open-ﬁeld exploration data from rats. You’re right that it is not limited to rodents but for better or for worse it’s probably too late for a rebrand!
  
  A future extension (or now) could be the ability to interface with common trajectory estimation tools; for example, taking in the (X, Y, (Z), time) outputs of animal pose estimation tools (like DeepLabCut or such) would also allow experimentalists to generate neural synthetic data from other sources of real-behavior.
  
  This is actually already possible via our “Agent.import_trajectory()” method. Users can pass an array of time stamps and an array of positions into the Agent class which will be loaded and smoothly interpolated along as shown here in Fig. 3a or demonstrated in these two new papers[9,10] who used RatInABox by loading in behavioural trajectories.
  
  What if a place cell is not encoding place but is influenced by reward or encodes a more abstract concept? Should a PlaceCell class inherit from an AbstractPlaceCell class, which could be used for encoding more conceptual spaces? How could their tool support this?
  
  In fact PlaceCells already inherit from a more abstract class (Neurons) which contains basic infrastructure for initialisation, saving data, and plotting data etc. We prefer the solution that users can write their own cell classes which inherit from Neurons (or PlaceCells if they wish). Then, users need only write a new get_state() method which can be as simple or as complicated as they like. Here are two examples we’ve already made which can be found on the GitHub:
  
  Author response image 2.
  
  Phase precession: PhasePrecessingPlaceCells(PlaceCells)[12] inherit from PlaceCells and modulate their ﬁring rate by multiplying it by a phase dependent factor causing them to “phase precess”.
  
  Splitter cells: Perhaps users wish to model PlaceCells that are modulated by recent history of the Agent, for example which arm of a ﬁgure-8 maze it just came down. This is observed in hippocampal “splitter cell”. In this demo[1] SplitterCells(PlaceCells) inherit from PlaceCells and modulate their ﬁring rate according to which arm was last travelled along.
  
  This a bit odd in the Discussion: "If there is a small contribution you would like to make, please open a pull request. If there is a larger contribution you are considering, please contact the corresponding author3" This should be left to the repo contribution guide, which ideally shows people how to contribute and your expectations (code formatting guide, how to use git, etc). Also this can be very off-putting to new contributors: what is small? What is big? we suggest use more inclusive language.
  
  We’ve removed this line and left it to the GitHub repository to describe how contributions can be made.
  
  Could you expand on the run time for BoundaryVectorCells, namely, for how long of an exploration period? We found it was on the order of 1 min to simulate 30 min of exploration (which is of course fast, but mentioning relative times would be useful).
  
  Absolutely. How long it takes to simulate BoundaryVectorCells will depend on the discretisation timestep and how many neurons you simulate. Assuming you used the default values (dt = 0.1, n = 10) then the motion model should dominate compute time. This is evident from our analysis in Figure 3f which shows that the update time for n = 100 BVCs is on par with the update time for the random motion model, therefore for only n = 10 BVCs, the motion model should dominate compute time.
  
  So how long should this take? Fig. 3f shows the motion model takes ~10-3 s per update. One hour of simulation equals this will be 3600/dt = 36,000 updates, which would therefore take about 72,000*10-3 s = 36 seconds. So your estimate of 1 minute seems to be in the right ballpark and consistent with the data we show in the paper.
  
  Interestingly this corroborates the results in a new inset panel where we calculated the total time for cell and motion model updates for a PlaceCell population of increasing size (from n = 10 to 1,000,000 cells). It shows that the motion model dominates compute time up to approximately n = 1000 PlaceCells (for BoundaryVectorCells it’s probably closer to n = 100) beyond which cell updates dominate and the time scales linearly.
  
  These are useful and non-trivial insights as they tell us that the RatInABox neuron models are quite eﬃcient relative to the RatInABox random motion model (something we hope to optimise further down the line). We’ve added the following sentence to the results:
  
  “Our testing (Fig. 3f, inset) reveals that the combined time for updating the motion model and a population of PlaceCells scales sublinearly O(1) for small populations n > 1000 where updating the random motion model dominates compute time, and linearly for large populations n > 1000. PlaceCells, BoundaryVectorCells and the Agent motion model update times will be additionally aﬀected by the number of walls/barriers in the Environment. 1D simulations are signiﬁcantly quicker than 2D simulations due to the reduced computational load of the 1D geometry.”
  
  And this sentence to section 2:
  
  “RatInABox is fundamentally continuous in space and time. Position and velocity are never discretised but are instead stored as continuous values and used to determine cell activity online, as exploration occurs. This diﬀers from other models which are either discrete (e.g. “gridworld” or Markov decision processes) or approximate continuous rate maps using a cached list of rates precalculated on a discretised grid of locations. Modelling time and space continuously more accurately reﬂects real-world physics, making simulations smooth and amenable to fast or dynamic neural processes which are not well accommodated by discretised motion simulators. Despite this, RatInABox is still fast; to simulate 100 PlaceCell for 10 minutes of random 2D motion (dt = 0.1 s) it takes about 2 seconds on a consumer grade CPU laptop (or 7 seconds for BoundaryVectorCells).”
  
  Regarding the Geometry and Boundary conditions, would supporting hyperbolic distance might be useful, given the interest in alternative geometry of representations (ie, https://www.nature.com/articles/s41593-022-01212-4)?
  
  Whilst this would be very interesting it would likely represent quite a signiﬁcant edit, requiring rewriting of almost all the geometry-handling code. We’re happy to consider changes like these according to (i) how simple they will be to implement, (ii) how disruptive they will be to the existing API, (iii) how many users would beneﬁt from the change. If many users of the package request this we will consider ways to support it.
  
  In general, the set of default parameters might want to be included in the main text (vs in the supplement).
  
  We also considered this but decided to leave them in the methods for now. The exact value of these parameters are subject to change in future versions of the software. Also, we’d prefer for the main text to provide a low-detail high-level description of the software and the methods to provide a place for keen readers to dive into the mathematical and coding speciﬁcs.
  
  It still says you can only simulate 4 velocity or head directions, which might be limiting.
  
  Thanks for catching this. This constraint has been relaxed. Users can now simulate an arbitrary number of head direction cells with arbitrary tuning directions and tuning widths. The methods have been adjusted to reﬂect this (see section 6.3.4).
  
  The code license should be mentioned in the Methods.
  
  We have added the following section to the methods:
  
  6.6 License RatInABox is currently distributed under an MIT License, meaning users are permitted to use, copy, modify, merge publish, distribute, sublicense and sell copies of the software.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.10.503541v3
www.biorxiv.org www.biorxiv.org

New submission 11/08/2022, 15:19:16

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  “The authors wish to relate beat-to-beat coordination of cardiac function (in this case as measured left ventricular pressure) to the activity of sympathetic neuron spiking within the stellate ganglion. A strength includes the challenging measurements from multiple stellate neuron activity over long durations in situ in the anesthetized pig.”
  
  We thank the reviewer for their feedback.
  
  “A major and overriding weakness is the founding assumption of the analysis that the underlying sympathetic neurons are all cardiac functioning in nature - an assumption that is overwhelmingly unlikely given the evidence in other species including humans that stellate postganglionic neurons are functionally mixed and have functional noncardiac targets. The use of broad and poorly explained/defined terms such as "event entropy" is difficult to follow and find meaning from. The manuscript is filled with difficult-to-follow text like "The neural specificity metric (Sudarshan et al., 2021). Fig. 5", is used to evaluate the degree to which neural activity is biased toward control target states taken here as LVP" and "The neural specificity is reduced from a multivariate signal to a univariate signal by computing the Shannon entropy at each timestamp of the mapped neural specificity metric". The figures are difficult to understand with axes that often bear no units or are quite compressed obscuring the intuitive meaning of the data trends. Fundamentally, cardiac pressure cycles with each heartbeat - roughly once per second - yet fluctuations in the depicted mean spike rate data with changes perhaps ten times in 25 minutes. Such plots are disorienting and difficult to associate with cardiac or neuron "functioning". Only 17 of the 38 references are not self-citations and thus the cited literature represents a narrow view of sympathetic regulation and sympathetic/stellate ganglion knowledge. Much of the foundations are self-professed in earlier publications by the present group and assumed to be accepted.”
  
  “Fundamentally, cardiac pressure cycles with each heartbeat - roughly once per second - yet fluctuations in the depicted mean spike rate data with changes perhaps ten times in 25 minutes. Such plots are disorienting and difficult to associate with cardiac or neuron "functioning”
  
  We would like to clarify this point with the understanding that the reviewer is referring to the time axis in Figure 3C in the manuscript.
  
  The coactivity matrix constructed in Figure 3C computes the cross correlation in sliding mean/std spike activities for different pairs of channels. The mean spiking activities across channels, as the reviewer correctly pointed out, do indeed have a weak autocorrelation with the period of the heart rate. The weak correlation for the heart rate period, possibly due to slow firing rates, was seen across all channels of both control and HF animals. But, the cause of a large proportion of channel-pairs exhibiting high coactivity, termed as cofluctuation (Shown as red tracings in Fig 3D), is not known and cannot be directly associated with cardiac functioning.
  
  The cofluctuation was also found to be aperiodic in nature approximating a lognormal distribution (Fig R1) with the HF animals containing heavy tails outside their confidence intervals (Fig R1B). The event rate computed from the cofluctuation time series (shown as blue steps in Fig 3E) for an animal is a measure of spatial coherence among SG neural populations and was developed as a novel metric to be used in future studies.
  
  Figure R1: Cofluctuation histograms (calculated from mean or standard deviation of sliding spike rate, referred as Cofluctuation_MEAN and Cofluctuation_STD, respectively) and log-normal fits for each animal group. μF IT and σF IT are the respective mean and standard deviation (STD) of fitted distribution, used for 68% confidence interval bounds. A-B: Control animals have narrower bounds and represent a better fit to log-normal distribution. C-D: Heart failure (HF) animals display more heavily skewed distributions that indicate heavy tails.
  
  “Only 17 of the 38 references are not self-citations and thus the cited literature represents a narrow view of sympathetic regulation and sympathetic/stellate ganglion knowledge. Much of the foundations are self-professed in earlier publications by the present group and assumed to be accepted.”
  
  We thank the reviewer for pointing this out. We have added four additional citations that include methods such as neural population bias and spatiotemporal dynamics linkages to control targets in the neuroscience literature. We have added these citations to page 15 in the “Conclusion” section of the manuscript. In addition, it is our group’s specialty to carry these cardiac nervous system experiments, we are not aware of another group collecting multi-electrode array data from the cardiac nervous system and studying population dynamics of cardiac neurons. Hence we build on based on our previous learnings. The most relevant literature (not necessarily related to cardiac nervous system) can be found in the neuroscience references we cited that contain applications of neural population recordings for different brain areas, mainly in neuropsychiatry domain to understand disease dynamics.
  
  “For the expert or even the uninformed reader, this report is broadly confused and confusing. The premises (beat to beat or whether LVP conveys cardiac function) are poorly supported. The conclusions are quite vague.”
  
  Thank you for your feedback. To simplify the understanding, we moved all mathematical details to supplementary material, re-wrote the abstract and the conclusion from scratch, and splitted the methods figures that may be confusion. We believe that our novel metrics event rate and entropy capture non-trivial linkages between heart failure status, cardiac neural activity (spike activity), and peripheral activity (LVP). We have supported our metrics with 17 animals with state-of-the-art surgical techniques and technology, and reported our results with detailed statistical analyses. Our manuscript essentially highlights that event rate and entropy metrics are significantly different between control animals and animals with heart failure. These metrics can be used to design future studies with these animal models to provide a more quantitative approach to heart disease, rather than binary (yes or no) descriptions.
  
  “Discussion: The abstract does not convey conclusions from the findings and contains broad statements such as "signatures based on linking neuronal population cofluctuation and examine differences in "neural specificity" of SG network" that have little substantive value or conclusion for the reader. Fundamentally what does the title "signatures based on linking neuronal population" cofluctuation mean to the reader? What changed in HF?”
  
  Thank you for this comment. We completely revised the abstract and conclusion as detailed in our response to Essential Revision #1. Event rate is a metric related to neural activity recordings and entropy is related to the association of neural activity to left ventricular blood pressure. Our findings suggest that both the neural population activity itself (event rate) and its ability to pay attention to cycles of left ventricular pressure (neural specificity) are significantly higher in animals with HF compared to controls.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.28.462183v3
www.biorxiv.org www.biorxiv.org

Deep-body feelings: ingestible pills reveal gastric correlates of emotions

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  (1.1) The work by Porciello and colleagues provides scientific evidence that the acidic content of the stomach covaries with the experienced level of disgust and fear evoked by disgusting videos. The working of the inside of the gut during cognitive or emotional processes have remained elusive due to the invasiveness of the methods to study it. The major strength of the paper is the use of the non-invasive smart pill technology, which senses changes in Ph, pressure and temperature as it travels through the gut, allowing authors to investigate how different emotions induced with validated video clips modulate the state of the gut. The experimental paradigm used to evoke distinct emotions was also successful, as participants reported the expected emotions after each emotion block. While the reported evidence is correlational in nature, I believe these results open up new avenues for studying brain-body interactions during emotions in cognitive neuroscience, and future causal manipulations will shed more insight on this phenomena. Indeed, this is the first study to provide evidence for a link between gastric acidity and emotional experience beyond single patient studies, and it has major implications for the advancement of our understanding of disorders with psycho-somatic influences, such as stress and it's influence of gastritis.
  
  1.1 First of all, we want to thank Reviewer#1 for his cogent comments and for highlighting that our findings may inspire future research on brain-body interactions. We took into the highest consideration all the remarks and changed the manuscripts accordingly.
  
  (1.2) As for the limitations, little insight is provided on the mechanisms, time scales, and inter-individual variability of the link between gastric Ph and emotional induction. Since this is a novel phenomena, it would be important to further validate and characterize this finding. On this line, one of the most well known influences of disgust on the gut is tachygastria, the acceleration of the gastric rhythm. It would be important to understand how acid secretion by disgusting film is related to tachygastria, but authors only examine the influence of disgusting film on the normogastric frequency range.
  
  1.2 We are aware that at the moment our data are mainly descriptive and do not provide a clear picture of the causal mechanisms. However, to deal with this outstanding issue we added a new series of analysis.
  
  Most of the data on gastric activity come from analysis of the normogastric band. However, information about the EGG tachygastric rhythm in humans is of potential great importance. To deal with the reviewer’s comment and considering the previously published literature, we re-examined the EGG data focusing on the tachygastric rhythm. The methodology remained consistent with the process described for normogastric peak extraction but this time, we extracted the peak in the tachygastric band, specifically 0.067 to 0.167 Hz (i.e., 4–10 cpm). The ANOVA performed over the tachygastric cycle revealed a significant main effect of the type of video clip (F(4, 112) = 2.907, p = 0.025, Eta2 (partial) = 0.09). However, the Bonferroni corrected post hoc tests did not show any significant difference between the different type of emotional video clips and the neutral condition. The sole significant comparison was observed between participants viewing happy and fearful video clips, indicating that participants’ tachygastric cycles were faster when exposed to happy rather than fearful video clips (p = 0.038). For a visual representation of the outcomes, please see Fig S6.
  
  We revised the main text (Page 17, lines: 472-482) to include this analysis. The revised text now reads as follows:
  
  “Finally, we explored whether normogastric and/or tachygastric cycle changed in response to specific emotional experience. After checking that normogastric and tachygastric peak frequencies were normally distributed (all ps > 0.05), we ran two separate ANOVAs on the individual peak frequencies in the normogastric and tachygastric range. Each analysis had the type of video clip as within-subjects factor. The ANOVA performed on the normogastric rhythm was not significant (F(4, 44) = 1.037, p = 0.399) suggesting that the gastric rhythm did not change while participants observed the different emotional video clips. In contrast, the ANOVA performed on the tachygastric rhythm did show a significant main effect (F(4, 112) = 2.907, p = 0.025, Eta2 (partial) = 0.09). However, the only comparison that survived the Bonferroni correction was the one between happy and fearful video clips, namely participants’ tachygastric cycle was faster when they observed happy vs fearful video clips (p = 0.038) see Fig. S6 for a graphical representation of the results.”
  
  To deal with the Reviewer’s comment, we also correlated the average pH value with the corresponding frequency of the tachygastric cycle recorded in the disgusting, happy and the fearful video clips, namely the emotions associated to changes in pH. The only significant correlation was the one found during the disgusting video clips (r= 0.435; p= 0.023, all the other rs ≤ 0.351, all the other ps ≥ 0.073). Differently from what we expected, we found a positive correlation suggesting that when participants were exposed to disgusting video clips the less acidic was the pH the higher was the frequency of the tachygastric cycle. Instead, we know from our pill data that disgusting video clips are associated to more acid values, and from literature (not replicated by us) to a faster gastric rhythm. Since we did not find strong support in the EGG analysis suggesting a relationship between the gastric rhythm and the emotional experience, we believe that additional evidence will help to clarify the relationship between pH and gastric rhythm.
  
  (1.3) Additionally, only one channel of the electrogastrogram (EGG) was used to measure the gastric rhythm, and no information is provided on the quality of the recordings. With only one channel of EGG, it is often impossible to identify the gastric rhythm as the position of the stomach varies from person to person, yielding inaccurate estimates of the frequency of the gastric rhythm.
  
  1.3 We agree with Reviewer 1 on this point. We acknowledge the potential limitation associated with one-channel EGG recording in our study. To deal with this remark, in a separate (ongoing) study (N# participants= 25) we recorded the electrogastrogram following the methodology outlined by Wolpert et al., 2020 published on Psychophysiology. Thus, in order to study the EGG in association to the emotional experience, we used a bipolar 4-channels montage while participants observed the same emotional video clips used in our current study (see picture below for the montage set-up).
  
  Author response image 1 shows the 4-channels EGG bipolar recording montage reproducing the one proposed by Wolpert et al., 2020.
  
  Author response image 1.
  
  Then, we extracted the gastric cycle in both the normogastric and the tachygastric bands.
  
  After checking that data were normally distributed (Kolmogorov-Smirnov ds > 0.10; ps> .20), in the case of the gastric cycle extracted in the normogastric band, we ran a repeated measures ANOVA with the type of video clip as the only within-subjects factor measured on the 5 levels (i.e. the five types of video clip: Disgusting, Fearful, Happy, Neutral, and Sad). The ANOVA shows that the gastric cycle recorded during the different video clips did not differ (F (4,96) = 0.39; p= 0.81), see the plot on Author response image 2.
  
  Author response image 2.
  
  Gastric cycle (normogastric band) recorded via multiple-channels electrogastrogram (EGG) during the emotional experience. The plot shows the gastric cycle extracted in the normogastric band while participants were observing the five categories of the video clips (i.e. those inducing disgust, fear, happiness, sadness and, as control, a neutral state).
  
  We also extracted the gastric cycle in the tachygastric band, the distribution of the data was not normal in one condition (Kolmogorov-Smirnov ds > 0.27; p < 0.05), therefore we ran a Friedman ANOVA to compare the gastric cycle during the different emotional experiences. The Friedman ANOVA was not statistically significant (χ2 (4) = 2.88; p = 0.58), suggesting that, similarly to the gastric cycle extracted in the normogastric band, also the one extracted in the tachygastric band was not clearly associated to the investigated emotional states, see Author response image 3.
  
  Author response image 3.
  
  Gastric cycle (tachygastric band) recorded via multiple-channels electrogastrogram (EGG) during the emotional experience. The plot above shows the gastric cycle extracted in the tachygastric band while participants were observing the five categories of the video clips (i.e. those inducing disgust, fear, happiness, sadness and as control a neutral state).
  
  Results from this control study seem to suggest that the non-significant effect of the gastric cycle was probably not due to the fact that we use a one-channel egg montage, at least for what concerns the gastric cycle extracted from the normogastric band.
  
  For what concerns the tachygastric frequency associated to the emotional experience these results from a multi-channel EGG recording seem to go in the same direction of the normogastric one, namely no frequency of the gastric cycles recorded during the emotional video clips was different from the control condition.
  
  The only significant difference that we found in our 1-channel EGG study was the one between the happy and the fearful video clips (see Fig. S6 contained in the supplementary materials and above). Specifically, we found that happy video clips were associated to higher gastric frequency compared to the fearful ones. However, we did not replicate these findings in our multi-channels EGG study.
  
  Although suggestive, this evidence is not conclusive. Indeed, we are aware that a final word on the results of our multi-channel study can be said only when a larger sample is obtained.
  
  (1.4) Finally, I believe that the results do not show evidence in favor of the discrete nature of emotions theory as they claim in the discussion. Authors chose to use stimuli inducing discrete emotions, and only asked subjective reports of these same discrete emotions, so these results shed no light on whether emotions are represented discretely vs continuously in the brain.
  
  We revised the discussion in order to better describe our results and toned down the interpretation that the present findings directly support the discrete nature of emotions, as suggested by this Reviewer.
  
  Now page 21&22 lines 622-631 reads as follow:
  
  “Overall, and in line with theoretical and empirical evidence (Damasio, 1999; Harrison et al., 2010; James, 1994, Lettieri et al., 2019; Stephens et al., 2010), our findings may suggest that specific patterns of subjective, behavioural, and physiological measures are linked to unique emotional states...We acknowledge that our results, although novel, are restricted to a sample of male participants, and more importantly they need to be replicated. We also acknowledge that future studies should better investigate the mechanisms underlying the role of the pH in the emergence of specific emotion. For instance, pharmacologically manipulating stomach pH during emotional induction, not only for basic emotions but also for exploring complex emotions such as moral disgust (Rozin et al., 2009), would enable researchers to generalize these findings and examine the directionality of this relationship.”
  
  Reviewer #2 (Public Review):
  
  To measure the role of gastric state in emotion, the authors used an ingestible smart pill to measure pH, pressure, and temperature in the gastrointestinal tract (stomach, small bowel, large bowel) while participants watched videos that induced disgust, fear, happiness, sadness, or a control (neutral). The study has a number of strengths, including the novelty of the measurement (very few studies have ever measured these gut properties during emotion processing) and the apparent robustness of their main finding (that during disgusting video clips, participants who experienced more feelings of disgust (and to a lesser degree which might not survive more stringent multiple comparison correction, fear) had more acidic stomach measurements, while participants who experienced more happiness during the disgusting video clips had a less acidic (more basic) stomach pH. Although the study is correlational (which all discussion should carefully reflect) and is restricted to a moderately-sized, homogenous sample, the results support their general conclusion that stomach pH is related to emotion experience during disgust induction. There may be additional analyses to conduct in order for the authors to claim this effect is specific to the stomach. Nevertheless, this work is likely to have a large impact on the field, which currently tends to rely on noninvasive measures of gastric activity such as electrogastrography (which the authors also collect for comparison); the authors' minimally-invasive approach yields new and useful measurements of gastric state. These new measures could have relevance beyond emotion processing in understanding the role of gut pH (and perhaps temperature and pressure) in cognitive processes (e.g. interoception) as well as mental and physical health.
  
  We are very grateful to Reviewer#2 for skilfully managing the paper and highlighting its strengths, particularly the innovative measurement approach and the potential implications these findings might offer for future research into the impact of gastric signals on emotional experiences and potentially on many other higher-order cognitive functions. Additionally, we would like to thank her for the highly valuable feedback. We have incorporated all the comments into the revised manuscript, aiming to enhance its quality.
  
  Reviewer #3 (Public Review):
  
  This study used novel ingestible pills to measure pH and other gastric signals, and related these measures to self-report ratings of emotions induced by video clips. The main finding was that when participants viewed videos of disgust, there was an association between gastric pH and feelings of disgust and fear, and (in the opposite direction) happiness. These findings may be the first to relate objective measures of gastric physiology to emotional experience. The methods open up many new questions that can be addressed by future studies and are thus likely to have an impact on the field.
  
  We thank very much also Reviewer#3 for the accurate reading of our manuscript; for highlighting the strengths of our study; and for providing valuable feedback. Below, a point-by-point response to all the comments raised by this Reviewer. We have incorporated their comments, and we hope they are satisfied by the new version of the manuscript.
  
  (3.1) My main concern is with the reliability of the results. The study associates many measures (pH, temperature, pressure, EGG) in stomach, small bowel, and large bowel with multiple emotion ratings. This amounts to many statistical tests. Only one of these measures (pH in the stomach) shows a significant effect. Furthermore, the key findings, as displayed in Figure 4 do not look particularly convincing. Perhaps this is a display issue, but the relations between stomach pH and Vas ratings of disgust, fear, and happiness were not apparent from the scatter plot and may be influenced by outliers (e.g., happiness).
  
  3.1 We thank Reviewer#3 for raising this issue which was also raised by Reviewer#1 and #2, se replies above. As reported above we worked on the data analysis in order to provide more evidence supporting our claim, i.e. that pH plays a role in the emotional experience of disgust, happiness and fear. We modified Figure 4 (now 5) as also requested by Reviewer 1 and 2, and we now hope that it is clearer. We included a new analysis, in which we used all the datapoints recorded from the ingestible device and we performed a mixed models analysis with pH as dependent variable, type of video clips and number of datapoints (‘Time’) as fixed factors, and the by-subject intercepts as random effects. This analysis not only supported the results of the original one but provided evidence for a causal role of the emotional induction on the pH of the stomach. Results of this analysis are described in point 1.7 in the response to Reviewer#1 and results of the new analysis and the revised version of the main figure can be found in track change in the manuscript (Page 15&16, lines: 408-439) in the main text and copied and pasted below.
  
  “To explore how the emotional induction could modulate the pH of the stomach and how the length of the exposure to that specific emotional induction could also play a role in modulating pH variations, we ran an additional model, Model 2. This model included all the pH datapoints registered using the Smartpill as dependent variable, the type of video clip and the number of the datapoints (“Time”) as fixed effects, and the by-subject intercepts as random effects (see Supplementary information for a detailed description of the model). Model 2 had a marginal R2 = 0.014 and a conditional R2 = 0.79. Visual inspection of the plots did reveal some small deviations from homoscedasticity, visual inspection of the residuals did not show important deviations from normality. As for collinearity (tested by means of vif function of car package), all independent variables had a GVIF^(1/(2*Df)))^2 < 10.
  
  Type III analysis of variance of Model 2 showed a statistically significant main effect of the Time (F = 20.237, p < 0.001, Eta2 < 0.01) suggesting that independently from the type of video clip observed, the stomach pH significantly decreased as a function of the time of exposure to the induction. A significant main effect of the type of video clip was also found (F = 22.242, p < 0.001, Eta2 = 0.01) suggesting that pH of the stomach changes when participants experienced different types of emotions. In particular, post hoc analysis revealed that pH was more acidic when participants observed disgusting compared to fearful (t= -11.417; p < 0.001), happy (t= -15.510; p < 0.001) and neutral (t= -3.598; p = 0.003) video clips.
  
  Also, pH was more acidic when participants observed fearful compared to happy (t= -4.064; p < 0.001), and less acidic compared to neutral (t= 7.835; p < 0.001) and sad scenarios (t= 9.743; p < 0.001). Finally, pH was less acidic when participants observed happy compared to neutral (t= 11.923; p < 0.001). and sad videoclips (t= 13.806; p < 0.001), see Fig.6, left panel. Interestingly, also the double interaction Time X Type of video clip was significant (F = 3.250, p = 0.0113, Eta2 < 0.01) suggesting that the time of the exposure to the induction differentially influenced the pH of the stomach depending on to the type of the observed video clip. Simple slope analysis showed that while pH did not change over time when observing disgusting (t= -1.2691; p = 0.2045) and happy (t= 0.4466; p = 0.6552) clips, it did significantly decrease over time when observing fearful (t= -4.4212; p < 0.001), sad (t= -2.0487; p = 0.0405) and neutral video clips (t= -2.7956; p = 0.0052), see Fig.6, right panel."
  
  We believe that the new evidence reported provides support of our claims and we hope that the reviewer agrees with us. However, as we also mentioned in the paper, we are aware that replications are needed and we are already working on this.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.02.17.528509v1
www.biorxiv.org www.biorxiv.org

Differential dopaminergic modulation of spontaneous cortico-subthalamic activity in Parkinson’s disease

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The largest concern with the manuscript is its use of resting-state recordings in Parkinson's Disease patients on and off levodopa, which the authors interpret as indicative of changes in dopamine levels in the brain but not indicative of altered movement and other neural functions. For example, when patients are off medication, their UPDRS scores are elevated, indicating they likely have spontaneous movements or motor abnormalities that will likely produce changed activations in MEG and LFP during "rest". Authors must address whether it is possible to study a true "resting state" in unmedicated patients with severe PD. At minimum this concern must be discussed in the manuscript.
  
  We agree that Parkinson’s disease can lead to unwanted movements such as tremor as well as hyperkinesias. This would of course be a deviation from a resting state in healthy subjects. However, such movements are part of the disease and occur unwillingly. The main tremor in Parkinson’s disease is a rest tremor and - as the name already suggests – it occurs while not doing anything. Therefore, such movements can arguably be considered part of the resting state of Parkinson’s disease. Resting state activity with and without medication is therefore still representative for changes in brain activity in Parkinson’s patients and indicative of alterations due to medication.
  
  To further investigate the effect of movement in our patients, we subdivided the UPDRS part 3 score into tremor and non-tremor subscores. For the tremor subscore we took the mean of item 15 and 17 of the UPDRS, whereas for the non-tremor subscore items 1, 2, 3, 9, 10, 12, 13, and 14 were averaged. Following Spiegel et al., 2007, we classified patients as akinetic-rigid (non-tremor score at least twice the tremor score), tremor-dominant (tremor score at least twice as large as the non-tremor score), and mixed type (for the remaining scores). Of the 17 patients, 1 was tremor dominant and 1 was classified as mixed type (his/her non-tremor score was greater than tremor score). None of our patients exhibited hyperkinesias during the recording. To exclude that our results are driven by tremor-related movement, we re-ran the HMM without the tremor-dominant and the mixed-type patient (see Figure R1 response letter).
  
  ON medication results for all HMM states remained the same. OFF medication results for the Ctx-Ctx and STN-STN state remained the same as well. The Ctx-STN state OFF medication was split into two states: Sensorimotor-STN connectivity was captured in one state and all other types of Ctx-STN connections were captured in another state (see Figure 1 response letter. The important point is that the biological conclusions stand across these solutions. Regardless, both with and without the two subjects a stable covariance matrix entailing sensorimotor-STN connectivity was determined, which is the main finding for the Ctx-STN state OFF medication.
  
  We therefore discuss this issue now within the limitation section (page 20):
  
  “Both motor impairment and motor improvement can cause movement during the resting state in PD. While such movement is a deviation from a resting state in healthy subjects, such movements are part of the disease and occur unwillingly. Therefore, such movements can arguably be considered part of the resting state of Parkinson’s disease. None of the patients in our cohort experienced hyperkinesia during the recording. All patients except for two were of the akinetic-rigid subtype. We verified that tremor movement is not driving our results. Recalculating the HMM states without these 2 subjects, even though it slightly changed some particular aspects of the HMM solution did not materially affect the conclusions.”
  
  Figure R1: States obtained after removing one tremor dominant and one mixed type patient from analysis. Panel C shows the split OFF medication cortico-STN state. Most of the cortico-STN connectivity is captured by the state shown in the top row (Figure 1 C OFF). Only the motor-STN connectivity in the alpha and beta band (along with a medial frontal-STN connection in the alpha band) is captured separately by the states labeled “OFF Split” (Figure 1 C OFF SPLIT).
  
  This reviewer was unclear on why increased "communication" in the medial OFC in delta and theta was interpreted as a pathological state indicating deteriorated frontal executive function. Given that the authors provide no evidence of poor executive function in the patients studied, the authors must at least provide evidence from other studies linking this feature with impaired executive function.
  
  If we understand the comment correctly it refers to the statement in the abstract “Dopaminergic medication led to communication within the medial and orbitofrontal cortex in the delta/theta frequency range. This is in line with deteriorated frontal executive functioning as a side effect of dopamine treatment in Parkinson’s disease”
  
  This statement is based on the dopamine overdose hypothesis reported in the Parkinson’s disease (PD) literature (Cools 2001; Kelly et al. 2009; MacDonald and Monchi 2011; Vaillancourt et al. 2013). We have elaborated upon the dopamine overdose hypothesis in the discussion on page 16. In short, dopaminergic neurons are primarily lost from the substantia nigra in PD, which causes a higher dopamine depletion in the dorsal striatal circuitry than within the ventral striatal circuits (Kelly et al. 2009; MacDonald and Monchi 2011). Thus, dopaminergic medication to treat the PD motor symptoms leads to increased dopamine levels in the ventral striatal circuits including frontal cortical activity, which can potentially explain the cognitive deficits observed in PD (Shohamy et al. 2005; George et al. 2013). We adjusted the abstract to read:
  
  “Dopaminergic medication led to coherence within the medial and orbitofrontal cortex in the delta/theta frequency range. This is in line with known side effects of dopamine treatment such as deteriorated executive functions in Parkinson’s disease.”
  
  In this article, authors repeatedly state their method allows them to delineate between pathological and physiological connectivity, but they don't explain how dynamical systems and discrete-state stochasticity support that goal.
  
  To recapitulate, the HMM divides a continuous time series into discrete states. Each state is a time-delay embedded covariance matrix reflecting the underlying connectivity between brain regions as well as the specific temporal dynamics in the data when such state is active. See Packard et al., (1980) for details about how a time-delay embedding characterises a linear dynamical system.
  
  Please note that the HMM was used as a data-driven, descriptive approach without explicitly assuming any a-priori relationship with pathological or physiological states. The relation between biology and the HMM states, thus, purely emerged from the data; i.e. is empirical. What we claim in this work is simply that the features captured by the HMM hold some relation with the physiology even though the estimation of the HMM was completely unsupervised (i.e. blind to the studied conditions). We have added this point also to the limitations of the study on page 19 and the following to the introduction to guide the reader more intuitively (page 4):
  
  “To allow the system to dynamically evolve, we use time delay embedding. Theoretically, delay embedding can reveal the state space of the underlying dynamical system (Packard et al., 1980). Thus, by delay-embedding PD time series OFF and ON medication we uncover the differential effects of a neurotransmitter such as dopamine on underlying whole brain connectivity.”
  
  Reviewer #2:
  
  Sharma et al. investigated the effect of dopaminergic medication on brain networks in patients with Parkinson's disease combining local field potential recordings from the subthalamic nucleus and magnetencephalography during rest. They aim to characterize both physiological and pathological spectral connectivity.
  
  They identified three networks, or brain states, that are differentially affected by medication. Under medication, the first state (termed hyperdopaminergic state) is characterized by increased connectivity of frontal areas, supposedly responsible for deteriorated frontal executive function as a side effect of medical treatment. In the second state (communication state), dopaminergic treatment largely disrupts cortico-STN connectivity, leaving only selected pathways communicating. This is in line with current models that propose that alleviation of motor symptoms relates to the disruption of pathological pathways. The local state, characterized by STN-STN oscillatory activities, is less affected by dopaminergic treatment.
  
  The authors utilize sophisticated methods with the potential to uncover the dynamics of activities within different brain network, which opens the avenue to investigate how the brain switches between different states, and how these states are characterized in terms of spectral, local, and temporal properties. The conclusions of this paper are mostly well supported by data, but some aspects, mainly about the presentation of the results, remain:
  
  We would like to thank the reviewer for his succinct and clear understanding of our work.
  
  1) The presentation of the results is suboptimal and needs improvement to increase readers' comprehension. At some points this section seems rather unstructured, some results are presented multiple times, and some passages already include points rather suitable for the discussion, which adds too much information for the results section.
  
  We have removed repetitions in the results sections and removed the rather lengthy introductory parts of each subsection. Moreover, we have now moved all parts, which were already an interpretation of our findings to the discussion.
  
  2) It is intriguing that the hyperdopaminergic state is not only identified under medication but also in the off-state. This is intriguing, especially with the results on the temporal properties of states showing that the time of the hyperdopaminergic state is unaffected by medication. When such a state can be identified even in the absence of levodopa, is it really optimal to call it "hyperdopaminergic"? Do the results not rather suggest that the identified network is active both off and on medication, while during the latter state its' activities are modulated in a way that could relate to side effects?
  
  The reviewer’s interpretations of the results pertaining to the hyper-dopaminergic state are correct. The states had been named post-hoc as explained in the results section. The hyper-dopaminergic state’s name derived from it showing the overdosing effects of dopamine. Of course, these results are only visible on medication. But off medication, this state also exists without exhibiting the effects of excess dopamine. To avoid confusion or misinterpretation of the findings and also following the relevant comment by reviewer 1, we renamed all states to be more descriptive:
  
  Hyperdopaminergic > Cortico-cortical state
  
  Communication > Cortico-STN state
  
  Local > STN-STN state.
  
  3) Some conclusions need to be improved/more elaborated. For example, the coherence of bilateral STN-STN did not change between medication off and on the state. Yet it is argued that a) "Since synchrony limits information transfer (Cruz et al. 2009; Cagnan, Duff, and Brown 2015; Holt et al. 2019) , local oscillations are a potential mechanism to prevent excessive communication with the cortex" (line 436) and b) "Another possibility is that a loss of cortical afferents causes local basal ganglia oscillations to become more pronounced" (line 438). Can these conclusions really be drawn if the local oscillations did not change in the first place?
  
  We apologize for the unclear description. Our conclusion was based on the following results:
  
  a) We state that STN-STN connectivity as measured by the magnitude of STN-STN coherence does not change OFF vs ON medication in the Cortico-STN state. This result is obtained using inter-medication analysis.
  
  b) But ON medication, STN-STN coherence in the Cortico-STN state was significantly different from mean coherence within the ON condition. These results are obtained using intra-medication analysis.
  
  Based on this, we conclude that in the Cortico-STN state, although OFF vs ON medication the magnitude of STN-STN coherence was unchanged, the STN-STN coherence was significantly different from mean coherence in the ON medication condition. The emergence of synchronous STN-STN activity may limit information exchange between STN and cortex ON medication.
  
  An alternative explanation for these findings might be a mechanism preventing connectivity between cortex and the STN ON medication. This missing interaction between STN and cortex might cause STN-STN oscillations to increase compared to the mean coherence within the ON state. Unfortunately, we cannot test such causal influences with our analysis.
  
  We have added the following discussion to the manuscript on page 17 in order to improve the exposition:
  
  “Bilateral STN–STN coherence in the alpha and beta band did not change in the cortico-STN state ON versus OFF medication (InterMed analysis). However, STN-STN coherence was significantly higher than the mean level ON medication (IntraMed analysis). Since synchrony limits information transfer (Cruz et al. 2009; Cagnan, Duff, and Brown 2015; Holt et al. 2019), the high coherence within the STN ON medication could prevent communication with the cortex. A different explanation would be that a loss of cortical afferents leads to increased local STN coherence. The causal nature of the cortico-basal ganglia interaction is an endeavour for future research.”
  
  Reviewer #3:
  
  In PD, pathological neuronal activity along the cortico-basal ganglia network notably consists in the emergence of abnormal synchronized oscillatory activity. Nevertheless, synchronous oscillatory activity is not necessarily pathological and also serve crucial cognitive functions in the brain. Moreover, the effect of dopaminergic medication on oscillatory network connectivity occurring in PD are still poorly understood. To clarify these issues, Sharma and colleagues simultaneously-recorded MEG-STN LFP signals in PD patients and characterized the effect of dopamine (ON and OFF dopaminergic medication) on oscillatory whole-brain networks (including the STN) in a time-resolved manner. Here, they identified three physiologically interpretable spectral connectivity patterns and found that cortico-cortical, cortico-STN, and STN-STN networks were differentially modulated by dopaminergic medication.
  
  Strengths:
  
  1) Both the methodological and experimental approaches used are thoughtful and rigorous.
  
  a) The use of an innovative data-driven machine learning approach (by employing a hidden Markov model), rather than hand-crafted analyses, to identify physiologically interpretable spectral connectivity patterns (i.e., distinct networks/states) is undeniably an added value. In doing so, the results are not biased by the human expertise and subjectivity, which make them even more solid.
  
  b) So far, the recurrent oscillatory patterns of transient network connectivity within and between the cortex and the STN reported in PD was evaluated/assessed to specific cortico-STN spectral connectivity. Conversely, whole-brain MEG studies in PD patients did not account for cortico-STN and STN-STN connectivity. Here, the authors studied, for the first time, the whole-brain connectivity including the STN (whole brain-STN approach) and therefore provide new evidence of the brain connectivity reported in PD, as well as new information regarding the effect of dopaminergic medication on the recurrent oscillatory patterns of transient network connectivity within and between the cortex and the STN reported in PD.
  
  2) Studying the temporal properties of the recurrent oscillatory patterns of transient network connectivity both ON and OFF medication is extremely important and provide interesting and crucial information in order to delineated pathological versus physiologically-relevant spectral brain connectivity in PD.
  
  We would like to thank the reviewer for their valuable feedback and correct interpretation of our manuscript.
  
  Weaknesses:
  
  1) In this study, the authors implied that the ON dopaminergic medication state correspond to a physiological state. However, as correctly mentioned in the limitations of the study, they did not have (for obvious reasons) a control/healthy group. Moreover, no one can exclude the emergence of compensatory and/or plasticity mechanisms in the brain of the PD patients related to the duration of the disease and/or the history of the chronic dopamine-replacement therapy (DRT). Duration of the disease and DRT history should be therefore considered when characterizing the recurrent oscillatory patterns of transient network connectivity within and between the cortex and the STN reported in PD, as well as when examining the effect of the dopaminergic medication on the functioning of these specific networks.
  
  We would like to thank the reviewer for pointing this out. We regressed duration of disease (year of measurement – year of onset) on the temporal properties of the HMM states. We found no relationship between any of the temporal properties and disease duration. Similarly, we regressed levodopa equivalent dosage for each subject on the temporal properties and found no relationship. We now discuss this point in the manuscript (page 20):
  
  “A further potential influencing factor might be the disease duration and the amount of dopamine patients are receiving. Both factors were not significantly related to the temporal properties of the states.”
  
  2) Here, the authors recorded LFPs in the STN activity. LFP represents sub-threshold (e.g., synaptic input) activity at best (Buzsaki et al., 2012; Logothetis, 2003). Recent studies demonstrated that mono-polar, but also bi-polar, BG LFPs are largely contaminated by volume conductance of cortical electroencephalogram (EEG) activity even when re-referenced (Lalla et al., 2017; Marmor et al., 2017). Therefore, it is likely that STN LFPs do not accurately reflect local cellular activity. In this study, the authors examined and measured coherence between cortical areas and STN. However, they cannot guarantee that STN signals were not contaminated by volume conducted signals from the cortex.
  
  We appreciate this concern and thank the reviewer for bringing it up. Marmor et al. (2017) investigated this on humans and is therefore most closely related to our research. They find that re-referenced STN recordings are not contaminated by cortical signals. Furthermore, the data in Lalla et al. (2017) is based on recordings in rats, making a direct transfer to human STN recordings problematic due to the different brain sizes. Since we re-referenced our LFP signals as recommended in the Marmor paper, we think that contamination due to cortical signals is relatively minor; see Litvak et al. (2011), Hirschmann et al. (2013), and Neumann et al. (2016) for additional references supporting this. That being said, we now discuss this potential issue in the paper on page 20.
  
  “Lastly, we recorded LFPs from within the STN –an established recording procedure during the implantation of DBS electrodes in various neurological and psychiatric diseases. Although for Parkinson patients results on beta and tremor activity within the STN have been reproduced by different groups (Reck et al. 2010, Litvak et al. 2011, Florin et al. 2013, Hirschmann et al. 2013, Neumann et al. 2016), it is still not fully clear whether these LFP signals are contaminated by volume-conducted cortical activity. However, while volume conduction seems to be a larger problem in rodents even after re-referencing the LFP signal (Lalla et al. 2017), the same was not found in humans (Marmor et al. 2017).”
  
  3) The methods and data processing are rigorous but also very sophisticated which make the perception of the results in terms of oscillatory activity and neural synchronization difficult.
  
  To aid intuition on how to interpret the result in light of the methods used, one can compare the analysis pipeline to a windowing approach. In a more standard approach, windows of different time length can be defined for different epochs within the time series and for each window coherence and connectivity can be determined. The difference in our approach is that we used an unsupervised learning algorithm to select windows of varying length based on recurring patterns of whole brain network activity. Within those defined windows we then determine the oscillatory properties via coherence and power – which is the same as one would do in a classical analysis. We have added an explanation of the concept of “oscillatory activity” within our framework to the introduction (page 2 footnote):
  
  “For the purpose of our paper, we refer to oscillatory activity or oscillations as recurrent, but transient frequency–specific patterns of network activity, even though the underlying patterns can be composed of either sustained rhythmic activity, neural bursting, or both (Quinn et al. 2019).”
  
  Moreover, we provide a more intuitive explanation of the analysis within the first section of the results (page 4):
  
  “Using an HMM, we identified recurrent patterns of transient network connectivity between the cortex and the STN, which we henceforth refer to as an ‘HMM state’. In comparison to classic sliding-window analysis, an HMM solution can be thought of as a data-driven estimation of time windows of variable length (within which a particular HMM state was active): once we know the time windows when a particular state is active, we compute coherence between different pairs of regions for each of these recurrent states.”
  
  4) Previous studies have shown that abnormal oscillations within the STN of PD patients are limited to its dorsolateral/motor region, thus dividing the STN into a dorsolateral oscillatory/motor region and ventromedial non-oscillatory/non-motor region (Kuhn et al. 2005; Moran et al. 2008; Zaidel et al. 2009, 2010; Seifreid et al. 2012; Lourens et al. 2013, Deffains et al., 2014). However, the authors do not provide clear information about the location of the LFP recordings within the STN.
  
  We selected the electrode contacts based on intraoperative microelectrode recordings (for details, see page 23). The first directional recording height after the entry into the STN was selected to obtain the three directional LFP recordings from the respective hemisphere. This practice has been proven to improve target location (Kochanski et al., 2019; Krauss et al., 2021). The common target area for DBS surgery is the dorsolateral STN. To confirm that the electrodes were actually located within this part of the STN, we now reconstructed the DBS location with Lead-DBS (Horn et al. 2019). All electrodes – except for one – were located within the dorsolateral STN (see figure 7 of the manuscript). To exclude that our results were driven by outlier, we reanalysed our data without this patient. No change in the overall connectivity pattern was observed (see figure R3 of the response letter).
  
  Figure R2: Lead DBS reconstruction of the location of electrodes in the STN for different subjects. The red electrodes have not been placed properly in the STN. The contacts marked in red represent the directional contacts from which the data was used for analysis.
  
  Figure R3: HMM states obtained after running the analysis without the subject with the electrode outside the STN.
  
  References:
  
  Buzsáki G, Anastassiou CA, Koch C. The origin of extracellular fields and currents-EEG, ECoG, LFP and spikes. Nat Rev Neurosci 2012; 13: 407–20.
  
  Cagnan H, Duff EP, Brown P. The relative phases of basal ganglia activities dynamically shape effective connectivity in Parkinson’s disease. Brain 2015; 138: 1667–78.
  
  Cools R. Enhanced or impaired cognitive function in Parkinson’s disease as a function of dopaminergic medication and task demands. Cereb Cortex 2001; 11: 1136–43.
  
  Cruz A V., Mallet N, Magill PJ, Brown P, Averbeck BB. Effects of dopamine depletion on network entropy in the external globus pallidus. J Neurophysiol 2009; 102: 1092–102.
  
  Florin E, Erasmi R, Reck C, Maarouf M, Schnitzler A, Fink GR, et al. Does increased gamma activity in patients suffering from Parkinson’s disease counteract the movement inhibiting beta activity? Neuroscience 2013; 237: 42–50.
  
  George JS, Strunk J, Mak-Mccully R, Houser M, Poizner H, Aron AR. Dopaminergic therapy in Parkinson’s disease decreases cortical beta band coherence in the resting state and increases cortical beta band power during executive control. NeuroImage Clin 2013; 3: 261–70.
  
  Hirschmann J, Özkurt TE, Butz M, Homburger M, Elben S, Hartmann CJ, et al. Differential modulation of STN-cortical and cortico-muscular coherence by movement and levodopa in Parkinson’s disease. Neuroimage 2013; 68: 203–13.
  
  Holt AB, Kormann E, Gulberti A, Pötter-Nerger M, McNamara CG, Cagnan H, et al. Phase-dependent suppression of beta oscillations in parkinson’s disease patients. J Neurosci 2019; 39: 1119–34.
  
  Horn A, Li N, Dembek TA, Kappel A, Boulay C, Ewert S, et al. Lead-DBS v2: Towards a comprehensive pipeline for deep brain stimulation imaging. Neuroimage 2019; 184: 293–316.
  
  Kelly C, De Zubicaray G, Di Martino A, Copland DA, Reiss PT, Klein DF, et al. L-dopa modulates functional connectivity in striatal cognitive and motor networks: A double-blind placebo-controlled study. J Neurosci 2009; 29: 7364–78.
  
  Kochanski RB, Bus S, Brahimaj B, Borghei A, Kraimer KL, Keppetipola KM, et al. The impact of microelectrode recording on lead location in deep brain stimulation for the treatment of movement disorders. World Neurosurg 2019; 132: e487–95.
  
  Krauss P, Oertel MF, Baumann-Vogel H, Imbach L, Baumann CR, Sarnthein J, et al. Intraoperative neurophysiologic assessment in deep brain stimulation surgery and its impact on lead placement. J Neurol Surgery, Part A Cent Eur Neurosurg 2021; 82: 18–26.
  
  Lalla L, Rueda Orozco PE, Jurado-Parras MT, Brovelli A, Robbe D. Local or not local: Investigating the nature of striatal theta oscillations in behaving rats. eNeuro 2017; 4: 128–45.
  
  Litvak V, Jha A, Eusebio A, Oostenveld R, Foltynie T, Limousin P, et al. Resting oscillatory cortico-subthalamic connectivity in patients with Parkinson’s disease. Brain 2011; 134: 359–74.
  
  MacDonald PA, MacDonald AA, Seergobin KN, Tamjeedi R, Ganjavi H, Provost JS, et al. The effect of dopamine therapy on ventral and dorsal striatum-mediated cognition in Parkinson’s disease: Support from functional MRI. Brain 2011; 134: 1447–63.
  
  MacDonald PA, Monchi O. Differential effects of dopaminergic therapies on dorsal and ventral striatum in Parkinson’s disease: Implications for cognitive function. Parkinsons Dis 2011; 2011: 1–18.
  
  Marmor O, Valsky D, Joshua M, Bick AS, Arkadir D, Tamir I, et al. Local vs. volume conductance activity of field potentials in the human subthalamic nucleus. J Neurophysiol 2017; 117: 2140–51.
  
  Neumann WJ, Degen K, Schneider GH, Brücke C, Huebl J, Brown P, et al. Subthalamic synchronized oscillatory activity correlates with motor impairment in patients with Parkinson’s disease. Mov Disord 2016; 31: 1748–51.
  
  Packard NH, Crutchfield JP, Farmer JD, Shaw RS. Geometry from a time series. Phys Rev Lett 1980; 45: 712–6.
  
  Quinn AJ, van Ede F, Brookes MJ, Heideman SG, Nowak M, Seedat ZA, et al. Unpacking Transient Event Dynamics in Electrophysiological Power Spectra. Brain Topogr 2019; 32: 1020–34.
  
  Reck C, Himmel M, Florin E, Maarouf M, Sturm V, Wojtecki L, et al. Coherence analysis of local field potentials in the subthalamic nucleus: Differences in parkinsonian rest and postural tremor. Eur J Neurosci 2010; 32: 1202–14.
  
  Shohamy D, Myers CE, Grossman S, Sage J, Gluck MA. The role of dopamine in cognitive sequence learning: Evidence from Parkinson’s disease. Behav Brain Res 2005; 156: 191–9.
  
  Spiegel J, Hellwig D, Samnick S, Jost W, Möllers MO, Fassbender K, et al. Striatal FP-CIT uptake differs in the subtypes of early Parkinson’s disease. J Neural Transm 2007; 114: 331–5.
  
  Vaillancourt DE, Schonfeld D, Kwak Y, Bohnen NI, Seidler R. Dopamine overdose hypothesis: Evidence and clinical implications. Mov Disord 2013; 28: 1920–9.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.09.24.308122v3
www.medrxiv.org www.medrxiv.org

New submission 03/06/2023, 09:52:05

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This study provided evidence to interpret and understand the aging and developmental processes in children. The main strength of the study is it measures a set of biological age measures and a set of developmental measures, thus providing multi-faceted evidence to explain the associations between aging and development in children. The main weakness of this study is that how to measure and test the aging hypothesis of "a buildup of biological capital model" and "wear and tear" is not well-explained. Why the observed associations between biological age measures and developmental measures could support the aforementioned aging theories?
  
  Thank you. On reflection we agree that how to test the aging hypotheses of "a buildup of biological capital model" and "wear and tear" is not well-explained in the manuscript. We have addressed this issue in the point-by-point responses below:
  
  1) Abstract - conclusion: The aging hypothesis of "a buildup of biological capital model" and "wear and tear" were mentioned in the conclusion without an explanation of these theories in the previous section. Readers who are not experts in the field may not understand the logic.
  
  We have replaced these phrases in the abstract with the following interpretation, which we hope will be more readily understood:
  
  “Patterns of associations suggested that accelerated immunometabolic age may be beneficial for some aspects of child development while accelerated DNA methylation age and telomere attrition may reflect early detrimental aspects of biological ageing, apparent even in children.”
  
  2) Result - Biological age marker performance: the correlation between transcriptome age and chronological age is very strong (r =0.94). I am afraid that very little age-independent information could be captured by the transcriptome age. Is it possible to down-regulate the age dependency of the transcriptome age in the training process?
  
  Thank you for this important comment: We agree the high accuracy of this clock may in fact reduce its relevance as a biological age marker and note that this is a concern generally in the field. We have explored the possibility of using a less accurate transcriptome age model as follows: Instead of elastic net modelling we tested using the lasso penalisation only, which will result in more parsimonious (sparse) models as less important features are dropped as the strength of the lambda parameter is increased. Plotting the correlation in the test set against number of features in models, as the lambda is sequentially increased, we can see (as shown in Author response image 1 by the blue line) that after the inclusion of around 200 features, the gain in accuracy becomes less steep.
  
  Author response image 1.
  
  We then tested the sensitivity of a model optimised for sparsity at the expense of some prediction accuracy, selected based on visual inspection (blue line, r in test set =0.87, number of features= 187) of the above plot, against developmental measures, compared to the most accurate model as presently included in the manuscript:
  
  Author response image 2.
  
  We find that, across all outcomes tested, the less accurate model, based on only the most important features, does not provide an improvement in sensitivity to developmental outcomes compared to the currently used model.
  
  We therefore prefer to keep the more accurate model in this study. Especially as it is consistent with the methodology used in the Horvath and Immunometabolic age models and generally in the field, and otherwise it is not obvious how the biological clock should be trained (especially for children without mortality data) without altering the whole approach of the study. We have acknowledged and discussed this issue on page 15.
  
  3) The study population comes from several cohorts, which might influence the results. How the cohort effects were controlled for in the analyses?
  
  The possible influence of cohort is a limitation of the study which we have discussed on page 16. We did not include cohort as a predictor in any of the candidate biological clocks since this may reduce detection of some age -related features. Instead, we include a variable for cohort as a fixed effect in all analyses with risk factors and developmental outcomes and examined the performance of candidate biological clocks in predicting chronological age within each cohort. As a further check, we have added an additional sensitivity analysis (Figure 4-figure supplement 6), against developmental outcomes significant in the main analysis, stratified by cohort. We find generally consistent effects across cohorts.
  
  4) Figure 3 only showed the number of p values. Can the author also provide the number of point estimates and 95% confidence intervals, perhaps in the supplemental table?
  
  This information was originally provided in supplemental table 5 (now Supplementary file 7), combined with the sensitivity analyses. To make this information easier to find, we have made this a stand-alone table (table 3). We now direct readers to this information within the caption of Figure 4 (previously figure 2).
  
  Reviewer #2 (Public Review):
  
  The study had an especially relevant aim for aging research and utilized various data types in an especially interesting human population. Multi-omics perspective adds great value to the work. The researchers aimed to evaluate how different indicators of biological age (BA) behave in children during their developmental stage. In the analysis, relationships between indicators of BA, health risk factors, and developmental factors were assessed in cross-sectional data comprising children aged 5-12 years. The manuscript is well-written and easy to follow. The methodology is good. The authors succeeded to reach the aim in most parts.
  
  In the study, previously known and unknown biological age indicators were used. Known indicators included telomere length and Horvath's epigenetic age. Unknown (novel) indicators, transcriptomic and immunometabolic clocks, were developed in the present study and they showed a strong correlation with calendar age in this population, also in the validation data set. Although the transcriptomic and immunometabolic clocks have the potential of being true indicators of biological age, they are still lacking scientific evidence of being such indicators in adults. That is, their associations with age-related diseases and mortality are yet to be shown. Thus, the major remark of the study relates to the phrasing: these novel transcriptomic and immunometabolic clocks should be presented as BA indicator candidates waiting for the needed evidence.
  
  Thank you for this important observation. However, we still find that “biological age indicator” is a useful umbrella term in this manuscript and there is not an obvious alternative. We therefore have added the following sentence on page 8, and highlighted the difference between the markers at key points in the abstract, introduction, results and discussion.
  
  “We note that since a common definition of markers of biological age is that they should be associated with age-related disease and mortality [69] these new clocks may only currently be considered “candidate” biological age markers. However, we have referred to both the established and candidate markers as biological age markers throughout to simplify presentation.”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2023.01.23.23284901v1
www.biorxiv.org www.biorxiv.org

New submission 04/12/2022, 10:59:28

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review)
  
  [...] One potential issue is that the high myelination signal is associated with the compartment in V2 (pale stripes) which was not functionally defined itself but by the absence of specific functional activations. No difference was reported between those stripes that were defined functionally. Other explanations for the differential pattern of a qMRI signals, e.g. ROI distribution for presumed pale stripes is not evenly distributed (more foveal), ROIs with low activations due to some other factor show higher myelin-related signals, cannot be excluded based on the analysis presented.
  
  Indeed, it would have been advantageous to directly functionally delineate pale stripes in V2. Since we were not able to achieve this by fMRI, we needed an indirect method to infer pale stripe contributions in the analysis. We also added a statement in the discussion section to emphasize this more (p. 9, lines 286–288).
  
  Furthermore, different myelination between thin and thick stripes was not tested, since we did not have a concrete hypothesis on this. Despite the conflicting findings of stronger myelination in dark or pale CO stripes in the literature, no histological study stated myelination differences between dark CO thin and thick stripes. Therefore, our primary interest and hypothesis was lying in comparing the different myelination of thin/thick and pale stripes using MRI.
  
  Thank you very much for this comment about potential other sources of differential qMRI parameter patterns. Indeed, based on the original analysis we could not exclude that the absence of functional activation around the foveal representation may have biased our analysis. We therefore added a supporting analysis, in which we excluded the region around the foveal representation from the analysis. The excluded cortical region was kept consistent between participants by excluding the same eccentricity range in all maps. We added more details in the results section of the revised manuscript (p. 8, lines 189–202). In Figure 5-Supplement 1 and Figure 5-Supplement 3, results from this supporting analysis are shown which reproduced the primary findings from the main analysis, particularly the relatively higher myelination of pale stripes.
  
  ROI definitions solely based on fMRI activation amplitude have additional limitations. However, we find it unlikely that a small fMRI effect size and low contrast-to-noise ratio (i.e. stochastic cause of low statistical parameter values/”activation”) has impacted the results, since Figure 3 shows that we could achieve a high degree of reproducibility for each participant.
  
  We would note that the fact that we found consistent differences across MPM and MP2RAGE sessions makes some potential artifacts driving the differences unlikely. We also find it unlikely that systematic cerebral blood volume differences between stripes would have driven the results. A higher local blood volume would lead to increased BOLD responses but also to a higher R1 value due to the deoxy-hemoglobin induced relaxation, which is opposite to the observation of higher activity in the thick/thin stripes but lower R1 values.
  
  Further studies using other functional metrics (e.g. VASO, ASL etc.) may help us to even more clearly demonstrate specificity but were out of the scope of this already rather extensive study. Although we have added extensive further analyses in the revised manuscript such as controlling for foveal effects or registration performance, we did not see a possibility to fully exclude a systematic bias that might potentially be caused by unknown factors.
  
  Another theoretical and practical issue is the question of "ground truth" for the non-invasive qMRI measures, as the authors - as their starting point - roundly dismiss direct histological tissue studies as conflicting, rather than take a critical look at the merit of the conflicting study results and provide a best hypothesis. If so, they need to explain better how they calibrate their non-invasive MR measurements of myelin.
  
  We agree and have now further elaborated on the limits of specificity of the R1 and R2* signal as cortical myelin marker (p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260). However, we still think that it is important for the reader to appreciate the conflicting results in histological studies using staining methods for myelin, which adds to the study’s background.
  
  We did not intend to give the impression that MRI provides the missing ground-truth to adjudicate histological controversies, but that it provides an alternative and additional view on the open questions. We changed the introduction to better reflect the aspect that the study offers a unique view by providing myelination proxies and functional measures in the same individual, which allows for direct comparison and investigation of structure-function relationships (see p. 2, lines 68–70; p. 3, lines 93–95), which is not accessible to any other approach. Nevertheless, we would like to note that R1 has been well established as a myelin marker under particular conditions (Kirilina et al., 2020; Mancini et al., 2020; Lazari and Lipp, 2021). It has also been widely used for cortical myelin mapping across a variety of populations, systems and field strengths. We added this statement to the introduction (see p. 2, lines 82-85). We note that we excluded volunteers with pathologies or neurological disorders from the study and their mean age was about 28 years. Thus, we had conditions comparable to previous (validation) studies.
  
  Because of the contradictory findings of histological studies, we could not further finesse the hypothesis beyond our previous a priori hypothesis that we expected differences in the myelin sensitive MRI metrics between the thin/thick versus pale stripes. To improve the contextual understanding, we added a paragraph in the discussion section covering in more depth how the MRI results relate to known histological findings (see pp. 8–9, lines 216–240).
  
  While this paper makes an important contribution to the question of the association of specific myelination patterns defining the columnar architecture in V2, it is not entirely clear whether the authors can fully resolve it with the data presented.
  
  Indeed, we agree that non invasive aggregate measures, such as the R1 metrics, offer limited specificity which precludes a fully conclusive inference about cortical myelination. We have further emphasized this on several occasions in the text (see p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260). Since the correspondence of cortical myelin levels and R1 (and other metrics) is an active area of research, we expect that the understanding, sensitivity and specificity of R1 to cortical myelination will further improve. We note that the use of qMRI is a substantial advance over weighted MRI typically used, which suffers from lack of specificity due to instrumental idiosyncrasies and varying measurement conditions.
  
  Reviewer #2 (Public Review)
  
  [...] Unfortunately, this particular study seems to fall into an unhappy middle ground in terms of the conclusions that can be drawn: the relaxometry measures lack the specificity to be considered "ground truth", while the authors claim that the literature lacks consensus regarding the structures that are being studied. The authors propose that their results resolve whether or not stripes differ in their patterns of myelination, but R1 lacks the specificity to do this. While myelin is a primary driver of relaxation times in cortex, relaxometry cannot be considered to be specific to myelin. It is possible that the small observed changes in R1 are driven by myelin, but they could also reflect other tissue constituents, particularly given the small observed effect sizes. If the literature was clear on the pattern of myelination across stripes, this study could confirm that R1 measurements are sensitive to and consistent with this pattern. But the authors present the work as resolving the question of how myelination differs between stripes, which over-reaches what is possible with this method. As it stands, the measured differences in R1 between functionally-defined cortical regions are interesting, but require further validation (e.g., using invasive myelin staining).
  
  We agree that we have inadvertently overstated the specificity of R1 at several occasions in the text. We therefore toned down the statements concerning the correspondence between R1 and myelin throughout the manuscript (e.g. see p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260).
  
  We also removed the phrase that gave the impression that MRI can conclusively resolve the conflicting results found in histological studies. In the Introduction, we changed the corresponding paragraph by emphasizing the alternative view, which can be obtained from MRI by the possibility to investigate structure-function relationships in the living human brain, which would not be possible by invasive myelin staining (see p. 2, lines 68–70; p. 3, lines 93–95).
  
  We acknowledge that – perhaps aside from electron microscopy – all common markers have shortcomings, which limit their specificity. For example, classic histology is not quantitative and resulted in conflicting results. It even includes the very fundamental issue, that the composition of myelin varies across the brain and within brain areas significantly (e.g., its lipid composition (González de San Román et al., 2018)). Thus, we regard the different invasive/non-invasive measures as complementary. R1 adds to this arsenal of measures and can be acquired non invasively. It has been shown to be a reliable myelin marker under certain circumstances. It follows the known myeloarchitecture patterns of the human brain, which was also checked for the data of the present study (see Figure 4 and Appendix 2). It is responsive to traumatic changes (Freund et al., 2019), development (Whitaker et al., 2016; Carey et al., 2018; Natu et al., 2019) and plasticity (Lazari et al., 2022). Since we studied healthy volunteers with no known pathologies that were sampled randomly from the population, we believe that the previous results generally apply and suggest sufficient specificity of the R1 marker. Of course, we cannot fully exclude bias due to unknown factors that have not been investigated/discovered by validation studies yet. However, in this case we expect that the systematic differences between stripe types would remain an important result most likely pointing to another interesting biological difference between stripes.
  
  While more research is needed to clarify the precise role of R1 for cortical myelin, we think that the meaningful determination of quantitative MR parameter within one cortical area is still interesting for the neuroscientific community.
  
  Moreover, the results make clear that R1 differences are not sufficiently strong to provide an independent measure of this structure (e.g., for segmentation of stripe). As such, one would still require fMRI to localise stripes, making it unclear what role R1 measures would play in future studies.
  
  Indeed, the observed small effect sizes in the present study still requires a functional localization with fMRI. We expected small effect sizes using R1 and R2* due to the known small inter-areal or intra-cortical differences of MRI myelin markers. Therefore, this study aimed at a proof-of-concept investigating whether intra-areal R1 differences at the spatial scale of columnar structures can be detected using non-invasive MRI. Our study shows that these differences can be seen but currently not at the single voxel level. We anticipate that with further improvements in sequence development and scanner hardware, high-resolution R1 estimates with sufficient SNR can be acquired making fMRI redundant (for this kind of investigations). Please see the reply to the next comment concerning the impact of using R1 in future studies.
  
  The Introduction concludes with the statement that "Whereas recent studies have explored cortical myelination ... using non-quantitative, weighted MR images... we showed for the first time myelination differences using MRI on a quantitative basis". As written, this sentence implies that others have demonstrated that simpler non-quantitative imaging can achieve the same aims as qMRI. Simply showing that a given method is able to achieve an aim would not be sufficient: the authors should demonstrate that this constitutes an important advance.
  
  Thank you for this comment. It goes to the heart of the concerns raised about specificity and sensitivity of MRI based myelin metrics. We elaborate here on the main advantage of using qMRI in our current study and why it is more specific than weighted MR imaging. However, we emphasize that a thorough comparison between qMRI and weighted MRI is highly complex and refer to our recent review paper on qMRI for further details (Weiskopf et al., 2021), which are beyond the scope of our paper. The signal in weighted MRI, even when optimally optimized to the tissue of interest, additionally depends on both inhomogeneities in the RF transmit and receive (bias) fields. Other methods like using a ratio image (T1w/T2w) can cancel out the receive field bias entirely (in the case of no subject movements between scans) but not the transmit field bias. This hampers the direct analysis and interpretation of signal differences between distant regions of the brain. For high resolution imaging applications, the usage of high magnetic fields such as 7 T is beneficial or even mandatory due to signal-to-noise (SNR) penalties. With increasing field strength, these inhomogeneities also apply to small regions as V2. For these cases, qMRI is advantageous since it provides metrics which are free from these technical biases, significantly improving the specificity. As high-field MRI has the potential to non invasively study the structure and function of the human brain at the spatial scale of cortical layers and cortical columns, we believe that the results of our current study, which successfully demonstrate the applicability of qMRI to robustly detect small differences at the level of columnar systems, is relevant for future studies in the field of neuroscience.
  
  We emphasized these considerations in the revised manuscript (see. p. 9, lines 273–285).
  
  The study includes a very small number of participants (n=4). The advantage of non-invasive in-vivo measurements, despite the fact that they are indirect measures, should be that one can study a reasonable number of subjects. So this low n seems to undermine that point. I rarely suggest additional data collection, but I do feel that a few more subjects would shore up the study's impact.
  
  The present study was conducted in line with a deep phenotyping study approach. That is, we focused on acquiring highly reliable datasets on individuals. We did not intend to capture the population variance, which is often the goal of other group studies, since low level and basic features such as stripes in V2 are expected to be present in all healthy individuals. Thus we traded off and prioritized test-retest measurements for fMRI sessions and using an alternative MP2RAGE acquisition over a larger number of individuals. This resulted in 6–7 scanning sessions on different days for each individual, summing up to 26 long scanning session in total. We also note that the used sample size is not smaller than in other studies with a similar research question. For example, another fMRI study investigating V2 stripes in humans used the same sample size of n=4 (Dumoulin et al., 2017).
  
  The paper overstates what can be concluded in a number of places. For example, the paper suggests that R1 and R2 are highly-specific to myelin in a number of places. For example, on p7 the text reads" "We tested whether different stripe types are differentially myelinated by comparing R1 and R2..." Relaxation times lack the specificity to definitively attribute these changes purely to myelin. Similarly, on p11: "Our study showed that pale stripes which exhibit lower oxidative metabolic activity according to staining with CO are stronger myelinated than surrounding gray matter in V2." This implies that the study directly links CO staining to myelination. In addition to using non-specific estimates of myelination, the study does not actually measure CO.
  
  We agree that we did not clearly point out the limitations of R1 myelin mapping. Therefore, we toned down the statements about the connection between cortical myelin and R1. The mentioned statements in the reviewer’s comment were changed accordingly (see p. 6, line 163; p. 11, lines 353–354). We also included a small paragraph to clarify the used terminology (color-selective thin stripes, disparity-selective thick stripes) in the manuscript (see p. 4, lines 110–114) to avoid the inadvertent conflation of CO staining and actually measured brain activity.
  
  I'm confused by the analysis in Figure 5. I can appreciate why the authors are keen to present a "tripartite" analysis (thick, thin, and pale stripes). But I find the gray curves confusing. As I understand it, the gray curves as generated include both the stripe of interest (red or blue plots) and the pale stripes. Why not just generate a three-way classification? Generating these plots in effect has already required hard classification of thin and thick stripes, so it is odd to create the gray plots, which mix two types of stripes. Alternatively, could you explicitly model the partial volume for a given cortical location (e.g., under the assumption that partial volume of thick and thin strips is indicated by the z-score) for the corresponding functional contrast? One could then estimate the relaxation times as a simple weighted sum of stripe-wise R1 or R2.
  
  Figure on weighted average of stripe-wise R1 and R2. (a) shows the weighted sum of R1 (de-meaned and de-curved) over all V2 voxels. z-scores from color-selective thin stripe experiments and disparity-selective thick stripes were used as weights in the left and middle group of bars, respectively. An intermediate threshold of zmax=1.96 was used, i.e., final weights were defined as weights=(z-1.96). Weights with z<0 were set to 0. For pale stripes (right group of bars), we used the maximum z-score value from thin and thick stripe measurements. We then set all weights with z≥1.96 to 0 and used the inverse as final weights. i.e., weights = -1 * (max(z)-1.96). (b) shows the same analysis for R2. Error bars indicate 1 standard error of the mean.
  
  (1) Yes, indeed. We agree that modeling the partial volume of each compartment (thin, thick and pale stripes) in each V2 voxel would be the most elegant approach. However, we note that z-scores between thin and thick stripe experiments may not reflect the voxel-wise partial volume effect, since they are a purely statistical measure and not a partial volume model. Having said this, we think that this general approach can give some additional insights and we provide results for a similar analysis here. We calculated the weighted sum of R1 and R2 values over all V2 voxels for each stripe compartment (thin, thick and pale stripes) independently (see above figure). For R1, we see the same pattern of R1 between stripe types as in the manuscript (Figure 5). Additionally, we show the differences here for each subject, which further demonstrates the reproducibility across subjects in our study. For R2, no clear pattern across subjects emerged, confirming the results in our manuscript. Since, this analysis did not add relavant new information to the manuscript, we refrained from adding this figure to the manuscript, in order not to overload it.
  
  (2) In our current study, we were not primarily interested in investigating differences between thin/thick stripes and pale stripes. While histological analysis found differences (though not consistent) between CO dark stripes (more myelinated, (Tootell et al., 1983)) and CO pale stripes (more myelinated, Krubitzer and Kaas, 1989)), no study stated myelin differences between CO dark stripes. This does not fully exclude the possibility of myelination differences but suggests that if myelination differences between CO dark stripes existed, they would presumably be smaller than differences between CO dark and CO pale stripes. Thus, it would be even more difficult to demonstrate than the hypothesis of this manuscript.
  
  Therefore, we decided to directly test two compartments against each other instead of modeling all three compartments within a single model. In our analysis, we thereby loosely followed the analysis methods described in Li et al. (2019), which compared myelin differences between thin/thick and pale stripes in macaques. We note that this demonstrates further consistency, since it is not trivial that both thick and thin stripes show lower R1 values than the pale stripes. For example, there may be no or opposite differences.
  
  (3) Just for clarification, the plots in Figure 5 show the comparison of R1 (or R2*) between two compartments in V2. The red (blue) curve includes the thin (thick) stripe of interest. The gray curve includes everything in V2 minus contributions from thick (thin) stripes of interest. If we take the thin stripe comparison as example (Figure 5a), then red contains the thin stripes of interest while gray contains everything minus the thick stripes. Therefore, assuming a tripartite stripe arrangement, the gray curve contains both thin and pale stripe contributions.
  
  References
  
  Carey D, Caprini F, Allen M, Lutti A, Weiskopf N, Rees G, Callaghan MF, Dick F. Quantitative MRI provides markers of intra-, inter-regional, and age-related differences in young adult cortical microstructure. Neuroimage 2018; 182:429–440.
  
  Dumoulin SO, Harvey BM, Fracasso A, Zuiderbaan W, Luijten PR, Wandell BA, Petridou N. In vivo evidence of functional and anatomical stripe-based subdivisions in human V2 and V3. Sci Rep 2017; 7:733.
  
  Freund P, Seif M, Weiskopf N, Friston K, Fehlings MG, Thompson AJ, Curt A. MRI in traumatic spinal cord injury: from clinical assessment to neuroimaging biomarkers. Lancet Neurol 2019; 18:1123–1135.
  
  González de San Román E, Bidmon H-J, Malisic M, Susnea I, Küppers A, Hübbers R, Wree A, Nischwitz V, Amunts K, Huesgen PF. Molecular composition of the human primary visual cortex profiled by multimodal mass spectrometry imaging. Brain Struct Func 2018; 223:2767–2783.
  
  Kirilina E, Helbling S, Morawski M, Pine K, Reimann K, Jankuhn S, Dinse J, Deistung A, Reichenbach JR, Trampel R, Geyer S, Müller L, Jakubowski N, Arendt T, Bazin P-L, Weiskopf N. Superficial white matter imaging: Contrast mechanisms and whole-brain in vivo mapping. Sci Adv 2020; 6:eaaz9281.
  
  Krubitzer LA, Kaas JH. Cortical integration of parallel pathways in the visual system of primates. Brain Res 1989; 478:161–165.
  
  Lazari A, Lipp I. Can MRI measure myelin? Systematic review, qualitative assessment, and meta-analysis of studies validating microstructural imaging with myelin histology. Neuroimage 2021; 230:117744.
  
  Lazari A, Salvan P, Cottaar M, Papp D, Rushworth MFS, Johansen-Berg H. Hebbian activity-dependent plasticity in white matter. Cell Rep 2022; 39:110951.
  
  Li X, Zhu Q, Janssens T, Arsenault JT, Vanduffel W. In Vivo Identification of Thick, Thin, and Pale Stripes of Macaque Area V2 Using Submillimeter Resolution (f)MRI at 3 T. Cereb 2019; 29:544–560.
  
  Mancini M, Karakuzu A, Cohen-Adad J, Cercignani M, Nichols TE, Stikov N. An interactive meta-analysis of MRI biomarkers of myelin. Elife 2020; 9:e61523.
  
  Natu VS, Gomez J, Barnett M, Jeska B, Kirilina E, Jaeger C, Zhen Z, Cox S, Weiner KS, Weiskopf N, Grill-Spector K. Apparent thinning of human visual cortex during childhood is associated with myelination. PNAS 2019; 116:20750–20759.
  
  Tootell RBH, Silverman MS, De Valois RL, Jacobs GH. Functional Organization of the Second Cortical Visual Area in Primates. Science 1983; 220:737–739.
  
  Weiskopf N, Edwards LJ, Helms G, Mohammadi S, Kirilina E. Quantitative magnetic resonance imaging of brain anatomy and in vivo histology. Nat Rev Phys 2021; 3:570–588.
  
  Whitaker KJ, Vértes PE, Romero-Garcia R, Váša F, Moutoussis M, Prabhu G, Weiskopf N, Callaghan MF, Wagstyl K, Rittman T, Tait R, Ooi C, Suckling J, Inkster B, Fonagy P, Dolan RJ, Jones PB, Goodyer IM, NSPN Consortium, Bullmore ET. Adolescence is associated with genomically patterned consolidation of the hubs of the human brain connectome. PNAS 2016; 113:9105–9110.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.28.489865v2
www.biorxiv.org www.biorxiv.org

New submission 10/03/2023, 14:06:47

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Determination of the biomechanical forces and downstream pathways that direct heart valve morphogenesis is an important area of research. In the current study, potential functions of localized Yap signaling in cardiac valve morphogenesis were examined. Extensive immunostainings were performed for Yap expression, but Yap activation status as indicated by nuclear versus cytoplasmic localization, Yap dephosphorylation, or expression of downstream target genes was not examined.
  
  We thank the reviewer for appreciating the significance of this work, and we also thank the reviewer for the constructive suggestions. Following these suggestions, we have improved analysis of YAP activation status and used nuclear versus cytoplasmic localization to quantify YAP activation. To address the reviewer’s concerns, we have conducted extra qPCR analysis of YAP downstream target genes and YAP upstream genes in Hippo pathway. Please find the detailed revisions in our responses to the Recommendations for authors.
  
  The goal of the work was to determine Yap activation status relative to different mechanical environments, but no biomechanical data on developing heart valves were provided in the study.
  
  We appreciate the reviewer for raising this concern. We have previously published the biomechanical data of developing chick embryonic heart valves in the following study:
  
  Buskohl PR, Gould RA, Butcher JT. Quantification of embryonic atrioventricular valve biomechanics during morphogenesis. Journal of Biomechanics. 2012;45(5):895-902.
  
  In that study, we used micropipette aspiration to measure the nonlinear biomechanics (strain energy) of chick embryonic heart valves at different developmental stages. Here in this study, we used the same method to measure the strain energy of YAP activated/inhibited cushion explants and compared it to the data from our previous study. Our findings were summarized in the Results: “YAP inhibition elevated valve stiffness”, and the detailed measurements, including images and data, are presented in Figure S4.
  
  There are several major weaknesses that diminish enthusiasm for the study.
  
  1) The Hippo/Yap pathway activation leads to dephosphorylation of Yap, nuclear localization, and induced expression of downstream target genes. However, there are no data included in the study on Yap nuclear/cytoplasmic ratios, phosphorylation status, or activation of other Hippo pathway mediators. Analysis of Yap expression alone is insufficient to determine activation status since it is widely expressed in multiple cells throughout the valves. The specificity for activated Yap signaling is not apparent from the immunostainings.
  
  We thank the reviewer for pointing out this weakness. We have now implemented nuclear versus cytoplasmic localization as recommended to quantify YAP activation. We have also conducted additional experiments to analyze via qPCR YAP downstream target genes and YAP upstream genes in Hippo pathway. Please see the detailed revisions in our responses to the Recommendations for authors.
  
  2) The specific regionalized biomechanical forces acting on different regions of the valves were not measured directly or clearly compared with Yap activation status. In some cases, it seems that Yap is not present in the nuclei of endothelial cells surrounding the valve leaflets that are subject to different flow forces (Fig 1B) and the main expression is in valve interstitial subpopulations. Thus the data presented do not support differential Yap activation in endothelial cells subject to different fluid forces. There is extensive discussion of different forces acting on the valve leaflets, but the relationship to Yap signaling is not entirely clear.
  
  We thank the reviewer for these important questions. The region-specific biomechanics have been well mapped and studied, thanks to the help from Computational Fluid Dynamics supported by ultrasound velocity and pressure measurements. For example:
  
  Yalcin, H.C., Shekhar, A., McQuinn, T.C. and Butcher, J.T. (2011), Hemodynamic patterning of the avian atrioventricular valve. Dev. Dyn., 240: 23-35.
  
  Bharadwaj KN, Spitz C, Shekhar A, Yalcin HC, Butcher JT. Computational fluid dynamics of developing avian outflow tract heart valves. Ann Biomed Eng. 2012 Oct;40(10):2212-27. doi: 10.1007/s10439-012-0574-8.
  
  Ayoub S, Ferrari G, Gorman RC, Gorman JH, Schoen FJ, Sacks MS. Heart Valve Biomechanics and Underlying Mechanobiology. Compr Physiol. 2016 Sep 15;6(4):1743-1780.
  
  Salman HE, Alser M, Shekhar A, Gould RA, Benslimane FM, Butcher JT, et al. Effect of left atrial ligation-driven altered inflow hemodynamics on embryonic heart development: clues for prenatal progression of hypoplastic left heart syndrome. Biomechanics and Modeling in Mechanobiology. 2021;20(2):733-50.
  
  Ho S, Chan WX, Yap CH. Fluid mechanics of the left atrial ligation chick embryonic model of hypoplastic left heart syndrome. Biomechanics and Modeling in Mechanobiology. 2021;20(4):1337-51.
  
  Those studies have shown that USS develops on the inflow surface of valves while OSS develops on the outflow surface of valves, CS develops in the tip region of valves while TS develops in the regions of elongation and compaction. Here in this study, we mimic those forces in our in-vitro and ex-vivo models. This allows us to study the direct effect of specific force on the YAP activity in different cell lineages. The results showed that OSS promoted YAP activation in VECs while USS inhibited it, CS promoted YAP activation in VICs while TS inhibited it. This result well explained the spatiotemporal distribution of YAP activation in Figure 1. For example, nuclear YAP was mostly found in VECs on the fibrosa side, where OSS develops, and YAP was not expressed in the nuclei in VECs of the atrialis/ventricularis side, where USS develops. It is also worth noting that formation of OSS on the outflow side is slower, and thus the side specific YAP activation in VECs was not in effect at the early stage, from E11.5 to E14.5.
  
  3) The requirement for Yap signaling in heart valve remodeling as described in the title was not demonstrated through manipulation of Yap activity.
  
  With respect, it is unclear what the reviewer is asking for given no experiments are suggested nor an elaboration of alternative interpretations of our results that emphasize against YAP requirement. It has been previously shown that YAP signaling is required for early EMT stages of valvulogenesis using conditional YAP deletion in mice:
  
  Zhang H, von Gise A, Liu Q, Hu T, Tian X, He L, et al. Yap1 Is Required for Endothelial to Mesenchymal Transition of the Atrioventricular Cushion. Journal of Biological Chemistry. 2014;289(27):18681-92.
  
  Signaling roles for early regulators at these later fetal stages are different, sometimes opposite early EndMT stages, thus contraindicating reliance on these early data to explain later events:
  
  Bassen D, Wang M, Pham D, Sun S, Rao R, Singh R, et al. Hydrostatic mechanical stress regulates growth and maturation of the atrioventricular valve. Development. 2021;148(13).
  
  However, embryos with YAP deletion failed to form endocardial cushions and could not survive long enough for the study of its roles in later cushion growth and remodeling into valve leaflets. In this work,
  
  We first showed the localization of YAP activity and its direct link with local shear or pressure domains. Then we explicitly applied controlled gain and loss of function of YAP via specific molecules. We also applied critical mechanical gain or loss of function studies to demonstrate YAP mechanoactivation necessity and sufficiency to achieve growth and remodeling.
  
  Reviewer #2 (Public Review)
  
  This study by Wang et al. examines changes in YAP expression in embryonic avian cultured explants in response to high and low shear stress, as well as tensile and compressive stress. The authors show that YAP expression is increased in response to low, oscillatory shear stress, as well as high compressive stress conditions. Inhibition of YAP signaling prevents compressive stress-induced increases in circularity, decreased pHH3 expression, and increases VE-cadherin expression. On the other hand, YAP gain of function prevents tensile stress-induced decreases in pHH3 expression and VE-cadherin expansion. It also decreases the strain energy density of embryonic avian cushion explants. Finally, using an avian model of left atrial ligation, the authors demonstrate that unloaded regions within the primitive valve structures are associated with increased YAP expression, compared to regions of restricted flow where YAP expression is low. Overall, this study sheds light on the biomechanical regulation of YAP expression in developing valves.
  
  We thank the reviewer for the accurate summary and their enthusiasm for this work.
  
  Strengths of the manuscript include:
  
  Novel insights into the dynamic expression pattern of YAP in valve cell populations during post-EMT stages of embryonic valvulogenesis.
  
  Identify the positive regulation of YAP expression in response to low, oscillatory shear stress, as well as high compressive stress conditions.
  
  Identify a link between YAP signaling in regulating stress-induced cell proliferation and valve morphogenesis.
  
  The inclusion of the atrial left atrial ligation model is innovative, and the data showing distinguishable YAP expression levels between restricted, and non-restricted flow regions is insightful.
  
  We thank the reviewer for appreciating the strengths of this work.
  
  This is a descriptive study that focuses on changes in YAP expression following exposure to diverse stress conditions in embryonic avian cushion explants. Overall, the study currently lacks mechanistic insights, and conclusions based on data are highly over-interpreted, particularly given that the majority of experimental protocols rely on one method of readout.
  
  We thank the reviewer for constructive suggestions.
  
  Reviewer #3 (Public Review)
  
  In this manuscript, Wang et al. assess the role of wall shear stress and hydrostatic pressure during valve morphogenesis at stages where the valve elongates and takes shape. The authors elegantly demonstrate that shear and pressure have different effects on cell proliferation by modulating YAP signaling. The authors use a combination of in vitro and in vivo approaches to show that YAP signaling is activated by hydrostatic pressure changes and inhibited by wall shear stress.
  
  We thank the reviewer for their enthusiasm for the impact of our work.
  
  There are a few elements that would require clarification:
  
  1) The impact of YAP on valve stiffness was unclear to me. How is YAP signaling affecting stiffness? is it through cell proliferation changes? I was unclear about the model put forward:
  
  Is it cell proliferation (cell proliferation fluidity tissue while non-proliferating tissue is stiffer?)
  
  Is it through differential gene expression?
  
  This needs clarification.
  
  We thank the reviewer for raising this important question. Cell proliferation can affect valve stiffness but is a minor factor compared with ECM deposition and cell contractility Our micropipette aspiration data showed that the higher cell proliferation rate induced by YAP activation did lead to stiffer valves when compared to the controls. This may be because at the early stages, cells are more elastic than the viscous ECM. However, the stiffness of YAP activated valves were only about half of that of YAP inhibited valves, showing that the transcriptional level factor plays a more important role. This also suggests that YAP inhibited valves exhibited a more mature phenotype. An analogous role of YAP has also been found in cardiomyocytes. Many theories propose that in cardiomyocytes when YAP is activated the proliferation programs are turned on, while when YAP is inhibited the proliferation programs are turned off and maturation programs are released. Similarly, here we hypothesize that YAP works like a mechanobiological switch, converting mechanical signaling into the decision between growth and maturation. We have revised the Discussion to include this hypothesis.
  
  2) The model proposes an early asymmetric growth of the cushion leading to different shear forces (oscillatory vs unidirectional shear stress). What triggers the initial asymmetry of the cushion shape? is YAP involved?
  
  Although the initial geometry of the cushion model is symmetric, the force acting on it is asymmetric. The detailed numerical simulation of how the initial forces trigger the asymmetric morphogenesis can be found in our previous publication:
  
  Buskohl PR, Jenkins JT, Butcher JT. Computational simulation of hemodynamic-driven growth and remodeling of embryonic atrioventricular valves. Biomechanics and Modeling in Mechanobiology. 2012;11(8):1205-17.
  
  The color maps represent the dilatation rates when a) only pressure is applied, b) only shear stress is applied, and c) both pressure and shear stress are applied. It is such load that initiates an asymmetric morphological change, as shown in d). In addition, we believe YAP is involved during the initiation because it is directly nuclear activated by CS and OSS or cytoplasmically activated by TS and LSS.
  
  3) The differential expression of YAP and its correlation to cell proliferation is a little hard to see in the data presented. Drawings highlighting the main areas would help the reader to visualise the results better.
  
  We thank the reviewer for this helpful suggestion, we have improved the visualization of Figure 3C and Figure 4C with insets of higher magnification.
  
  4) The origin of osmotic/hydrostatic pressure in vivo. While shear is clearly dependent upon blood flow, it is less clear that hydrostatic pressure is solely dependent upon blood flow. For example, it has been proposed that ECM accumulation such as hyaluronic acid could modify osmotic pressure (see for example Vignes et al.PMID: 35245444). Could the authors clarify the following questions:
  
  How blood flow affects osmotic pressure in vivo?
  
  Is ECM a factor that could affect osmotic pressure in this system?
  
  We thank the reviewer for sharing this interesting study. The osmotic pressure plays a critical role in mechanotransduction and the development of many tissues including cardiovascular tissues and cartilage. As proposed in the reference, osmotic pressure is an interstitial force generated by cardiac contractility. Here in our study, the hydrostatic pressure is different, which is an external force applied by flowing blood. According to Bernoulli's law, when an incompressible fluid flows around a solid, the static pressure it applies on the solid is equal to its total pressure minus its dynamic pressure.
  
  Despite the difference, the osmotic pressure can mimic the effect of hydrostatic pressure in-vitro. The in-vitro osmotic pressure model has been widely used in cartilage research, for example:
  
  P. J. Basser, R. Schneiderman, R. A. Bank, E. Wachtel, and A. Maroudas, “Mechanical properties of the collagen network in human articular cartilage as measured by osmotic stress technique.,” Arch. Biochem. Biophys., vol. 351, no. 2, pp. 207–19, 1998.
  
  D. a. Narmoneva, J. Y. Wang, and L. a. Setton, “Nonuniform swelling-induced residual strains in articular cartilage,” J. Biomech., vol. 32, no. 4, pp. 401–408, 1999.
  
  C. L. Jablonski, S. Ferguson, A. Pozzi, and A. L. Clark, “Integrin α1β1 participates in chondrocyte transduction of osmotic stress,” Biochem. Biophys. Res. Commun., vol. 445, no. 1, pp. 184–190, 2014.
  
  Z. I. Johnson, I. M. Shapiro, and M. V. Risbud, “Extracellular osmolarity regulates matrix homeostasis in the intervertebral disc and articular cartilage: Evolving role of TonEBP,” Matrix Biol., vol. 40, pp. 10–16, 2014.
  
  When maturing cushions shift from GAGs dominated ECM to collagen dominated ECM, the water and ion retention capacity of the tissue would be greatly changed, and thus reducing the osmotic pressure. This could in turn accelerate the maturation of cushions. By contrast, the ECM of growing cushions remain GAGs dominated, which would delay maturation and prolong the growth.
  
  The revised second section of Results is as follows:
  
  Shear and hydrostatic stress regulate YAP activity
  
  In addition to the co-effector of the Hippo pathway, YAP is also a key mediator in mechanotransduction. Indeed, the spatiotemporal activation of YAP correlated with the changes in the mechanical environment. During valve remodeling, unidirectional shear stress (USS) develops on the inflow surface of valves, where YAP is rarely expressed in the nuclei of VECs (Figure 2A). On the other side, OSS develops on the outflow surface, where VECs with nuclear YAP localized. The YAP activation in VICs also correlated with hydrostatic pressure. The pressure generated compressive stress (CS) in the tips of valves, where VICs with nuclear YAP localized (Figure 2B). Whereas tensile stress (TS) was created in the elongated regions, where YAP was absent in VIC nuclei.
  
  To study the effect of shear stress on the YAP activity in VECs, we applied USS and OSS directly onto a monolayer of freshly isolated VECs. The VEC was obtained from AV cushions of chick embryonic hearts at HH25. The cushions were placed on collagen gels with endocardium adherent to the collagen and incubated to enable the VECs to migrate onto the gel. We then removed the cushions and immediately applied the shear flow to the monolayer for 24 hours. The low stress OSS (2 dyn/cm2) promoted YAP nuclear translocation in VEC (Figure 2C, E), while high stress USS (20 dyn/cm2) restrained YAP in cytoplasm.
  
  To study the effect of hydrostatic stress on the YAP activation in VICs, we used media with different osmolarities to mimic the CS and TS. CS was induced by hypertonic condition while TS was created by hypotonic condition, and the Unloaded (U) condition refers to the osmotically balanced media. Notably, in-vivo hydrostatic pressure is generated by flowing blood, while in-vivo osmotic pressure is generated by cardiac contractility and plays a critical role in the mechanotransduction during valve development (30). Despite the different in-vivo origination, the osmotic pressure provides a reliable model to mimic the hydrostatic pressure in-vitro (31). We cultured HH34 AV cushion explants under different loading conditions for 24 hours and found that the trapezoidal cushions adopted a spherical shape (Figure 2D). TS loaded cushions significantly compacted, and the YAP activation in VICs of TS loaded cushions was significantly lower than that in CS loaded VICs (Figure 2F).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.11.24.517814v1
www.biorxiv.org www.biorxiv.org

https://biorxiv.org/cgi/content/10.1101/2022.03.11.483948

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Huang et al. sought to study the cellular origin of Tuft cells and the molecular mechanisms that govern their specification in severe lung injury. First the authors show ectopic emergence of Tuft cells in airways and distal parenchyma following different injuries. The authors also used lineage tracing models and uncovered that p63-expressing cells and to some extent Scgb1a1-lineaged labeled cells contribute to tuft cells after injury. Further, the authors modulated multiple pathways and claim that Notch inhibition blocks tuft cells whereas Wnt inhibition enhances Tuft cell development in basal cell cultures. Finally, the authors used Trpm5 and Pou2f3 knock-out models to claim that tuft cells are indispensable for alveolar regeneration.
  
  In summary, the findings described in this manuscript are somewhat preliminary. The claim that the cellular origin of Tuft cells in influenza infection was not determined is incorrect. Current data from pathway modulation is preliminary and this requires genetic modulation to support their claims.
  
  We thank the reviewer for the comments and we have performed extensive experiments to address the reviewer’s comments. In the revised manuscript we provide additional data including genetic modulation findings to support our model.
  
  Major comments:
  
  1) The abstract sounds incomplete and does not cover all key aspects of this manuscript. Currently, it is mainly focusing on the cellular origin of Tuft cells and the role of Wnt and notch signaling. However, it completely omits the findings from Trpm5 and Pou2f3 knock-out mice. In fact, the title of the manuscript highlights the indispensable nature of tuft cells in alveolar regeneration.
  
  We have modified the abstract and title accordingly.
  
  2) In lines 93-94, the authors state that "It is also unknown what cells generate these tuft cells.....". This statement is incorrect. Rane et al., 2019 used the same p63-creER mouse line and demonstrated that all tuft cells that ectopically emerge following H1N1 infection originate from p63+ lineage labeled basal cells. Therefore, this claim is not new.
  
  We thank the reviewer’s comment. Although Rane et al. reported the p63-expressing lineage-negative epithelial stem/progenitor cells (LNEPs) could contribute to the ectopic tuft cells after PR8 virus infection, it is still not clear whether the p63+ cells immediately give rise to tuft cells or though EBCs. Thus, we performed TMX injection after PR8 infection, different from Rane et al (Rane et al., 2019). who performed Tmx injection before viral infection to indicate the ectopic tuft cells are derived from EBCs, as shown in revised Figure 2.
  
  3) Lines 152-153 state that "21.0% +/- 2.0 % tuft cells within EBCs are labeled with tdT when examined at 30 dpi...". It is not clear what the authors meant here ("within EBC's")? And also, the same sentence states that "......suggesting that club cell-derived EBCs generate a portion of tuft cells....". In this experiment, the authors used club cell lineage tracing mouse lines. So, how do the authors know that the club cell lineage-derived tuft cells came through intermediate EBC population? Current data do not show evidence for this claim. Is it possible that club cells can directly generate tuft cells?
  
  We apologize for the confusion and revised the text accordingly. Here, “within EBCs” means within the “pods” area where p63+ basal cells are ectopically present. The sentence is revised as “21.0% +/- 2.0 % tuft cells that are ectopically present in the parenchyma are labeled by tdT. Notably, these lineage labeled tuft cells were co-localized with EBCs.” We don’t know whether the club cell lineage-derived tuft cells transit through intermediate EBCs and that is why we use “suggest”. It is also possible that club cells can directly generate tuft cells. To avoid the confusion, we delete the sentence.
  
  4) Based on the data from Fig-3A, the authors claim that treatment with C59 significantly enhances tuft cell development in ALI cultures. Porcupine is known to facilitate Wnt secretion. So, which cells are producing Wnt in these cultures? It is important to determine which cells are producing Wnt and also which Wnt? Further, based on DBZ treatments, it appears that active Notch signaling is necessary to induce Tuft cell fate in basal cells. Where are Notch ligands expressed in these tissues? Is Notch active only in a small subset of basal cells (and hence generate rate tuft cells)? This is one of the key findings in this manuscript. Therefore, it is important to determine the expression pattern of Wnt and Notch pathway components.
  
  We thank the reviewer’s interesting questions and agree the importance of identifying the specific ligands and receptors for relevant Wnt and Notch signaling during tuft cell derivation. That being said, we think the topic is beyond the scope of this study which is focused on the role of tuft cells in alveolar regeneration. The point is well taken and we will investigate the topic in our future study.
  
  5) How do the authors explain different phenotypes observed in Trpm5 knockout and Pou2f3 mutants? Is it possible that Trpm5 knockout mice have a subset of tuft cells and that they might be something to do with the phenotypic discrepancy between two mutant models?
  
  Again we thank the reviewer for the interesting question. As discussed in the discussion section, Trpm5 is also reported to be expressed in B lymphocytes (Sakaguchi et al., 2020). It is possible that loss of Trpm5 modulates the inflammatory responses following viral infection, which may contribute to improved alveolar regeneration. However, it is also possible that Trpm5-/- mice keep a subset of tuft cells that facilitate lung regeneration as suggested by the reviewer.
  
  6) One of the key findings in this manuscript is that Wnt and Notch signaling play a role in Tuft cell specification. All current experiments are based on pharmacological modulation. These need to be substantiated using genetic gain loss of function models.
  
  We have performed the genetic studies.
  
  Reviewer #2 (Public Review):
  
  In this manuscript, the authors describe the ectopic differentiation of tuft cells that were derived from lineage-tagged p63+ cells post influenza virus infection. These tuft cells do not appear to proliferate or give rise to other lineages. They then claim that Wnt inhibitors increase the number of tuft cells while inhibiting Notch signaling decreases the number of tuft cells within Krt5+ pods after infection in vitro and in vivo. The authors further show that genetic deletion of Trpm5 in p63+ cells post-infection results in an increase in AT2 and AT1 cells in p63 lineage-tagged cells compared to control. Lastly, they demonstrate that depletion of tuft cells caused by genetic deletion of Pou2f3 in p63+ cells has no effect on the expansion or resolution of Krt5+ pods after infection, implying that tuft cells play no functional role in this process.
  
  Overall, in vivo and in vitro phenotypes of tuft cells and alveolar cells are clear, but the lack of detailed cellular characterization and molecular mechanisms underlying the cellular events limits the value of this study.
  
  We thank the reviewer for the comments and acknowledging that our findings are clear. In the revised manuscript we provide more detailed characterization and genetic evidence to elucidate the role of tuft cells in lung regeneration.
  
  1) Origin of tuft cells: Although the authors showed the emergence of ectopic tuft cells derived from labelled p63+ cells after infection, it cannot be ruled out that pre-existing p63+Krt5- intrapulmonary progenitors, as previously reported, can also contribute to tuft cell expansion (Rane et al. 2019; by labelling p63+ cells prior to infection, they showed that the majority of ectopic tuft cells are derived from p63+ cells after viral infection). It would be more informative if the authors show the differentiation of tuft cells derived from p63+Krt5+ cells by tracing Krt5+ cells after infection, which will tell us whether ectopic tuft cells are differentiated from ectopic basal cells within Krt5+ pods induced by virus infection.
  
  We thank the reviewer for the helpful suggestion. We have performed the experiment accordingly.
  
  2) Mechanisms of tuft cell differentiation: The authors tried to determine which signaling pathways regulate the differentiation of tuft cells from p63+ cells following infection. Although Wnt/Notch inhibitors affected the number of tuft cells derived from p63+ labelled cells, it remains unclear whether these signals directly modulate differentiation fate. The authors claimed that Wnt inhibition promotes tuft cell differentiation from ectopic basal cells. However, in Fig 3B, Wnt inhibition appears to trigger the expansion of p63+Krt5+ pod cells, resulting in increased tuft cell differentiation rather than directly enhancing tuft cell differentiation. Further, in Fig 3D, Notch inhibition appears to reduce p63+Krt5+ pod cells, resulting in decreased tuft cell differentiation. Importantly, a previous study has reported that Notch signalling is critical for Krt5+ pod expansion following influenza infection (Vaughan et al. 2015; Xi et al. 2017). Notch inhibition reduced Krt5+ pod expansion and induced their differentiation into Sftpc+ AT2 cells. In order to address the direct effect of Wnt/Notch signaling in the differentiation process of tuft cells from EBCs, the authors should provide a more detailed characterization of cellular composition (Krt5+ basal cells, club cells, ciliated cells, AT2 and AT1 cells, etc.) and activity (proliferation) within the pods with/without inhibitors/activators.
  
  Again we thank the reviewer for the insightful suggestions. We agree that it will be interesting to further address the direct effect of Wnt/Notch signaling in the differentiation process of tuft cells from EBCs. In this revised manuscript we added new findings of EBC differentiation into tuft cells in mice with genetic deletion of Rbpjk.
  
  3) Impact of Trpm5 deletion in p63+ cells: It is interesting that Trpm5 deletion promotes the expansion of AT2 and AT1 cells derived from labelled p63+ cells following infection. It would be informative to check whether Trpm5 regulates Hif1a and/or Notch activity which has been reported to induce AT2 differentiation from ectopic basal cells (Xi et al. 2017). Although the authors stated that there was no discernible reduction in the size of Krt5+ pods in mutant mice, it would be interesting to investigate the relationship between AT2/AT1 cell retaining pods and the severity of injury (e.g. large Krt5+ pods retain more/less AT2/AT1 cells compared to small pods. What about other cell types, such as club and goblet cells, in Trpm5 mutant pods? Again, it cannot be ruled out that pre-existing p63+Krt5- intrapulmonary progenitor cells can directly convert into AT2/AT1 cells upon Trpm5 deletion rather than p63+Krt5+ cells induced by infection.
  
  We thank the reviewer for the comments and suggestions. Our new data using KRT5-CreER mouse line confirmed that pod cells (Krt5+) do not contribute to AT2/AT1 cells, consistent with previous studies (Kanegai et al., 2016; Vaughan et al., 2015). Our data also show that p63-CreER lineage labeled AT2/AT1 cells are separated from pod cell area, suggesting pod cells and these AT2/AT1 cells are generated from different cell of origin. We also checked the Notch activity in pod cells in Trpm5-/- mice, and some pod cell-derived cells are Hes1 positive, whereas some are Hes1 negative (RLFigure 1). As indicated in discussion we think that AT2/AT1 cells are possibly derived from pre-existing AT2 cells that transiently express p63 after PR8 infection. It will be interesting to test whether Trpm5 regulates Hif1a in this population (p63+,Krt5-), and this will be our next plan.
  
  RLFigure 1. Representative area staining in Trpm5-/- mice at 30 dpi. Area 1: Notch signaling is active (Hes1+, arrows) in pod cells following viral infection. Area 2: pod cells exhibit reduced Notch activities. Note few Hes1+ cells in pods (arrows). Scale bar: 50 µm.
  
  4) Ectopic tuft cells in COVID-19 lungs: The previous study by the authors' group revealed the presence of ectopic tuft cells in COVID-19 patient samples (Melms et al. 2021). There appears to be no additional information in this manuscript.
  
  In Melms et al., Nature, 2021 (Melms et al., 2021), we showed tuft cell expansion in COVID-19 lungs but not the potential origin of tuft cells. In this manuscript we show some cells co-expressing POU2F3 and KRT5, suggesting a pod-to-tuft cell differentiation.
  
  5) Quantification information and method: Overall, the quantification method should be clarified throughout the manuscript. Further, in the method section, the authors stated that the production of various airway epithelial cell types was counted and quantified on at least 5 "random" fields of view. However, virus infection causes spatially heterogeneous injury, resulting in a difficult to measure "blind test". The authors should address how they dealt with this issue.
  
  We clarified that quantification method as suggested. For the in vitro cell culture assays on the signaling pathways, we took pictures from at least five random fields of view for quantification. For lung sections, we tile-scanned the lung sections including at least three lung lobes and performed quantification.
  
  Reviewer #3 (Public Review):
  
  In this manuscript Huang et al. study how the lung regenerates after severe injury due to viral infection. They focus on how tuft cells may affect regeneration of the lung by ectopic basal cells and come to the conclusion that they are not required. The manuscript is intriguing but also very puzzling. The authors claim they are specifically targeting ectopic basal progenitor cells and show that they can regenerate the alveolar epithelium in the lung following severe injury. However, it is not clear that the p63-CreERT2 line the authors are using only labels ectopic basal cells. The question is what is a basal cell? Is an ectopic basal progenitor cell only defined by Trp63 expression?
  
  The accompanying manuscript by Barr et al. uses a Krt5-CreERT2 line to target ectopic basal cells and using that tool the authors do not see a signification contribution of ectopic basal cells towards alveolar epithelial regeneration. As such the claim that ectopic basal cell progenitors drive alveolar epithelial regeneration is not well-founded.
  
  We appreciate the reviewer for the positive comments and agreeing that our findings are interesting.
  
  The title itself is also not very informative and is a bit misleading. That being said I think the manuscript is still very interesting and can likely easily be improved through a better validation of which cells the p63-CreERT2 tool is targeting.
  
  We have revised the title accordingly and performed extensive experiments to address the reviewer’s concerns.
  
  I, therefore, suggest the following experiments.
  
  1) Please analyze which cells p63-CreERT2 labels immediately after PR8 and tamoxifen treatment. Are all the tdTomato labeled cells also Krt5 and p63 positive or are some alveolar epithelial cells or other airway cell types also labeled?
  
  We thank the reviewer for the question. To answer the reviewer’s question, we performed PR8 infection (250 pfu) on three Trp63-CreERT2;R26tdT mice and TMX treatment at days 5 and 7 post viral infection. We didn't perform TMX injection immediately as the mice were sick at a few days post infection. The lung samples were collected at 14 dpi. We observed that tdT+ cells are present in the airways (rebuttal letter RLFigure 2A, B), and it appears that the lineage labeled cells (tdT+) include club cells (CC10+) that are underlined by tdT+Krt5+ basal cells (RLFigure 2C). We think that these labeled basal cells give rise to club cells. However, we also noticed that rare club cells and ciliated cells (FoxJ1+) are labeled by tdT in the areas absent of surrounding tdT+ basal cells (RLFigure 2D). Moreover, a minor population of tdT+ SPC+ cells are present in the terminal airways that were disrupted by viral infection (RLFigure 2E and D). We did not see any pods formed in this experiment and we did not observe any tdT+ cells in the intact alveoli (uninjured area).
  
  RLFigure 2. Trp63-CreERT2 lineage labeled cells in the airways but not alveoli when Tamoxifen was induced at day 5 and 7 after PR8 H1N1 viral infection. Trp63-CreERT2;R26-tdT mice were infected with PR8 at 250 pfu and Tmx were delivered at a dose of 0.25 mg/g bodyweight by oral gavage. Lung samples were collected and analyzed at 14 dpi. Stained antibodies are as indicated. Scale bar: 100 µm.
  
  2) Please also show if p63-CreERT2 labels any cells in the adult lung parenchyma in the absence of injury after tamoxifen treatment.
  
  Dr. Wellington Cardoso’s group demonstrated that Trp63-CreERT2 only labels very few cells in the airways but not the lung parenchyma in the absence of injury after tamoxifen treatment (Yang et al., 2018). Dr. Ying Yang has revisited the data and she did not observe any labeling in the lung parenchyma (n = 2).
  
  3) Please analyze if p63-CreERT2 labels any cells with tdTomato in the absence of injury or after PR8 infection but without tamoxifen treatment.
  
  We performed the experiment and didn't observe any labeled cells in the lung parenchyma without Tamoxifen treatment (n = 4).
  
  4) Please analyze when after PR8 infection do the first p63-CreERT2 labeled tdTomato positive alveolar epithelial cells appear.
  
  We administered tamoxifen at day 5 and 7 after PR8 infection and harvested lung tissues at day 14. As shown in Figure 1, we observed a few tdT+ SPC+ cells in the terminal airways that are disrupted by viral infection. Notably, we did not observe any lineage labeled cells in the intact alveoli (uninjured) in this experiment..
  
  5) A clonal analysis of p63-CreERT2 labeled cells using a confetti reporter might also help interpret the origin of p63-CreERT2 labeled cells.
  
  We thank the reviewer for the suggestion. Our new data demonstrate that a rare population of SPC+tdT+ cells are present in the disrupted terminal airways of Trp63-CreERT2;R26tdT mice. Our data in the original manuscript and the new data suggest that the initial SPC+;tdT+ cells are rare because we have to administrate multiple doses of Tamoxifen to label them. Given the less labeling efficiency of confetti than R26tdT mice, it is possible we will not be able to label these SPC+ cells. Moreover, our original manuscript clearly shows individual clones of SPC+tdT+ cells in the regenerated lung, and they do not seem to compose of multiple clones. Therefore we think that use of confetti mice may not add new information..
  
  6) Lastly could the authors compare the single-cell RNAseq transcription profile of p63-CREERT2 labeled cells immediately after PR8 and tamoxifen treatment and also at 60dpi. A pseudotime analysis and trajectory interference analysis could help elucidate the identity of p63-CreERT2 labeled cells that are actually not ectopic basal progenitor cells.
  
  We appreciated the reviewer’s suggestion and agree that single cell RNA sequencing with pseudotime analysis can provide further information regarding the origin of the lineage labeled alveolar cells of Trp63-CreERT2;R26tdT mice. That said, our new data clearly show that KRT5-CreER lineage labeled cells do not give rise to AT1/2 cells as previously described (Kanegai et al., 2016; Vaughan et al., 2015), suggesting that the ectopic basal progenitor cells do not generate alveolar cells. By contrast, Trp63-CreERT2 lineage labeled cells do give rise to AECs, suggesting that this p63+ cell population capable of generating AECs are different from Krt5+ ectopic basal progenitor cells. Our single cell core has an extremely long waiting list due to the pandemic and we hope that our new findings are enough to address the reviewer’s concern without the need of single cell analysis..
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.11.483948v1
www.biorxiv.org www.biorxiv.org

Information transfer in mammalian glycan-based communication

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This manuscript applies the framework of information theory to study a subset of cellular receptors (called lectins) that bind to glycan molecules, with a specific focus on the kinds of glycans that are typical of fungal pathogens. The authors use the concentration of various types of ligands as the input to the signaling channel, and measure the "response" of individual cells using a GFP reporter whose expression is driven by a promoter that responds to NFκB. While this work is overall technically solid, I would suggest that readers keep several issues in mind while evaluating these results.
  
  1) One of the largest potential limitations of the study is the reliance of the authors on exogenous expression of the relevant receptors in U937 cells. Using a cell-line system like this has several advantages, most notably the fact that the authors can engineer different reporters and different combinations of receptors easily into the same cells. This would be much more difficult with, say, primary cells extracted from a mouse or a human. While the ability to introduce different proteins into the cells is a benefit, the problem is that it is not clear how physiologically relevant the results are. To their credit, the authors perform several controls that suggest that differences in transfection efficiency are not the source of the differences in channel capacity between, say, dectin-1 and dectin-2. As the authors themselves clearly demonstrate, however, the differences in the properties of these signaling system are not based on receptor expression levels, but rather on some other property of the receptor. Now, it could be that the dectin-2 receptor is somehow just more "noisy" in terms of its activity compared to, say, dectin-1. This seems a somewhat less likely explanation, however, and so it is likely that downstream details of the signaling systems differ in some way between dectin-2 and the more "information efficient" receptors studied by the authors.
  
  The channel capacity of a cell signaling network depends critically on the distributions of the downstream signaling molecules in question: see the original paper by Cheong et al. (2011, Science 334 (6054), 354-8) and subsequent papers (notably Selimkhanov et al. (2014) Science 346 (6215), 1370-3 and Suderman et al. (2018) Interface Focus 8 (6), 20180039). The U937 cells considered here clearly don't serve the physiological function of detecting the glycans considered by the authors; despite the fact that this is an artificial cell line, the fact the authors have to exogenously express the relevant receptors indicates that these cells are not necessarily a good model for the types of cells in the body that actually have evolved to sense these glycan molecules.
  
  Signaling molecules readily exhibit cell-type-specific expression levels that influence cellular responses to external stimuli (Rowland et al.(2017) Nat Commun 8, 16009). So it is unclear that the distributions of downstream signaling molecules in U937 cells mirror those that would be observed in the immune cell types relevant to this response. As such, the physiological relevance of the differences between dectin-2 channel capacities and those exhibited by the other receptors are currently unclear.
  
  We appreciate Reviewer #1’s in-depth comments related to physiological relevance of the U937 cell. A big benefit of using information theory to investigate a biological communication channel is the realization of quantitative measurement of information that the channel transmits without having detailed measurement of spatiotemporal dynamics of receptors and downstream signaling cascades. In addition, the quantity of measured information itself in turn gives us a decent prediction about detailed signaling mechanisms by comparing the information quantity difference. For example, we investigated how transmission of glycan information from dectin-2 is synergistically modulated in the presence of either dectin-1, DC-SIGN or mincle. Our approach allows to investigate how individual lectins on immune cells contribute to glycan information transmission and be integrated in the presence other type of lectins. Therefore, the findings describe how physiologically relevant lectins are integrating the extracellular signal in a more defined way. Furthermore, we found that our model cell line has one order of magnitude higher expression of dectin-2 compared with primary human monocytes and exhibits a similar zymosan binding pattern (will be described in Recommendations for the authors and Figure R8).
  
  We fully agree that acquiring more information on the information transmission capability of primary immune cells would increase physiological relevance. In the revised manuscript we addressed this concern by comparing the receptor expression levels of our model cell lines with primary monocytes, for which we find an agreement of cellular heterogeneity. However, we would also like to point out that the very basic nature of our question, of how information stored in glycans is processed by lectins, is not tightly bound to these difference of primary cells and cell lines.
  
  Line 382: Finally, it is important to take into consideration that our conclusions came from model cell lines, which were used as a surrogate for cell-type-specific lectin expression patterns of primary immune cells. Human monocytes and dectin-2 positive U937 cells have comparable receptor densities and respond similar to stimulation with zymosan particles (SI Fig. 6A and B).
  
  2) Another issue that readers might want to keep in mind is that the details of the channel capacity calculation are a bit unclear as the manuscript is currently written. The authors indicate that their channel capacity calculations follow the approach of Cheong et al. (2011) Science 334 (6054), 354-8. However, the extent to which they follow that previous approach is not obvious. For instance, the calculations presented in the 2011 work use a combined bootstrapping/linear extrapolation approach to estimate the mutual information at infinite population size in order to deal with known inaccuracies in the calculation that arise from finite-size effects. The Cheong approach also deals with the question of how many bins to use in order to estimate the joint probability distribution across signal and response.
  
  They do this by comparing the mutual information they calculate for the real data with that calculated for random data to ensure that they are not calculating spuriously high mutual information based on having too many bins. While the Cheong et al. paper does a great job explaining why these steps need to be undertaken, a subsequent paper by Suderman et al. (2017, PNAS 114 (22), 5755-60) explains the approach in even greater detail in the supporting information. Those authors also implemented several improvements to the general approach, including a bootstrap method for more accurately estimating the error in the mutual information and channel capacity estimates.
  
  The problem here is that, while the authors claim to follow the approach of Cheong et al., it seems that they have re-implemented the calculation, and they do not provide sufficient detail to evaluate the extent to which they are performing the same exact calculation. Since estimates of mutual information are technically challenging, specific details of the steps in their approach would be helpful in order to understand how closely their results can be compared with the results of previous authors. For instance, Cheong et al. estimate the "channel capacity" by trying a set of likely unimodal and bimodal distributions for the input to the channel, and choosing the maximal value as the channel capacity. This is clearly a very approximate approach, since the channel capacity is defined as the supremum over an (uncountably infinite) set of input probability distributions. In any case, the authors of the current manuscript use a different approach to this maximization problem. Although it is a bit unclear how their approach works, it seems that they treat the probability of each input bin as an independent parameter (under the constraint that the probabilities sum to one) and then use an optimization algorithm implemented in Python to maximize the mutual information. In principle, this could be a better approach, since the set of input distributions considered is potentially much larger. The details of the optimization algorithm matter, however, and those are currently unclear as the paper is written.
  
  We thank Reviewer #1’s recommendation for increasing the legitimacy of the calculation. In the revised manuscript we tried to explain channel capacity calculation procedures in more detail with statistical approaches that adopted from Cheong et al. (2011) and Suderman et al. (2018) (SI section 1 and 2). Furthermore, we decide the number of binning from not only random dataset but also the number of total samples as shown below:
  
  Figure R1. A) Extrapolated channel capacity values of random dataset at infinitely subsampled distribution under various total number of samples and output binning. The white line in the heatmap represents the channel capacity value at 0.01 bit. B) Extrapolated channel capacity values at infinite subsample size of U937 cells’ input (TNF-a doses) and output (GFP reporter) response.
  
  Figure R1 describes channel capacity values from random (A) and experimental dataset (B, TNFAR + TNF-a). The channel capacity values from random data indicates the dependence of channel capacity on the number of the output binning and total number sample. According to this heatmap, we decided the allowed bias as 0.01 bits as shown in contour line shown in Figure R1A. Since our minimum dataset that used for channel capacity calculation in the absence of labelled input is near 90,000, the expected bias in channel capacity calculation is therefore less than 0.01 bits in binning range from 10 to 1000 as shown in Figure R1A.
  
  Furthermore, we demonstrated mutual information maximization procedure using predefined unibimodal input distribution and compared with the systematic method that we used in the work. We found that there is no noticeable difference in channel capacity value between two approaches (SI Figure 3M).
  
  3) Another issue to be careful about when interpreting these findings is the fact that the authors use logarithmic bins when calculating the channel capacity estimates. This is equivalent to saying that the "output" of the cell signaling channel is not the amount of protein produced under the control of the NFκB promoter, but rather the log of the protein level. Essentially, the authors are considering a case where the relevant output of the system is not the amount of protein itself, but the fold change in the amount of protein. That might be a reasonable assumption, especially if the protein being produced is a transcription factor whose own promoters have evolved to detect fold changes. For many proteins, however, the cell is likely responsive to linear changes in protein concentration, not fold changes. And so choosing the log of the protein level as the output may not make sense in terms of understanding how much information is actually contained in this particular output variable. Regardless, choosing logarithmic bins is not purely a matter of convenience or arbitrary choice, but rather corresponds to a very strong statement about what the relevant output of the channel is.
  
  We understand Reviewer #1’s concern regarding the choice of log binning. We found that if the number of binning is higher than 200, no matter the binning methods, including linear, logarithmic or equal frequency, the estimated channel capacities in each binning number are converged into the same value. The only difference is how quickly the values approach the converged channel capacity as increasing the binning number (shown in Figure R2). In the revised manuscript, we used linear binning to represent more relevant protein signaling as the Reviewer mentioned. Note that the channel capacity values calculated from linear binning do not show noticeable different from our previously calculated channel capacity values.
  
  On the other hand, linear binning generates significant bias, if we consider labelled input (i.e., continuous input) into channel capacity calculation, due to the increase of binning in input region.
  
  Figure R2. Output binning number and binning method dependence of channel capacity value for experimental dataset. The inset plots show the relative difference of channel capacity value to the maximum channel capacity value in the entire binning range (i.e., from 10 to 1000) of the corresponding binning method.
  
  According to Reviewer #1’s comment we have changed the binning method from logarithmic binning to linear binning in the whole experimental dataset except in the presence of labelled input (i.e., dectin-2 antibody). If we consider channel capacity between labelled input and NF-kB reporter, equal frequency binning is used for every layer of the channel capacity (i.e., labelled input-binding, binding-GFP, labelled input-GFP)
  
  Reviewer #2 (Public Review):
  
  My expertise is more on the theoretical than the experimental aspects of this paper, so those will be the focus of these comments.
  
  Signal transduction is an important area of study for mathematical biologists and biophysicists. This setting is a natural one for information-theoretic methods, and such methods are attracting increasing research interest. Experimental results that attempt to directly quantify the Shannon capacity of signal transduction are particularly interesting. This paper represents an important contribution to this emerging field.
  
  My main comments are about the rigorousness and correctness of the theoretical results. More details about these results would improve the paper and help the reader understand the results.
  
  We understand reviewer #2’s comment related with rigorousness and correctness of the theoretical results of this work. In the revised manuscript, we added following contents to help the reader to better understand the channel capacity calculation procedures.
  
  • General illustrative introduction regarding how we measured input and output dataset and how we handle those data to prepare joint probability distribution shown in SI section 1.1 and 1.2.
  
  • Exemplified mutual information maximization procedure using experimental and arbitrary dataset shown in SI section 1.3.
  
  The calculation of channel capacity, given in the methods, is quite a standard calculation and appears to be correct. However, I was confused by the use of the "weighting value" w_i, which is not specified in the manuscript. The input distribution appears to be a product of the weight w_i and the input probability value p_i, and these appear always to occur together as a product w_i p_i. (In joint probabilities w_i p(i,j), the input probability can be extracted using Bayes' rule, leaving w_i p_i p(j|i).) This leads met wonder two things. First, what role does w_i play (is it even necessary)? Second, of particular interest here is the capacity-achieving input distribution p_i, but w_i obscures it; is the physical input distribution p_i equal to the capacity-achieving distribution? If not, what is the meaning of capacity?
  
  We thank Reviewer #2’s comment regarding the arbitrariness of the weightings. We realize there was a lack of explanation on the weighting values in the original manuscript. 𝑃x(𝑖) is a marginal probability distribution of input from the original dataset and 𝑃x'(𝑖) is the marginal probability distribution of modified input that maximize the mutual information. In usual case 𝑃x(𝑖) is not equal to 𝑃x'(𝑖) and therefore one needs to find 𝑃x'(𝑖) from 𝑃x(𝑖). Because 𝑃x'(𝑖) is a linear combination of 𝑃x(𝑖), it can be expressed as 𝑤(𝑖)𝑃x(𝑖) , where 𝑤(𝑖) is the weightings, under constraint ∑input/i 𝑤(𝑖)𝑃x (𝑖) = 1 . The changed input distribution, in turn, modifies the joint probability distribution as 𝑃'xy (𝑖, 𝑗) = 𝑤(𝑖)𝑃xy)(𝑖, 𝑗). To help readers understand of this work we expanded the Appendix with illustrative descriptions.
  
  A more minor but important point: the inputs and outputs of the communication channel are never explicitly defined, which makes the meaning of the results unclear. When evaluating the capacity of an information channel, the inputs X and outputs Y should be carefully defined, so that the mutual information I(X;Y) is meaningful; the mutual information is then maximized to obtain capacity. Although it can be inferred that the input X is the ligand concentration, and the output Y is the expression of GFP, it would be helpful if this were stated explicitly.
  
  We agree with Reviewer’s suggestion for better description of input and output in the manuscript. Therefore, we have modified Figure 1 A and B and the main text to describe the source of input and output much clearly, as follows:
  
  Line 92: Accounting for the stochastic behavior of cellular signaling, information theory provides robust and quantitative tools to analyze complex communication channels. A fundamental metric of information theory is entropy, which determines the amount of disorder or uncertainty of variables. In this respect, cellular signaling pathways having high variability of the initiating input signals (e.g. stimulants) and the corresponding highly variable output response (i.e. cellular signaling) can be characterized as a high entropy. Importantly, input and output can have mutual dependence and therefore knowing the input distribution can partly provide the information of output distribution. If noise is present in the communication channel, input and output have reduced mutual dependence. This mutual dependence between input and output is called mutual information. Mutual information is, therefore, a function of input distribution and the upper bound of mutual information is called channel capacity (SI section 1) (Cover and Thomas, 2012). In this report, a communication channel describes signal transduction pathway of C-type lectin receptor, which ultimately lead to NF-κB translocation and finally GFP expression in the reporter model (Fig. 1A). To quantify the signaling information of the communication channels, we used channel capacity. Importantly, the channel capacity isn’t merely describing the resulting maximum intensity of the reporter cells. The channel capacity takes cellular variation and activation across a whole range of incoming stimulus of single cell resolved data into account and quantifies all of that data into a single number.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.05.10.443458v1
www.biorxiv.org www.biorxiv.org

New submission 11/01/2023, 11:47:09

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  The authors examine the role of secreted BAFF in senescence phenotypes in THP1 AML cells and primary human fibroblasts. In the former, BAFF is found to potentiate the inflammatory phenotype (SASP) and in the latter to potentiate cell cycle arrest. This is an important study because the SASP is still largely considered in generic and monolithic terms, and it is necessary to deconvolute the SASP and examine its many components individually and in different contexts.
  
  Although the results show differences for BAFF in the two cell models, there are many places where key results are missing and the results over-interpreted and/or missing controls.
  
  1) Figure 1. Test whether the upregulation of BAFF is specific to senescence, or also in reversible quiescence arrest.
  
  We appreciate the Reviewer’s requests. We performed the experiments in fibroblasts and THP-1 cells to assess BAFF levels in quiescence. As shown below in the figure for Reviewers, we induced quiescence in fibroblasts by serum starvation (0.1%) for 96 h and confirmed the quiescent state by measuring two markers of quiescence (reduction of CCND1 mRNA and reduction of phopho-S6, when compared to cycling cells, following markers established previously (PMID 25483060) (panel A). In this case, the level of BAFF mRNA was increased upon quiescence (panel B).
  
  In THP-1 cells, we tried to induce quiescence by serum starvation and glutamine depletion for 96 h. Unfortunately, however, inducing quiescence in THP-1 cells was rather challenging, likely because they are cancer cells. Thus, we observed a reduction of cell proliferation in both conditions, but we observed a reduction in phospho-S6 only in the samples without glutamine (panel C). We failed to see increased BAFF mRNA levels in quiescent THP-1 cells after either serum starvation or glutamine depletion (panel D).
  
  In summary, further studies will be necessary to fully understand if the increased expression of BAFF seen in senescent cells is also observed in other conditions of growth suppression (such as quiescence or differentiation), as well as whether this effect is specific to different cell types.
  
  2) Figure 1, Supplement 1G. Show negative control IgG for immunofluorescence.
  
  We thank the Reviewer for this suggestion. Along with other changes during the revision, we decided to remove the immunofluorescence data in order to include more informative data.
  
  3) All results with siRNA should be validated with at least 2 individual siRNAs to eliminate the possibility of off-target effects.
  
  We agree with the Reviewer on the importance of testing individual siRNAs. For BAFF, we originally tested two independent siRNAs (BAFF#1 and BAFF#2) individually, but we also pooled them for additional analysis (and referred to simply as “BAFFsi” along the manuscript). In the revised version of our manuscript, we included the key experiments performed with these two individual BAFF siRNAs. Upon BAFF silencing in THP-1 cells, we observed a reduction of SASP factors and SA-β-Gal activity levels with each individual siRNA (Figure 4-Figure Supplement 1D-F) and with the pooled siRNAs (Figure 4C). For WI-38 cells, we observed a reduction of p53 levels with individual and pooled siRNAs (Figure 7-Figure Supplement 1A), as well as a reduction in IL6 levels and SA-β-Gal activity (Figure 6-Figure Supplement 1D,E). After IRF1 silencing, we observed a reduction in BAFF pre-mRNA with two different pairs of CTRLsi and IRF1si pools (Figure 2I and supplementary Figure 2E). For the data on BAFF receptors, we used SMARTpools from Dharmacon, which are combinations of 4 siRNAs designed by the company to minimize off-target effects. These additions and clarifications are indicated in the revised manuscript.
  
  4) To confirm a role for IRF1 in the activation of BAFF, the authors should confirm the binding of IRF1 to the BAFF promoter by ChIP or ChIP-seq.
  
  We thank the Reviewer for this suggestion. We performed ChIP-qPCR analysis in THP-1 cells that were either proliferating or rendered senescent after exposure to IR (Figure 2H, Materials and methods section), and we confirmed the binding of IRF1 to the proximal promoter region of BAFF. As anticipated, this interaction was stronger after inducing senescence.
  
  5) Key antibodies should be validated by siRNA knockdown of their targets, for example, TACI, BCMA, and BAFF-R in Figure 5. Note that there is an apparent discrepancy between BCMA data in Figure 5B vs 5C.
  
  We fully agree with the Reviewer on this point and we thank him/her for helping us to improve this part of our manuscript. To address the discrepancy regarding BCMA western blot analysis and flow cytometry data, we silenced BCMA in THP-1 cells and tested two different antibodies advertised to recognize BCMA. This experiment allowed us to identify the correct band for BCMA by western blot analysis. We then confirmed that BCMA is upregulated in senescence, as observed by both western blot and flow cytometry analyses. We have modified the manuscript to reflect these changes. Please find these data in Figure 5A,B and Figure 5-Figure Supplement 1A of the revised manuscript.
  
  6) Figure 5E. Negative/specificity controls for this assay should be shown.
  
  We thank the reviewer for this comment and regret that we were unable to provide a negative control. The kit only provides a competitive wild-type oligomer used to test the specificity of the binding. For each sample (CTRLsi, BAFFsi, CTRLsi IR, BAFFsi IR) and each antibody tested (p65, p50, p52, RelB and c-Rel), we evaluated the reductions in signal upon addition of excess competitive oligomer per well (20 pmol/well) compared to wells with an inactive oligomer. However, the negative control was performed only as single replicate, due to the limited quantity of nuclear extracts and the high number of samples and antibodies analyzed. We therefore considered this control as being ‘qualitative’ rather than fully ‘quantitative’.
  
  7) Hybridization arrays such as Figure 5H, Figure 6 - Supplement 1I, and Figure 6H should be shown as quantitated, normalized data with statistics from replicates.
  
  We appreciate this request. We have included the quantification and statistics to the phosphoarrays used for THP-1 and WI-38 cells, which had been performed in triplicate (Figure 7A, Figure 5-Figure Supplement 1D). The original arrays are shown in the respective Source Data Files. In the interest of space, we removed the cytokine array performed on IMR-90 cells and left instead the quantitative ELISA for IL6 (Figure 6-Figure Supplement 1F). The data obtained from the cytokine array analysis in Figure 4F and Figure 4-Supplemental Figure 1C are supported by quantitative multiplex ELISA measurements (Figure 4E and Figure 4C).
  
  8) Figure 6B - Supplement 1. Controls to confirm fractionation (i.e., non-contamination by cytosolic and nuclear proteins) should be shown.
  
  We thank the Reviewer for this suggestion. We tested the efficiency of fractionation and we did in fact observe some degree of contamination from cytosolic proteins using the earlier version of the kit (Pierce, cat. 89881). We therefore purchased an improved version of the kit (Pierce, cat. A44390) and repeated the surface fractionation assay, which this time showed improved fractionation (Figure 7-Figure Supplement 1B). Interestingly, with the improved fractionation strategy, we observed that BAFF receptors in fibroblasts were almost exclusively localized inside the cell and not on the surface, as we found in THP-1 cells. Further validation of BAFF receptor antibodies has been provided in Figure 5-Figure Supplement 1A. As described in the text, the intracellular localization of BAFF receptors was previously reported in other cell types and conditions (PMID 31137630, PMID 19258594, PMID 30333819, PMID 10903733), and thus it is possible that BAFF may act through non-canonical mechanisms in WI-38 cells. Nonetheless, we did detect a small amount of BAFFR on the cell surface, and furthermore, BAFFR silencing reduced the level of p53 in fibroblasts. Therefore, we propose that BAFFR may be the primary receptor involved in p53 regulation in fibroblasts (Figure 7-Figure Supplement 1B,C). Our data on BAFF receptors deserve deeper characterization in a future study of the functions of BAFF receptors in senescence.
  
  9) Figure 6A. Knockdown of BAFF should be shown by western blot.
  
  Yes, definitely. We appreciate this comment and have included BAFF knockdown data in fibroblasts by western blot analysis (Figure 7B).
  
  10) Figure 6G. Although BAFF knockdown decreases the expression of p53, p21 increases. How do the authors explain this?
  
  We thank the Reviewer for the interesting question. We too were surprised to observe that the p53-dependent transcripts regulated by BAFF did not include CDKN1A (p21) mRNA, as confirmed by western blot analysis. The accumulation of p21 in senescence can be also regulated by p53-independent pathways and in p53-/- cells, for example by p90RSK, SP1, and ZNF84 (PMID 24136223, PMID 25051367, PMID 33925586). Eventually, we removed the data relative to p21 and γ-H2AX in favor of other data and to streamline the content of this manuscript for the reader.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.25.513730v1
www.biorxiv.org www.biorxiv.org

New submission 13/09/2022, 11:48:46

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  1-1. I do have some concerns that the differences in network clustering reported in Fig 6 may be due to noise and I think the comparisons against the HCP parcellation could be more robust. Specifically, with regard to the network clustering in Fig 6. The authors use a clustering algorithm (which is not explained) to cluster the parcels into different functional networks. They achieve this by estimating the mean time series for each parcel in each individual, which they then correlate between the n regions, to generate an nxn connectivity matrix. This they then binarise, before averaging across individuals within an age group. It strikes me that binarising before averaging will artificially reduce connections for which only a subset of individuals are set to zero. Therefore averaging should really occur before binarising. Then I think the stability of these clusters should be explored by creating random repeat and generation groups (as done for the original parcells) or just by bootstrapping the process. I would be interested to see whether after all this the observation that the posterior frontoparietal expands to include the parahippocampal gryus from 3-6 months and then disappears at 9 months - remains.
  
  We thank the reviewer for this insightful comment on our clustering process. For the step of “binarizing before averaging”, we followed the method proposed by Yeo et al (1). In this method, all correlation matrices are binarized according to the individual-specific thresholds. Specifically, each individual-specific threshold is determined according to the percentile, and only 10% of connections are kept and set to 1, while all other connections are set to 0. Yeo et al. (1) explained their motivation for doing so as “the binarization of the correlation matrix leads to significantly better clustering results, although the algorithm appears robust to the particular choice of the threshold”. We consider that the possible reason is that the binarization of connectivity in each individual offers a certain level of normalization so that each subject can contribute the same number of connections. If averaging occurs before binarizing, the actual connectivity contributed by different subjects would be different, which leads to bias. Meanwhile, we tested the stability of ‘binarizing first’ and ‘averaging first’, and the result is shown in Fig. R1 below. This figure suggests a similar conclusion as (1), where binarizing first before averaging leads to better clustering stability. We added the motivation of binarizing before averaging in the revised manuscript between line 577 and line 581.
  
  Fig. R1. The comparison of clustering stability of different methods. The red line refers to the clustering stability when binarizing the correlation matrices first and then averaging the matrices across individuals, while the blue line refers to the clustering stability when averaging the correlation matrices across individuals first and then binarizing the average matrix.
  
  For the final clustering results, we performed our clustering method using bootstrapping 100 times, and the final result is a majority voting of each parcel. The comparison of these two results is shown in Fig. R2. Overall, we do observe good repeatability between these two results. However, we also observed that some parcels show different patterns between the two results, especially for those parcels that are spatially located around the boundaries of networks or the medial wall. The pattern of the observation that “the posterior frontoparietal expands to include the parahippocampal gyrus from 3-6 months and then disappears at 9 months – remains” was not repeated in the bootstrapped results. These results might suggest that the clustering method is quite robust, the discovered patterns are relatively stable, and the differences between our original results and bootstrapping results might be caused by noises or inter-subject variabilities.
  
  Fig. R2. Top panel: the network clustering results using all data in the original manuscript. Bottom panel: the network clustering results using majority voting through 100 times of bootstrapping. Black circles and red arrows point to the parahippocampal gyrus, which was included in the posterior frontoparietal network, and is not well repeated in the bootstrapped results. (M: months)
  
  1-2. Then with regard to the comparison against the HCP parcellation, this is only qualitative. The authors should see whether the comparison is quantitatively better relative to the null clusterings that they produce.
  
  Thank you for this great suggestion! As suggested, we added this quantitative comparison using the Hausdorff distance. Similar to the comparison in parcel variance and homogeneity, the 1,000 null parcellations were created by randomly rotating our parcellation with small angles on the spherical surface 1,000 times. We compared our parcellation and the null parcellations by accordingly evaluating their Hausdorff distances to some specific areas of the HCP parcellation on the spherical space, including Brodmann's area 2, 3b, 4+3a, 44+45, V1, and MT+MST. The results are listed in Figure 4. From the results, we can observe that our parcellation generally shows statistically much lower Hausdorff distances to the HCP parcellation, suggesting that our parcellation generates parcel borders that are closer to HCP parcellations compared to the null parcellations.
  
  However, we noticed very few null parcellations that show smaller Hausdorff distances compared to our parcellation. A possible reason comes from our surface registration process with the HCP template purely based on cortical folding, without using functional gradient density maps, which are not available in the HCP template. As a result, this does not ensure high-quality functional alignment between our infant data and the HCP space, thus inevitably increasing the Hausdorff distance between our parcellation and the HCP parcellation.
  
  1-3. … not all individuals appear (from Fig 8) to be acquired exactly at the desired timepoints, so maybe the authors might comment on why they decided not to apply any kernel weighted or smoothing to their averaging? Pg. 8 'and parcel numbers show slight changes that follow a multi-peak fluctuation, with inflection ages of 9 and 18 months' explain - the parcels per age group vary - with age with peaks at 9 and 18 - could this be due to differences in the subject numbers, or the subjects that were scanned at that point?
  
  We do agree with the reviewer that subjects are not scanned at similar time points. This is designed in the data acquisition protocol to seamlessly cover the early postnatal stage so that we will have a quasi-continuous observation of the dynamic early brain development.
  
  We didn’t apply kernel weighted average or smoothing when generating the parcellation, as we would like each scan to contribute equally, and each parcellation map could be representative of the cohort of the covered age, instead of only part of them. Meanwhile, our final ‘age-common parcellation’ could be representative of all subjects from birth to 2 years of age. However, we do agree that the parcellation map that is only designed for the use of a specific age, e.g., 1-year-olds, kernel weighted average, or even a more restricted age range could be a more appropriate solution.
  
  For the parcel number that likely shows fluctuations with subject numbers, we added an experiment, where we randomly selected 100 scans by considering the minimum scan number in each age group using bootstrapping and repeated this process 100 times. The average parcel number of each age is reported in the following Table R1. We didn’t observe strong changes in parcel numbers when reducing scan numbers, which further demonstrates that our parcel numbers do not show a strong relation to subject numbers. However, the parcel number does not increase greatly from 18M to 24M in the bootstrapping results, so we modified the statement in the manuscript about the parcel number to ‘… all parcel numbers fall between 461 to 493 per hemisphere, where the parcel number attains a maximum at around 9 months and then reduces slightly and remains relatively stable afterward. …’, which can be found between line 121 and line 122.
  
  1-4. I also have some residual concerns over the number of parcels reported, specifically as to whether all of this represents fine-grained functional organisation, or whether some of it represents noise. The number of parcels reported is very high. While Glasser et al 2016 reports 360 as a lower bound, it seems unlikely that the number of parcels estimated by that method would greatly exceed 400. This would align with the previous work of Van Essen et al (which the authors cite as 53) which suggests a high bound of 400 regions. While accepting Eickhoff's argument that a more modular view of parcellation might be appropriate, these are infants with underdeveloped brain function.
  
  We thank the reviewer for this insightful comment. We agree that there might be noises for some of the parcels, as noises exist in each step, such as data acquisition, image processing, surface reconstruction, and registration, especially considering functional MRI is noisier than structural MRI. Though our experiments show that our parcellation is fine-grained and is suitable for the study of the infant brain functional development, it is hard to directly quantitatively validate as there is no ground truth available.
  
  Despite these, we are still motivated to create fine-grained parcellations, as with the increase of bigger and higher resolution imaging data and advanced computational methods, parcellations with more fine-grained regions are desired for downstream analyses, especially considering the hierarchical nature of the brain organization (2). And the main reason that our method generates much finer parcellation maps, is that both our registration and parcellation process is based on the functional gradient density, which characterizes a fine-grained feature map based on fMRI. This leads to both better inter-subject alignment in functional boundaries and finer region partitions. This strategy is different from Glasser et al (3), which jointly considers multimodal information for defining parcel boundaries, thus parcels revealed purely by functional MRI might be ignored in the HCP parcellation. We hope our parcellation framework can be a useful reference for this research direction. We added this discussion in the revised manuscript between line 268 and line 271.
  
  For the parcel number, even without performing surface registration based on fine-grained functional features, recent adult fMRI-based parcellations greatly increased parcel numbers, such as up to 1,000 parcels in Schaefer et al. (4), 518 parcels in Peng et al. (5), and 1,600 parcels in Zhao et al. (6). For infants, we do agree that the infant functional connectivity might not be as strong as in adults. However, there are opinions (7-9) that the basic units of functional organization are likely to present in infant brains, and brain functional development gradually shapes the brain networks. Therefore, the functional parcel units in infants could be possibly on a comparable scale to adults. Even so, we do agree that more research needs to be performed on larger datasets for better evaluations. We added this discussion in the revised manuscript between line 275 and line 280.
  
  1-5. Further comparisons across different subjects based on small parcels increases the chances of downstream analyses incorporating image registration noise, since as Glasser et al 2016 noted, there are many examples of topographic variation, which diffeomorphic registration cannot match. Therefore averaging across individuals would likely lose this granularity. I'm not sure how to test this beyond showing that the networks work well for downstream analyses but I think these issues should be discussed.
  
  We agree with the reviewer that averaging across individuals inevitably brings some registration errors to the parcellation, especially for regions with high topographic variation across subjects, which would lead to loss of granularity in these regions. We believe this is an important issue that exists in most methods on group-level parcellations, and the eventual solution might be individualized parcellation, which will be our future work. We added this discussion in the revised manuscript between line 288 and line 292.
  
  We also agree with the reviewer that downstream analyses are important evaluations for parcellations. We provided a beta version of our parcellation with 602 parcels (10) to our colleagues, and they tested our parcellation in the task of infant individual recognition across ages using functional connectivity, to explore infant functional connectome fingerprinting (10). We compared the performance of different parcellations with 602 ROIs (our beta version), 360 ROIs (HCP MMP parcellation (3)), and 68 ROIs (FreeSurfer parcellation (11)). The results (Fig. R3) show that our parcellation with a higher parcellation number yields better accuracy compared to other parcellations. We added a description of this downstream application in the discussion between line 284 and line 287.
  
  Fig. R3. The comparison of different parcellations for infant individual recognition across age based on functional connectivity (figure source: Hu et al. (10)). The parcellation with 602 ROIs is the beta version of our parcellation, 360 ROIs stands for HCP MMP parcellation (3) and 68 ROIs stands for the FreeSurfer parcellation (11). This downstream task shows that a higher parcellation number does lead to better accuracy in the application.
  
  1-6. Finally, I feel the methods lack clarity in some areas and that many key references are missing. In general I don't think that key methods should be described only through references to other papers. And there are many references, particular to FSL papers, that are missing.
  
  We thank the reviewer for this great suggestion. We added related references for FLIRT, FSL, MCFLIRT, and TOPUP For the alignment to the HCP 32k_LR space, we first aligned all subjects to the fsaverage space using spherical demons, and then used part of the HCP pipeline (12) to map the surface from the fsaverage space to HCP 164k_LR space, and downsampled to 32k_LR space. We modified this citation by referencing the HCP pipeline by Glasser et al. (12) instead and detailed this registration process in the revised manuscript between line 434 to line 440 in the revised manuscript and as below:
  
  “… The population-mean surface maps were mapped to the HCP 164k ‘fs_LR’ space using the deformation field that deforms the ‘fsaverage’ space to the ‘fs_LR’ space released by Van Essen et al. (13), which was obtained by landmark-based registration. By concatenating the three deformation fields of steps 1, 3, and 4, we directly warped all cortical surfaces from individual scan spaces to the HCP 164k_LR space and then resampled them to 32k_LR using the HCP pipeline (12), thus establishing vertex-to-vertex correspondences across individuals and ages …”
  
  Reviewer #2 (Public Review):
  
  2-1. Diminishing enthusiasm is the lack of focus in the result section, the frequent use of jargon, and figures that are often difficult to interpret. If those issues are addressed, the proposed atlas could have a high impact in the field especially as it is aligned with the template of the Human Connectome Project.
  
  We’d like to thank Reviewer #2 for the appreciation of our atlas. According to the reviewer’s suggestion, we went through the manuscript again by focusing on correcting the use of jargon, clarity in the result section, as well as figures and figure captions. We hope our corrections can help explain our work to a broader community. Our revisions are accordingly detailed in the following. Meanwhile, our parcellation maps have been aligned with the templates in HCP and FreeSurfer and made available via NITRC at: https://www.nitrc.org/projects/infantsurfatlas/.
  
  References
  
  B. Thomas Yeo, F. M. Krienen, J. Sepulcre, M. R. Sabuncu, D. Lashkari, M. Hollinshead, J. L. Roffman, J. W. Smoller, L. Zöllei, J. R. Polimeni, The organization of the human cerebral cortex estimated by intrinsic functional connectivity. Journal of neurophysiology 106, 1125-1165 (2011).
  
  S. B. Eickhoff, R. T. Constable, B. T. Yeo, Topographic organization of the cerebral cortex and brain cartography. NeuroImage 170, 332-347 (2018).
  
  M. F. Glasser, T. S. Coalson, E. C. Robinson, C. D. Hacker, J. Harwell, E. Yacoub, K. Ugurbil, J. Andersson, C. F. Beckmann, M. Jenkinson, S. M. Smith, D. C. Van Essen, A multi-modal parcellation of human cerebral cortex. Nature 536, 171-178 (2016).
  
  A. Schaefer, R. Kong, E. M. Gordon, T. O. Laumann, X.-N. Zuo, A. J. Holmes, S. B. Eickhoff, B. T. J. C. C. Yeo, Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. 28, 3095-3114 (2018).
  
  L. Peng, Z. Luo, L.-L. Zeng, C. Hou, H. Shen, Z. Zhou, D. Hu, Parcellating the human brain using resting-state dynamic functional connectivity. Cerebral Cortex, (2022).
  
  J. Zhao, C. Tang, J. Nie, Functional parcellation of individual cerebral cortex based on functional mri. Neuroinformatics 18, 295-306 (2020).
  
  W. Gao, S. Alcauter, J. K. Smith, J. H. Gilmore, W. Lin, Development of human brain cortical network architecture during infancy. Brain Structure and Function 220, 1173-1186 (2015).
  
  W. Gao, H. Zhu, K. S. Giovanello, J. K. Smith, D. Shen, J. H. Gilmore, W. J. P. o. t. N. A. o. S. Lin, Evidence on the emergence of the brain's default network from 2-week-old to 2-year-old healthy pediatric subjects. 106, 6790-6795 (2009).
  
  K. Keunen, S. J. Counsell, M. J. J. N. Benders, The emergence of functional architecture during early brain development. 160, 2-14 (2017).
  
  D. Hu, F. Wang, H. Zhang, Z. Wu, Z. Zhou, G. Li, L. Wang, W. Lin, G. Li, U. U. B. C. P. Consortium, Existence of Functional Connectome Fingerprint during Infancy and Its Stability over Months. Journal of Neuroscience 42, 377-389 (2022).
  
  R. S. Desikan, F. Ségonne, B. Fischl, B. T. Quinn, B. C. Dickerson, D. Blacker, R. L. Buckner, A. M. Dale, R. P. Maguire, B. T. Hyman, An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31, 968-980 (2006).
  
  M. F. Glasser, S. N. Sotiropoulos, J. A. Wilson, T. S. Coalson, B. Fischl, J. L. Andersson, J. Xu, S. Jbabdi, M. Webster, J. R. Polimeni, The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80, 105-124 (2013).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.24.469844v1
www.biorxiv.org www.biorxiv.org

Fluidics System for Resolving Concentration-Dependent Effects of Dissolved Gases on Tissue Metabolism

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The authors present a system that allows the measurement of OCR on diverse tissues. Using two optopes, one before the tissue under examination, and one after, allows the OCR to be measured as the difference between the concentration of O2 in the in-flow gas and the concentration of O2 in the out-flow gas. The system maintains the tissue at a set concentration of dissolved O2 so that experiments can be performed over a long period of time. The authors have provided ample data and full methods and their conclusions are most likely reliable.
  
  Currently, we know that O2 is critical for diverse physiological processes, however it is rarely as well controlled for as well as non-gas solutes such as glucose, as we lack methods to control its delivery and infer its consumption. By addressing this need, the authors contribute something valuable to the field, which will hopefully be built on by others. The authors have already begun to show the utility of their system by exploring the complicated biology of H2S. As delivering this gas in a controlled manner is hard, often people use NaHS instead. In line with previous studies (well cited by the authors), differences are observed.
  
  Specific points
  
  1) The gas control system is used with islets, INS-1 832/12 cells, retinas, and liver tissue, demonstrating its broad applicability.
  
  2) The system as a platform can have diverse extra measurement modalities attached to it, for example visible-wavelength absorbance and fluorescence. Metabolite concentrations in the tissue culture outflow could also be measured.
  
  3) The reduction state of cyt c and cyt c oxidase are measured from the second derivative of absorbance at 550 and 605 nm. Ideally, to reliably decompose these signals full spectra around 550-605 nm would be collected. As the authors are only using cytochrome reduction state as a qualitative measure and appear careful to avoid over-interpretation this method should be fine. However, the authors ought to show a representative time course including the fully oxidised and reduced states demonstrating this approach as making these measurements is demanding and will depend on the exact spectroscopic set-up. Without this information it is hard to judge the reliability of the paper.
  
  We appreciate giving us the latitude for a less robust measurement. However, we actually did do what you have suggested should be done. That is, with the Ocean Optics spectrophotometer, we measure the full light spectrum from 400 to 650. Using this spectral data, we calculate the first and second derivatives of the absorption. We have previously published our approach to spectral analysis, as well as the inclusion of the fully oxidized and reduced states (Sweet IR, G Khalil, AR Wallen, M Steedman, KA Schenkman, JA Reems, SE Kahn, JB Callis. Continuous measurement of oxygen consumption by pancreatic islets. Diabetes Tech. Ther. 4: 661-672, 2002; Sweet IR, Cook DL, DeJulio E, Wallen AR, Khalil G, Callis JB, Reems JA: Regulation of ATP/ADP in pancreatic islets. Diabetes 53:401-409. 2004), so we did not include all the details. In order to ensure that our description is clear, we have added a more thorough explanation that we used spectral analysis and not just data obtained as single wavelengths.
  
  Reviewer #2 (Public Review):
  
  The present project is an extension of prior work from this work group in which they describe a technological advancement to their published flow-culture system. Such improvements now incorporate technology that allows for metabolic characterization of mammalian tissues while precisely controlling the concentration of abundant gases (e.g., O2), as well as trace gases (e.g., H2S). The present article demonstrates the utility of this system in the context of hypoxia/re-oxygenation experiments, as well as exposure to H2S. Although the methodology described herein is clearly capable of detecting nuanced metabolic changes in response to variations in O2 or H2S, the lack of a head-to-head comparison with other techniques makes it difficult to discern the potential impact of the technology.
  
  We understand the benefit of comparing compare a new method with the currently utilized methods. However, the novelty of our methodology is that it is able to control the exposure of tissue to levels of both abundant and trace dissolved gas composition, functions that neither of these existing instruments provide. In addition, continuous flow of media allows maintenance and assessment of tissue models that cannot be accommodated by static or spinner systems. Since we are the first to report an entirely novel technology, the direct comparison to benchmarks is not possible. In the past, however, we have tested liver slices and retina in a Seahorse and the tissue died within 120 minutes presumably due to the lack of flow/reoxygenation in the tissue. In addition, islets placed in spinner systems such as the Oxygraph become fragmented and broken very rapidly. So, a head to head comparison on the tissue OCR response to changes in gas composition cannot be meaningfully carried out for the facets of our method that we highlighted. The methodology we present has capabilities that do not exist in any other commercially available system. We have stated this latter point in the last line of the second paragraph of the Introduction. Regarding the general reliability of the O2 consumption measurement: the unprecedented accuracy and stability of the O2 detectors and the ability of our flow system to maintain tissue for days while generating accurate and reproducible measurements of O2 consumption has previously been established (Sweet IR, Gilbert M, Sabek O, Fraga DW, Gaber AO, Reems JA. Glucose Stimulation of Cytochrome c Reduction and Oxygen Consumption as Assessment of Human Islet Quality. Transplantation 80: 1003- 1011, 2005; Neal AS, Rountree AM, Philips CW, Kavanagh TJ, Williams DP, Newham P, Khalil G, Cook DL, Sweet IR. Quantification of low-level drug effects using real-time, in vitro measurement of oxygen consumption rate. Toxicological Sciences 148: 594-602, 2015).
  
  In addition, diffusion gradients both in the bath, as well as the tissue itself likely impact the accuracy of the metabolic measurements. This is likely relevant for the liver slices experiments.
  
  We agree that there are certainly concentration gradients within tissue, and these are increased in the absence of capillary flow. Nonetheless, the gradients will certainly be less than what occurs in static systems. In general, optimal size of tissue pieces are a trade-off between potential for hypoxia if the tissue is too large, and a lack of untraumatized tissue if it is too small. We have added text to address this concern that these effects are to be considered when choosing the size and shape of the liver slices or other tissue models to place into the flow system.
  
  Following resection, liver tissue can be mechanically permeabilized (PMID: 12054447). In the present experiments, no controls were put in place to discern if the tissue was permeabilized. This could be checked by adding in adenylates and additional carbon substrates and assessing the impact on OCR. Similar controls likely need to be implemented for the islet and retina experiments.
  
  As we have used flow systems in the past to maintain islets and liver for 24 hours and more (Neal AS, Rountree AM, Kernan K, Van Yserloo B, Zhang H, Reed BJ, Osborne W, Wang W, Sweet IR. Real time imaging of intracellular hydrogen peroxide in pancreatic islets. Biochem. J. 473:4443-4456, 2016; Neal AS, Rountree AM, Philips CW, Kavanagh TJ, Williams DP, Newham P, Khalil G, Cook DL, Sweet IR. Quantification of low-level drug effects using real-time, in vitro measurement of oxygen consumption rate. Toxicological Sciences 148: p. 594-602, 2015) and based on stable OCR we concluded that the tissue is viable. However, it is possible that the membranes of some of the tissue would become permeabilized which would affect the responses to test compounds. We considered this issue from two perspectives. 1. Whether established models that we used to test the BaroFuse were prone to high cell permeability; and 2. Whether loading and maintenance of the tissue models in the fluidics system resulted in increased permeability. We did do experiments measuring the ADP responses in OCR by islets and retina within the fluidics system. Effects were observable but small. However, these results are not definitive, because it was difficult to know what the response in permeabilized tissue was (and permeabilizing tissue slices was difficult). We then used Propidium Iodide staining to visualize and quantify the level of permeability. In islets, the fluorescence in isolated islets before and after perifusion was negligible compared to that in islets permeabilized by H2O2 treatment (see below).
  
  Fig. 1. Staining of isolated rat islets with the indicator of cell membrane integrity propidium iodide. Islets were stained either before or after a 3-hour perifusion. As a positive control for PI staining, islets were treated with 500 uM H2O2 for 30 minutes and incubated overnight. Each data point was the average +/- SE for an n of 3.
  
  There was some fluorescence in retina and liver however, but it was difficult to interpret this data in terms of a fraction of the tissue that is permeabilized due to the fact that dye close to the surface of the tissue is preferentially imaged. So, we finally assessed the amount of permeabilized tissue in retina and liver by comparing uptake of 3H H2O and an extracellular marker C14 sucrose.
  
  Fig. 2. Fraction of tissue water space that is accessible to the extracellular marker sucrose. Left: Mouse retina. Right: Rat liver slice. Each data point was the average +/- SE for an n of 3.
  
  Extracellular water in liver and retina is well established to be about 25%, close to the volume of distribution of sucrose. Thus, we cannot rule out that there are a small percentage of cells that are permeabilized, but the vast majority are not.
  
  Additional comments are detailed below:
  
  -The experiments with H2S are particularly interesting, as this system does seem well suited to investigate the metabolic effects of H2S.
  
  Thanks! We are excited by the potential for this method to assess the effects of H2S and other trace gases.
  
  -The authors state the transient rise in O2 consumption was surprising; however, accumulation of succinate during ischemia and rapid oxidation upon reperfusion has been previously demonstrated (PMID: 32863205).
  
  This is an interesting paper which describes findings that speak to the role of succinate in supplying fuel that could drive the transient changes in O2 consumption observed following hypoxia. It would be an interesting experiment to perform our hypoxia-reoxygenation experiment in the absence and presence of the permeable malonate to see if the spike in O2 consumption following reoxygenation was absent in the presence of the drug. We have removed the word surprising and cited this paper.
  
  -In the paper, Zaprinast was used to block pyruvate uptake. However, the rationale to use this compound, as opposed to the more specific MPC inhibitor UK5099 is unclear.
  
  We could have used UK5099, but we had used Zaprinast in past studies (Du J, Cleghorn WM, Contreras L, Lindsay K, Rountree AM, Chertov AO, Turner SJ, Sahaboglu A, Linton J, Sadilek M, Satrústegui I, Sweet IR, Paquet-Durand F, Hurley JB. Inhibition of mitochondrial pyruvate transport by Zaprinast causes massive accumulation of aspartate at the expense of glutamate in retinas. J Biol. Chem, 288:36129-40, 2013) and so we knew that in our hands that it blocked pyruvate mitochondrial uptake and would therefore be a good test of the rapid transfer of pyruvate across the plasma membrane.
  
  -Throughout the paper, the authors list 'COVID-19' as a potential application. It is not clear how this technology could be used in the context of COVID-19.
  
  Reference to COVID-19 has been removed.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.07.434330v1
www.biorxiv.org www.biorxiv.org

New submission 28/05/2022, 08:29:41

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Wang and Dudko derive analytical equations for one special case of a model of Ca-dependent vesicle fusion, in the attempt to find a "general theory" of synaptic transmission. They use a model with 2 kinetically distinct fast and slow pools (equation 1).
  
  Critique
  
  1) Overall, the analytical approach applied here remains limited to the quite arbitrarily chosen 2-pool model. Thus, while the authors are able to re-capitulate the kinetics of transmitter release under a series of defined intracellular Ca-concentration steps, [Ca]i (see Fig. 2B; data from Woelfel et al. 2007 J. Neuroscience), this is nevertheless not surprising because the data by Woelfel et al. was originally also fit with a 2-pool model. More importantly, the 2-pool model is valid for describing release kinetics at high [Ca]i, but it cannot account for other important phenomena of synaptic transmission like e.g. spontaneous and asynchronous release which happen at lower [Ca]i, with different Ca cooperativity (Lou et al., 2005). Along the same lines, the derivations of the equations by Wang and Dudko are not valid in the range of low [Ca]i below about 1 micromolar (see "private recommendations" for details). This, however, limits the applicability of the model to AP-driven transmitter release, and it shows that based on one specific arbitrarily chosen model (here: the 2-pool model), one cannot claim to build a realistic and full "theory" for synaptic transmission.
  
  Our two-pool description is far from being “arbitrarily chosen”. It is based on experimental facts that have been established by multiple independent laboratories: namely, the observed two distinct vesicle fusion kinetics due to the presence of the readily releasable and reserve pools in vivo and due to the presence of two dominant vesicle morphologies in vitro. The two-pool picture has been confirmed and successfully used in numerous experimental papers previously. That being said, our two-pool description refers to a more general notion of separation of timescales and is thus more flexible than a literal interpretation might suggest.
  
  The data from Woelfel et al. 2007 J. Neuroscience, while of excellent quality, are not the only measured kinetics of the action-potential triggered vesicle fusion that our theory has been able to recapitulate (see other experimental data in Fig.2 and Fig.3 of the manuscript). The theory also recapitulates the kinetic measurements from fifteen other independent experimental studies, on ten different types of synapses. The dynamic range (peak release rate) of these synapses vary by 10 orders of magnitude, and the range of Ca2+ concentrations spans more than 3 orders of magnitude. Our work recapitulates these 16 datasets not through 16 different ad-hoc models but through a single, fully analytically solved, theoretical framework. Importantly, beyond recapitulating the existing data, our analytically tractable theory enables one to extract the unique sets of microscopic parameters for particular synapses, such as the activation energies and kinetic rates of their synaptic machinery, the sizes of the vesicle pools and the critical number of SNAREs. We verify that these predictions from our theory have reasonable values for each of the data sets; this is an additional, non-trivial check of our theory. The fact that our theory reproduces observations on such strikingly diverse systems, and has such a degree of predictive power, cannot be dismissed as an artifact or coincidence. We are not aware of any other theory, nor fitting model, of comparable generality and the ability to generate concrete predictions.
  
  Reviewer #1 is mistaken in stating that the derivations of our equations are not valid below 1 micromolar Ca2+ concentrations. It is evident already from Figure R1 below (Fig.2 in the revised manuscript) that the theory performs flawlessly at concentrations as low as 0.1µM. There are indeed non-linear effects at ultra-low Ca2+ concentrations that are not displayed by the experimental data in Fig. R1. Our theory is also applicable in that regime: one simply needs to include a second coordinate (in addition to the number of Ca2+ ions bound, 𝑄‡ ) to account for the multidimensionality of the free energy landscape, analogous to the calculations of the rate constants for multidimensional activated rate processes in chemical physics. This illustrates just one of the many ways in which our theory will enable detailed studies of mechanistic aspects of synaptic transmission.
  
  With further regards to generality, as stated in our Abstract, this paper is concerned with providing a physical theory to describe “rapid and precise neuronal communication” enabled by “a highly synchronous release” of neurotransmitters. Typically, more than 90% of the neurotransmitters are released through synchronous release during the action potential. By applying our theory to each of the multiple Ca!" sensors one will be able to cover the remaining <10% of the neurotransmitters and thus simultaneously describe spontaneous, asynchronous and synchronous release. While detailed studies of these effects are clearly beyond the scope of this work, our theory opens a door for such studies by providing a foundation in the form of a conceptual, analytically tractable framework.
  
  2) In their derivations, Wang and Dudko collapse the intracellular Ca-concentration [Ca]i, a parameter directly quantified in the several original experiments that went into Fig. 2A, into a dimensionless relative [Ca]i "c" (see equation 7). Similarly, the release rates are collapsed into a dimensionless quantity. With these normalizations, Ca-dependent transmitter release measured in several preparations seems to fall onto a single theoretical prediction (Fig. 2A). The deeper meaning behind the equalization of the data was unclear, except a demonstration that the data from these different experiments can in general be described with a two-pool model, which is at the core of the dimensionless equations. One issue might be that many of the original data sets used here derive from the same preparation (the calyx of Held), and therefore the previous data might not scatter strongly between studies. This could be clarified by the authors by also plotting the data from all studies on the non-normalized [Ca]i axis for comparison. Furthermore, it would be useful to include data from other preparations, like the inner hair cells (Beutner et al. 2001 Neuron; their Fig. 3) which likely have a lower Ca-sensitivity, i.e. are right-shifted as compared to the calyx (see discussion in Woelfel & Schneggenburger 2003 J. Neuroscience). Thus, it is unclear why normalization of [Ca]i to "c" should be an advantage, because differences in the intracellular Ca sensitivity of vesicle fusion exist between synapses (see above), and likely represent important physiological differences between secretory systems.
  
  We thank the Reviewer for challenging our work with the hypothesis that the demonstrated universal scaling of the experimental data could in fact be an artefact caused by pre-selecting the data with the same preparation – addressing this hypothesis is indeed a compelling test to probe the true limits of generality of our theory. Below we carry out this test. We implemented the two suggestions of the Reviewer: (i) we added datasets on markedly different synaptic preparations, including the inner hair cells as suggested by the Reviewer, as well as retina bipolar cell, hippocampal mossy fiber, cerebella basket cell, chromaffin cell, insulin-secreting cell, and additional data on Calyx of Held from multiple laboratories, and (ii) we plotted the data on the non-normalized axis of [Ca2+] to reveal the full extent of scatter among the data sets. The resulting plot (Fig. R1 below) speaks for itself: in vivo data for the release rate span 4 orders of magnitude at low [Ca2+] and 6 orders of magnitude at high [Ca2+], and there is a 10 orders of magnitude difference between the release rates from in vivo and in vitro data. The scatter across 4-10 orders of magnitude allows one to appreciate the vastly different sensitivities to [Ca2+] between synaptic preparations (Fig.R1, left). Yet, all these data collapse beautifully on the master curve established by our theory (Fig.R1, right).
  
  Fig. R1. Despite 10 orders of magnitude variation in the release rate of different synaptic preparations and more than 3 orders of magnitude range of calcium concentration (left), the data collapse onto a universal curve predicted by the theory (right). The universal collapse indicates that the established scaling (Eq. 7) is universal across different synapses. The distinct sets of parameters for individual synapses (Appendix 3 Table 2) is a demonstration of the predictive power of the theory as a tool for extracting the unique properties of each synapse from experimental data.
  
  What the Reviewer refers to as “the equalization of the data” is known in statistical physics as universality. The deeper meaning of a universal scaling is its indication that the observed phenomena realized in seemingly unrelated systems are in fact governed by common physical principles. The collapse of the data onto the universal curve in Fig. R1 is a demonstration that the present theory has uncovered, quantitatively, unifying physical principles underneath the striking diversity and bewildering complexity of chemical synapses. The Referee is of course correct that the differences in [Ca2+] sensitivities among synapses likely represent important physiological differences between distinct synapses and distinct secretory systems. The present theory does not negate these differences, but it in fact allows one to quantify these differences through the unique sets of extracted parameters for individual synapses (see Appendix 3 Table 2). We are not aware of any other theory that has demonstrated universality in synaptic transmission through a simple, single scaling relation across 10 orders of magnitude in dynamic range and at the same time allowed the extraction of the microscopic parameters that are unique for the individual synapses and thus reflect the diversity of their synaptic machinery. We included Fig. R1 shown here in the revised manuscript (Figure 2).
  
  3) Finally, the authors use their model to derive the number of SNARE proteins necessary for vesicle fusion, and they arrive at the quite strong conclusion that N = 2 SNAREs are required. Nevertheless, this estimate doesn't fit with the number of n = 4-5 Ca2+ ions which the original studies of Fig. 2A consistently found. The Ca-sensitivity at the calyx of Held, and the steepness of the release rate versus [Ca]i relation is determined by Ca-binding to Synatotagmin-2 (the specific Ca sensor isoform found at the calyx synapse), as has been determined in molecular studies at the calyx synapse (see Sun et al. 2007 Nature; Kochubey & Schneggenburger 2011 Neuron). Furthermore, in other secretory cells, the number of SNARE proteins has been estimated to be {greater than or equal to} 3 (Mohrmann et al., Science 2010).
  
  The Reviewer is incorrect in their claim that there is any discrepancy here. The number of SNAREs N and the number of Ca2+ ions 𝑄‡ , extracted from the fit to our theory, are actually in a good agreement with the findings from the studies mentioned by the Reviewer. To clarify, the parameter 𝑄‡ is the number of 𝐶𝑎!" ions bound to a SNARE at the transition state (not final state) of the free energy landscape of a SNARE complex. Appendix 3 Table 2 shows that, for all synaptic preparations, the extracted values at the transition state are 𝑄‡ < 4 − 5, which is indeed consistent with n = 4 − 5 at the final state. We note that, in addition, our theory enables one to extract the key energetic parameter that governs synaptic vesicle fusion: the activation free energy barrier ∆𝐺‡ of SNARE conformational transition (in the range 8-34 kBT for different synaptic preparations, see Appendix 3 Table 2), which, to our knowledge, has not been possible to extract from these experiments before.
  
  The specific value N=2 was extracted from a particular data set for Calyx of Held (Woelfel et al 2007), for which the temporal curves of cumulative release at different Ca2+ concentrations were available. It is quite possible that the value of N will be different for some other synapses. As we emphasize in the manuscript (see Discussion), the present theory does not declare the same value of N for all types of synapses; the power of the theory lies in providing a fitting tool for extracting this value for a system of interest.
  
  Taken together, the derivation of the analytical equations for the kinetic scheme of a 2-pool model is mathematically interesting, and the scholarly derived equations are trustworthy. Nevertheless, the derived analytical model in fact captures only a specific stage of synaptic transmission focusing on Ca-dependent fusion of vesicles from two pools at [Ca]i >1 microM. Other important processes and mechanistic components (e.g. spontaneous, asynchronous release, Ca-dependent pool replenishment, postsynaptic factors) are either over-simplified or remained out of the scope of the theory. Therefore, the paper is far from providing a general "theory for synaptic transmission", as the title promises.
  
  We appreciate that the Reviewer sees our analytical derivations as being mathematically interesting, scholarly derived, and trustworthy. We believe that we have convincingly refuted the Reviewer’s criticisms regarding perceived limitations. We have shown that our universal scaling and collapse is not limited to high calcium concentrations, and have presented checks using data from vastly different synaptic preparations. As noted above, the generality of a theory is determined not by the amount of details packed in it but by the ability of the theory to reproduce observations and generate predictions regarding the phenomenon of interest (here: rapid and precise neuronal communication) while containing as few details as possible. Our theory accomplishes just that; it delivers precisely what our title promises.
  
  Reviewer #2 (Public Review):
  
  The present MS describes an effort to create a general mathematical model of synaptic neurotransmission. The authors invested great efforts to create a complex model of the presynaptic mechanisms, but their approach of the postsynaptic mechanisms is way oversimplified. The authors claim that their model is consistent with lots of in vivo and in vitro experimental data, but this night be true for a small subselection of experimental papers (they cite 7 experimental papers regularly in the MS!). The authors also indicate that their modeling has a realistic foundation, namely they can relate some parameters in their equations to molecules/molecular mechanisms. One example is the parameter N, which they claim indicate the number of SNARE complexes requires for fusion. The reviewer finds it rather misleading because it alludes that there is a parameter for complexin, Rim1, Rim-BP, Munc13-1 etc... The equations clearly cannot formulate and reflect diversity due to different isoforms of even the above mentioned key presynaptic molecules.
  
  We appreciate that the Reviewer found 7 different experimental papers – covering different synapses and different experimental setups – to be “a small subselection”. We believe that Fig. R1 above (response to Reviewer #1 point 2), which uses 16 different experimental papers, leaves no further doubts that the claims about the consistency between the theory and data are fully justified. Despite up to 10 orders of magnitude variation in the release rate of different synaptic preparations and more than 3 orders of magnitude range of calcium concentrations (Fig. R1, left), all the data collapse onto a universal curve predicted by our theory (Fig. R1, right). These data represent different systems – from the central nervous system to the secretory system – and come from in vivo and in vitro experiments. The data we have used cover the measurements on all synaptic systems that we could find in the literature on the action potential-driven neurotransmitter release. If the Reviewer is aware of any existing data on other synaptic systems that we might have missed, we will gratefully appreciate the opportunity to apply the theory to those data as well.
  
  The diversity of the molecular components in different synapses is captured in our theory through different values of the microscopic parameters Δ𝐺‡, 𝑄‡ and 𝑘( . These parameters describe, respectively, the activation energy barrier, the number of bound Ca2+ ions, and the intrinsic rate of the conformational transition of the SNARE complexes that drive synaptic vesicle fusion in a given synapse. Different isoforms of the individual components of SNARE complexes and scaffold proteins, including the proteins mentioned by the Reviewer, will be reflected in different values of Δ𝐺‡, 𝑄‡ and 𝑘( for specific synaptic preparations, as can be seen in Appendix 3 Table 2 in the manuscript. These parameters capture the energetic and kinetic properties of the synaptic fusion machinery as a complex rather than as a collection of isolated molecules. Because the molecular components within a SNARE complex act collectively (hence the name “complex”) to drive vesicle fusion, it is natural (and indeed fortunate) that the predictive power of the theory can be preserved with only a few key parameters of the molecular machinery as opposed to requiring a long list of parameters for every specific isoform of each of the many individual molecular components.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.12.17.423159v2
www.biorxiv.org www.biorxiv.org

New submission 02/12/2022, 10:16:00

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors present a strong set of experiments to uncover what type of role non-mutant stromal cells might be playing in the development of VM and AST, two vascular lesions that share some similarities.
  
  Questions about experimental design.
  
  1) For quantification of gene expression in VM and AST specimens in Figure 2, the methods say qPCR data were normalized to housekeeping genes, but it would be helpful to normalize to endothelial content. It might be that increased TGFa is due to increased endothelium.
  
  We thank the Reviewer for this excellent suggestion. We have now added this new data as suggested with normalization of TGFA mRNA to the endothelial marker PECAM-1/CD31 mRNA. A trend towards an increased expression of TGFA mRNA was detected in VM/AST specimens in comparison to the control group. We also show in the manuscript that besides CD31-positive vascular structures, TGFA is expressed in intervascular areas, i.e. between the vessels, in the patients’ lesions (Fig.2) and in lesion-derived CD31negative intervascular stromal cells. These data altogether demonstrate that i) TGFA is expressed also in other cell types than endothelial cells and ii) indicates that the increased expression of TGFA in lesion samples is not only due to increased vasculature/endothelium in the patient samples.
  
  The new RT-qPCR data has now been added to the manuscript as a new Fig. 2 - figure supplement 1.
  
  2) The mutant allelic frequency for the HUVEC-PIK3CA WT versus HUVEC-PIK3CA H1047R should be provided. This is critically needed for the interpretation of the results.
  
  Thank you for this valuable comment. To confirm that PIK3CA H1047R is still present in transduced HUVECs at the end-point of the mouse xenograft experiment, we performed a new ddPCR analysis detecting fractional abundance of PIK3CA p.H1047R from the matrigel plug-in samples. In this new data, mean fractional abundance of PIK3CA p.H1047R in fibroblast containing PIK3CA H1047R EC plugs was shown to be 27.1 % (variation 26.5-27.8 %; n=2 mice in duplicates). This corresponds to ~54 % of PIK3CA p.H1047R mutation positive cells in the plug, assuming a single copy of the mutation in each cell. As a control group, no positivity was detected in samples with fibroblast and in PIK3CAwt EC, as all the cells express the wildtype form of the PIK3CA gene. Please see Author Response Image 1 representative 2D amplification plots of the mutation analysis. Fractional abundances of PIK3CA mutations in the patient tissue samples and patient-derived CD31+ cells can also be seen in Table 1 and were in a range of 5-12 % (whole tissue) and 44-51 % (EC fraction).
  
  3) From Figure 5, it appears that the human primary fibroblasts are not required for the mutant ECs to form perfused vessels (panel H).
  
  We thank the Reviewer for the comment and agree that based on our H&E staining and erythrocyte analysis, perfused vessels are evident in PIK3CA mutant plugs containing ECs with fibroblasts but also in plugs containing ECs alone. This was expected as PIK3CA mutation in ECs alone has shown to be a driver of venous malformation. However, prior to our study the role of fibroblasts in PIK3CA-driven lesions had not been studied. To better understand the role of fibroblasts in lesion formation, we have now added new data to the manuscript containing example images of the PIK3CA H1047R plugs with or without fibroblasts, and added a new quantitation of their erythrocyte amount. Please see Author Response Image 2. Our data demonstrates that there are significantly: i) more CD31-positive vascular structures (Fig. 5E-G), ii) larger lumens (Fig. 5D-F) and iii) more erythrocyte-containing regions, indicative for perfused vessels (new Fig. 5H) in lesions with fibroblasts in comparison to plugs containing ECs alone. This implies that fibroblasts further induce PIK3CA-driven EC lesion formation.
  
  Author Response Image 2. Vascular structures formed with PIK3CA H1047R ECs alone and PIK3CA H1047R ECs + FBs in mouse xenograft plugs. In the figure panel, H&E staining on each individual plug in these groups is presented. Equal size close-up images were taken from the middle of each plug covering > 50% of plug area (scale bar 250µm). More erythrocytes (red) are seen in the plugs with fibroblasts in comparison to ECs alone. Scanned images of the H&E stained whole tissue sections can be seen in the Fig. 5 – source file.
  
  A new quantitative analysis of erythrocyte positive area in relation to whole plug area using SproutAngio quantification tool was additionally performed (). Analysis was done on a blinded manner and showed significantly increased erythrocyte amount in the plugs containing PIK3CA H1047R ECs and fibroblasts (in comparison to EC alone). Describtion of the analysis has now been added in the manuscript (p. 42, rows 839-843) Figures 5G and 5H in the manuscript were updated to show statistics and automated intensitybased quantification of the erythrocyte positive area/ plug instead of erythrocyte scoring (scale 0-3).
  
  Is it possible that TGFa from the ECs is sufficient to drive vascular malformation?
  
  Mutations in genes such as PIK3CA, TEK and KRAS have been shown to drive formation of vascular anomalies. Thus it is unlikely that a single growth factor, such as increased expression of TGFA, would drive this process alone. That being said, our data shows that TGFA is able to regulate proliferation of PIK3CA mutated ECs via secondary mechanism (Fig. 4F), and we show that inhibition of EGFR pathway is able to reduce PIK3CA-driven lesion growth in mice (Fig. 7). As our bulk RNA-sequencing data from patient-derived cells, showed expression of also other growth factors in lesion ECs (Table 3), it is likely that multiple angiogenic growth factors are involved in lesion formation similarly as in tumors and their expression is primarily driven by mutated cells and secondary by cell-cell crosstalk with other lesion cell types. Thus, targeting of multiple signalling pathways could be a beneficial treatment strategy in the future.
  
  Reviewer #2 (Public Review):
  
  In this manuscript, Ilmonen H. et al explored potential crosstalk between endothelial cells and fibroblasts in a context of sporadic vascular malformation (venous malformation and angiomatoses of soft tissue). With a high level of evidence, they found that mutated endothelial cells secrete TGFA that will activate surrounding fibroblasts, leading in turn to VEGFA secretion that will stimulate endothelial cell sprouting and vascular malformation development. Experiments are well-designed and support their hypothesis. Some controls are missing, particularly in Fig. 2. Indeed, it is mandatory to provide data from healthy skin biopsies (that are available in many laboratories): TGFa, CD31, P-EGFR staining.
  
  We thank the Reviewer for the comments. Although it is common that VM presents in skin, in this work we solely focused on intramuscular and subcutaneous AST and VM patient samples and excluded the samples containing skin from this study. We did TGFA immunostainings from healthy skeletal muscle that can be seen Figure 2 – figure supplement 2B. CD31 staining of vessels in healthy skeletal muscle near the resection margin can be seen in Figure 1B. Please see below also tissue locations of all VM and AST samples in this study:
  
  • Intramuscular, 42.1 % of lesions (n=16)
  
  • Intramuscular and subcutaneous, 21.1 % of lesions (n=8)
  
  • Intramuscular, subcutaneous and synovial membrane, 5.3 % of lesions (n=2)
  
  • Intramuscular and synovial membrane, 2.6 % of lesions (n=1)
  
  • Subcutaneous and synovial membrane, 2.6 % of lesions (n=1)
  
  • Subcutaneous only, 26.3 % of lesions (n=10)
  
  • Skin, none of the lesions
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.23.509204v1
www.biorxiv.org www.biorxiv.org

Differences in the gut microbiomes of distinct ethnicities within the same geographic area are linked to host metabolic health

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  The key question that the authors were addressing was how ethnicity differentially affects the microbiota of subjects living in a particular area (in this case East Asians and Caucasians living in San Francisco that have been enrolled in an 'Inflammation, Diabetes, Ethnicity and Obesity cohort - although inflammatory disease was apparently excluded in these subjects).
  
  The existence of differences between different populations allows potential discrimination of the underlying factors - such as host genetics, diet, lifestyle, physiological parameters, body habitus or other environmental influences. In this case body habitus has been selected as a stratification factor between the two ethnicities. Immigration potentially allows distinction of environmental and host genetical influences.
  
  The strength of the study is in the level of robust analysis of the microbiotas by a very experienced group of researchers, distinguishing the microbiota differences, especially in lean subject, with analysis of associations that may be driving the differences. It is interesting that diet is not one of the apparent associations in this study, yet the relationship of microbiota diversity to body habitus is strong in Caucasian subjects. These associations cannot easily be extrapolated to causation or mechanism - a fact well recognized in the paper - but remain important observations that rationalize in vivo modeling with experimental animals or in vitro analyses of microbial interactions between different taxa simulating the context of differences in the intestinal milieu. The paper includes work showing that differences of the microbiota can be recapitulated after transfer to germ-free mice, at least over the short term: this is important to provide tools to model the reasons for differences in consortial composition.
  
  A very large amount of work required to assemble the samples and the clinical phenotypic metadata set making the data an important and definitive contribution for the subjects studied. Of course, this is one sample of extremely variable human conditions and lifestyles that will help build the overall picture of how differences in our genetics and environment shape our intestinal microbiota.
  
  We appreciate the reviewers' positive summary of our manuscript and agree with the reviewer’s assessment of the need for both mechanistic follow-on studies and extensions to larger and more diverse cohorts.
  
  Reviewer #2 (Public Review):
  
  The study's primary aims are to test for the differences in the microbiome between self-identified East Asian and White subjects from the San Francisco area in the new IDEO cohort. The study builds on an growing literature which describes variations among ethnic groups. The major conclusion of "emphasize the utility of studying diverse ethnic groups" is not novel to the literature.
  
  It was not our intention to imply that our study is novel in studying two distinct ethnic groups, but rather to emphasize that differences exist between ethnicities with regard to the gut microbiome and to provide a systematic analysis of this including gnotobiotic mouse models along a key health disparity in Asian Americans. We include references of prior examples of this work in our introduction (including several references in our introductory paragraph). We have modified our abstract to clarify this point further:
  
  “Taken together, our findings add to the growing body of literature describing variation between ethnicities and provide a starting point for defining the mechanisms through which the microbiome may shape disparate health outcomes in East Asians.”
  
  Overall, the strength of the results is that they confirm patterns from different cohorts/studies and demonstrate that ethnic-related differences are common. The results are subject to sample size concerns that may underpin some of the conflicting or lack of significant results. For instance, there is no overlap in highlighted species-level taxonomy differences between 16S and metagenomic analyses, which precludes a clear interpretation of the meaning of those differences and whether taxa should be highlighted in the abstract; there are low AUC values for the random forest modelling; and there is a lack of significance in correlations between BMI and East Asian subjects in F4a where there may be a correlation. While a minor point, it serves to highlight the sample sizes as the range of the variation in East Asian subjects is not as substantial as the White subjects because there are fewer East Asian data points above a 30 BMI (~N=5) relative to those of White subjects (~N=11).
  
  We agree that our study was limited by sample size and that future studies increasing sample size would be valuable to assess the intersection of metabolic health in colocalized EA and W subjects. We include this in our discussion:
  
  “Due to the investment of resources into ensuring a high level of phenotypic information on each cohort member, and due to its restricted geographical catchment area, the IDEO cohort was relatively small at the time of this analysis (n=46 individuals). This study only focused on two of the major ethnicities in the San Francisco Bay Area; as IDEO continues to expand and diversify its membership, we hope to study a sufficient number of participants from other ethnic groups in the future.”
  
  The microbiome transfers from humans to mice also demonstrate that certain features of interpersonal or ethnic-related differences can be established in mice. This is useful for future studies, but it is not unexpected in and of itself given the robustness of transferring microbiome differences in other human-to-mouse studies. If the phenotype data were more compelling, then the utility of these transfers could be valuable.
  
  We respectfully disagree with this point. To our knowledge, this is the first study demonstrating that ethnicity-associated differences in the gut microbiota are stable following transplantation, which is certainly not guaranteed given the marked and currently unpredictable variations between donor and recipient microbiotas shown here and in prior studies by us (Nayak et al., 2021; Turnbaugh et al., 2009b) and others (Walter et al., 2020).
  
  We state this rationale in our results section:
  
  “Taken together, our results support the hypothesis that there are stable ethnicity-associated signatures within the gut microbiota of lean EA vs. W individuals that are independent of diet. To experimentally test this hypothesis, we transplanted the gut microbiotas of two representative lean W and lean EA individuals into germ-free male C57BL/6J mice…Next, we sought to assess the reproducibility of these findings across multiple donors and in the context of a distinctive dietary pressure. We fed 20 germ-free male mice a high-fat, high-sugar (HFHS) diet for 4 weeks prior to colonization with a gut microbiota from one of 5 W and 5 EA donors....”
  
  Furthermore, while the phenotypic data may not be as dramatic as the reviewer had hoped, this is to our knowledge the first demonstration that ethnicity-associated differences in the gut microbiota play a causal role in host phenotypes, as highlighted in our discussion:
  
  “Our results in humans and mouse models support the broad potential for downstream consequences of ethnicity-associated differences in the gut microbiome for metabolic syndrome and potentially other disease areas. However, the causal relationships and how they can be understood in the context of the broader differences in host phenotype between ethnicities require further study.”
  
  However, in the current state, I am concerned with the experimental design since the LFPP experiments used N=1 donor per ethnicity for establishing the mice colonies and are resultantly confounded by mice pseudo-replication with recipient mice derived from one donor of each ethnicity. This concern is relevant to interpreting results back to interpersonal or interethnic variation. Are phenotypic differences due to individual differences or ethnic differences? It's not clear.
  
  We presented our data in summary form integrating the results from 3 independent experiments across two figures. To account for pseudoreplication as the reviewer suggests, we have restricted permutational space to account for one donor for multiple recipient mice using the parameters outlined in the adonis software package. Analyzing our results from 3 separate experiments, our results are statistically significant, which we mention in the revised text:
  
  “In a pooled analysis of all gnotobiotic experiments accounting for one donor for multiple recipient mice, ethnicity and diet were both significantly associated with variations in the gut microbiota (Fig. S9), consistent with the extensive published data demonstrating the rapid and reproducible impact of a HFHS diet on the mouse and human gut microbiota (Bisanz et al., 2019).”
  
  Figure S9. Combined analysis of recipient mice reveals significant associations with donor ethnicity and recipient diet. A PhILR PCoA is plotted based on 16S-Seq data from all gnotobiotic experiments. Individual mice are colored by (A) donor ethnicity or (B) the recipient’s diet. Both ethnicity and diet were statistically significant contributors to variance (ADONIS p-values and estimated variance displayed using blocks restricted by donor identifiers to account for one donor going to multiple recipient mice). We also observed a trend for interaction between diet and ethnicity in this model (p=0.068, R2=0.047, ADONIS).
  
  The HFHS experiment also used N=5 donors that somewhat mitigates these concerns, but mixed sexes were used here and there can be sex-specific human microbiome differences.
  
  Our study was designed to evaluate ethnicity and metabolic health. As we report in our original and updated analysis, we found no significant associations between the gut microbiota and biological sex (Figs. 2E and S4) in the IDEO cohort, perhaps due to the small effect size of sex reported in prior studies by other groups (Arumugam et al., 2011; Ding and Schloss, 2014; Schnorr et al., 2014; Zhang et al., 2021) coupled to the limited size of the current IDEO cohort.
  
  The Turnbaugh and Koliwad labs use mixed sexes as donors for studies in conventionally raised and gnotobiotic mice due to our active funding from the NIH, which has clear guidelines meant to prevent continued discrimination against studies in females. The following link has additional information for your consideration: https://orwh.od.nih.gov/sex-gender/nih-policy-sex-biological-variable.
  
  Importantly, our study was not confounded by sex due to the use of similar numbers of male and female donors (2 male and 2 females in the LFPP experiments and 3 female and 2 males for both ethnicities in the HFHS experiment). All of our recipient mice were male, as specified in our methods section and our revised main text:
  
  “To experimentally test this hypothesis, we transplanted the gut microbiotas of two representative lean W and lean EA individuals into germ-free male C57BL/6J mice…Next, we sought to assess the reproducibility of these findings across multiple donors and in the context of a distinctive dietary pressure. We fed 20 germ-free male mice a high-fat, high-sugar (HFHS) diet for 4 weeks prior to colonization with a gut microbiota from one of 5 W and 5 EA donors....”
  
  To further investigate any potential sex-specific signal we have stratified our analysis for the HFHS experiment by the gender of the donors (Reviewer Figure 2). This reveals that the significance between ethnicity in the microbiota transplantation experiments is preserved in mice that received stool from male donors (Reviewer Fig. 2A) but not female donors (Reviewer Fig. 2B). In Reviewer Fig. 1 above, LFPP1 and LFPP2 were conducted using different donors of different biological sex. Splitting our LFPP experiments up revealed the consistent signal for ethnicity in microbial community composition that we report above. The small sample sizes in this stratified analysis makes it difficult to conclude that there are reproducible sex-specific differences in the microbiome transplant experiments, but we agree with the reviewer that this question should be more thoroughly explored in future work.
  
  We have added a brief note to the discussion to emphasize this important point:
  
  “...differences between the human donor and recipient mouse microbiotas inherent to gnotobiotic transplantation warrant further investigation as do differences in the stability of the gut microbiotas of male versus female donors”
  
  Reviewer Figure 2. (A,B) Principal coordinate analysis of PhILR Euclidean distances of stool from germ-free recipient mice transplanted with stool microbial communities from (A) male (n=2 EA and n=2 W donors) or (B) female (n=3 EA and n=3 W) donors of either ethnicity and fed a HFHS diet. Significance was assessed by ADONIS. Pairs of germ-free mice receiving the same donor sample are connected by a dashed line (n=2 recipient mice per donor). Experimental designs are shown in Fig. S7.
  
  Finally, experimental results are not always consistent and sometimes show opposite trends that may be related to the sampling sizes. For instance, fat and lean mass increased and decreased respectively in LFPP, but there were no statistically-similar differences in HFHS. Moreover, the metabolic fat mass outcomes in mice do not match the expected human donor data. For instance, in LFPP1, White subjects had lower fat mass in humans but recipient mice on average gained more fat. It is difficult to reconcile these differences to a biological or sampling scheme reason.
  
  We wholeheartedly agree with this point and were also surprised that the recipient mouse phenotypes did not match our original hypothesis based upon the observed health disparities between EA and W individuals. These surprising and perhaps counter-intuitive results demand further study and mechanistic dissection. We have tried to capture potential explanations for these findings while highlighting the limitations of our current study in our expanded discussion. With respect to the glucose tolerance data, the lack of a microbiome-driven phenotype might be due to the use of genetically identical mice that are not prone to metabolic illness without significant perturbation. If we had used mice prone to metabolic disease, such as non-obese diabetic (NOD) germ free recipient mice where the microbiome is known to impact the development of diabetes, we may have seen between ethnic differences in glucose tolerance.
  
  Our revised discussion, with key points underlined is copied below for your convenience:
  
  “Our results in humans and mouse models support the broad potential for downstream consequences of ethnicity-associated differences in the gut microbiome for metabolic syndrome and potentially other disease areas. However, the causal relationships and how they can be understood in the context of the broader differences in host phenotype between ethnicities require further study. While these data are consistent with our general hypothesis that ethnicity-associated differences in the gut microbiome are a source of differences in host metabolic disease risk, we were surprised by both the nature of the microbiome shifts and their directionality. Based upon observations in the IDEO (Alba et al., 2018) and other cohorts (Gu et al., 2006; Zheng et al., 2011), we anticipated that the gut microbiomes of lean EA individuals would promote obesity or other features of metabolic syndrome. In humans, we did find multiple signals that have been previously linked to obesity and its associated metabolic diseases in EA individuals, including increased Firmicutes (Basolo et al., 2020; Bisanz et al., 2019), decreased A. muciniphila (Depommier et al., 2019; Plovier et al., 2017), decreased diversity (Turnbaugh et al., 2009a), and increased acetate (Perry et al., 2016; Turnbaugh et al., 2006). Yet EA subjects also had higher levels of Bacteroidota and Bacteroides, which have been linked to improved metabolic health (Johnson et al., 2017). More importantly, our microbiome transplantations demonstrated that the recipients of the lean EA gut microbiome had less body fat despite consuming the same diet. These seemingly contradictory findings may suggest that the recipient mice lost some of the microbial features of ethnicity relevant to host metabolic disease or alternatively that the microbiome acts in a beneficial manner to counteract other ethnicity-associated factors driving disease.
  
  EA subjects also had elevated levels of the short-chain fatty acids propionate and isobutyrate. The consequences of elevated intestinal propionate levels are unclear given the seemingly conflicting evidence in the literature that propionate may either exacerbate (Tirosh et al., 2019) or protect from (Lu et al., 2016) aspects of metabolic syndrome. Clinical data suggests that circulating propionate may be more relevant for disease than fecal levels (Müller et al., 2019), emphasizing the importance of considering both the specific microbial metabolites produced, their intestinal absorption, and their distribution throughout the body. Isobutyrate is even less well-characterized, with prior links to dietary intake (Berding and Donovan, 2018) but no association with obesity (Kim et al., 2019). Unlike SCFAs, we did not identify consistent differences in BCAAs, potentially due to differences in both extraction and standardization techniques inherent to GC-MS and NMR analysis (Cai et al., 2016; Lynch and Adams, 2014; Qin et al., 2012).
  
  There are multiple limitations of this study. Due to the investment of resources into ensuring a high level of phenotypic information on each cohort member coupled to the restricted geographical catchment area, the IDEO cohort was relatively small at the time of this analysis (n=46 individuals). The current study only focused on two of the major ethnicities in the San Francisco Bay Area. As IDEO continues to expand and diversify its membership, we hope to study a sufficient number of participants from other ethnic groups. Stool samples were collected at a single time point and analyzed in a cross-sectional manner. While we used validated tools from the field of nutrition to monitor dietary intake, we cannot fully exclude subtle dietary differences between ethnicities (Johnson et al., 2019), which could be interrogated through controlled feeding studies (Basolo et al., 2020). Our mouse experiments were all performed in wild-type adult males. The use of a microbiome-dependent transgenic mouse model of diabetes (Brown et al., 2016) would be useful to test the effects of inter-ethnic differences in the microbiome on insulin and glucose tolerance. Additional experiments are warranted using the same donor inocula to colonize germ-free mice prior to concomitant feeding of multiple diets, allowing a more explicit test of the hypothesis that diet can disrupt ethnicity-associated microbial signatures. These studies, coupled to controlled experimentation with individual strains or more complex synthetic communities, would help to elucidate the mechanisms responsible for ethnicity-associated changes in host physiology and their relevance to disease.”
  
  Reviewer #3 (Public Review):
  
  The authors aimed to characterise how gut microbiota changes between different ethnic group for bacterial richness and community structure. They also wanted to address how this is associated with ethnic group within a defined geographical location. They have started to their story by comparing the fecal microbiota of relatively small cohort consisting of 46 lean and obese East Asian and White participants living in the San Francisco Bay Area. For that reason they used 16S and shotgun metagenomics. They demonstrated that ethnicity-associated differences in the gut microbiota are stronger in lean individuals and obese did not have a clear difference in the gut microbiota profile between ethnic groups, either suggesting that established obesity or its associated dietary patterns can overwrite long-lasting microbial signatures or alternatively that there is a shared ethnicity-independent microbiome type that predisposes individuals to obesity. The authors did also show the metabolic differences between these ethnic groups and the major differences were in the branched chain amino acid and the short-chain fatty acids. To prove their point, at this stage they have also used different metabolomic methodology. Although some aspects of the work are not very novel, the work does provide additional insights into the effect(s) of ethnicity, current living location and diet on shaping microbiota. Honestly, while reading through the manuscript, I have several questions where I believed that clarification was needed. But somehow, I felt like the authors have been reading my mind every step of the way. At the end of each section whatever I questioned was addressed in the next paragraph There are, however, a few points that I think would like to hear the authors' clarification.
  
  The authors pursued the story using 16S data. However, they have shotgun Metagenomics data which gives more power and resolution to microbiota profile. Is there any specific reason why the story was not build with shotgun Metagenomic data? However, if this is the case it will be nice to justify in the text or legend which figure was built with what dataset exactly?
  
  As discussed above, 16S rRNA gene and metagenomic sequencing both have strengths and weaknesses. For example, 16S-seq is inexpensive and allows analysis of low abundance species, whereas metagenomics permits analysis of gene and pathway abundances of abundant taxa. As requested, we have now expanded Figure 2 (metagenomics) to better match Figure 1 (16S-seq). The type of technology is defined within each legend and the relevant text within our results.
  
  Even though the authors mentioned in the discussion that they have not used the same inocula from a donor to different diet, it will be nice if the authors further comments whether they would expect the same results or slightly different results which each different inocula.
  
  As requested, we have modified the text in our discussion to include these comments:
  
  “Additional experiments are warranted using the same donor inocula to colonize germ-free mice prior to concomitant feeding of multiple diets, allowing a more explicit test of the hypothesis that diet can disrupt ethnicity-associated microbial signatures. These studies, coupled to controlled experimentation with individual strains or more complex synthetic communities, would help to elucidate the mechanisms responsible for ethnicity-associated changes in host physiology and their relevance to disease.”
  
  Overall, the study is well executed and claims and conclusions seem relatively well justified by the provided evidence. The findings are interesting for a broad audience of biologists. The findings are interesting for a broad audience of biologists.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.10.23.352807v1
www.biorxiv.org www.biorxiv.org

Slow oscillation-spindle coupling strength predicts real-life gross-motor learning in adolescents and adults

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Overall, the authors have done a nice job covering the relevant literature, presenting a story out of complicated data, and performing many thoughtful analyses.
  
  However, I believe the paper requires quite major revisions.
  
  We thank the reviewer for their encouraging assessment of our manuscript. We are grateful for their valuable and especially detailed feedback that helped us to substantially improve our manuscript.
  
  Major issues:
  
  I do not believe the current results present a clear, comprehensible story about sleep and motor memory consolidation. As presented, sleep predicts an increase in the subsequent learning curve, but there is a negative relationship between learning curve and task proficiency change (which is, as far as I can tell, similar to "memory retention"). This makes it seem as if sleep predicts more forgetting on initial trials within the subsequent block (or worse memory retention) - is this true? Regardless of whether it is statistically true, there appears another story in these data that is being sacrificed to fit a story about sleep. To my eye, the results may first and foremost tell a circadian (rather than sleep) story. Examining the data in Figure 2A and 2B, it appears that every AM learning period has a higher learning curve (slope) than every PM period. While this could, of course, be due to having just slept, the main story gleaned from such a result is not a sleep effect on retention, which has been the emphasis on motor memory consolidation research in the last couple of decades, but on new learning. The fact that this effect appears present in the first session (juggling blocks 1-3 in adolescents and blocks 1-5 in adults) makes this seem the more likely story here, since it has less to do with "preparing one to re-learn" and more to do with just learning and when that learning is optimal. But even if it does not reach statistical significance in the first session alone, it remains a concern and, in my opinion, should be considered a focus in the manuscript unless the authors can devise a reason to definitively rule it out.
  
  Here is how I recommend the authors proceed on this point: include all sessions from all subjects into a mixed effect model, predicting the slope of the learning curve with time of day and age group as fixed effects and subjects as random effects:
  
  learning curve slope ~ AM/PM [AM (0) or PM (1)] + age [adolescent (0) or adult (1)] + (1|subject)
  
  …or something similar with other regressors of interest. If this is significant for AM/PM status, they should re-try the analysis using only the first session. If this is significant, then a sleep-centric story cannot be defended here at all, in my opinion. If it is not (which could simply result from low power, but the authors could decide this), the authors should decide if they think they can rule out circadian effects and proceed accordingly. I should note that, while to many, a sleep story would be more interesting or compelling, that is not my opinion, and I would not solely opt to reject this paper if it centered a time-of-day story instead.
  
  The authors need to work out precisely what is happening in the behavior here, and let the physiology follow that story. They should allow themselves to consider very major revisions (and drop the physiology) if that is most consistent with the data. As presented, I am very unclear of what to take away from the study.
  
  We thank the reviewer for the opportunity to further elaborate on our behavioral results. We agree that the interpretation of the behavior in the complex gross-motor task is not straight forward, which might be partly due to less controllability compared to for example finger-tapping tasks. The reviewer is correct that, initially sleep seems to predict more forgetting on initial trials within the subsequent block given the dip in task proficiency and a resulting increase in steepness of the learning curve after the sleep retention interval. Notably, this dip in performance after sleep has also been reported for finger-tapping tasks (cf. Eichenlaub et al, 2020). The performance dip is also present in the wake first group (Figure 2) after the first interval. This observation suggests that picking up the task again after a period of time comes at a cost. Interestingly, this performance dip is no longer present after the second retention interval indicating that the better the task proficiency the easier it is to pick up juggling again. In other words, juggling has been better consolidated after additional training. Critically, our results show, that participants with higher SO-spindle coupling strength have a lower dip in performance after the retention interval, thus indicating a learning advantage.
  
  Figure 2
  
  (A) Number of successful three-ball cascades (mean ± standard error of the mean [SEM]) of adolescents (circles) for the sleep-first (blue) and wake-first group (green) per juggling block. Grand average learning curve (black lines) as computed in (C) are superimposed. Dashed lines indicate the timing of the respective retention intervals that separate the three performance tests. Note that adolescents improve their juggling performance across the blocks. (B) Same conventions as in (A) but for adults (diamonds). Similar to adolescents, adults improve their juggling performance across the blocks regardless of group.
  
  We discuss the sleep effect on juggling in the discussion section (page 22 – 23, lines 502 – 514):
  
  "How relevant is sleep for real-life gross-motor memory consolidation? We found that sleep impacts the learning curve but did not affect task proficiency in comparison to a wake retention interval (Figure 2DE). Two accounts might explain the absence of a sleep effect on task proficiency. (1) Sleep rather stabilizes than improves gross-motor memory, which is in line with previous gross-motor adaption studies (Bothe et al, 2019; Bothe et al, 2020). (2) Pre-sleep performance is critical for sleep to improve motor skills (Wilhelm et al, 2012). Participants commonly reach asymptotic pre-sleep performance levels in finger tapping tasks, which is most frequently used to probe sleep effects on motor memory. Here we found that using a complex juggling task, participants do not reach asymptotic ceiling performance levels in such a short time. Indeed, the learning progression for the sleep-first and wake-first groups followed a similar trend (Figure 2AB), suggesting that more training and not in particular sleep drove performance gains."
  
  If indeed the authors keep the sleep aspect of this story, here are some comments regarding the physiology. The authors present several nice analyses in Figure 3. However, given the lack of behavioral difference between adolescents and adults (Fig 2D), they combine the groups when investigating behavior-physiology relationships. In some ways, then, Figure 3 has extraneous details to the point of motor learning and retention, and I believe the paper would benefit from more focus. If the authors keep their sleep story, I believe Figure 3 and 4 should be combined and some current figure panels in Figure 3 should be removed or moved to the supplementary information.
  
  We thank the reviewers for their suggestion and we agree that the figures of our manuscript would benefit from more focus. Therefore, we combined Figure 3 and 4 from the original manuscript into a revised Figure 3 in the updated version of the manuscript. In more detail, subpanels that explain our methodological approach can now be found in Figure 3 – figure supplement 1, while the updated Figure 3 now focuses on developmental changes in oscillatory dynamics and SO-spindle coupling strength as well as their relationship to gross-motor learning.
  
  Updated Figure 3:
  
  (A) Left: topographical distribution of the 1/f corrected SO and spindle amplitude as extracted from the oscillatory residual (Figure 3 – figure supplement 1A, right). Note that adolescents and adults both display the expected topographical distribution of more pronounced frontal SO and centro-parietal spindles. Right: single subject data of the oscillatory residual for all subjects with sleep data color coded by age (darker colors indicate older subjects). SO and spindle frequency ranges are indicated by the dashed boxes. Importantly, subjects displayed high inter-individual variability in the sleep spindle range and a gradual spindle frequency increase by age that is critically underestimated by the group average of the oscillatory residuals (Figure 3 – figure supplement 1A, right). (B) Spindle peak locked epoch (NREM3, co-occurrence corrected) grand averages (mean ± SEM) for adolescents (red) and adults (black). Inset depicts the corresponding SO-filtered (2 Hz lowpass) signal. Grey-shaded areas indicate significant clusters. Note, we found no difference in amplitude after normalization. Significant differences are due to more precise SO-spindle coupling in adults. (C) Top: comparison of SO-spindle coupling strength between adolescents and adults. Adults displayed more precise coupling than adolescents in a centro-parietal cluster. T-scores are transformed to z-scores. Asterisks denote cluster-corrected two-sided p < 0.05. Bottom: Exemplary depiction of coupling strength (mean ± SEM) for adolescents (red) and adults (black) with single subject data points. Exemplary single electrode data (bottom) is shown for C4 instead of Cz to visualize the difference. (D) Cluster-corrected correlations between individual coupling strength and overnight task proficiency change (post – pre retention) for adolescents (red, circle) and adults (black, diamond) of the sleep-first group (left, data at C4). Asterisks indicate cluster-corrected two-sided p < 0.05. Grey-shaded area indicates 95% confidence intervals of the trend line. Participants with a more precise SO-spindle coordination show improved task proficiency after sleep. Note that the change in task proficiency was inversely related to the change in learning curve (cf. Figure 2D), indicating that a stronger improvement in task proficiency related to a flattening of the learning curve. Further note that the significant cluster formed over electrodes close to motor areas. (E) Cluster-corrected correlations between individual coupling strength and overnight learning curve change. Same conventions as in (D). Participants with more precise SO-spindle coupling over C4 showed attenuated learning curves after sleep.
  
  and
  
  Figure 3 - figure supplement 1
  
  (A) Left: Z-normalized EEG power spectra (mean ± SEM) for adolescents (red) and adults (black) during NREM sleep in semi-log space. Data is displayed for the representative electrode Cz unless specified otherwise. Note the overall power difference between adolescents and adults due to a broadband shift on the y-axis. Straight black line denotes cluster-corrected significant differences. Middle: 1/f fractal component that underlies the broadband shift. Right: Oscillatory residual after subtracting the fractal component (A, middle) from the power spectrum (A, left). Both groups show clear delineated peaks in the SO (< 2 Hz) and spindle range (11 – 16 Hz) establishing the presence of the cardinal sleep oscillations in the signal. (B) Top: Spindle frequency peak development based on the oscillatory residuals. Spindle frequency is faster at all but occipital electrodes in adults than in adolescents. T-scores are transformed to z-scores. Asterisks denote cluster-corrected two-sided p < 0.05. Bottom: Exemplary depiction of the spindle frequency (mean ± SEM) for adolescents (red) and adults (black) with single subject data points at Cz. (C) SO-spindle co-occurrence rate (mean ± SEM) for adolescents (red) and adults (black) during NREM2 and NREM3 sleep. Event co-occurrence is higher in NREM3 (F(1, 51) = 1209.09, p < 0.001, partial eta² = 0.96) as well as in adults (F(1, 51) = 11.35, p = 0.001, partial eta² = 0.18). (D) Histogram of co-occurring SO-spindle events in NREM2 (blue) and NREM3 (purple) collapsed across all subjects and electrodes. Note the low co-occurring event count in NREM2 sleep. (E) Single subject (top) and group averages (bottom, mean ± SEM) for adolescents (red) and adults (black) of individually detected, for SO co-occurrence-corrected sleep spindles in NREM3. Spindles were detected based on the information of the oscillatory residual. Note the underlying SO-component (grey) in the spindle detection for single subject data and group averages indicating a spindle amplitude modulation depending on SO-phase. (F) Grand average time frequency plots (-2 to -1.5s baseline-corrected) of SO-trough-locked segments (corrected for spindle co-occurrence) in NREM3 for adolescents (left) and adults (right). Schematic SO is plotted superimposed in grey. Note the alternating power pattern in the spindle frequency range, showing that SO-phase modulates spindle activity in both age groups.
  
  Why did the authors use Spearman rather than Pearson correlations in Figure 4? Was it to reduce the influence of the outlier subject? They should minimally clarify and justify this, since it is less conventional in this line of research. And it would be useful to know if the relationship is significant with Pearson correlations when robust regression is applied. I see the authors are using MATLAB, and the robustfit toolbox (https://www.mathworks.com/help/stats/robustfit.html) is a simple way to address this issue.
  
  We thank the reviewers for their suggestion. We agree that when inspecting the scatter plots it looks like that the correlations could be severely influenced by two outliers in the adult group. Because this is an important matter, we recalculated all previously reported correlations without the two outliers (Figure R4, left column) and followed the reviewer’s suggestion to also compute robust regression (Figure R4, right column) and found no substantial deviation from our original results.
  
  In more detail, increase in task proficiency resulted in flattening of the learning curve when removing outliers (Figure R4A, rhos = -0.70, p < 0.001) and when applying robust regression analysis (Figure R4B, b = -0.30, t(67) = -10.89, rho = -0.80, p < 0.001). Likewise, higher coupling strength still predicted better task proficiency (mean rho = 0.35, p = 0.029, cluster-corrected) and flatter learning curves after sleep (rho = -0.44, p = 0.047, cluster-corrected) when removing the outliers (Figure R4CE) and when calculating robust regression (Figure R4DF, task proficiency: b = 82.32, t(40) = 3.12, rho = 0.45, p = 0.003; learning curve: b = -26.84, t(40) = -2.96, rho = -0.43, p = 0.005). Furthermore, we calculated spearman rank correlations and cluster-corrected spearman rank correlations in our original manuscript, to mitigate the impact of outliers, even though Pearson correlations are more widely used in the field. Therefore, we still report spearman rank correlations for single electrodes instead of robust correlations as it is more consistent with the cluster-correlation analyses.
  
  We now use robust trend lines instead of linear trend lines in our scatter plots. Further, we added the correlations without outliers (Figure R4ACE) to the supplements as Figure 2 – figure supplement 1D and Figure 3 – figure supplement 2 FG. These additional analyses are now reported in the results section of the revised manuscript (page 9, lines 186 – 191):
  
  "[…] we confirmed a strong negative correlation between the change (post retention values – pre retention values) in task proficiency and the change in learning curve after the retention interval (Figure 2F; rhos = -0.71, p < 0.001), which also remained strong after outlier removal (Figure 2 – figure supplement 1D). This result indicates that participants who consolidate their juggling performance after a retention interval show slower gains in performance."
  
  And (page 16, lines 343 – 346):
  
  "[…] Furthermore, our results remained consistent when including coupled spindle events in NREM2 (Figure 3 – figure supplement 2E) and after outlier removal (Figure 3 – figure supplement 2FG)."
  
  Furthermore, we now state that we specifically utilized spearman rank correlations to mitigate the impact of outliers in our analyses in the method section (page 35, lines 808 – 813)::
  
  "For correlational analyses we utilized spearman rank correlations (rhos; Figure 2F & Figure 3DE) to mitigate the impact of possible outliers as well as cluster-corrected spearman rank correlations by transforming the correlation coefficients to t-values (p < 0.05) and clustering in the space domain (Figure 3DE). Linear trend lines were calculated using robust regression."
  
  Figure R4
  
  (A) Spearman rank correlation between task proficiency change and learning curve change collapsed across adolescents (red dot) and adults (black diamonds) after removing two outlier subjects in the adult age group. Grey-shaded area indicates 95% confidence intervals of the robust trend line. (B) Robust regression of task proficiency change and learning curve change of the original sample. (C) Cluster-corrected correlations (right) between individual coupling strength and overnight task proficiency change (post – pre retention) after outlier removal (left, spearman correlation at C4, uncorrected). Asterisks indicate cluster-corrected two-sided p < 0.05. (D) Robust regression of coupling strength at C4 and task proficiency of the original sample. (E) Same conventions as in (C) but for overnight learning curve change. (F) Same conventions as in (D) but for overnight learning curve change.
  
  Additionally, with only a single night of recording data, it is impossible to disentangle possible trait-based sleep characteristics (e.g., Subject 1 has high SO-spindle coupling in general and retains motor memories well, but these are independent of each other) from a specific, state-based account (e.g., Subject 1's high SO-spindle coupling on night 1 specifically led to their improved retention or change in learning, etc., and this is unrelated to their general SO-spindle coupling or motor performance abilities). Clearly, many studies face this limitation, but this should be acknowledged.
  
  We thank the reviewers for their important remark. We agree that it is impossible to make a sound statement about whether our reported correlations represent trait- or state-based aspects of the sleep and learning relationship with the data that we have reported in the manuscript. However, while we are lacking a proper baseline condition without any task engagement, we still recorded polysomnography for all subjects during an adaptation night. Given the expected pronounced differences in sleep architecture between the adaptation nights and learning nights (see Table R3 for an overview collapsed across both age groups), we initially refrained from entering data from the adaptation nights into our original analyses, but we now fully report the data below. Note that the differences are driven by the adaptation night, where subjects first have to adjust to sleeping with attached EEG electrodes in a sleep laboratory.
  
  Table R3. Sleep architecture (mean ± standard deviation) for the adaptation and learning night collapsed across both age groups. Nights were compared using paired t-tests
  
  To further clarify whether subjects with high coupling strength have a motor learning advantage (i.e. trait-effect) or a learning induced enhancement of coupling strength is indicative for improved overnight memory change (i.e. state-effect), we ran additional analyses using the data from the adaptation night. Note that the coupling strength metric was not impacted by differences in event number and our correlations with behavior were not influenced by sleep architecture (please refer to our answer of issue #7 for the results).Therefore, we considered it appropriate to also utilize data from the adaptation night.
  
  First, we correlated SO-spindle coupling strength obtained from the adaptation night with the coupling strength in the learning night. We found that overall, coupling strength is highly correlated between the two measurements (mean rho across all channels = 0.55, Figure R5A), supporting the notion that coupling strength remains rather stable within the individual (i.e. trait), similar to what has been reported about the stable nature of sleep spindles as a “neural finger-print” (De Gennaro & Ferrara, 2003; De Gennaro et al, 2005; Purcell et al, 2017).
  
  To investigate a possible state-effect for coupling strength and motor learning, we calculated the difference in coupling strength between the two nights (learning night – adaptation night) and correlated these values with the overnight change in task proficiency and learning curve. We identified no significant correlations with a learning induced coupling strength change; neither for task proficiency nor learning curve change (Figure R5B). Note that there was a positive correlation of coupling strength change with overnight task proficiency change at Cz (Figure R5B, left), however it did not survive cluster-corrected correlational analysis (rhos = 0.34, p = 0.15). Combined, these results favor the conclusion that our correlations between coupling strength and learning rather reflect a trait-like relationship than a state-like relationship. This is in line with the interpretation of our previous studies that SO-spindle coupling strength reflects the efficiency and integrity of the neuronal pathway between neocortex and hippocampus that is paramount for memory networks and the information transfer during sleep (Hahn et al, 2020; Helfrich et al, 2019; Helfrich et al, 2018; Winer et al, 2019). For a comprehensive review please see Helfrich et al (2021), which argued that SO-spindle coupling predicts the integrity of memory pathways and therefore correlates with various metrics of behavioral performance or structural integrity.
  
  Figure R5
  
  (A) Topographical plot of spearman rank correlations of coupling strength in the adaptation night and learning night across all subjects. Overall coupling strength was highly correlated between the two measurements. (B) Cluster-corrected correlation between learning induced coupling strength changes (learning night – adaptation night) and overnight change in task proficiency (left) as well as learning curve (right). We found no significant clusters, although correlations showed similar trends as our original analyses, with more learning induced changes in coupling strength resulting in better overnight task proficiency and flattened learning curves.
  
  We have now added the additional state-trait analyses (Figure R5) to the updated manuscript as Figure 3 – figure supplement 2HI and report them in the results section (page 17, lines 361 – 375):
  
  "Finally, we investigated whether subjects with high coupling strength have a gross-motor learning advantage (i.e. trait-effect) or a learning induced enhancement of coupling strength is indicative for improved overnight memory change (i.e. state-effect). First, we correlated SO-spindle coupling strength obtained from the adaptation night with the coupling strength in the learning night. We found that overall, coupling strength is highly correlated between the two measurements (mean rho across all channels = 0.55, Figure 3 – figure supplement 2H), supporting the notion that coupling strength remains rather stable within the individual (i.e. trait). Second, we calculated the difference in coupling strength between the learning night and the adaptation night to investigate a possible state-effect. We found no significant cluster-corrected correlations between coupling strength change and task proficiency- as well as learning curve change (Figure 3 – figure supplement 2I).
  
  Collectively, these results indicate the regionally specific SO-spindle coupling over central EEG sensors encompassing sensorimotor areas precisely indexes learning of a challenging motor task."
  
  We further refer to these new results in the discussion section (page 23, lines 521 – 528):
  
  "Moreover, we found that SO-spindle coupling strength remains remarkably stable between two nights, which also explains why a learning-induced change in coupling strength did not relate to behavior (Figure 3 – figure supplement 2I). Thus, our results primarily suggest that strength of SO-spindle coupling correlates with the ability to learn (trait), but does not solely convey the recently learned information. This set of findings is in line with recent ideas that strong coupling indexes individuals with highly efficient subcortical-cortical network communication (Helfrich et al, 2021)."
  
  Additionally, we now provide descriptive data of the adaptation and learning night (Table R3) in the Supplementary file – table 1 and explicitly mention the adaptation night in the results section, which was previously only mentioned in the method section(page 6, lines 101 – 105):.
  
  "Polysomnography (PSG) was recorded during an adaptation night and during the respective sleep retention interval (i.e. learning night) except for the adult wake-first group (for sleep architecture descriptive parameters of the adaptation night and learning night as well as for adolescents and adults see Supplementary file – table 1 & 2)."
  
  Reviewer #2 (Public Review):
  
  In this study Hahn and colleagues investigate the role of Slow-oscillation spindle coupling for motor memory consolidation and the impact of brain maturation on these interactions. The authors employed a real-life gross-motor task, where adolescents and adults learned to juggle. They demonstrate that during post-learning sleep SO-spindles are stronger coupled in adults as compared to adolescents. The authors further show, that the strength of SO-spindle coupling correlates with overnight changes in the learning curve and task proficiency, indicating a role of SO-spindle coupling in motor memory consolidation.
  
  Overall, the topic and the results of the present study are interesting and timely. The authors employed state of the art analyse carefully taking the general variability of oscillatory features into account. It also has to be acknowledged that the authors moved away from using rather artificial lab-tasks to study the consolidation of motor memories (as it is standard in the field), adding ecological validity to their findings. However, some features of their analyses need further clarification.
  
  We thank the reviewer for their positive assessment of our manuscript. Incorporating the encouraging and helpful feedback, we believe that we substantially improved the clarity and robustness of our analyses.
  
  1) Supporting and extending previous work of the authors (Hahn et al, 2020), SO-spindle coupling over centro-parietal areas was stronger in adults as compared to adolescents. Despite these differences in the EEG results the authors collapsed the data of adults and adolescents for their correlational analyses (Fig. 4a and 4b). Why would the authors think that this procedure is viable (also given the fact that different EEG systems were used to record the data)?
  
  We thank the reviewers for the opportunity to clarify why we think it is viable to collapse the data of adolescents and adults for our correlational analyses. In the following we split our answers based on the two points raised by the reviewers: (1) electrophysiological differences (i.e. coupling strength) between the groups and (2) potential signal differences due to different EEG systems.
  
  Electrophysiological differences
  
  Upon inspecting the original Figure 4, it is apparent that the coupling strength of the combined sample does not form isolated clusters for each age group. In other words, while adult coupling strength is on the higher and adolescent coupling on the lower end due to the developmental increase in coupling strength we reported in the original Figure 3F, both samples overlap forming a linear trend. Second, when running the correlational analyses between coupling strength and task proficiency as well as learning curve separately for each age group, we found that they follow the same direction (Figure R3). Adolescents with higher coupling strength show better task proficiency (Figure R3A, rhos = 0.66, p = 0.005). This effect was also present when using robust regression (b = 109.97, t(15)=3.13, rho = 0.63, p = 0.007). Like adolescents, adults with higher coupling strength at C4 displayed better task proficiency after sleep (Figure R3B, rhos = 0.39, p = 0.053). This relationship was stronger when using robust regression (b = 151.36, t(23)=3.17, rho =0.56, p = 0.004). For learning curves, we found the expected negative correlation at C4 for adolescents (Figure R3C, rhos = -0.57, p = 0.020) and adults (Figure R3D, rhos = -0.44, p = 0.031). Results were comparable when using robust regression (adolescents: b = -59.58, t(15) = -2.94, rho = -0.60, p = 0.010; adults: b = -21.99, t(23 )= -1.71, rho = -0.37, p = 0.101).
  
  Taken together, these results demonstrate that adolescents and adults show the effects and the same direction at the same electrode, thus, making it highly unlikely that our results are just by chance and that our initial correlation analyses are just driven by one group.
  
  Additionally, we already controlled for age in our original analyses using partial correlations (also refer to our answer to issue #6). Hence, our additional analyses provide additional support that it is viable to collapse the analyses across both age groups even though they differ in coupling strength.
  
  Different EEG-systems
  
  The reviewers also raise the question whether our analyses might be impacted by the different EEG systems we used to record our data. This is an important concern especially when considering that cross-frequency coupling analyses can be severely confounded by differences in signal properties (Aru et al, 2015). In our sample, the strongest impact factor on signal properties is most likely age, given the broadband power differences in the power spectrum we found between the groups (original Figure 3A). Importantly, we also found a similar systematic power difference in our longitudinal study using the same ambulatory EEG system for both data recordings (Hahn et al, 2020). This is in line with numerous other studies demonstrating age related EEG power changes in broadband- as well as SO and sleep spindle frequency ranges (Campbell & Feinberg, 2016; Feinberg & Campbell, 2013; Helfrich et al, 2018; Kurth et al, 2010; Muehlroth et al, 2019; Muehlroth & Werkle-Bergner, 2020; Purcell et al, 2017). Therefore, we already had to take differences in signal property into account for our cross-frequency analyses. Regardless whether the underlying cause is an age difference or different signal-to-noise ratios of different EEG systems.
  
  To mitigate confounds in the signal, we used a data-driven and individualized approach detecting SO and sleep spindle events based on individualized frequency bands and a 75-percentile amplitude criterion relative to the underlying signal. Additionally we z-normalized all spindle events prior to the cross-frequency coupling analyses (Figure R3E). We found no amplitude differences around the spindle peak (point of SO-phase readout) between adolescents that were recorded with an ambulatory amplifier system (alphatrace) and adults that were recorded with a stationary amplifier system (neuroscan) using cluster-based random permutation testing. This was also the case for the SO-filtered (< 2 Hz) signal (Figure R3E, inset). Critically, the significant differences in amplitude from -1.4 to -0.8 s (p = 0.023, d = -0.73) and 0.4 to 1.5 s (p < 0.001, d = 1.1) are not caused by age related differences in power or different EEG-systems but instead by the increased coupling strength (i.e. higher coupling precision of spindles to SOs) in adults giving rise to a more pronounced SO-wave shape when averaging across spindle peak locked epochs.
  
  Consequently, our analysis pipeline already controlled for possible differences in signal property introduced through different amplifier systems. Nonetheless, we also wanted to directly compare the signal-to-noise ratio of the ambulatory and stationary amplifier systems. However, we only obtained data from both amplifier systems in the adult sleep first group, because we recorded EEG during the juggling learning phase with the ambulatory system in addition to the PSG with the stationary system. First, we computed the power spectra in the 1 to 49 Hz frequency range during the juggling learning phase (ambulatory) and during quiet wakefulness (stationary) for every subject in the adult sleep first group in 10-seconds segments. Next, we computed the signal-to-noise ratio (mean/standard deviation) of the power spectra per frequency across all segments. We only found a small negative cluster from 21.9 to 22.5 Hz (p = 0.042, d = 0.53; Figure R3F), which did not pertain our frequency-bands of interest. Critically, the signal-to-noise ratio of both amplifiers converged in the upper frequency bands approaching the noise floor, therefore, strongly supporting the notion that both systems in fact provided highly comparable estimates.
  
  In conclusion, both age groups display highly similar effects and direction when correlating coupling strength with behavior. Further, after individualization and normalization the analytical signal, we found no differences in signal properties that would confound the cross-frequency analysis. Lastly, we did not find systematic differences in signal-to-noise ratio between the different EEG-systems. Thus, we believe it is justified to collapse the data across all participants for the correlational analyses, as it combines both, the developmental aspect of enhanced coupling precision from adolescence to adulthood and the behavioral relevance for motor learning which we deem a critical research advance from our previous study.
  
  Figure R3
  
  (A) Cluster-corrected correlations (right) between individual coupling strength and overnight task proficiency change (post – pre retention) for adolescents of the sleep-first group (left, spearman correlation at C4, uncorrected). Asterisks indicate cluster-corrected two-sided p < 0.05. Grey-shaded area indicates 95% confidence intervals of the robust trend line. Participants with a more precise SO-spindle coordination show improved task proficiency after sleep. (B) Cluster-corrected correlation of coupling strength and overnight task proficiency change) for adults. Same conventions as in (A). Similar trend of higher coupling strength predicting better task proficiency after sleep (C) Cluster-corrected correlation of coupling strength and overnight learning curve change for adolescents. Same conventions as in (A). Higher coupling strength related to a flatter learning curve after sleep. (D) Cluster-corrected correlation of coupling strength and overnight learning curve change for adults. Same conventions as in (A). Higher coupling strength related to a flatter learning curve after sleep. (E) Spindle peak locked epoch (NREM3, co-occurrence corrected) grand averages (mean ± SEM) for adolescents (red) and adults (black). Inset depicts the corresponding SO-filtered (2 Hz lowpass) signal. Black lines indicate significant clusters. Note, we found no difference in amplitude after normalization. Significant differences are due to more precise SO-spindle coupling in adults. Spindle frequency is blurred due to individualized spindle detection. (F) Signal-to-noise ratio for the stationary EEG amplifier (green) during quiet wakefulness and for the ambulatory EEG amplifier (purple) during juggling training. Grey shaded area denotes cluster-corrected p < 0.05. Note that signal-to-noise ratio converges in the higher frequency ranges.
  
  We have now added Figure R3E as Figure 3B to the revised version of the manuscript to demonstrate that there were no systematic differences between the two age groups in the analytical signal due to the expected age related power differences or EEG-systems. Specifically, we now state in the results section (page 13 – 14, lines 282 – 294):
  
  "We assessed the cross frequency coupling based on z-normalized spindle epochs (Figure 3B) to alleviate potential power differences due to age (Figure 3 – figure supplement 1A) or different EEG-amplifier systems that could potentially confound our analyses (Aru et al, 2015). Importantly, we found no amplitude differences around the spindle peak (point of SO-phase readout) between adolescents and adults using cluster-based random permutation testing (Figure 3B), indicating an unbiased analytical signal. This was also the case for the SO-filtered (< 2 Hz) signal (Figure 3B, inset). Critically, the significant differences in amplitude from -1.4 to -0.8 s (p = 0.023, d = -0.73) and 0.4 to 1.5 s (p < 0.001, d = 1.1) are not caused by age related differences in power or different EEG-systems but instead by the increased coupling strength (i.e. higher coupling precision of spindles to SOs) in adults giving rise to a more pronounced SO-wave shape when averaging across spindle peak locked epochs."
  
  Further, we added the correlational analyses that we computed separately for the age groups (Figure R3A-D) to the revised manuscript (Figure 3 – figure supplement 2CD) as they further substantiate our claims about the relationship between SO-spindle coupling and gross-motor learning.
  
  We now refer to these analyses in the results section (page 16, lines 338 – 343):
  
  "Critically, when computing the correlational analyses separately for adolescents and adults, we identified highly similar effects at electrode C4 for task proficiency (Figure 3 – figure supplement 2C) and learning curve (Figure 3 – figure supplement 2D) in each group. These complementary results demonstrate that coupling strength predicts gross-motor learning dynamics in both, adolescents as well as adults, and further show that this effect is not solely driven by one group."
  
  2) The authors might want to explicitly show that the reported correlations (with regards to both learning curve and task proficiency change) are not driven by any outliers.
  
  We thank the reviewers for their suggestion. We agree that when inspecting the scatter plots it looks like that the correlations could be severely influenced by two outliers in the adult group. Because this is an important matter, we recalculated all previously reported correlations without the two outliers (Figure R4, left column) and followed the reviewer’s suggestion to also compute robust regression (Figure R4, right column) and found no substantial deviation from our original results.
  
  In more detail, increase in task proficiency resulted in flattening of the learning curve when removing outliers (Figure R4A, rhos = -0.70, p < 0.001) and when applying robust regression analysis (Figure R4B, b = -0.30, t(67) = -10.89, rho = -0.80, p < 0.001). Likewise, higher coupling strength still predicted better task proficiency (mean rho = 0.35, p = 0.029, cluster-corrected) and flatter learning curves after sleep (rho = -0.44, p = 0.047, cluster-corrected) when removing the outliers (Figure R4CE) and when calculating robust regression (Figure R4DF, task proficiency: b = 82.32, t(40) = 3.12, rho = 0.45, p = 0.003; learning curve: b = -26.84, t(40) = -2.96, rho = -0.43, p = 0.005). Furthermore, we calculated spearman rank correlations and cluster-corrected spearman rank correlations in our original manuscript, to mitigate the impact of outliers, even though Pearson correlations are more widely used in the field. Therefore, we still report spearman rank correlations for single electrodes instead of robust correlations as it is more consistent with the cluster-correlation analyses.
  
  We now use robust trend lines instead of linear trend lines in our scatter plots. Further, we added the correlations without outliers (Figure R4ACE) to the supplements as Figure 2 – figure supplement 1D and Figure 3 – figure supplement 2 FG. These additional analyses are now reported in the results section of the revised manuscript (page 9, lines 186 – 191):
  
  "[…] we confirmed a strong negative correlation between the change (post retention values – pre retention values) in task proficiency and the change in learning curve after the retention interval (Figure 2F; rhos = -0.71, p < 0.001), which also remained strong after outlier removal (Figure 2 – figure supplement 1D). This result indicates that participants who consolidate their juggling performance after a retention interval show slower gains in performance."
  
  And (page 16, lines 343 – 346):
  
  "[…] Furthermore, our results remained consistent when including coupled spindle events in NREM2 (Figure 3 – figure supplement 2E) and after outlier removal (Figure 3 – figure supplement 2FG)."
  
  Furthermore, we now state that we specifically utilized spearman rank correlations to mitigate the impact of outliers in our analyses in the method section (page 35, lines 808 – 813)::
  
  "For correlational analyses we utilized spearman rank correlations (rhos; Figure 2F & Figure 3DE) to mitigate the impact of possible outliers as well as cluster-corrected spearman rank correlations by transforming the correlation coefficients to t-values (p < 0.05) and clustering in the space domain (Figure 3DE). Linear trend lines were calculated using robust regression."
  
  Figure R4:
  
  (A) Spearman rank correlation between task proficiency change and learning curve change collapsed across adolescents (red dot) and adults (black diamonds) after removing two outlier subjects in the adult age group. Grey-shaded area indicates 95% confidence intervals of the robust trend line. (B) Robust regression of task proficiency change and learning curve change of the original sample. (C) Cluster-corrected correlations (right) between individual coupling strength and overnight task proficiency change (post – pre retention) after outlier removal (left, spearman correlation at C4, uncorrected). Asterisks indicate cluster-corrected two-sided p < 0.05. (D) Robust regression of coupling strength at C4 and task proficiency of the original sample. (E) Same conventions as in (C) but for overnight learning curve change. (F) Same conventions as in (D) but for overnight learning curve change.
  
  3) The sleep data of all participants (thus from both sleep first and wake first) were used to determine the features of SO-spindle coupling in adolescents and adults. Were there any differences between groups (sleep first vs. wake first)? This might be in interesting in general but especially because only data of the sleep first group entered the subsequent correlational analyses.
  
  We thank the reviewers for their remark. We agree that adding additional information about possible differences between the sleep first and wake first groups would allow for a more comprehensive assessment of the reported data. We did not explain our reasoning to include only the sleep first groups for the correlation analyses clearly enough in the original manuscript. Unfortunately, we can only report data for the adolescents in our sample, because we did not record polysomnography (PSG) for the adult wake first group. This is also one of the two reasons why we focused on the sleep first groups for our correlational analyses.
  
  Adolescents in the sleep first group did not differ from adolescents in the wake first group in terms of sleep architecture (except REM (%), which did not correlate with behavior [task proficiency: rho = -0.17, p = 0.28; learning curve: -0.02, p = 0.90]) as well as SO and sleep spindle event descriptive measures (see Table R2). Importantly, we found no differences in coupling strength between the two groups (Figure R2A).
  
  Table R2. Summary of sleep architecture and SO/spindle event descriptive measures (at electrode C4) of adolescents in the sleep first and wake first group (mean ± standard deviation). Independent t-tests were used for comparisons
  
  The second reason why we focused our analyses on sleep first was that adolescents in the wake first group had higher task proficiency after the sleep retention interval than the sleep first group (Figure R2A; t(23) = -2.24, p = 0.034). This difference in performance is directly explained by the additional juggling test that the wake first group performed at the time point of their learning night, which should be considered as additional training. Therefore, we excluded the wake first group from our correlational analyses because sleep and wake first group are not comparable in terms of juggling training during the night when we assessed SO-spindle coupling strength.
  
  Figure R2
  
  (A) Comparison of SO-spindle coupling strength in the adolescent sleep first (blue) and wake first (green) group using cluster-based random permutation testing (Monte-Carlo method, cluster alpha 0.05, max size criterion, 1000 iterations, critical alpha level 0.05, two-sided). Left: exemplary depiction of coupling strength at electrode C4 (mean ± SEM). Right: z-transformed t-values plotted for all electrodes obtained from the cluster test. No significant clusters emerged. (B) Comparison of task proficiency between sleep first and wake first group after the sleep retention interval (mean ± SEM). Adolescents in the wake first group had higher task proficiency given the additional juggling performance test, which also reflects additional training.
  
  These additional analyses (Figure R2) and the summary statistics of sleep architecture and SO/spindle event descriptives of adolescents in the sleep first and wake first group (Table R2), are now reported in the revised version of the manuscript as Figure 3 – figure supplement 2AB and Supplementary file – table 7. We now explicitly explain our rationale of why we only considered participants in the sleep first group for our correlational analyses in the results section (page 6, lines 101 – 105):
  
  "Polysomnography (PSG) was recorded during an adaptation night and during the respective sleep retention interval (i.e. learning night) except for the adult wake-first group (for sleep architecture descriptive parameters of the adaptation night and learning night as well as for adolescents and adults see Supplementary file – table 1 & 2)"
  
  And (page 15, lines 311 – 320):
  
  "[…] Furthermore, given that we only recorded polysomnography for the adults in the sleep first group and that adolescents in the wake first group showed enhanced task proficiency at the time point of the sleep retention interval due to additional training (Figure 3 – figure supplement 2A), we only considered adolescents and adults of the sleep-first group to ensure a similar level of juggling experience adolescents and adults of the sleep-first group to ensure a similar level of juggling experience (for summary statistics of sleep architecture and SO and spindle events of subjects that entered the correlational analyses see Supplementary file – table 6). Notably, we found no differences in electrophysiological parameters (i.e. coupling strength, event detection) between the adolescents of the wake first and sleep first group (Figure 3 – figure supplement 2B & Supplementary file – table 7)."
  
  4) To allow a more comprehensive assessment of the underlying data information with regards to general sleep descriptives (minutes, per cent of time spent in different sleep stages, overall sleep time etc.) as well as related to SOs, spindles and coupled events (e.g. number, density etc.) would be needed.
  
  We agree with the reviewers that additional information about sleep architecture and SO as well as sleep spindle characteristics are needed for a more comprehensive assessment of our data. We now added summary tables for sleep architecture and SO/spindle event descriptive measures for the whole sample (Table R4) and for the sleep first groups that we used for our correlational analyses (Table R5) to the supplementary material in the updated manuscript. It is important to note, that due to the longer sleep opportunity of adolescents that we provided to accommodate the overall higher sleep need in younger participants, adolescents and adults differed in most general sleep architecture markers and SO as well as sleep spindle descriptive measures. In addition, changes in sleep architecture are prominent during the maturational phase from adolescence to adulthood, which might introduce additional variance between the two age groups.
  
  Table R4. Summary of sleep architecture and SO/spindle event descriptive measures (at electrode C4) of adolescents and adults across the whole sample (mean ± standard deviation) in the learning night. Independent t-tests were used for comparisons
  
  Table R5. Summary of sleep architecture and SO/spindle event descriptive measures (at electrode C4) of adolescents and adults in the sleep first group (mean ± standard deviation) in the learning night. Independent t-tests were used for comparisons
  
  In order to ensure that our correlational analyses are not driven by these systematic differences between the two age groups, we used cluster-corrected partial correlations to control for sleep architecture markers (Figure R7) and SO/spindle descriptive measurements (Figure R8A). Critically, none of these possible confounders changed the pattern of our initial correlational analyses of coupling strength and task proficiency/learning curve. Additionally, we also controlled for differences in spindle event number by using a bootstrapped resampling approach. We randomly drew 200 spindle events in 100 iterations and subsequently recalculated the coupling strength for each subject. We found that resampled values and our original observation of coupling strength are almost perfectly correlated, indicating that differences in event number are unlikely to have an impact on coupling strength as long as there are at least 200 events (Figure R8B). Combined these analyses demonstrate that our correlations between coupling strength and behavior are not influenced by the reported differences in sleep architecture and SO/spindle descriptive measures.
  
  Figure 7R
  
  Summary of cluster-corrected partial correlations of coupling strength with task proficiency (left) and learning curve (right) controlling for possible confounding factors. Asterisks indicate location of the detected cluster. The pattern of initial results remained highly stable.
  
  Figure R8
  
  (A) Summary of cluster-corrected partial correlations of coupling strength with task proficiency (left) and learning curve (right) controlling SO/spindle descriptive measures at critical electrode C4. Asterisks indicate location of the detected cluster. The pattern of initial results remained highly stable. (B) Spearman correlation between resampled coupling strength (N = 200, 100 iterations) and original observation of coupling strength for adolescents (red circles) and adults (black diamonds), indicating that coupling strength is not influenced by spindle event number if at least 200 events are present. Grey-shaded area indicates 95% confidence intervals of the robust trend line.
  
  We now provide general sleep descriptives (Table R4 & R5) in the revised version of the manuscript as Supplementary file – table 2 & table 6. These data are referred to in the results section (page 6, lines 101 – 105):
  
  "Polysomnography (PSG) was recorded during an adaptation night and during the respective sleep retention interval (i.e. learning night) except for the adult wake-first group (for sleep architecture descriptive parameters of the adaptation night and learning night as well as for adolescents and adults see Supplementary file – table 1 & 2)."
  
  And (page 15, lines 311 – 318):
  
  "Furthermore, given that we only recorded polysomnography for the adults in the sleep first group and that adolescents in the wake first group showed enhanced task proficiency at the time point of the sleep retention interval due to additional training (Figure 3 – figure supplement 2A), we only considered adolescents and adults of the sleep-first group to ensure a similar level of juggling experience (for summary statistics of sleep architecture and SO and spindle events of subjects that entered the correlational analyses see Supplementary file – table 6)."
  
  The additional control analyses (Figure R7 & R8) are also now added to the revised manuscript as Figure 3 – figure supplement 3 & 4 in the results section (page 16, lines 356 – 360):
  
  "For a summary of the reported cluster-corrected partial correlations as well as analyses controlling for differences in sleep architecture see Figure 3 – figure supplement 3. Further, we also confirmed that our correlations are not influenced by individual differences in SO and spindle event parameters (Figure 3 – figure supplement 4)."
  
  5) The authors used a partial correlations to rule out that age drove the relationship between coupling strength, learning curve and task proficiency. It seems like this analysis was done specifically for electrode C4, after having already established that coupling strength at electrode C4 correlates in general with changes in the learning curve and task proficiency. I think the claim that results were not driven by age as confounding factor would be stronger if the authors used a cluster-corrected partial correlation in the first place (just as in the main analysis).
  
  The reviewers are correct that initially we only conducted the partial correlation for electrode C4. Following the reviewers suggestion we now additionally computed cluster-corrected partial correlations similar to our main analysis. Like in our original analyses, we found a significant positive central cluster (Figure R6A, mean rho = 0.40, p = 0.017) showing that higher coupling strength related to better task proficiency after sleep and a negative cluster-corrected correlation at C4 showing that higher coupling strength was related to flatter learning curves after sleep (Figure R6B, rho = -0.47, p = 0.049) also when controlling for age.
  
  Figure R6
  
  (A) Cluster-corrected partial correlation of individual coupling strength in the learning night and overnight change in task proficiency (post – pre retention) collapsed across adolescents and adults, controlling for age. Asterisks indicate cluster-corrected two-sided p < 0.05. A similar significant cluster to the original analysis (Figure 4A) emerged comprising electrodes Cz and C4. (B) Same conventions as in A. Like in the original analysis (Figure 4B) a negative correlation between coupling strength at C4 and learning curve change survived cluster-corrected partial correlations when controlling for age.
  
  We now always report cluster-corrected partial correlations when controlling for possible confounding variables in the updated version of the manuscript (also see answer to issue #7). A summary of all computed partial correlations including Figure R6 can now be found as Figure 3 – figure supplement 3 & 4 in the revised manuscript.
  
  Specifically we now state in the results section (page 16 – 17, lines 347 – 360):
  
  "To rule out age as a confounding factor that could drive the relationship between coupling strength, learning curve and task proficiency in the mixed sample, we used cluster-corrected partial correlations to confirm their independence of age differences (task proficiency: mean rho = 0.40, p = 0.017; learning curve: rhos = -0.47, p = 0.049). Additionally, given that we found that juggling performance could underlie a circadian modulation we controlled for individual differences in alertness between subjects due to having just slept. We partialed out the mean PVT reaction time before the juggling performance test after sleep from the original analyses and found that our results remained stable (task proficiency: mean rho = 0.37, p = 0.025; learning curve: rhos = -0.49, p = 0.040). For a summary of the reported cluster-corrected partial correlations as well as analyses controlling for differences in sleep architecture see Figure 3 – figure supplement 3. Further, we also confirmed that our correlations are not influenced by individual differences in SO and spindle event parameters (Figure 3 – figure supplement 4)."
  
  And in the methods section (page 35, lines 813 – 814):
  
  "To control for possible confounding factors we computed cluster-corrected partial rank correlations (Figure 3 – figure supplement 3 and 4)."
  
  References
  
  Aru, J., Aru, J., Priesemann, V., Wibral, M., Lana, L., Pipa, G., Singer, W. & Vicente, R. (2015) Untangling cross-frequency coupling in neuroscience. Curr Opin Neurobiol, 31, 51-61.
  
  Bothe, K., Hirschauer, F., Wiesinger, H. P., Edfelder, J., Gruber, G., Birklbauer, J. & Hoedlmoser, K. (2019) The impact of sleep on complex gross-motor adaptation in adolescents. Journal of Sleep Research, 28(4).
  
  Bothe, K., Hirschauer, F., Wiesinger, H. P., Edfelder, J. M., Gruber, G., Hoedlmoser, K. & Birklbauer, J. (2020) Gross motor adaptation benefits from sleep after training. J Sleep Res, 29(5), e12961.
  
  Campbell, I. G. & Feinberg, I. (2016) Maturational Patterns of Sigma Frequency Power Across Childhood and Adolescence: A Longitudinal Study. Sleep, 39(1), 193-201.
  
  Dayan, E. & Cohen, L. G. (2011) Neuroplasticity subserving motor skill learning. Neuron, 72(3), 443-54. De Gennaro, L. & Ferrara, M. (2003) Sleep spindles: an overview. Sleep Med Rev, 7(5), 423-40.
  
  De Gennaro, L., Ferrara, M., Vecchio, F., Curcio, G. & Bertini, M. (2005) An electroencephalographic fingerprint of human sleep. Neuroimage, 26(1), 114-22.
  
  Dinges, D. F., Pack, F., Williams, K., Gillen, K. A., Powell, J. W., Ott, G. E., Aptowicz, C. & Pack, A. I. (1997) Cumulative sleepiness, mood disturbance, and psychomotor vigilance performance decrements during a week of sleep restricted to 4-5 hours per night. Sleep, 20(4), 267-77.
  
  Dinges, D. F. & Powell, J. W. (1985) Microcomputer Analyses of Performance on a Portable, Simple Visual Rt Task during Sustained Operations. Behavior Research Methods Instruments & Computers, 17(6), 652-655.
  
  Eichenlaub, J. B., Biswal, S., Peled, N., Rivilis, N., Golby, A. J., Lee, J. W., Westover, M. B., Halgren, E. & Cash, S. S. (2020) Reactivation of Motor-Related Gamma Activity in Human NREM Sleep. Front Neurosci, 14, 449.
  
  Feinberg, I. & Campbell, I. G. (2013) Longitudinal sleep EEG trajectories indicate complex patterns of adolescent brain maturation. American Journal of Physiology - Regulatory, Integrative and Comparative Physiology, 304(4), R296-303.
  
  Hahn, M., Heib, D., Schabus, M., Hoedlmoser, K. & Helfrich, R. F. (2020) Slow oscillation-spindle coupling predicts enhanced memory formation from childhood to adolescence. Elife, 9.
  
  Helfrich, R. F., Lendner, J. D. & Knight, R. T. (2021) Aperiodic sleep networks promote memory consolidation. Trends Cogn Sci.
  
  Helfrich, R. F., Lendner, J. D., Mander, B. A., Guillen, H., Paff, M., Mnatsakanyan, L., Vadera, S., Walker, M. P., Lin, J. J. & T., K. R. (2019) Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572.
  
  Helfrich, R. F., Mander, B. A., Jagust, W. J., Knight, R. T. & Walker, M. P. (2018) Old Brains Come Uncoupled in Sleep: Slow Wave-Spindle Synchrony, Brain Atrophy, and Forgetting. Neuron, 97(1), 221-230 e4.
  
  Killgore, W. D. (2010) Effects of sleep deprivation on cognition. Prog Brain Res, 185, 105-29.
  
  Kurth, S., Jenni, O. G., Riedner, B. A., Tononi, G., Carskadon, M. A. & Huber, R. (2010) Characteristics of sleep slow waves in children and adolescents. Sleep, 33(4), 475-80.
  
  Maris, E. & Oostenveld, R. (2007) Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods, 164(1), 177-90.
  
  Muehlroth, B. E., Sander, M. C., Fandakova, Y., Grandy, T. H., Rasch, B., Shing, Y. L. & Werkle-Bergner, M. (2019) Precise Slow Oscillation-Spindle Coupling Promotes Memory Consolidation in Younger and Older Adults. Sci Rep, 9(1), 1940.
  
  Muehlroth, B. E. & Werkle-Bergner, M. (2020) Understanding the interplay of sleep and aging: Methodological challenges. Psychophysiology, 57(3), e13523.
  
  Niethard, N., Ngo, H. V. V., Ehrlich, I. & Born, J. (2018) Cortical circuit activity underlying sleep slow oscillations and spindles. Proceedings of the National Academy of Sciences of the United States of America, 115(39), E9220-E9229.
  
  Purcell, S. M., Manoach, D. S., Demanuele, C., Cade, B. E., Mariani, S., Cox, R., Panagiotaropoulou, G., Saxena, R., Pan, J. Q., Smoller, J. W., Redline, S. & Stickgold, R. (2017) Characterizing sleep spindles in 11,630 individuals from the National Sleep Research Resource. Nature Communications, 8, 15930.
  
  Van Dongen, H. P., Maislin, G., Mullington, J. M. & Dinges, D. F. (2003) The cumulative cost of additional wakefulness: dose-response effects on neurobehavioral functions and sleep physiology from chronic sleep restriction and total sleep deprivation. Sleep, 26(2), 117-26.
  
  Wilhelm, I., Metzkow-Meszaros, M., Knapp, S. & Born, J. (2012) Sleep-dependent consolidation of procedural motor memories in children and adults: the pre-sleep level of performance matters. Developmental Science, 15(4), 506-15.
  
  Winer, J. R., Mander, B. A., Helfrich, R. F., Maass, A., Harrison, T. M., Baker, S. L., Knight, R. T., Jagust, W. J. & Walker, M. P. (2019) Sleep as a potential biomarker of tau and beta-amyloid burden in the human brain. J Neurosci.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.21.427606v1
www.biorxiv.org www.biorxiv.org

Early life experience sets hard limits on motor learning as evidenced from artificial arm use

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  Maimon-Mor et al. examined the control of reaching movement of one-handers, who were born with a partial arm, and amputees, who lost their arm in adulthood. The authors hypothesized that since one-handers started using their artificial arm earlier in life then amputees, they are expected to exhibit better motor control, as measured by point-to-point reaching accuracy. Surprisingly, they found the opposite, that the reaching accuracy of one-handers is worse than that of amputees (and control with their non-dominant hand). This deficit in motor control was reflected in an increase in motor noise rather than consistent motor biases.
  
  Strengths:
  
  I found the paper in general very well and clearly written.
  
  The authors provide detailed analyses to examine various possible factors underlying deficits in reaching movements in one-handers and amputees, including age at which participants first used an artificial arm, current usage of the arm, performance in hand localization tasks, and statistical methods that control for potential confounding factors.
  
  The results that one handers, who start using the artificial arm at early age, show worse motor control than amputees, who typically start using the arm during adulthood, are surprising and interesting. Also intriguing are the results that reaching accuracy is negatively correlated with the time of limbless experience in both groups. These results suggest that there is a plasticity window that is not anchored to a certain age, but rather to some interference (perhaps) from the time without the use of artificial arm. In one-handers these two time intervals are confounded by one another, but the amputees allow to separate them. I think that the results have implications for understanding plasticity aspects of acquiring skills for using artificial limbs.
  
  Weaknesses:
  
  While I found that one of the main conclusion from the paper is that the main factor that is related to increased motor noise is the time spent without the artificial arm, it felt that this was not emphasized as such. These results are not mentioned in the abstract and the correlation for amputees is not shown in a figure.
  
  We thank the reviewer for their comment. While it is true that motor noise correlated with time of limbless experience in both groups, we were hesitant to highlight the results found in amputees, considering the small number of participants, and lack of converging evidence (e.g., contrary to the congenital group, we did not find a strong main effect). For these reasons, we have chosen to include it in the manuscript but not highlight it or base our main conclusions on it. Following the reviewer’s comment, the correlation of the amputees’ data is now visualised in Figure 3. Moreover, while the behavioural correlation might be similar in both groups, from a neural standpoint, the limbless experience of a toddler with a developing brain is qualitatively different to that of an adult, with a fully developed brain, who has lost a limb. As such, we were hesitant to link these two findings into a single framework, however in the revised manuscript we highlight this tentative link.
  
  Discussion (4th paragraph):
  
  “In both the congenital and acquired groups, artificial arm reaching motor noise correlated with the amount of time they spent using only their residual limb. It is therefore tempting to link these two results under a unifying interpretation; however, this requires further research, considering the neural differences between the two groups.”
  
  Figure 3. Years of limbless experience before first artificial arm use in the acquired group. (A) Relationship between years of limbless experience and (A) artificial arm reaching errors or (B) artificial arm motor noise in the acquired group.
  
  The suggested mechanism of a deficit in visuomotor integration is not clear, and whether the results indeed point to this hypothesis. The results of the reaching task show that the one-handers exhibit higher motor noise and initial error direction than amputees. The results of the 2D localization task (the same as the standard reaching task but without visual feedback) show no difference in errors between the groups. First, it is not clear how the findings of the 2D localization task are in line with the results that one-handers show larger initial directional errors.
  
  We fully take on the reviewer’s comment regarding the vague use of the term visuomotor integration. In the revised manuscript, we have opted instead for a much broader term, suggesting a deficit in visual-based corrective movements, considering we are limited in our ability to infer the specific underlying mechanism from our result. We have also made changes to the abstract based on the reviewer’s comment (see below).
  
  With regards to discussing how the various results fit together, in the revised manuscript, these are now discussed more at length. In short, in the 2D localisation task (reaching without visual feedback), participants were not instructed to perform fast ballistic movements. Instead, participants were instructed that they could perform movements to correct for their initial aiming error (using proprioception). Together with the similar performance observed for the proprioceptive task, this strengthens our suggestion that the deficit in the congenital group is triggered by visual-driven corrections. These various considerations are now detailed as follows:
  
  Abstract:
  
  “Since we found no group differences when reaching without visual feedback, we suggest that the ability to perform efficient visually-based corrective movements, is highly dependent on either biological or artificial arm experience at a very young age.”
  
  Result (section 7, 1st paragraph):
  
  “From these results, we infer that early-life experience relates to a suboptimal ability to reduce the system’s inherent noise, and that this is possibly not related to the noise generated by the execution of the initial motor plan. Early life experience might therefore relate to better use of visual feedback in performing corrective movements. The continuous integration of visual and sensory input is at the heart of visually- driven corrective movements. Therefore, one possibility is that limited early life experience, results in suboptimal integration of information within the sensorimotor system.”
  
  Discussion (2nd paragraph):
  
  “When performing reaching movements without visual feedback (2D localisation task), the congenital group did not differ from the acquired or control group. This begs the question, if the congenital group has a deficit in motor planning why was it not evident in this task as well? In the 2D localisation task, unlike the main task, participants were allowed to make corrective movements. While they did not receive visual feedback, the proprioceptive and somatosensory feedback from the residual limb appears to be enough to allow them to correct for initial reaching errors and perform at the same level as the acquired and control group. Moreover, we did not find strong evidence for an impaired sense of localisation of either the residual or the artificial arm in the congenital group. As such, by elimination, our evidence suggests that the process of using visual information to perform corrective movements isn’t as efficient in the congenital group.”
  
  Discussion (2nd paragraph):
  
  “Lack of concurrent visual and motor experience during development might therefore cause a deficit in the ability to form the computational substrates and thus to efficiently use visual information in performing corrective movements.”
  
  Discussion (last paragraph):
  
  “By the process of elimination, we have nominated suboptimal visual feedback-based corrections to be the most likely cause underlying this motor deficit.”
  
  Second, I think that these results suggest that the deficiency in one-handers is with feedback responses rather than feedforward. This may also be supported by the correlation with age: early age is correlated with less end-point motor noise, rather than initial directional error. Analyses of feedback correction might help shedding more light on the mechanism. The authors mention that the participants were asked to avoid doing corrective movement and imposed a limit of 1 sec per reach to encourage that. But it is not clear whether participants actually followed these instructions. 1 sec could be enough time to allow feedback responses, especially for small amplitude movements (e.g., <10 cm).
  
  Please see below our response to the feedback correction analysis suggestion. Regarding corrective movements, we had the same concern as the reviewer which led us to use hand velocity data to identify first movement termination. We apologise if the experimental design and pre-processing procedures were not clear.
  
  In short, a 1 sec trial duration was imposed on all trials to generate a sense of time- pressure and encourage participants to perform fast ballistic movements. As we were worried that participants might still perform secondary corrective movements within this 1 sec window, for each trial, we used the hand velocity profile to identify the end of the first movement. Below, we have plotted the arm velocity from a single trial to illustrate this procedure. For this trial, the timepoint indicated by the circular marker has been identified as the time of the end of the first movement (See Methods for further information). For each trial, endpoint location was defined as the location of the arm at the movement termination timepoint defined by the kinematic data and not the endpoint at the 1 sec timepoint. It is worth noting that performing the same analysis using the end- points recorded at the 1 sec timepoint did not generate different statistical results.
  
  This has now been further clarified in the text.
  
  Results (section 1, 1st paragraph):
  
  “Reaching performance was evaluated by measuring the mean absolute error participants made across all targets (see Figure 1C). The absolute error refers to the distance from the cursor’s position at the end of the first reach (endpoint) to the centre of the target in each trial. The endpoint of each trial was set as the arm location at the end of the first reaching movement, identified using the trial’s kinematic data (See Methods).”
  
  Methods (section: Data processing and analysis – main task):
  
  “Within the 1 sec movement time constraint, in some trials, participants still performed secondary corrective movements. We therefore used the tangential arm velocities to identify the end of the first reach in each trial (i.e., movement termination).”
  
  Reviewer #2:
  
  This is a broad and ambitious study that is fairly unique in scope - the questions it seek to answer are difficult to answer scientifically, and yet the depth of the questions it seeks to answer and the framework in which it is founded seem out of place in a clinical journal.
  
  And yet, as a scientist and clinician, I found myself objecting to the claims of the authors, only have them to address my objection in the very next section. The results are surprising, but compelling - the authors have done an excellent job of untangling a very complicated question, and they have tested (for our field) a large number of subjects.
  
  The main two results of the paper, from my perspective, are as follows:
  
  1) Persons with an amputation can form better models of new environments, such as manipulandums, than can those with congenital deficiencies. This result is interesting because a) the task did not depend on significant use of the device (they were able to use their intact musculature for the reaching-based task), and b) the results were not influenced by the devices used by the subjects (cosmetic, body-powered, or myoelectric).
  
  2) Persons with congenital deficiency fit earlier in life had less error than those fit later in life.
  
  Taken together, these results suggest that during early childhood the brain is better able to develop the foundation necessary to develop internal models and that if this is deprived early in childhood, it cannot be regained later in life - even if subjects have MORE experience. (E.g., those with congenital deficiencies had more experience using their prosthetic arm than those with amputation, and yet scored worse).
  
  The questions analyzed by the researchers are excellent and the statistical methods are generally appropriate. My only minor concern is that the authors occasionally infer that two groups are the same when a large p-value is reported, whereas large p-values do not convey that the groups are the same; only that they cannot be proven to be different. The authors would need to use a technique such as ICC or analysis of similarities to prove the groups are the same.
  
  We appreciate the reviewer’s concern about inferring the null from classical frequentist statistics. In this manuscript, we have opted to using Bayesian statistics as a measure of testing the significance of similarity across groups (See Methods: Statistical analysis) as opposed to the frequentist methods suggested by the reviewer. This approach is equivalent to the ones proposed by the reviewer and are widely used in our field. A Bayesian Factor (BF) smaller than 0.33 is regarded as sufficient evidence for supporting the null hypothesis that is, that there are no differences between the groups.
  
  This approach is described in detail in the methods and is introduced in the first section of the results as well.
  
  Results (1st section 2nd paragraph):
  
  “To further explore the non-significant performance difference between amputees and controls, we used a Bayesian approach (Rouder et al., 2009), that allows for testing of similarities between groups (the null hypothesis). In this analysis, the smaller effect size of the two reported here (1.39) was inputted as the Cauchy prior width. The resulting Bayesian Factor (BF10=0.28) provided moderate support to the null hypothesis (i.e., smaller than 0.33).”
  
  Methods (Statistical analysis section):
  
  “In parametric analyses (ANCOVA, ANOVA, Pearson correlations), where the frequentist approach yielded a non-significant p-value, a parallel Bayesian approach was used and Bayes Factors (BF) were reported (Morey & Rouder, 2015; Rouder et al., 2009, 2012, 2016). A BF<0.33 is interpreted as support for the null-hypothesis, BF > 3 is interpreted as support for the alternative hypothesis (Dienes, 2014). In
  
  Bayesian ANOVAs and ANCOVA’s, the inclusion Bayes Factor of an effect (BFIncl) is reported, reflecting that the data is X (BF) times more likely under the models that include the effect than under the models without this predictor. When using a Bayesian t-test, a Cauchy prior width of 1.39 was used, this was based on the effect size of the main task, when comparing artificial arm reaches of amputees and one- handers. Therefore, the null hypothesis in these cases would be there is no effect as large as the effect observed in the main task.”
  
  Following the reviewer’s comment, we have carefully scanned through the manuscript to make sure no equivalence claims are made without the support of a significant BF. In one instance that has been the case and has been rectified.
  
  Results (3rd section, 2nd paragraph):
  
  “We compared artificial arm and nondominant arm biases (distance from the centre of the endpoint to the target) across groups, using intact arm biases as a covariate. The ANCOVA resulted in no significant (inconclusive) group differences (F(2,47)=2.40, p=0.1, BFIncl=0.72; see Figure 2A).”
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.26.428281v1
www.biorxiv.org www.biorxiv.org

New submission 16/09/2022, 11:10:25

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Redox signaling is a dynamic and concerted orchestra of inter-connected cellular pathways. There is always a debate whether ROS (reactive oxygen species) could be a friend or foe. Continued research is needed to dissect out how ROS generation and progression could diverge in physiological versus pathophysiological states. Similarly, there are several paradoxical studies (both animal and human) wherein exercise health benefits were reported to be accompanied by increases in ROS generation. It is in this context, that the present manuscript deserves attention.
  
  Utilizing the in-vitro studies as well as mice model work, this manuscript illustrates the different regulatory mechanisms of exercise and antioxidant intervention on redox balance and blood glucose level in diabetes. The manuscript does have some limitations and might need additional experiments and explanation.
  
  The authors should consider addressing the following comments with additional experiments.
  
  1) Although hepatic AMPK activation appears to be a central signaling element for the benefits of moderate exercise and glucose control, additional signals (on hepatic tissue) related to hepatic gluconeogenesis such as Forkhead box O1 (FoxO1), phosphoenolpyruvate carboxykinase (PEPCK), and GLUT2 needs to be profiled to present a holistic approach. Authors should consider this and revise the manuscript.
  
  We appreciate the constructive suggestion. Besides glycolysis, gluconeogenesis and glucose uptake are critical in maintaining liver and blood glucose homeostasis.
  
  FoxO1 has been tightly linked with hepatic gluconeogenesis through inhibiting the transcription of gluconeogenesis-related PEPCK and G6Pase expression (1, 2). Herein, we found the expression of FoxO1 increased in the diabetic group but reduced in the CE, IE and EE groups (Fig. X1A, Fig.5E-F in manuscript). Meanwhile, the mRNA level of Pepck and G6PC (one of the three G6Pase catalytic-subunit-encoding genes) also decreased in the CE, IE, and EE groups (Fig. X1B-1C, Fig.5H-I in manuscript). These results indicates that these three modes of exercise all inhibited gluconeogenesis through down-regulating FoxO1.
  
  For the glucose uptake, we detected the protein expression of GLUT2 in the liver tissue. Glut2 helps in the uptake of glucose by the hepatocytes for glycolysis and glycogenesis. Accordingly, we found GLUT2,a glucose sensor in liver, was up-regulated in diabetic rats, but down-regulated by the CE and IE intervention. However, GLUT2 didn’t decrease in the EE group, which is consistent with the results of the unimproved blood glucose by EE intervention (Figure X1A, Fig.5E and 5G in manuscript).
  
  Taken together, moderate exercise could benefits glucose control through increasing glycolysis and decreasing gluconeogenesis. We added this part in Page 9 line 251-263 and Figure 5E-5I in this version.
  
  Figure X1. A. Representative protein level and quantitative analysis of FOXO1 (82 kDa), GLUT2 (60-70 kDa) and Actin (45 kDa) in the rats in the Ctl, T2D, T2D + CE, T2D + IE and T2D + EE groups. C-D. Expression of hepatic Pepck and G6PC mRNA in the Ctl, T2D, T2D + CE, T2D + IE and T2D + EE groups were evaluated by real-time PCR analysis. Values represent mean ratios of Pepck and G6PC transcripts normalized to GAPDH transcript levels.
  
  2) Very recently sestrin2 signaling is assumed significant attention in relation to exercise and antioxidant responses. Therefore, authors should profile the sestrin2 levels as it is linked to several targets such as mTOR, AMPK and Sirt1. Additionally, the levels of Nrf2 should be reported as this is the central regulator of the threshold mechanisms of oxidative stress and ROS generation.
  
  We appreciate reviewer’s expert comments. Nrf2 is an important mediator of antioxidant signaling, playing a fundamental role in maintaining the redox homeostasis of the cell. Under unstressed conditions, Nrf2 activity is suppressed by its innate repressor Kelch-like ECH-associated protein 1 (Keap1) (3). With the increase of ROS level in the development of diabetes, Nrf2 was activated to induce the transcription of several antioxidant enzymes (4, 5).
  
  Nrf2 expression level has been reported to increase in HFD mice or diabetic patients (6, 7). It has been found from in vitro studies that NRF2 activation is achieved with acute exposure to high glucose, whereas longer incubation times or oscillating glucose concentration failed to activate Nrf2 (8, 9). These suggest that the increase of ROS in diabetes can cause compensatory upregulation of Nrf2. In our study, we found that Nrf2 increased in diabetic rats, which can further initiate the expression of antioxidant enzymes. As shown in Fig.X2A (Fig.2H-2K in manuscript), Grx and Trx involved in thioredoxin metabolism were up-regulated accordingly like Nrf2. After CE intervention, the level of Nrf2 increased further more (Fig.2E-2F), suggesting that CE intervention could activate antioxidant system to achieve a high-level redox balance. We have added these new results into Figure 2.
  
  On the other hand, the expression level of Sestrin2 and Nrf2 decreased after antioxidant supplement. Our results suggest that the antioxidant treatment improved the diabetes through inhibiting ROS level to achieve a low-level redox balance, but moderate exercise enhanced ROS tolerance to achieve a high-level balance (Fig.X2D-F, Fig.3E-3G in manuscript).
  
  We added the new data in “Page 5 line 147-153 and Page 7 line 183-186” and Figure 2-3 in current version.
  
  Figure X2. A-C. Representative protein level and quantitative analysis of Nrf2 (97 kDa), Sestrin2 (57 kDa) and Actin (45 kDa) in the rats in the Ctl, T2D and T2D + CE groups. D-F. Representative protein level and quantitative analysis of Nrf2 (97 kDa), Sestrin2 (57 kDa) and HSP90 (90 kDa) in the rats in the Ctl, T2D and T2D + APO groups.
  
  3) Authors should discuss the exercise-associated hormesis curve. They should discuss whether moderate exercise could decrease the sensitivity to oxidative stress by altering the bell-shaped dose-response curve.
  
  We thank the reviewer’s valuable comments. According to literatures, Zsolt Radak et al proposed a bell-shaped dose-response curve between normal physiological function and level of ROS in healthy individuals, and suggested that moderate exercise can extend or stretch the levels of ROS while increases the physiological function (10). Our results validated this hypothesis and further proposed that moderate exercise could produce ROS meanwhile increase antioxidant enzyme activity to maintain high level redox balance according to the Bell-shaped curve, whereas excessive exercise would generate a higher level of ROS, leading to reduced physiological function. In this study, we found the state of diabetic individuals is more applicable to the description of a S-shaped curve, due to the high level of oxidative stress and decreased reduction level in diabetic individuals (Fig.8B). With the increase of ROS, the physiological function of diabetic individuals gradually decreases and enters a state of redox imbalance. Moderate exercise shifts the S-shaped curve into a bell-shaped dose-response curve, thus reducing the sensitivity to oxidative stress in diabetic individuals and restoring redox homeostasis. However, with excessive exercise, ROS production increases beyond the threshold range of redox balance, resulting in decreased physiological function (Fig.8B, see the decreasing portion of the bell curve to the right of the apex).
  
  Nevertheless, the antioxidant intervention increased physiological activity by reducing ROS levels in diabetic individuals, restoring a bell-shaped dose-response curve at low level of ROS (Fig.8B). Therefore, redox balance could be achieved either at low level of ROS mediated by antioxidant intervention or at high level of ROS mediated by moderate exercise, both of which were regulated by AMPK activation. Therefore, both high and low levels of redox balance can lead to high physiological function as long as they are in the redox balance threshold range. Then, the activation of AMPK is an important sign of exercise or antioxidant intervention to obtain redox dynamic balance which helps restore physiological function. Accordingly, we speculate that the antioxidant intervention based on moderate exercise might offset the effect of exercise, but antioxidants could be beneficial during excessive exercise. The human study also supports that supplementation with antioxidants may preclude the health-promoting effects of exercise (11). Therefore, personalized intervention with respect to redox balance will be crucial for the effective treatment of diabetes patients.
  
  We added this part into “Discussion” in this version (Page 13-14 line 389-418).
  
  4) It would not be ideal to single-out AMPK as a sole biomarker in this manuscript. Instead, authors should consider AMPK activation and associated signaling in relation to redox balance. This should also be presented in Fig 7.
  
  We thank reviewer’s critical comments. According to the comments, we have discussed the AMPK signaling in the discussion part (Page 13, line 373-384) and added the AMPK signaling in Fig.8A.
  
  Reference:
  
  R. A. Haeusler, K. H. Kaestner, D. Accili, FoxOs function synergistically to promote glucose production. J Biol Chem 285, 35245-35248 (2010).
  
  J. Nakae, T. Kitamura, D. L. Silver, D. Accili, The forkhead transcription factor Foxo1 (Fkhr) confers insulin sensitivity onto glucose-6-phosphatase expression. J Clin Invest 108, 1359-1367 (2001).
  
  M. McMahon, K. Itoh, M. Yamamoto, J. D. Hayes, Keap1-dependent proteasomal degradation of transcription factor Nrf2 contributes to the negative regulation of antioxidant response element-driven gene expression. J Biol Chem 278, 21592-21600 (2003).
  
  R. S. Arnold et al., Hydrogen peroxide mediates the cell growth and transformation caused by the mitogenic oxidase Nox1. Proc Natl Acad Sci U S A 98, 5550-5555 (2001).
  
  J. M. Lee, M. J. Calkins, K. Chan, Y. W. Kan, J. A. Johnson, Identification of the NF-E2-related factor-2-dependent genes conferring protection against oxidative stress in primary cortical astrocytes using oligonucleotide microarray analysis. J Biol Chem 278, 12029-12038 (2003).
  
  T. Jiang et al., The protective role of Nrf2 in streptozotocin-induced diabetic nephropathy. Diabetes 59, 850-860 (2010).
  
  X. H. Wang et al., High Fat Diet-Induced Hepatic 18-Carbon Fatty Acids Accumulation Up-Regulates CYP2A5/CYP2A6 via NF-E2-Related Factor 2. Front Pharmacol 8, 233 (2017).
  
  T. S. Liu et al., Oscillating high glucose enhances oxidative stress and apoptosis in human coronary artery endothelial cells. J Endocrinol Invest 37, 645-651 (2014).
  
  Z. Ungvari et al., Adaptive induction of NF-E2-related factor-2-driven antioxidant genes in endothelial cells in response to hyperglycemia. Am J Physiol Heart Circ Physiol 300, H1133-1140 (2011).
  
  Z. Radak et al., Exercise, oxidants, and antioxidants change the shape of the bell-shaped hormesis curve. Redox Biol 12, 285-290 (2017).
  
  M. Ristow et al., Antioxidants prevent health-promoting effects of physical exercise in humans. Proc Natl Acad Sci U S A 106, 8665-8670 (2009).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.15.491995v1
www.biorxiv.org www.biorxiv.org

New submission 05/10/2022, 12:08:58

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  In one of the most creative eDNA studies I have had the pleasure to review, the authors have taken advantage of an existing program several decades old to address whether insect declines are indeed occurring - an active area of discussion and debate within ecology. Here, they extracted arthropod environmental DNA (eDNA) from pulverized leaf samples collected from different tree species across different habitats. Their aim was to assess the arthropod community composition within the canopies of these trees during the time of collection to assess whether arthropod richness, diversity, and biomass were declining. By utilizing these leaf samples, the greatest shortcoming of assessing arthropod declines - the lack of historical data to compare to - was overcome, and strong timeseries evidence can now be used to inform the discussion. Through their use of eDNA metabarcoding, they were able to determine that richness was not declining, but there was evidence of beta diversity loss due to biotic homogenization occurring across different habitats. Furthermore, their application of qPCR to assess changes in eDNA copy number temporally and associate those changes with changes to arthropod biomass provided support to the argument that arthropod biomass is indeed declining. Taken together, these data add substantial weight to the current discussion regarding how arthropods are being affected in the Anthropocene.
  
  Thank you very much for the positive assessment of our work.
  
  I find the conclusions of the paper to be sound and mostly defensible, though there are some issues to take note of that may undermine these findings.
  
  Firstly, I saw no explanation of the requisite controls for such an experiment. An experiment of this scale should have detailed explanations of the field/equipment controls, extraction controls, and PCR controls to ensure there are no contamination issues that would otherwise undermine the entirety of the study. At one point in the manuscript the presence of controls is mentioned just once, so I surmise they must exist. Trusting such results needs to be taken with caution until such evidence is clearly outlined. Furthermore, the plate layout which includes these controls would help assess the extent of tag-jumping, should the plate plan proposed in Taberlet et al., 2018 be adopted.
  
  Second, without the presence of adequate controls, filtering schemes would be unable to determine whether there were contaminants and also be unable to remove them. This would also prevent samples from being filtered out should there be excessive levels of contamination present. Without such information, it makes it difficult to fully trust the data as presented.
  
  Finally, there is insufficient detail regarding the decontamination procedures of equipment used to prepare the samples (e.g., the cryomil). Without clear explanations of the steps the authors took to ensure samples were handled and prepared correctly, there is yet more concern that there may be unseen problems with the dataset.
  
  We are well aware of the potential issues and consequences of contamination in our work. However, we are also confident that our field and laboratory procedures adequately rule out these issues. We agree with the reviewer that we should expand more on our reasoning. Hence, we have now significantly expanded the Methods section outlining controls and sample purity, particularly under “Tree samples of the German Environmental Specimen Bank – Standardized time series samples stored at ultra-low temperatures” (lines 303-304), “Test for DNA carryover in the cryomill” (lines 448-464) and “Statistical analysis” (lines 570-575).
  
  We ran negative control extractions as well as negative control PCRs with all samples. These controls were sequenced along with all samples and used to explore the effect of experimental contamination. With the exception of a few reads of abundant taxa, these controls were mostly clean. We report this in more detail now in the Methods under “Sequence analysis” (lines 570-575). This suggests that our data are free of experimental contamination or tag jumping issues.
  
  We have also expanded on the avoidance of contamination in our field sampling protocols. The ESB has been set up for monitoring even the tiniest trace amounts of chemicals. Carryover between samples would render the samples useless. Hence, highly clean and standardized protocols are implemented. All samples are only collected with sterilized equipment under sterile conditions. Each piece of equipment is thoroughly decontaminated before sampling.
  
  The cryomill is another potential source of cross-contamination. The mill is disassembled after each sample and thoroughly cleaned. Milled samples have already been tested for chemical carryover, and none was found. We have now added an additional analysis to rule out DNA carryover. We received the milling schedule of samples for the past years. Assuming samples get contaminated by carryover between milling runs, two consecutive samples should show signatures of this carryover. We tested this for singletaxon carryover as well as community-wide beta diversity, but did not find any signal of contamination. This gives us confidence that our samples are very pure. The results of this test are now reported in the manuscript (Suppl. Fig 12 & Suppl. Table 3).
  
  Reviewer #2 (Public Review):
  
  Krehenwinkel et al. investigated the long-term temporal dynamics of arthropod communities using environmental DNA (eDNA) remained in archived leave samples. The authors first developed a method to recover arthropod eDNA from archived leave samples and carefully tested whether the developed method could reasonably reveal the dynamics of arthropod communities where the leave samples originated. Then, using the eDNA method, the authors analyzed 30-year-long well-archived tree leaf samples in Germany and reconstructed the long-term temporal dynamics of arthropod communities associated with the tree species. The reconstructed time series includes several thousand arthropod species belonging to 23 orders, and the authors found interesting patterns in the time series. Contrary to some previous studies, the authors did not find widespread temporal α-diversity (OTU richness and haplotype diversity) declines. Instead, β-diversity among study sites gradually decreased, suggesting that the arthropod communities are more spatially homogenized in recent years. Overall, the authors suggested that the temporal dynamics of arthropod communities may be complex and involve changes in α- and β-diversity and demonstrated the usefulness of their unique eDNA-based approach.
  
  Strengths:
  
  The authors' idea that using eDNA remained in archived leave samples is unique and potentially applicable to other systems. For example, different types of specimens archived in museums may be utilized for reconstructing long-term community dynamics of other organisms, which would be beneficial for understanding and predicting ecosystem dynamics.
  
  A great strength of this work is that the authors very carefully tested their method. For example, the authors tested the effects of powdered leaves input weights, sampling methods, storing methods, PCR primers, and days from last precipitation to sampling on the eDNA metabarcoding results. The results showed that the tested variables did not significantly impact the eDNA metabarcoding results, which convinced me that the proposed method reasonably recovers arthropod eDNA from the archived leaf samples. Furthermore, the authors developed a method that can separately quantify 18S DNA copy numbers of arthropods and plants, which enables the estimations of relative arthropod eDNA copy numbers. While most eDNA studies provide relative abundance only, the DNA copy numbers measured in this study provide valuable information on arthropod community dynamics.
  
  Overall, the authors' idea is excellent, and I believe that the developed eDNA methodology reasonably reconstructed the long-term temporal dynamics of the target organisms, which are major strengths of this study.
  
  Thank you very much for the positive assessment of our work.
  
  Weaknesses:
  
  Although this work has major strengths in the eDNA experimental part, there are concerns in DNA sequence processing and statistical analyses.
  
  Statistical methods to analyze the temporal trend are too simplistic. The methods used in the study did not consider possible autocorrelation and other structures that the eDNA time series might have. It is well known that the applications of simple linear models to time series with autocorrelation structure incorrectly detect a "significant" temporal trend. For example, a linear model can often detect a significant trend even in a random walk time series.
  
  We have now reanalyzed our data controlling for autocorrelation and for non-linear changes of abundance and recover no change to our results. We have added this information to the manuscript under “Statistical analysis” (lines 629-644).
  
  Also, there are some issues regarding the DNA sequence analysis and the subsequent use of the results. For example, read abundance was used in the statistical model, but the read abundance cannot be a proxy for species abundance/biomass. Because the total 18S DNA copy numbers of arthropods were quantified in the study, multiplying the sequence-based relative abundance by the total 18S DNA copy numbers may produce a better proxy of the abundance of arthropods, and the use of such a better proxy would be more appropriate here. In addition, a coverage-based rarefaction enables a more rigorous comparison of diversity (OTU diversity or haplotype diversity) than the readbased rarefaction does.
  
  We did not use read abundance as a proxy for abundance, but used our qPCR approach to measure relative copy number of arthropods. While there are biases to this (see our explanations above), the assay proved very reliable and robust. We thus believe it should indeed provide a rough estimate of biomass. As biomass is very commonly discussed in insect decline (in fact the first study on insect decline entirely relies on biomass; Hallmann et al. 2017), we feel it is important go include a proxy for this as well. However, we also discuss the alternative option that a turnover of diversity is affecting the measured biomass. A pattern of abundance loss for common species has been described in other works on insect decline.
  
  We liked the reviewer’s suggestion to use copy number information to perform abundance-informed rarefaction. We have done this now and added an additional analysis rarefying by copy number/biomass. A parallel analysis using this newly rarefied table was done for the total diversity as well as single species abundance change. Details can be found in the Methods and Results section of the manuscript. However, the result essentially remains the same. Even abundance-informed rarefaction does not lead to a pattern of loss of species richness over time (see “Statistical analysis”).
  
  The overall results are supporting a scenario of no overall loss of species richness over time, but a loss of abundance for common species. And we indeed see the pattern of declining abundance for once-common species in our data, for example the loss of the Green Silver-Line moth, once a very common species in beech canopy (Suppl. Fig. 10). We have added details on this to the Discussion (lines 254-260).
  
  These points may significantly impact the conclusions of this work.
  
  Reviewer #3 (Public Review):
  
  The aim of Weber and colleagues' study was to generate arthropod environmental DNA extracted from a unique 30-year time series of deep-frozen leaf material sampled at 24 German sites, that represent four different land use types. Using this dataset, they explore how the arthropod community has changed through time in these sites, using both conventional metabarcoding to reconstruct the OTUs present, and a new qPCR assay developed to estimate the overall arthropod diversity on the collected material. Overall their results show that while no clear changes in alpha diversity are found, the βdiversity dropped significantly over time in many sites, most notable in the beech forests. Overall I believe their data supports these findings, and thus their conclusion that diversity is becoming homogenized through time is valid.
  
  Thank you for the positive assessment.
  
  While overall I do not doubt the general findings, I have a number of comments. Firstly while I agree this is a very nice study on a unique dataset - other temporal datasets of insects that were used for eDNA studies do exist, and perhaps it would be relevant to put the findings into context (or even the study design) of other work that has been done on such datasets. One example that jumps to my mind is Thomsen et al. 2015 https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2656.12452 but I am sure there are others.
  
  We have expanded the introduction and discussion on this citing this among other studies now (lines 71-72, 276-278).
  
  From a technical point of view, the conclusions of course rely on several assumptions, including (1) that the biomass assay is effective and (2) that the reconstructed levels of OTU diversity are accurate,
  
  With regards to biomass although it is stated in the manuscript that "Relative eDNA copy number should be a predictor for relative biomass ", this is in fact only true if one assumes a number of things, e.g. there is a similar copy number of 18s rDNA per species, similar numbers of mtDNA per cell, a similar number of cells per individual species etc. In this regard, on the positive side, it is gratifying to see that the authors perform a validation assay on 7 mock controls, and these seem to indicate the assay works well. Given how critical this is, I recommend discussing the details of this a bit more, and why the authors are convinced the assay is effective in the main text so that the reader is able to fully decide if they are in agreement. However perhaps on the negative side, I am concerned about the strategy taken to perform the qPCR may have not been ideal. Specifically, the assay is based on nested PCR, where the authors first perform a 15cycle amplification, this product is purified, then put into a subsequent qPCR. Given how both PCR is notorious for introducing amplification biases in general (especially when performed on low levels of DNA), and the fact that nested PCRs are notoriously contamination prone - this approach seems to be asking for trouble. This raises the question - why not just do the qPCR directly on the extracts (one can still dilute the plant DNA 100x prior to qPCR if needed). Further, given the qPCRs were run in triplicate I think the full data (Ct values) for this should be released (as opposed to just stating in the paper that the average values were used). In this way, the readers will be able to judge how replicable the assay was - something I think is critical given how noisy the patterns in Fig S10 seem to be.
  
  We agree with this point, and this is why we do not want to overstate the decline in copy number. This is an additional source of data next to genetic and species diversity. We have added to our discussion of turnover as another potential driver of copy number change (lines 257-260). We have also added text addressing the robustness of the mock community assay (lines 138-141).
  
  However, we are confident of the reliability and robustness of our qPCR assay for the detection of relative arthropod copy number. We performed several validations and optimizations before using the assay. We have added additional details to the manuscript on this (see “Detection of relative arthropod DNA copy number using quantitative PCR”, lines 548-556). We got the idea for the nested qPCR from a study (Tran et al.) showing its high accuracy and reproducibility. We show that our assay has a very high replicability using triplicates of each qPCR, which we will now include in the supplementary data on Dryad. The SD of Ct values is very low (~ 0.1 on average). NTC were run with all qPCRs to rule out contamination as an issue in the experiments. We also find a very high efficiency of the assay. At dilutions far outside the observed copy number in our actual leaf data, we still find the assay to be accurate. We found very comparable abundance changes across our highly taxonomically diverse mock communities. This also suggests that abundance changes are a more likely explanation than simple turnover for the observed drop in copy number. A biomass loss for common species is well in line with recent reports on insect decline. We can also rely on several other mock community studies (Krehenwinkel et al. 2017 & 2019) where we used read abundance of 18S and found it to be a relatively good predictor of relative biomass.
  
  The pattern in Fig. S10 is not really noisy. It just reflects typical population fluctuations for arthropods. Most arthropod taxa undergo very pronounced temporal abundance fluctuations between years.
  
  Next, with regards to the observation that the results reveal an overall decrease in arthropod biomass over time: The authors suggest one alternate to their theory, that the dropping DNA copy number may reflect taxonomic turnover of species with different eDNA shedding rates. Could there be another potential explanation - simply be that leaves are getting denser/larger? Can this be ruled out in some way, e.g. via data on leaf mass through time for these trees? (From this dataset or indeed any other place).
  
  This is a very good point. However, we can rule out this hypothesis, as the ESB performs intensive biometric data analysis. The average leaf weight and water content have not significantly changed in our sites. We have addressed this in the Methods section (see ”Tree samples of the German Environmental Specimen Bank – Standardized time series samples stored at ultra-low temperatures”, lines 308-311).
  
  With regards to estimates of OTU/zOTU diversity. The authors state in the manuscript that zOTUs represent individual haplotypes, thus genetic variation within species. This is only true if they do not represent PCR and/or sequencing errors. Perhaps therefore they would be able to elaborate (for the non-computational/eDNA specialist reader) on why their sequence processing methods rule out this possibility? One very good bit of evidence would be that identical haplotypes for the individual species are found in the replicate PCRs. Or even between different extractions at single locations/timepoints.
  
  We have repeated the analysis of genetic variation with much more stringent filtering criteria (see “Statistical analysis”, lines 611-615). Among other filtering steps, this also includes the use of only those zOTUs that occur in both technical replicates, as suggested by the reviewer. Another reason to make us believe we are dealing with true haplotypic variation here is that haplotypes show geographic variation. E.g., some haplotypes are more abundant in some sites than in others. NUMTS would consistently show a simple correlation in their abundance with the most abundant true haplotype.
  
  With regards to the bigger picture, one thing I found very interesting from a technical point of view is that the authors explored how modifying the mass of plant material used in the extraction affects the overall results, and basically find that using more than 200mg provides no real advantage. In this regard, I draw the authors and readers attention to an excellent paper by Mata et al. (https://onlinelibrary.wiley.com/doi/full/10.1111/mec.14779) - where these authors compare the effect of increasing the amount of bat faeces used in a bat diet metabarcoding study, on the OTUs generated. Essentially Mata and colleagues report that as the amount of faeces increases, the rare taxa (e.g. those found at a low level in a single faeces) get lost - they are simply diluted out by the common taxa (e.g those in all faeces). In contrast, increasing biological replicates (in their case more individual faecal samples) increased diversity. I think these results are relevant in the context of the experiment described in this new manuscript, as they seem to show similar results - there is no benefit of considerably increasing the amount of leaf tissue used. And if so, this seems to point to a general principal of relevance to the design of metabarcoding studies, thus of likely wide interest.
  
  Thank you for this interesting study, which we were not aware of before. The cryomilling is an extremely efficient approach to equally disperse even traces of chemicals in a sample. This has been established for trace chemicals early during the operation of the ESB, but also seems to hold true for eDNA in the samples. We have recently done more replication experiments from different ESB samples (different terrestrial and marine samples for different taxonomic groups) and find that replication of extraction does not provide much more benefit than replication of PCR. Even after 2 replicates, diversity approaches saturation. This can be seen in the plot below, which shows recovered eDNA diversity for different ESB samples and different taxonomic groups from 1-4 replicates. A single extract of a small volume contains DNA from nearly all taxa in the community. Rare taxa can be enriched with more PCR replicates.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.27.489699v1
www.biorxiv.org www.biorxiv.org

New submission 06/08/2023, 14:14:29

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author response
  
  Reviewer #1 (Public Review):
  
  This careful study reports the importance of Rab12 for Parkinson's disease associated LRRK2 kinase activity in cells. The authors carried out a targeted siRNA screen of Rab substrates and found lower pRab10 levels in cells depleted of Rab12. It has previously been reported that LLOMe treatment of cells breaks lysosomes and with time, leads to major activation of LRRK2 kinase. Here they show that LLOMe-induced kinase activation requires Rab12 and does not require Rab12 phosphorylation to show the effect.
  
  We thank the reviewer for their comments regarding the carefulness and importance of our work and for their specific feedback which has substantially improved our revised manuscript.
  
  1) Throughout the text, the authors claim that "Rab12 is required for LRRK2 dependent phosphorylation" (Page 4 line 78; Page 9 line 153; Page 22 line 421). This is not correct according to Figure 1 Figure Supp 1B - there is still pRab10. It is correct only in relation to the LLOMe activation. Please correct this error.
  
  We appreciate the reviewer’s comment around the requirement of Rab12 for LRRK2-dependent phosphorylation of Rab10 and question regarding whether this is relevant under baseline conditions or only in relation to LLOMe activation. Using our MSD-based assay to quantify pT73 Rab10 levels under basal conditions, we observed a similar reduction in Rab10 phosphorylation when we knockdown Rab12 as we also observed with LRRK2 knockdown (Figure 1A). Further, we see comparable reduction in Rab10 phosphorylation in RAB12 KO cells as that observed in LRRK2 KO cells using our MSD-based assay (Figure 2A and B). Based on this data, we believe Rab12 is a key regulator of LRRK2 activation under basal conditions without additional lysosomal damage. However, as the reviewer noted, we do observe some residual Rab10 phosphorylation upon Rab12 knockdown when assessed by western blot analysis (Figure 1D and Figure 1- figure supplement 1). A similar signal is observed upon LRRK2 knockdown, which may suggest that some small amount of Rab10 phosphorylation may be mediated by another kinase in this cell model. Nevertheless, we appreciate this reviewer’s point and have therefore modified the text to remove any reference to Rab12 being required for LRRK2-dependent Rab phosphorylation and now instead refer to Rab12 as a regulator of LRRK2 activity.
  
  As noted by the reviewer, our data does suggest that Rab12 is required for the increase in Rab10 phosphorylation observed following LLOMe treatment to elicit lysosomal damage, and we now refer to this appropriately throughout the text.
  
  2) The authors conclude that Rab12 recruitment precedes that of LRRK2 but the rate of recruitment (slopes of curves in 3F and G) is actually faster for LRRK2 than for Rab12 with no proof that Rab12 is faster-please modify the text-it looks more like coordinated recruitment.
  
  The reviewer raises an excellent point regarding our ability to delineate whether Rab12 recruitment precedes that of LRRK2 on lysosomes following LLOMe treatment. As noted by the reviewer, we do see both the recruitment of Rab12 and LRRK2 to lysosomes increase on a similar timescale, so we cannot truly resolve whether Rab12 recruitment precedes LRRK2 recruitment in our studies. Based on this, we have modified the text to emphasize that this data supports coordinated recruitment, as suggested, and we have further removed any mention of Rab12 preceding LRRK2. The specific change is as follows “Rab12 colocalization with LRRK2 increased over time following LLOMe treatment, supporting potential coordinated recruitment of these proteins to lysosomes upon damage (Figure 3I). Together, these data demonstrate that Rab12 and LRRK2 both associate with lysosomes following membrane rupture.” and can be found on lines 460-463 of the updated manuscript.
  
  3) The title is misleading because the authors do not show that Rab12 promotes LRRK2 membrane association. This would require Rab12 to be sufficient to localize LRRK2 to a mislocalized Rab12. The authors DO show that Rab12 is needed for the massive LLOME activation at lysosomes. Please re-word the title.
  
  To address the reviewer’s concern regarding the title of our manuscript, we have modified the title from “Rab12 regulates LRRK2 activity by promoting its localization to lysosomes” to “Rab12 regulates LRRK2 activity by facilitating its localization to lysosomes” to soften the language around the sufficiency of Rab12 in regulating the localization of LRRK2 to lysosomes. We show that Rab12 deletion significantly reduces LRRK2 activity (as assessed by Rab10 phosphorylation on lysosomes) and significantly increases the localization of LRRK2 to lysosomes upon lysosomal damage. The updated title better reflects the regulatory role of Rab12 in modulating LRRK2 activity, and we thank the reviewer for their suggestion to modify this accordingly.
  
  Reviewer #2 (Public Review):
  
  This study shows that rab12 has a role in the phosphorylation of rab10 by LRRK2. Many publications have previously focused on the phosphorylation targets of LRRK2 and the significance of many remains unclear, but the study of LRRK2 activation has mostly focused on the role of disease-associated mutations (in LRRK2 and VPS35) and rab29. The work is performed entirely in an alveolar lung cell line, limiting relevance for the nervous system. Nonetheless, the authors take advantage of this simplified system to explore the mechanism by which rab12 activates LRRK2. In general, the work is performed very carefully with appropriate controls, excluding trivial explanations for the results, but there are several serious problems with the experiments and in particular the interpretation.
  
  We appreciate the reviewer’s comments regarding the rigor of our work and the potential impact of our studies to address a key unanswered question in the field regarding the mechanisms by which LRRK2 activation is mediated. Our studies focused on the A549 cell model given its high endogenous expression of LRRK2 and Rab10, and this cell line provided a simple system to investigate the mechanism and impact of Rab12-dependent regulation of LRRK2 activity. We agree with the reviewer that future studies are warranted to understand whether similar Rab12-dependent regulation of LRRK2 occurs in relevant CNS cell types.
  
  First, the authors note that rab29 appears to have a smaller or no effect when knocked down in these cells. However, the quantitation (Fig1-S1A) shows a much less significant knockdown of rab29 than rab12, so it would be important to repeat this with better knockdown or preferably a KO (by CRISPR) before making this conclusion. And the relationship to rab29 is important, so if a better KD or KO shows an effect, it would be important to assess by knocking down rab12 in the rab29 KO background.
  
  The reviewer raises a good point regarding the importance of confirming that loss of Rab29 has no effect on Rab10 phosphorylation. To address potential concerns about insufficient Rab29 knockdown, we measured the levels of pT73 Rab10 in RAB29 KO A549 cells by MSD-based analysis. RAB29 deletion had no effect on Rab10 phosphorylation, confirming findings from our RAB siRNA screen and the observations of Dario Alessi’s group reported previously (Kalogeropulou et al Biochem J 2020; PMID: 33135724). We have included this new data into our updated manuscript in Figure 1- figure supplement 1 and comment on it on page 6 in the updated Results section.
  
  Secondly, the knockdown of rab12 generally has a strong effect on the phosphorylation of the LRRK2 substrate rab10 but I could not find an experiment that shows whether rab12 has any effect on the residual phosphorylation of rab10 in the LRRK2 KO. There is not much phosphorylation left in the absence of LRRK2 but maybe this depends on rab12 just as much as in cells with LRRK2 and rab12 is operating independently of LRRK2, either through a different kinase or simply by making rab10 more available for phosphorylation. The epistasis experiment is crucial to address this possibility. To establish the connection to LRRK2, it would also help to compare the effect of rab12 KD on the phosphorylation of selected rabs that do or do not depend on LRRK2.
  
  The reviewer raises an interesting question regarding whether Rab12 can further reduce Rab10 phosphorylation independently of LRRK2. Using our quantitative MSD-based assay, we observe that pRab10 levels are at the lower limits of detection of the assay in LRRK2 KO A549 cells. Unfortunately, this means that we are unable to detect whether there might be any additional minor reduction in Rab10 phosphorylation with Rab12 knockdown in LRRK2 KO cells. We cannot rule out that Rab12 may play a LRRK2-independent role in regulating Rab10 phosphorylation in other cell lines, and future studies are warranted to explore whether Rab12 knockdown can further reduce Rab10 phosphorylation in other systems, including in CNS cells.
  
  Regarding exploring the effects of RAB12 knockdown on the phosphorylation of other Rabs, we also assessed the impact of RAB12 KO on phosphorylation of another LRRK2-Rab substrate, Rab8a. We observed a strong reduction in pT72 Rab8a levels in RAB12 KO cells compared to wildtype cells, suggesting the impact of RAB12 deletion extends beyond Rab10 (see representative western blot in Author response image 1). Due to potential concerns with the selectivity of the pT72 Rab8a antibody (potentially detecting the phosphorylation of other LRRK2-Rabs), we cannot definitively demonstrate that Rab12 mediates the phosphorylation of other Rabs. This question should be revisited when additional phospho-Rab antibodies become available that enable us to selectively detect LRRK2-dependent phosphorylation of additional Rab substrates under endogenous expression conditions.
  
  Author response image 1.
  
  A strength of the work is the demonstration of p-rab10 recruitment to lysosomes by biochemistry and imaging. The demonstration that LRRK2 is required for this by biochemistry (Fig 4A) is very important but it would also be good to determine whether the requirement for LRRK2 extends to imaging. In support of a causal relationship, the authors also state that lysosomal accumulation of rab12 precedes LRRK2 but the data do not show this. Imaging with and without LRRK2 would provide more compelling evidence for a causative role.
  
  We thank the reviewer for their suggestion to assess Rab12 recruitment to damaged lysosomes with and without LRRK2 using imaging-based analyses to add confidence to our findings from biochemical approaches. To address this comment, we have imaged the recruitment of mCherry-tagged Rab12 to lysosomes (as assessed using an antibody against endogenous LAMP1) and observed a significant increase in Rab12 levels on lysosomes following LLOMe treatment. This occurs to a similar extent in LRRK2 KO A549 cells, suggesting that Rab12 is an upstream regulator of LRRK2 activity. This new data has been incorporated into the revised manuscript (Figure 3E) and is presented on page 20 of the updated manuscript.
  
  Our conclusions on this are further strengthened by new data assessing Rab12 recruitment to lysosomes using orthogonal analysis of isolated lysosomes biochemically. Using the Lyso-IP method, we observed a strong increase in the levels of Rab12 on lysosomes following LLOMe treatment that was maintained in LRRK2 KO cells. These data have been added to the updated manuscript (new data added to Figure 3- figure supplement 1).
  
  Together, these data support our hypothesis that Rab12 recruitment to damaged lysosomes is upstream, and independent, of LRRK2.
  
  The authors also touch base with PD mutations, showing that loss of rab12 reduces the phosphorylation of rab10. However, it is interesting that loss of rab12 has the same effect with R1441G LRRK2 and D620N VPS35 as it does in controls. This suggests that the effect of rab12 does not depend on the extent of LRRK2 activation. It is also surprising that R1441G LRRK2 does not increase p-rab10 phosphorylation (Fig 2G) as suggested in the literature and stated in the text.
  
  We agree with the reviewer that it is quite interesting that RAB12 knockdown significantly attenuates Rab10 phosphorylation in the context of PD-linked variants in addition to that observed in wildtype cells basally and after LLOMe treatment. As noted by the reviewer, we did not observe increased levels of phospho-Rab10 in LRRK2 R1441G KI A549 cells at the whole cell level (Figure 2G). However, we observed a significant increase in Rab10 phosphorylation on isolated lysosomes from LRRK2 R1441G KI cells compared to WT cells (Figure 4B). This may suggest that the LRRK2 R1441G variant leads to a more modest increase in LRRK2 activity in this cell model. Previous studies in MEFs from LRRK2 R1441G KI mice or neutrophils from human subjects that carry the LRRK2 R1441G variant showed a 3-4 fold increase in Rab10 phosphorylation (Fan et al Acta Neuropathol 2021 PMID: 34125248 and Karaye et al Mol Cell Proteomics 2020 PMID: 32601174), supporting that this variant does lead to increased Rab10 phosphorylation and that the extent of LRRK2 activation may vary across different cell types.
  
  Most important, the final figure suggests that PD-associated mutations in LRRK2 and VPS35 occlude the effect of lysosomal disruption on lysosomal recruitment of LRRK2 (Fig 4D) but do not impair the phosphorylation of rab10 also triggered by lysosomal disruption (4A-C). Phosphorylation of this target thus appears to be regulated independently of LRRK2 recruitment to the lysosome, suggesting another level of control (perhaps of kinase activity rather than localization) that has not been considered.
  
  The reviewer suggests an interesting hypothesis around the existence of additional levels of control beyond the lysosomal levels of LRRK2 to lead to increased Rab10 phosphorylation of lysosomes. Given the variability we have observed in measuring endogenous LRRK2 levels on lysosomes, we performed two additional replicates to assess lysosomal LRRK2 levels in LRRK2 R1441G KI and VPS35 D620N KI cells at baseline and after treatment with LLOMe. We observed a significant increase in LRRK2 levels on lysosomes in cells expressing either PD-linked variant and a trend toward a further increase in the levels of LRRK2 on lysosomes after LLOMe treatment in these cells (Figure 4D in the updated manuscript). We have updated the text on page 24 to reflect this change, suggesting that the PD-linked variants do not fully occlude the effect of lysosomal disruption on the lysosomal recruitment of LRRK2.
  
  LLOMe treatment leads to a stronger increase in Rab10 phosphorylation on lysosomes from LRRK2 R1441G and VPS35 D620N cells compared to the modest increase in LRRK2 levels observed. This could suggest that, as the reviewer noted, additional mechanisms beyond increased lysosomal localization of LRRK2 may be driving the robust increase in Rab10 phosphorylation observed. We have modified the results section on lines 548-551 to highlight this possibility: “Rab10 phosphorylation showed a more significant increase in response to LLOMe treatment than LRRK2 on lysosomes from LRRK2 R1441G and VPS35 D620N KI cells, suggesting that there may be more regulation beyond the enhanced proximity between LRRK2 and Rab that contribute to LRRK2 activation in response to lysosomal damage.”
  
  Reviewer #3 (Public Review):
  
  Increased LRRK2 kinase activity is known to confer Parkinson's disease risk. While much is known about disease-causing LRRK2 mutations that increase LRRK2 kinase activity, the normal cellular mechanisms of LRRK2 activation are less well understood. Rab GTPases are known to play a role in LRRK2 activation and to be substrates for the kinase activity of LRRK2. However, much of the data on Rabs in LRRK2 activation comes from over-expression studies and the contributions of endogenously expressed Rabs to LRRK2 activation are less clear. To address this problem, Bondar and colleagues tested the impact of systematically depleting candidate Rab GTPases on LRRK2 activity as measured by its ability to phosphorylate Rab10 in the human A549 type 2 pneumocyte cell line. This resulted in the identification of a major role for Rab12 in controlling LRRK2 activity towards Rab10 in this model system. Follow-up studies show that this role for Rab12 is of particular importance for the phosphorylation of Rab10 by LRRK2 at damaged lysosomes. Increases in LRRK2 activity in cells harboring disease-causing mutants of LRRK2 and VPS35 also depend (at least partially) on Rab12. Confidence in the role of Rab12 in supporting LRRK2 activity is strengthened by parallel experiments showing that either siRNA-mediated depletion of Rab12 or CRISPR-mediated Rab12 KO both have similar effects on LRRK2 activity. Collectively, these results demonstrate a novel role for Rab12 in supporting LRRK2 activation in A549 cells. It is likely that this effect is generalizable to other cell types. However, this remains to be established. It is also likely that lysosomes are the subcellular site where Rab12-dependent activation of LRRK2 occurs. Independent validation of these conclusions with additional experiments would strengthen this conclusion and help to address some concerns that much of the data supporting a lysosome localization for Rab12-dependent activation of LRRK2 comes from a single method (LysoIP). Furthermore, there is a discrepancy between panel 4A versus 4D in the effect of LLoMe-induced lysosome damage on LRRK2 recruitment to lysosomes that will need to be addressed to strengthen confidence in conclusions about lysosomes as sites of LRRK2 activation by Rab12.
  
  We thank the reviewer for their comments regarding our work that identifies Rab12 as a novel regulator of LRRK2 activation and the appreciation of the parallel approaches we employed to add confidence in this effect.
  
  As suggested by the reviewer, we have updated our manuscript to now include independent validation of our conclusions using imaging-based analyses to complement our data from biochemical analyses using the Lyso-IP method. Specifically, we have included new imaging data that confirms that Rab12 levels are increased on lysosomes following membrane permeabilization with LLOMe treatment and demonstrates that this occurs independent of LRRK2, providing additional support that Rab12 is an upstream regulator of LRRK2 activity (Figure 3E in the updated manuscript).
  
  Regarding the reviewer’s comment on a discrepancy between our findings in Figure 4A and Figure 4D, we have performed additional independent replicates in Figure 4D to assess the impact of lysosomal damage on the lysosomal levels of LRRK2 at baseline or upon the expression of genetic variants. We observed a significant increase in LRRK2 levels on lysosomes following LLOMe treatment in our set of experiments included in Figure 4A and a non-significant trend toward an increase in LRRK2 levels on isolates lysosomes in Figure 4D. As described in more detail below (in response to the second point raised by this reviewer), we think this variability arises because of a combination of low levels of LRRK2 on lysosomes with endogenous expression and variability across experiments in the efficiency of lysosomal isolation. Our observations of increased recruitment of LRRK2 to lysosomes upon damage are further supported by parallel imaging-based studies (Figure 3F-I) and are consistent with previous studies using overexpression systems.
  
  We thank the reviewer for all of the suggestions which have added further confidence to our conclusions and substantially improved the manuscript.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.02.21.529466v1
www.biorxiv.org www.biorxiv.org

Handling of intracellular K+ determines voltage dependence of plasmalemmal monoamine transporter function

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  This work is aiming at the characterization of the molecular and kinetic mechanism of how three members of the SLC6 family of transporters, namely for dopamine (DAT), norepinephrine (NET), and serotonin (SERT), transport substrate across the membrane, and how the transport process is affected by cations. The authors use electrophysiology and sophisticated rapid solution exchange methods, in conjunction with fluorescence recordings from single cells, to correlate flux (from fluorescence) with electrical activity (from currents).
  
  The strength of the methods is based on the application of a kinetic method with high time resolution, allowing the isolation of fast processes in the transport mechanism, and their modeling using a kinetic multistep scheme. In particular useful is the combination with fluorescence recording from single cells, which allows the authors to measure flux and current in the same cell under voltage clamp conditions. This is an elegant approach to get information on the voltage dependence of substrate flux, which is difficult to obtain with other methods. As to the strength of the results, the data are generally of high quality, showing the kinetic and mechanistic similarities and differences between the three transporters under observation. Another strength is that the results are quantitatively represented by kinetic simulations, which appear to fit the experimental data well.
  
  The major weakness of the research is related to interpretation of the experimental results. While the authors propose a unified K+ interaction mechanism for the three transporters, DAT, NET and SERT, the proposed K+ association/dissociation mechanism is 1) highly unusual, and 2) not unique in the ability to explain the experimental data. As to point 1), the DAT mechanism (Fig. 7A) proposes a sequence of intracellular K+ association and dissociation steps. Since the intracellular [K+] remains constant, such a sequence requires a change of affinity for K+, which is initially high when K+ associates (33 microM according to the provided rate constants) and then has to be low for K+ dissociation (3.3 mM). Such an affinity change requires input of free energy, to promote K+ dissociation. From the provided rate constants and at room temperature this free energy change can be approximated as 11.4 kJ/mol. This is a large energy amount, in fact larger than what is stored in the physiological concentration gradient for one Na+ ion as a driving force for transport. It appears that the transporter would waste a lot of energy for no apparent benefit, with a futile K+ association/dissociation cycle, that would just generate heat.
  
  Therefore, while the authors have achieved their aim of quantitatively assessing transporter function and thorough description by a kinetic mechanism, their final proposed mechanism does not support all of the conclusions because it is by far from unique in being able to explain the data (point 2) above). While this may be true for other transport mechanisms proposed in the past, the mechanism proposed here is somewhat odd with respect to energy requirements. Thus, it would require extraordinary experimental proof to propose it in exclusion of other, maybe more plausible mechanisms.
  
  Despite these shortcomings, the potential impact of the work is high, because a unifying theory of cation interaction and stoichiometry of the monoamine transporter members of the SLC6 family has been missing in the literature. In addition, the elegant method of combining single cell electrophysiology and fluorescence flux measurements is impactful, especially in the whole cell recording method, allowing the control of intracellular ionic composition.
  
  We thank reviewer 1 for his comments on the kinetic modelling. We do not claim that the mechanism, which we propose, is unique in its ability to explain the data. However, we should like to argue that the proposed mechanism is plausible and parsimonious. We, much like reviewer 1, initially asked the question, whether a mechanism requiring an ion such as potassium to associate and subsequently dissociate from the same side of the transporter was energetically feasible. In fact, one of the main reasons for employing kinetic models was to address this specific issue.
  
  If detailed balance in a kinetic model is maintained (i.e., the product of the rates in the forward direction of a loop equals the product of the rates in the reverse direction), the model is energetically sound (i.e., such a model does not violate the laws of thermodynamics). It is true that for a spontaneous reaction to occur, the Gibbs free energy has to be negative. In a multistep process, however, this consideration only pertains to the “initial” and the “final” state. As long as the Gibbs free energy between these two states is negative the reaction will proceed, even if the Gibbs free energy between “intermediate” states is positive. This point is illustrated in the schemes below.
  
  Scheme (A) maps out the Gibbs free energy of the outer loop of the kinetic model of DAT (i.e., this path describes the conformational trajectory, which the transporter takes in the presence of intracellular K+- see scheme in Fig.7A of the manuscript). For calculating the Gibbs free energy of this loop, we assumed a pre-equilibrium condition (i.e., an extracellular and intracellular substrate concentration that we arbitrarily set to 10 μM and 100 nM, respectively) and the membrane voltage as 0 mV. As shown in the scheme, the Gibbs free energy between the “initial-left” and the “final-right” state is negative. Accordingly, the multistep reaction can proceed spontaneously.
  
  In scheme (B), we mapped out the Gibbs free energy for the same path and the same pre-equilibrium condition as shown in scheme (A); the only difference is that the membrane potential was now assumed to be -60 mV. This is to show that voltage is also a determining factor of the extent by which the Gibbs free energy changes.
  
  In Scheme (C), we mapped out the Gibbs free energy at equilibrium (the difference in Gibbs free energy between the “initial” and the “final” state is zero). This condition is met when the intracellular substrate concentration is 155 μM. At this intracellular substrate concentration, the energy stored in the substrate gradient notably matches exactly the energy of the Na+ gradient. The model therefore predicts that no energy is dissipated as heat, an observation that is in contrast to the concern raised by reviewer 1. We admit that the model can be criticized on this ground, because arguably, a realistic process is expected to dissipate energy as heat even if it involves a microscopic system (as is the case here). Determination of how much heat is generated in a transport cycle is, however, beyond the scope of the present manuscript and warrants a detailed study. In such a study, one could investigate if any heat loss generated can be compensated by, for instance, the occasional antiport of K+ by DAT, which, as we point out in the discussion, is possible. In this context, we stress that the energetic costs would have been much higher, if we had assumed non-obligatory antiport of K+ through DAT. Such a mechanism predicts that the K+ gradient is constantly dissipated in the absence of the substrate, which would indeed create the futile heat loss reviewer 1 is concerned about.
  
  An alternate hypothesis to the actions of intracellular K+ on the DAT transport cycle would be to propose the presence of a regulatory K+ binding site. We are reluctant to assume this mechanism for the simple reason that there is little evidence for such sites from the available crystal structures. The view that K+ binds to Na2 site in DAT, NET and SERT is consistent with our data (see Fig.5). These observations are aided by a previous study that shows K+ can bind to the Na2 site in DAT, as determined by extensive molecular dynamic simulations (Razavi et al., 2017, cited in the manuscript). By its very nature, the Na2 site cannot serve as a regulatory K+ binding site; for the transporter to proceed in the transport cycle, K+ must at some point dissociate from the Na2 site.
  
  On further scrutiny of our model for DAT, NET and SERT, we noticed that the extra and intracellular affinities for Na+ were set too high. We regret this oversight that arose because we had only simulated experiments in which the intracellular Na+ concentrations had been zero. The selected Na+ affinities would not have allowed the transporter to function properly at a physiological intracellular Na+ concentration (which is ~10 mM). We now rectified this problem by lowering the inner and outer Na+ affinity by a factor of 10. In Fig.7 of the main manuscript and supplementary figure 6, we have now replaced all previous simulations of the three transporters with the predictions of the newly amended model. As seen, the changes in the binding parameters for Na+ in the model could still account for the key findings of this study.
  
  Reviewer #2:
  
  Bhat et al. study transport mechanism of three members of the SLC6 family, i.e. DAT, NET and SERT, using a combination of cellular electrophysiology, fluorescence measurements - taking advantage of a fluorescent substrate (APP+) that can be transported by each of these different transporters - and kinetic modelling. They find that DAT, NET and SERT differ in intracellular K+ binding. In DAT and NET, intracellular K+ binding is transient, resulting in voltage-dependent transport. In contrast, SERT transports K+, and the addition of a charged substrate to the transport cycle makes serotonin transport voltage-independent.
  
  This is an extremely nice and interesting manuscript, based on a series of beautifully designed and executed experiments that are convincingly analyzed via a kinetic model. I have only some suggestions:
  
  1) Fig. 4: I find the description of Fig. 4 extremely difficult to understand. In clear contrast to the introductory sentence "Previous studies showed that Kin+ was antiported by SERT, but not by NET or DAT (Rudnick & Nelson, 1978; Gu et al., 1996; Erreger et al.,2008), SERT appears to be able to transport APP+ without K+ in Fig. 4. I was trying to understand this obvious discrepancy for a long time, until I found the authors coming back to this point in the discussion "However steady-state assessment of transporter mediated substrate uptake is hindered by the fact that all three monoamine transporters can also transport substrate in the absence of Kin+". This is a little late, and the author should address this point more explicitly in the result section, close to the description of Fig. 4.
  
  We agree with reviewer 2’s comments pertaining to the SERT data represented in Fig.4C. The observations made from this dataset seem confusing in the absence of any relevant context. We have added the following statements to clarify any discrepancy arising from Fig. 4 (lines 266-273): “Owing to the instrumental role of Kin+ in the catalytic cycle of SERT, the observed lack of difference in APP+ uptake profiles by SERT-expressing cells in the presence or absence of Kin+ seem contradictory. This discrepancy can be explained as follows: 1) SERT can alternatively antiport protons to complete its catalytic cycle (Keyes and Rudnick, 1982; Hasenhuetl et al. 2016) and 2) APP+ is a poor SERT substrate (as determined by lack of APP+ induced steady state currents, Fig. 2F and 3F) that may be shuttled into SERT-expressing cells at rates slower than the rate limiting isomerization of SERT from inward open to outward open state.”
  
  2) Throughout the whole manuscript I am missing statistical details in comparisons.
  
  Statistical details for comparisons, which were done on some data sets in Fig. 4, Fig.5 and Fig.6, have now been incorporated in the manuscript text.
  
  3) Since APP+ might also only bind to the transporter or even only bind to the cell membrane, the authors might want to look at how the time course of the cellular APP+ signal depends on the size of the cells or on the ratio of transport currents and capacitance. It is of course possible that the tested cells do not differ sufficiently in size to permit such comparison. The authors should at least comment on this possibiliy.
  
  We are working on monoclonal lines. Thus, the differences in cell size are small (between 25- 30 pF). In the new supplementary figure 1, we show that our (previously held) conjecture that the fast component represents membrane binding was wrong. In fact, analysis of the APP+ fluorescence in control cells (supplementary figure 1D) suggests that APP+ adherence to the plasma membrane does not contribute to significant fluorescence signal. We apologize for this misinterpretation and please refer to the responses to reviewer 1 for more details.
  
  4) Another set of results one might look at are the time courses of fluorescence decay after the end of the APP+ perfusion (Fig. 2 and 4). Substrate (APP+) outward transport should have a comparable voltage dependence as substrate uptake, moreover it should depend on the amount of substrate that entered to the cell before. Could the authors provide such result and use them to exclude specific/unspecific APP+ binding?
  
  In supplementary figure 1 (panel, A and C) and video files 1 and 2, we show that APP+ adheres to intracellular membranes of organelles. This has also been shown previously by others (Solis Jr. et al., 2012; Karpowicz Jr et al., 2013; Wilson et al., 2014, cited in the manuscript). Because these structures serve as sinks, there is no (or only little) free APP+, which is available for outward transport.
  
  Reviewer #3:
  
  The sodium-coupled biogenic transporters DAT, NET and SERT, terminate the synaptic actions of dopamine, norepinephrine and serotonin, respectively. They belong to the family of Neurotransmitter:sodium:symporters. These transporters have very similar sequences and this is reflected at the structural level as judged by similarity of the crystal structures of the outward-facing conformations DAT and SERT. However, earlier functional studies indicated that transport by SERT is electroneutral because the charges sodium ions and substrate moving into the cell are compensated by the outward movement of potassium ions (or protons) to complete the transport cycle. On the other hand, DAT and NET are electrogenic. Moreover, potassium ions are not extruded by these transporters and the Authors set out to investigate if the electrogenicity is related to difference in potassium handling between SERT and the two other biogenic transporters. This was done by analyzing the role of intracellular cations and voltage on substrate transport by the three biogenic amine transporters. This was achieved by the simultaneous recording of uptake of the fluorescent substrate APP+ and the current induced by this process under voltage-clamp conditions by single HEK293 cells expressing the transporters. The Authors found that even though uptake by NET and DAT did not require internal potassium, these transporters could actually interact with internal potassium as judged by the voltage dependence of the so-called peak current. This voltage dependence was very steep in the absence of both sodium and potassium. However, in the presence of either cation this voltage dependence became less steep when either of these cations was present in the internal milieu, indicating that not only sodium but also potassium could bind from the inside. The same result was obtained with SERT. However, uptake by SERT was found to be much less dependent on the membrane voltage than that by DAT and NET and was stimulated by internal potassium, consistent with the proposed electroneutrality of the former. The observations indicate that the structural similarity of the three biogenic amine transporters is also reflected in their ability to bind potassium, even though this cation can translocate to the outside only in SERT.
  
  Strengths:
  
  Development of a sophisticated technique to interrogate the mechanism of sodium coupled biogenic amine transport in single cells. Rigorous analysis of the data. Conclusions supported by the data. The methodology can be used to obtain novel insights into the mechanism of other transporters.
  
  Weaknesses:
  
  The presentation could be made more "user friendly" by explaining in more detail what is happening as we go through the data. For instance, peak and steady state currents are shown already in Figure 1, but an (too brief) explanation is only provided when describing Figure 5. A schematic in the first part of the Results would be useful. Some information of on the structural background should be provided as well as a full description of the transport cycle, namely the number of sodium ions translocated per cycle and the argument why chloride remains bound to the transporter throughout the cycle. The control that in contrast to potassium, lithium is inert should be performed not only for DAT, but also for the two other transporters.
  
  We thank Dr. Kanner for these recommendations. Regarding the role of Na+ and Cl- in the transport cycle of the monoamine transporters, we have briefly mentioned the same in the introduction as follows: “The crystal structure of both hSERT and dDAT show two bound Na+ ions. However, only one Na+ ion is thought to be released on the intracellular side in both transporters (Rudnick & Sandtner, 2019). Cl-, on the other hand, has been shown to play a modulatory role in the transport cycle of SERT and DAT, but Cl- is not essential for the transport stoichiometry (Erreger et al., 2008; Hasenhuetl et al., 2016).”
  
  As for the control experiments with Li+, we are very grateful to Dr. Kanner for his suggestions. En route to extending the observations, which we obtained with DAT in the presence of high intracellular Li+, to NET and SERT, we stumbled upon some unexpected results: while IV relations of peak currents with high intracellular Li+ or NMDG+ in NET were identical (similar to DAT), SERT gave us exactly the opposite profiles. IV relations of high intracellular Li+ in SERT were as shallow as those in the presence of high +++ intracellular K or high intracellular Na . This is indicative of intracellular Li binding to SERT, an observation not previously reported that further highlights the differences in DAT/NET and SERT in cation binding. We believe that our observations with Li+ and SERT could be expanded on in a separate story. We have accordingly changed the manuscript text in the Results and Discussion as follows:
  
  Results (lines 320-337):
  
  “Because the absence of Kin+ affected the slope of the IV-relation of the peak current, we surmised that potassium bound from the intracellular side not only to SERT but also possibly to DAT and NET. We explored this conjecture by determining the IV relation of peak currents through all three +++ transporters in the presence of lithium (Liin = 163 mM) instead of Kin . Li is believed to be an inert cation, because it does not support substrate translocation by SLC6 transporters. As expected, the IV relation of peak currents through DAT and NET were similar in the presence of 163 mM Lin+ to those recorded in the absence of Kin+ (cf., diamond and triangle symbol in Fig. 5J and 5K). These observations clearly indicate that Kin+ binds to both DAT and NET and rule out an alternative explanation, i.e. that the effect can be accounted for water and monovalent cations briefly occupying a newly available space in the inner vestibule. SERT, on the hand, show shallow IV relations of peak currents with high Liin+ when compared to those acquired in the absence of Kin+ (cf., diamond and triangle symbol in Fig. 5L). This is indicative of Liin+ binding to SERT on the intracellular side. The exact nature of Liin+ binding to SERT has not been reported previously and warrants further investigation. The IV relations of peak currents are similar in the presence of 163 mM Kin+ (Fig. 5A-C) and of 163 mM Nain+ (Fig. 5G-I) in DAT, NET and SERT (cf. circle and square symbols in Fig. 5J-L). This is consistent with the idea that Nain+ and Kin+ bind to overlapping sites in these transporters. “
  
  Discussion (lines 524-527):
  
  “Interestingly, differences between DAT/NET and SERT are further substantiated by the ability of SERT+ to bind to intracellular Li . The exact nature of this interaction is unknown and necessitates an in-depth investigation that is beyond the scope of this study.”
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.11.434931v1
www.biorxiv.org www.biorxiv.org

De novo apical domain formation inside the Drosophila adult midgut epithelium

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This paper is a follow-up of the authors previous paper (2018), in which they carefully described the organisation of the junctions between cells of the adult Drosophila midgut epithelium and their control from the basal side by integrin signalling. Here, the authors used state-of-the art imaging and genetics to unravel step-by-step the events leading from an initially unpolarised cell to an epithelial cell that integrates into the existing epithelium. Many of the images are accompanied by cartoons, which help the reader to better understand the images and follow the conclusions. It would have been helpful yet, in particular with respect to the mutant phenotypes described later, if they would have named each of the steps/stages. In addition, mentioning the timescale would give an idea about the temporal frame in which this process elapses.
  
  We have used terms such as “unpolarised cells, polarised Actin/Cno” to label different stages in Figure 6, since this sequence of steps is inferred from results obtained from fixed samples with still images. We have illustrated the septate junction mutant phenotype in Figure 8I.
  
  We have also performed a new experiment to estimate the time taken for an activated EB to form a PAC and to become a mature enterocyte using overexpressing Sox21a with esg[ts]>GFP to induce enteroblast differentiation. Counting the number of GFP+ve cells without PAC, with a PAC and with full apical domain at different time points suggests that activated EBs take about a day to form a PAC and another day to form a fully-integrated enterocyte. We have summarised the results in Figure 5-figure supplement 1C.
  
  We have also included this result in the main-text as “ To estimate the time taken for enteroblasts to progress to pre-enterocytes with a PAC, and for pre-enterocytes become to enterocytes, we induced enterocyte differentiation by over-expressing UAS-Sox21a under the control of esg[ts]-Gal4 and counted the number of GFP+ve cells without a PAC or apical domain, with a PAC and with a full apical domain at different time points after induction (Chen et al., 2016; Meng and Biteau, 2015; Zhai et al., 2017). 17 hours after shifting the flies to 25ºC to inactivate Gal80ts, almost no GFP+ve cells had progressed to pre-EC with a PAC (0.1%) or EC (1%), and these few cells probably started to differentiate before Sox 21a induction. 24 hours later, 10% of the GFP+ve cells had developed into pre-ECs with a PAC and 20% had become ECs (Figure 5-figure supplement 1B-C). After an additional 24 hours, the number of cells with a PAC fell to 1%, whereas 50% were ECs. Assuming that it takes 12-17 hours to induce high levels of Sox21a expression, these results suggest that most activated EBs take about 24 hours to develop into a pre-EC with a PAC and a further 24 hours to differentiate into a mature EC, although some cells differentiate faster. This time frame is in agreement with a previous study using similar approaches to accelerate differentiation (Rojas Villa et al., 2019) and a recent live imaging study tracing the enteroblast to enterocyte transition (Tang et al., 2021). These results also indicate that down-regulation of Sox21a is not essential for enteroblast to pre-enterocyte differentiation, since enteroblasts overexpressing Sox21a still from a PAC (Figure 5-figure supplement 1B).
  
  The authors convincingly show that septate junctions are instrumental for proper polarisation and integration of the enteroblast. However, while they nicely showed that Canoe in neither required in the enteroblast nor in the enterocytes for this process, it remains unclear whether septate junction proteins are required in enteroblast or in enterocytes or in both and at which particular step the process fails in the mutant.
  
  Early stage enteroblasts neither express or require septate junction proteins, whereas late stage enteroblasts and pre-enterocytes do (Chen et al., 2020; Hung et al., 2020; Izumi et al., 2019; Xu et al., 2019). Since cells mutant for septate junction proteins do not develop into mature enterocytes with an apical domain facing the gut lumen, we cannot answer the reviewer’s question of whether septate junction proteins are required in enterocytes.
  
  As we discussed in the paper, we think that “differentiating enteroblasts only require a basal cue to establish their initial apical-basal polarity, whereas the formation of the pre-assembled apical compartment also requires a junctional cue. The septate junctions are not necessary for apical domain formation per se, however, as mesh mutant enteroblasts form a full-developed apical domain with a brush border inside the cell. This suggests that septate junctions define the site of apical domain formation by delimiting the region where apical membrane proteins are secreted to assemble the brush border, but do not control the process of apical domain formation directly.”
  
  Reviewer #2 (Public Review):
  
  The authors recently showed the polarization of the cells of the adult Drosophila midgut does not require any of the canonical epithelial polarity factors, and instead depend on basal cues from adhesion to the ECM, as well as septate junction proteins (Chen et al, 2018). Here they extend this research to examine in greater detail precisely how midgut epithelial cells integrate in the pre-exisiting epithelium and become polarized. Surprisingly, they show that enteroblasts form an apical membrane initiation site prior to polarizing. Furthermore, they show that this develops into a pre-apical compartment containing fully-formed brush border. This is a very interesting finding - it explains how integrating enteroblasts can integrate into a pre-existing epithelium without disrupting barrier function. The conclusions of this paper are mostly well supported by data, but some aspects could do with being clarified and extended as outlined below.
  
  Model presented in Figure 6
  
  While the separation of membranes indicated in Figure 6 steps 3-5 can be seen in the image shown in Figure 3B, this is one of the only images which supports the idea that there is a separation of membranes between the enteroblast and overlying enterocytes during PAC formation. Is the model in Figure 6 supported by EM data - can you see a region where there is brush border and separation of cells? Supplementing Figure 3 with corresponding EM images would greatly aid the reader in interpreting the data and strengthen the model.
  
  We think that AJ clearing and membrane separation is a brief process that is quickly followed by the separation of the apical and junctional proteins and apical secretion at the AMIS to form the PAC. We have not captured this stage in our EM images, but have many other examples that show this step (e.g Figure 4C and Figure 8F). Another example is shown below.
  
  A key step in the model is that the clearance of E-Cadherin from the apical membrane leads to a loss of adhesion between the enteroblast and the overlying enterocytes. This would need to be supported by functional data such as overexpression of E-Cad or E-CadDN in enteroblasts or by generating shg mutant clones. If the model is correct, perturbing E-Cad levels in enteroblasts should lead to defects in PAC formation, such as loss of de-adhesion/early de-adhesion/excessive de-adhesion.
  
  We think it is the local clearance of ECad from the apical membrane, not the downregulation of total level of ECad that is important for the local membrane separation and future PAC formation. The experiment of overexpressing ECad or ECad-DN proposed by the reviewer might be crucial to demonstrate the importance of total amount of ECad, but might not be very helpful in determining the importance of membrane separation in the PAC formation. Moreover, AJ formation in fly midgut epithelium does not depend on ECad, suggesting that ECad and NCad act redundantly which further complicates this approach (Choi et al., 2011; Liang et al., 2017).
  
  Role for the septate junction proteins
  
  Septate junction proteins were previously shown by these authors to be required for enteroblast polarization and integration into the midgut epithelium (Chen et al, 2018). Here they extend this by examining enteroblasts mutant for septate junction proteins, and conclude that septate junction proteins are required for normal PAC formation. However, it is not clear what aspect of the polarization of the enteroblasts is disrupted, because a number of mesh mutant cells (albeit a lower proportion than in wildtype) do form PACs. The main phenotype seems to be that cells fail to polarize (as previously reported) or have internalised PACs. It is hard to know what to conclude from this data about the role of the septate junction components in PAC formation.
  
  The major phenotype of the septate junction mutants is the loss of polarity, i.e. an inability to form an apical domain and integrate into the epithelial layer as shown in Figure 8. Neither mesh or Tsp2a mutants can form a PAC, even though mesh mutant cells have higher propensity to form an internal PAC-like structure (Figure 8B,C,E,G,H, Figure 8-figure supplement 1L). Thus, we think that septate junctions are required for AMIS and PAC formation. What complicates the interpretation is that some (6-20%) septate junction mutant cells do form an AMIS like structure (Figure 8D-F, Figure 8-figure supplement 1F&K). The simplest explanation for this result is that this is due to perdurance of the wild-type proteins after clone induction, with the weaker phenotype of ssk mutants being due to longer perdurance of this protein. However, we cannot rule out the alternative explanation that AMIS and PAC formation is facilitated by the septate junction proteins, but that they can still form very inefficiently in their absence.
  
  We realise that this section was quite confusing in the orginal version of the manuscript and have now re-written it to make this interpretation clearer.
  
  Coracle is used as a readout for the localization of septate junction components, yet the staining for Cora in Figure S3B looks quite different to Mesh in S3D. If Cora is to be used as a readout for the localization of septate junction components, then staining for Cora/Mesh and/or Cora/SSk or Tsp2a should be shown.
  
  When discussing the requirement for septate junctions for enteroblast integration - Coracle and Mesh are used interchangeably - but as mentioned before, it is not clear if they colocalize, or if their localization is interdependent (as demonstrated for Mesh, Tsp2a and Ssk in Figure 7). What is the phenotype of enteroblasts mutant for cora?
  
  Following from the previous point - while it is clear that Coracle is apical early during AMIS formation, it is not clear if Mesh, Tsp2a and Ssk also are, yet these are the mutants that are examined for a role in AMIS/PAC formation. It would be good to know whether the loss of cora would lead to defects in AMIS formation.
  
  The reason we used mainly Coracle as a marker for the septate junctions is that Mesh and Tsp2A localise to the basal labyrinth as well as to the septate junctions which could confuse the reader. We have now added new panels to Figure 3-figure supplement 3E&F showing the colocalization of Cora with Mesh/Tsp2a at the septate junctions and during the crucial stages of PAC formation.
  
  Additional Results:
  
  "Coracle is a peripheral septate junction protein whose localisation depends on the structural septate junction components such as Mesh/Ssk/Tsp2a (Chen et al., 2018; Izumi et al., 2016, 2012). Cora antibody staining provides a clearer marker for the septate junctions than Mesh or Tsp2a antibody staining, because the latter also label the basal labyrinth (Figure 3-figure supplement 1E&F). To determine whether Cora is required for PAC formation or epithelial polarity in the adult midgut, we generated a null mutant allele with a premature stop codon in FERM domain using CRISPR. Cells mutant for this allele, corajc, or a second cora null allele, cora5, can form a PAC, septate junctions and a full apical domain, indicating that Cora is also not required for enteroblast integration or enterocyte polarity (Figure 7F&G, Figure 7-figure supplement 1E-H).
  
  Additional Materials and Methods:
  
  We used the CRISPR/Cas9 method (Bassett and Liu, 2014) to generate null alleles of canoe and coracle. sgRNA was in vitro transcribed from a DNA template created by PCR from two partially complementary primers:
  
  forward primer:
  
  For coracle:
  
  5′-GAAATTAATACGACTCACTATAGAAGCTGGCCATGTACGGCGGTTTTAGAGCTAGAAATAGC-3′;
  
  The sgRNA was injected into…Act5c-Cas9 embryos to generate coracle null alleles (Port et al., 2014). Putative…coracle mutants in the progeny of the injected embryos were recovered, balanced, and sequenced. …The coraclejc allele contains a 2bp deletion around the CRISPR site, resulting in a frameshift that leads to stop codon at amino acid 225 in the middle of the FERM domain, which is shared by all isoforms. No Coracle protein was detectable by antibody (DSHB C615.16) staining in both midgut and follicle cell clones. The coraclejc allele was recombined with FRT G13 to make the FRTG13 coraclejc flies.
  
  It is unclear what is happening in Figure 8A,C,E, S7D. Is that a detachment phenotype or an integration phenotype? Are the majority of cells unpolarised due to loss of integrin attachment rather than failure to form an AMIS/PAC?
  
  Cells mutant for septate junction proteins do not detach from the basement membrane and still localise Talin basally, as illustrated by the new panel we have added (Figure 8-figure supplement 1N), showing Talin localisation in Tsp2a mutant cell.
  
  However, because the mutant cells cannot integrate and remain stuck beneath the septate junctions between the enterocytes, they sometimes become displaced from a portion of the basement membrane by younger EBs that derive from the same mutant ISC, leading to a pile up of cells in the basal region of the epithelium (e.g. Figure 8A, E and H).
  
  We have added the following sentences to the Results, explaining these points:
  
  "Because the mutant cells remain trapped beneath enterocyte-enterocyte septate junctions, they accumulate in the basal region of the epithelium, with new EBs derived from the same mutant ISC forming beneath them and reducing their contact with the basement membrane (Figure 8A)."
  
  " The majority of cells mutant for septate junction components fail to polarise or form an AMIS, although they form normal lateral and basal domains, as the basal integrin signalling component, Talin, localises normally (Figure 8-figure supplement 1N)."
  
  It is unclear whether enteroblasts really pass through an 'unpolarized stage'. In Figure 6, when they are described as 'unpolarised', they clearly have distinct basal and AJ domains. In septate junction mutants, when cells are classified as unpolarized, do they still have distinct regions of integrin/E-Cad expression?
  
  This is a semantic question. We agree that they have distinct lateral and basal domains, but they do not have an apical domain. In this respect, these "unpolarised" cells are similar to a mesenchymal fibroblast migrating on a substrate, which has a distinct basal side contacting the substrate that is different from the non-contacting regions of the cell surface. They also match the description of the migratory, "mesenchymal" enteroblasts (Antonello et al., 2015). To make this clearer, we have added the following notes to the legend for Figure 6: “Unpolarised” in the second panel of this figure indicates that the enteroblast has not formed a distinct apical domain. At this stage, no marker is clearly apically localised. “unpolarised” or “polarised” in the third and fourth panels describe the localisation of marker proteins, such as Actin and Cno."
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.12.10.472136v2
www.biorxiv.org www.biorxiv.org

New submission 18/05/2023, 13:15:53

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  eLife assessment
  
  This important paper exploits new cryo-EM tomography tools to examine the state of chromatin in situ. The experimental work is meticulously performed and convincing, with a vast amount of data collected. The main findings are interpreted by the authors to suggest that the majority of yeast nucleosomes lack a stable octameric conformation. Despite the possibly controversial nature of this report, it is our hope that such work will spark thought-provoking debate, and further the development of exciting new tools that can interrogate native chromatin shape and associated function in vivo.
  
  We thank the Editors and Reviewers for their thoughtful and helpful comments. We also appreciate the extraordinary amount of effort needed to assess both the lengthy manuscript and the previous reviews. Below, we provide our provisional responses in bold blue font. The majority of the comments are straightforward to address. We have taken a more conservative approach with the subset of comments that would require us to speculate because we either lack key information or we lack technical expertise. Instead of adding the speculative replies to the main text, we think it will be better to leave them in the rebuttal for posterity. Readers will therefore have access to our speculation and know that we did not feel confident enough to include these thoughts in the Version of Record.
  
  Reviewer #1 (Public Review):
  
  This manuscript by Tan et al is using cryo-electron tomography to investigate the structure of yeast nucleosomes both ex vivo (nuclear lysates) and in situ (lamellae and cryosections). The sheer number of experiments and results are astounding and comparable with an entire PhD thesis. However, as is always the case, it is hard to prove that something is not there. In this case, canonical nucleosomes. In their path to find the nucleosomes, the authors also stumble over new insights into nucleosome arrangement that indicates that the positions of the histones is more flexible than previously believed.
  
  We want to point out that canonical nucleosomes are there in wild-type cells in situ, albeit rarer than what’s expected based on our HeLa cell analysis. The negative result (absence of any canonical nucleosome classes in situ) was found in the histone-GFP mutants.
  
  Major strengths and weaknesses:
  
  Personally, I am not ready to agree with their conclusion that heterogenous non-canonical nucleosomes predominate in yeast cells, but this reviewer is not an expert in the field of nucleosomes and can't judge how well these results fit into previous results in the field. As a technological expert though, I think the authors have done everything possible to test that hypothesis with today's available methods. One can debate whether it is necessary to have 35 supplementary figures, but after working through them all, I see that the nature of the argument needs all that support, precisely because it is so hard to show what is not there. The massive amount of work that has gone into this manuscript and the state-of-the art nature of the technology should be warmly commended. I also think the authors have done a really great job with including all their results to the benefit of the scientific community. Yet, I am left with some questions and comments:
  
  Could the nucleosomes change into other shapes that were predetermined in situ? Could the authors expand on if there was a structure or two that was more common than the others of the classes they found? Or would this not have been found because of the template matching and later reference particle used?
  
  Our best guess (speculation) is that one of the class averages that is smaller than the canonical nucleosome contains one or more non-canonical nucleosome classes. We do not feel confident enough to single out any of these classes precisely because we do not yet know if they arise from one non-canonical nucleosome structure or from multiple – and therefore mis-classified – non-canonical nucleosome structures (potentially with other non-nucleosome complexes mixed in). We feel it is better to leave this discussion out of the manuscript, or risk sending the community on wild goose chases.
  
  Our template-matching workflow uses a low-enough cross-correlation threshold that any nucleosome-sized particle (plus minus a few nanometers) would be picked, which is why the number of hits is so large. So unless the noncanonical nucleosomes quadrupled in size or lost most of their histones, they should be grouped with one or more of the other 99 class averages (WT cells) or any of the 100 class averages (cells with GFP-tagged histones). As to whether the later reference particle could have prevented us from detecting one of the non-canonical nucleosome structures, we are unable to tell because we’d really have to know what an in situ non-canonical nucleosome looks like first.
  
  Could it simply be that the yeast nucleoplasm is differently structured than that of HeLa cells and it was harder to find nucleosomes by template matching in these cells? The authors argue against crowding in the discussion, but maybe it is just a nucleoplasm texture that side-tracks the programs?
  
  Presumably, the nucleoplasmic “side-tracking” texture would come from some molecules in the yeast nucleus. These molecules would be too small to visualize as discrete particles in the tomographic slices, but they would contribute textures that can be “seen” by the programs – in particular RELION, which does the discrimination between structural states. We do not know the inner-workings of RELION well enough to say what kinds of density textures would side-track its classification routines.
  
  The title of the paper is not well reflected in the main figures. The title of Figure 2 says "Canonical nucleosomes are rare in wild-type cells", but that is not shown/quantified in that figure. Rare is comparison to what? I suggest adding a comparative view from the HeLa cells, like the text does in lines 195-199. A measure of nucleosomes detected per volume nucleoplasm would also facilitate a comparison.
  
  Figure 2’s title is indeed unclear and does not align with the paper’s title and key conclusion. The rarity here is relative to the expected number of nucleosomes (canonical plus non-canonical). We have changed the title to “Canonical nucleosomes are a minority of the expected total in wild-type cells”. We would prefer to leave the reference to HeLa cells to the main text instead of as a figure panel because the comparison is not straightforward for a graphical presentation. Instead, we will report the total number of nucleosomes estimated for this particular tomogram (~7,600) versus the number of canonical nucleosomes classified (297; 594 if we assume we missed half of them).
  
  If the cell contains mostly non-canonical nucleosomes, are they really non-canonical? Maybe a change of language is required once this is somewhat sure (say, after line 303).
  
  This is an interesting semantic and philosophical point. From the yeast cell’s “perspective”, the canonical nucleosome structure would be the form that is in the majority. That being said, we do not know if there is one structure that is the majority. From the chromatin field’s point of view, the canonical nucleosome is the form that is most commonly seen in all the historical – and most contemporary – literature, namely something that resembles the crystal structure of Luger et al, 1997. Given these two lines of thinking, we will add the following clarification after line 303:
  
  “At present, we do not know what the non-canonical nucleosome structures are, meaning that we cannot even determine if one non-canonical structure is the majority. Until we know what the family of non-canonical nucleosome structures are, we will use the term non-canonical to describe the nucleosomes that do not have the canonical (crystal) structure”.
  
  The authors could explain more why they sometimes use conventional the 2D followed by 3D classification approach and sometimes "direct 3-D classification". Why, for example, do they do 2D followed by 3D in Figure S5A? This Figure could be considered a regular figure since it shows the main message of the paper.
  
  Because the classification of subtomograms in situ is still a work in progress, we felt it would be better to show one instance of 2-D classification for lysates and one for lamellae. While it is true that we could have presented direct 3-D classification for the entire paper, we anticipate that readers will be interested to see what the in situ 2-D class averages look like.
  
  The main message is that there are canonical nucleosomes in situ (at least in wild-type cells), but they are a minority. Therefore, the conventional classification for Figure S5A should not be a main figure because it does not show any canonical nucleosome class averages in situ.
  
  Figure 1: Why is there a gap in the middle of the nucleosome in panel B? The authors write that this is a higher resolution structure (18Å), but in the even higher resolution crystallography structure (3Å resolution), there is no gap in the middle.
  
  There is a lower concentration of amino acids at the middle in the disc view; unfortunately, the space-filling model in Figure 1A hides this feature. The gap exists in experimental cryo-EM density maps. See below for an example. The size of the gap depends on the contour level and probably the contrast mechanism, as the gap is less visible in the VPP subtomogram averages. To clarify this confusing phenomenon, we will add the following lines to the figure legend:
  
  “The gap in the disc view of the nuclear-lysate-based average is due to the lower concentration of amino acids there, which is not visible in panel A due to space-filling rendering. This gap’s size may depend on the contrast mechanism because it is not visible in the VPP averages.”
  
  Reviewer #2 (Public Review):
  
  Nucleosome structures inside cells remain unclear. Tan et al. tackled this problem using cryo-ET and 3-D classification analysis of yeast cells. The authors found that the fraction of canonical nucleosomes in the cell could be less than 10% of total nucleosomes. The finding is consistent with the unstable property of yeast nucleosomes and the high proportion of the actively transcribed yeast genome. The authors made an important point in understanding chromatin structure in situ. Overall, the paper is well-written and informative to the chromatin/chromosome field.
  
  We thank Reviewer 2 for their positive assessment.
  
  Reviewer #3 (Public Review):
  
  Several labs in the 1970s published fundamental work revealing that almost all eukaryotes organize their DNA into repeating units called nucleosomes, which form the chromatin fiber. Decades of elegant biochemical and structural work indicated a primarily octameric organization of the nucleosome with 2 copies of each histone H2A, H2B, H3 and H4, wrapping 147bp of DNA in a left handed toroid, to which linker histone would bind.
  
  This was true for most species studied (except, yeast lack linker histone) and was recapitulated in stunning detail by in vitro reconstitutions by salt dialysis or chaperone-mediated assembly of nucleosomes. Thus, these landmark studies set the stage for an exploding number of papers on the topic of chromatin in the past 45 years.
  
  An emerging counterpoint to the prevailing idea of static particles is that nucleosomes are much more dynamic and can undergo spontaneous transformation. Such dynamics could arise from intrinsic instability due to DNA structural deformation, specific histone variants or their mutations, post-translational histone modifications which weaken the main contacts, protein partners, and predominantly, from active processes like ATP-dependent chromatin remodeling, transcription, repair and replication.
  
  This paper is important because it tests this idea whole-scale, applying novel cryo-EM tomography tools to examine the state of chromatin in yeast lysates or cryo-sections. The experimental work is meticulously performed, with vast amount of data collected. The main findings are interpreted by the authors to suggest that majority of yeast nucleosomes lack a stable octameric conformation. The findings are not surprising in that alternative conformations of nucleosomes might exist in vivo, but rather in the sheer scale of such particles reported, relative to the traditional form expected from decades of biochemical, biophysical and structural data. Thus, it is likely that this work will be perceived as controversial. Nonetheless, we believe these kinds of tools represent an important advance for in situ analysis of chromatin. We also think the field should have the opportunity to carefully evaluate the data and assess whether the claims are supported, or consider what additional experiments could be done to further test the conceptual claims made. It is our hope that such work will spark thought-provoking debate in a collegial fashion, and lead to the development of exciting new tools which can interrogate native chromatin shape in vivo. Most importantly, it will be critical to assess biological implications associated with more dynamic - or static forms- of nucleosomes, the associated chromatin fiber, and its three-dimensional organization, for nuclear or mitotic function.
  
  Thank you for putting our work in the context of the field’s trajectory. We hope our EMPIAR entry, which includes all the raw data used in this paper, will be useful for the community. As more labs (hopefully) upload their raw data and as image-processing continues to advance, the field will be able to revisit the question of non-canonical nucleosomes in budding yeast and other organisms.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.04.04.438362v3
www.biorxiv.org www.biorxiv.org

Increasing stimulus similarity drives nonmonotonic representational change in hippocampus

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  In this paper, Wammes et al. used fMRI to investigate changes in representational similarity of temporally paired images in hippocampal subfields. The stimuli were designed to parametrically vary in their visual similarity so that individual pairs covered the entire range of visual overlap, which was behaviourally validated by a separate sample of participants. The authors compared the neural patterns evoked by these pairs of stimuli before and after participants completed a statistical learning task. The findings showed that pre- to post-learning, representations in the dentate gyrus reconfigured to fit a cubic model, consistent with the non-monotonic plasticity hypothesis (NMPH).
  
  This is an interesting, novel approach with a clever stimulus manipulation which addresses a gap in the current literature. The study is well-motivated by theory, the analyses are appropriate and clearly described, the implemented controls are carefully designed, and the manuscript is well-written. However, it is unclear whether the same principles necessarily generalize beyond visual similarity, and whether these neural patterns meaningfully relate to behaviour.
  
  1) The analytic approach is well-designed and the results clearly address the hypotheses. However, it seems like the conclusions might be dependent on this learning paradigm, which should be discussed in a bit more detail and made clearer. The present statistical learning approach is somewhat implicit in its nature and relies on the participants gradually recognizing the temporal links between stimuli. In contrast, in most prior studies cited in the present manuscript, participants were explicitly instructed to make associations between stimuli that either occurred on the screen simultaneously, or relatively far apart in time (i.e., not successively). This top-down influence likely plays an important role. Even beyond experimental paradigms - we often make connections between similar experiences that occurred far apart in time, and cannot always rely on temporal contingencies. The step between previous work and statistical learning needs to be made clearer and more explicit.
  
  Although our current approach involves a more implicit statistical learning task, the hypothesized non-monotonic plasticity is a general mechanism that has been and can be applied across tasks. We used temporal contingency to create a situation where representations were concurrently active. However, prior work has used other manipulations, such as linking to a shared associate. We have modified and expanded both the Introduction and Conclusion to emphasize this broader context and highlight directions for future work.
  
  See Introduction (p. 4, lines 60-74): “The NMPH has been put forward as a learning mechanism that applies broadly across tasks in which memories compete, whether they have been linked based on incidental co-occurrence in time or through more intentional associative learning (Ritvo et al., 2019). The NMPH can explain findings of differentiation in diverse paradigms (e.g., linking to a shared associate: Chanales et al., 2017; Favila et al., 2016; Schlichting et al., 2015; Molitor et al., 2020; retrieval practice: Hulbert & Norman, 2015; statistical learning: Kim, Norman, & Turk-Browne, 2017) by positing that these paradigms induced moderate coactivation of competing memories. Likewise, relying on the same parameter of coactivation, the NMPH can explain seemingly contradictory findings showing that shared associates (Collin et al., 2015; Milivojevic et al., 2015; Schlichting et al., 2015; Molitor et al., 2020) and co-occurring items (Schapiro et al., 2012; Schapiro, Turk-Browne, Norman, & Botvinick, 2016) can lead to integration by positing that — in these cases — the paradigms induced strong coactivation. Importantly, although the NMPH is compatible with findings of both differentiation and integration across several paradigms with diverse task demands, the explanations above are post hoc and do not provide a principled test of the NMPH’s core claim that there is a continuous, U-shaped function relating the level of coactivation to representational change.
  
  See Introduction (p. 5, lines 83-86): “No existing study has demonstrated the full U- shaped pattern for representational change; that is what we set out to do here, using a visual statistical learning paradigm — specifically, we brought about coactivation using temporal co-occurrence between paired items, and we manipulated the degree of coactivation by varying the visual similarity of the items in a pair.”
  
  See Conclusion (p. 18, lines 370-374): “From a theoretical perspective, these results provide the strongest evidence to date for the NMPH account of hippocampal plasticity. We expect that a similar U-shaped function relating coactivation and representational change will manifest in paradigms with different task demands and stimuli, but additional work is needed to provide empirical support for this claim about generality.”
  
  2) Related to the point above - the timecourse over which such statistical learning occurs should be discussed. If I understood correctly, all of the learning occurred in the 6 scanned blocks between the two templating runs. Does the NMPH predict that the hippocampal patterns should immediately reconfigure depending on visual input, or only reconfigure once the participants encode the links between paired stimuli? If the pattern consistent with the NMPH is immediately evident, this would suggest that the present findings, while very convincing, might not be governed by the same mechanisms as integration/differentiation in memory. It seems unlikely that participants would immediately attempt to link these complex visual stimuli, especially as the cover task was orthogonal. To this end, it would be helpful to see any kind of analysis evaluating representations across the 6 statistical learning runs.
  
  The reviewer correctly describes that learning took place over the six blocks between templating runs. We agree that observing the emergence of representational change across those runs would be ideal. Unfortunately, however, our design is not compatible with this analysis. Because the pairs were learned from deterministic transition probabilities, the onsets of the paired stimuli were correlated in time. When these correlated events are convolved with the slow hemodynamic response, the responses to the paired stimuli cannot be reliably distinguished. Also, the response to the second stimulus in a pair would be affected by visual similarity to its preceding stimulus as a result of adaptation/repetition suppression, confounding comparisons across conditions. These problems are precisely why we employed a pre/post design in which to-be/previously paired stimuli are presented independently in a random order. This allows for the assessment of representational similarity unconfounded with correlated onsets or adaptation.
  
  Although we cannot provide a sense of the learning trajectory, we now highlight this design decision, acknowledge the limitation, and highlight this as an opportunity for future work with other more time-resolved modalities or with (random) representational assessments interdigitated with the learning blocks.
  
  See Discussion (p. 17, lines 358-366): “Finally, although analyzing representational overlap in templating runs before and after statistical learning afforded us the ability to quantify pre-to-post changes, our design precluded analysis of the emergence of representational change over time. That is, we could not establish whether integration or differentiation occurred early or late in statistical learning. This is because, during statistical learning runs, the onsets of paired images were almost perfectly correlated, meaning that it was not possible to distinguish the representation of one image from its pairmate. Future work could monitor the time course of representational change, either by interleaving additional templating runs throughout statistical learning (although this could interfere with the statistical learning process), or by exploiting methods with higher temporal resolution where the responses to stimuli presented close in time can more readily be disentangled.”
  
  3) In the Introduction and Discussion, the authors focus on learning and discuss the integration/differentiation of memories. To establish a link between the reported hippocampal representations and behaviour, it would be helpful to show evidence of a link between neural differentiation and measures of statistical learning such as priming.
  
  As the reviewer alluded to earlier, our behavioral task is orthogonal to the manipulation of temporal co-occurrence. Accordingly, we do not have any behavioral data on which we could conduct such an analysis. We fully acknowledge the value of this suggestion and now describe this as a limitation and area for future research.
  
  See Discussion (p. 17, lines 350-357): “Prior work in this area has demonstrated brain- behavior relationships (Favila et al., 2016; Molitor et al., 2020), so it is clear that changes in representational overlap (i.e., either integration or differentiation) can bear on later behavioral performance. However, in the current work, our behavioral task was intentionally orthogonal to the dimensions of interest (i.e., unrelated to temporal co- occurrence and visual similarity), limiting our ability to draw conclusions about potential downstream effects on behavior. We believe that this presents a compelling target for follow-up research. Establishing a behavioral signature of both integration and differentiation in the context of nonmonotonic plasticity will not only clarify the brain-behavior relationship, but also allow for investigations in this domain without requiring brain data.”
  
  4) From the authors' predictions (and Fig 1), it might follow that participants who show steeper slopes in early visual regions (i.e., higher correspondence to stimulus similarity) pre-learning might also show a stronger cubic trend in the hippocampus. It would be useful to show within-participant analyses to link visual processing regions to hippocampal representations.
  
  What a fantastic suggestion! To test this prediction, we extracted the linear coefficients in the visual similarity analysis from cortical ROIs (V1, V2, LO, IT, FG, PHC, PRC, and EC) and the cubic model fit in the representational change analysis from the key hippocampal ROI (DG). Linearity during the initial templating run in PRC was associated with stronger non-monotonicity in DG. The full reporting of these analyses is now included in the figure supplements and referenced in the main text.
  
  See Results, subsection Representational Change (p. 12, lines 228-229): “Interestingly, in an exploratory analysis, we found that the degree of model fit in DG was predicted by the extent to which visual representations in PRC tracked model similarity (see Figure 4—figure supplement 2).”
  
  Reviewer #2:
  
  The authors apply neural network modeling and representational analysis of fMRI data to testing the ability of the theoretical framework under the "non-monotonic-plasticity hypothesis" to explain how hippocampal subdivisions represent similarity and distinctiveness between events. They suggest that the dentate gyrus subfield, in particular, was sensitive to the degree of overlap between experiences, and changes how it favored distinctiveness or similarity in its representation of associated stimuli in a non-monotonic manner.
  
  Overall, the work builds logically on prior evidence from this group focused on how cortical representations influence memory, and leverages a compelling theoretical framework to reconcile some conflict in the literature on how hippocampal representations respond to overlap.
  
  The primary confusion and concern with the current manuscript was on the theoretical side. It was not wholly clear from the literature review why DG was the predicted locus of the non-monotonic representational relationship observed, and how the findings fit with extant data from rodent work.
  
  Thank you for providing an opportunity to better motivate our work. We have updated the paragraph justifying our focus on the hippocampus and on DG in particular.
  
  See Introduction (p. 8, lines 122-147): “We and others have previously hypothesized that nonmonotonic plasticity applies widely throughout the brain (Ritvo et al., 2019), including sensory regions (e.g., Bear, 2003). In this study, we focused on the hippocampus because of its well-established role in supporting learning effects over relatively short timescales (e.g., Favila et al., 2016; Kim et al., 2017; Schapiro et al., 2012). Importantly, we hypothesized that, even if nonmonotonic plasticity occurs throughout the entire hippocampus, it might be easier to trace out the full predicted U-shape in some hippocampal subfields than in others. As discussed above, our hypothesis is that representational change is determined by the level of coactivation — detecting the U-shape requires sweeping across the full range of coactivation values, and it is particularly important to sample from the low-to-moderate range of coactivation values associated with the differentiation ‘dip’ in the U-shaped curve (i.e., the leftmost side of the inset in Fig. 1). Prior work has shown that there is extensive variation in overall activity (sparsity) levels across hippocampal subfields, with CA2/3 and DG showing much sparser codes than CA1 (Barnes, McNaughton, Mizumori, Leonard, & Lin, 1990; Duncan & Schlichting, 2018). We hypothesized that regions with sparser levels of overall activity (DG, CA2/3) would show lower overall levels of coactivation and thus do a better job of sampling this differentiation dip, leading to a more robust estimate of the U-shape, compared to regions like CA1 that are less sparse and thus should show higher levels of coactivation (Ritvo et al., 2019). Consistent with this idea, human fMRI studies have found that CA1 is relatively biased toward integration and CA2/3/DG are relatively biased toward differentiation (Dimsdale-Zucker et al., 2018; Kim et al., 2017; Molitor et al., 2020). Zooming in on the regions that have shown differentiation in human fMRI (CA2/3/DG), we hypothesized that the U-shape would be most visible in DG, for two reasons: First, DG shows sparser activity than CA3 (Barnes et al., 1990; GoodSmith et al., 2017; West, Slomianka, & Gundersen, 1991) and thus will do a better job of sampling the left side of the coactivation curve. Second, CA3 is known to show strong attractor dynamics (‘pattern completion’; McNaughton & Morris, 1987; Rolls & Treves, 1998; Guzowski, Knierim, & Moser, 2004) that might make it difficult to observe moderate levels of coactivation. For example, rodent studies have demonstrated that, rather than coactivating representations of different locations, CA3 patterns tend to sharply flip between one pattern and the other (e.g., Leutgeb, Leutgeb, Moser, & Moser, 2007; Vazdarjanova & Guzowski, 2004).”
  
  Additionally, the theoretical model (nicely illustrated in the manuscript) is considered in a somewhat biological-network-agnostic level. Some assumption for how context changes over time, how prior representations are maintained over time, etc., are important for non-monotonic relationships between representations and memory to manifest in the model, but the manuscript does not provide much discussion of their plausibility. This was particularly notable in terms of the emphasis given in the fMRI data to different hippocampal subfields, but not much discussion given on whether/why the model framework is static across subfields (in terms of how context and item information are represented and connected).
  
  We appreciate this nudge to discuss these additional subfield-specific factors; we have added a paragraph to the Discussion that addresses these issues.
  
  See Discussion (p. 16, lines 318-336): “Although we focused above on differences in sparsity when motivating our predictions about subfield-specific learning effects, there are numerous other factors besides sparsity that could affect coactivation and (through this) modulate learning. For example, the degree of coactivation during statistical learning will be affected by the amount of residual activity of the A item during the B item’s presentation in the statistical learning phase. In Figure 1, this residual activity is driven by sustained firing in cortex, but this could also be driven by sustained firing in hippocampus; subfields might differ in the degree to which activation of stimulus information is sustained over time (see, e.g., the literature on hippocampal time cells: Eichenbaum, 2014; Howard & Eichenbaum, 2013), and activation could be influenced by differences in the strength of attractor dynamics within subfields (e.g., Neunuebel & Knierim, 2014; Leutgeb et al., 2007). Also, in Figure 1, the learning responsible for differentiation was shown as happening between ‘perceptual conjunction’ neurons and ‘context’ neurons in the hippocampus. Subfields may vary in how strongly these item and context features are represented, in the stability/drift of the context representations (DuBrow, Rouhani, Niv, & Norman, 2017), and in the interconnectivity between item and context features (Witter, Wouterlood, Naber, & Van Haeften, 2000). It is also likely that some of the relevant plasticity between item and context features happens across, in addition to within, subfields (Hasselmo & Eichenbaum, 2005). For these reasons, exploring the predictions of the NMPH in the context of biologically detailed computational models of the hippocampus (e.g., Schapiro, Turk-Browne, Botvinick, & Norman, 2017; Frank, Montemurro, & Montaldi, 2020; Hasselmo & Wyble, 1997) will help to sharpen predictions about what kinds of learning should occur in different parts of the hippocampus.
  
  As such, this review was very positive, and found the methods to be sound and the conclusions to be solid. There was some room for improvement in how the theoretical foundation was presented for the hippocampal subregion fMRI predictions and for the conceptualization of the neural network memory model.
  
  We agree with the reviewer that more justification of our specific hippocampal predictions was required and we are grateful for their suggestions.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.13.435275v1
www.biorxiv.org www.biorxiv.org

New submission 15/11/2022, 14:34:59

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This study explores the mechanisms responsible for reduced steroidogenesis of adrenocortical cells in a mouse model of systemic inflammation induced by LPS administration. Working from RNA and protein profiling data sets in adrenocortical tissue from LPS-treated mice they report that LPS perturbs the TCA cycle at the level of succinate dehydrogenase B (SDHB) impairing oxidative phosphorylation. Additional studies indicate these events are coupled to increased IL-1β levels which inhibit SDHB expression through DNA methyltransferase-dependent DNA methylation of the SDHB promoter.
  
  In general, these are interesting studies with some novel implications. I do, however, have concerns with some of the author's rather broad conclusions given the limitations of their experimental approach. The paper could be improved by addressing the following points:
  
  1) The limitations of using LPS as the model for systemic inflammation need to be explicitly described.
  
  We thank the Reviewer for this suggestion. Indeed, the LPS model has several limitations as a preclinical model of sepsis, which are outlined in the revised Discussion. Despite its limitations, we chose this model over other models of sepsis, such as the cecal slurry model, due to its high reproducibility, which enabled the here presented mechanistic studies.
  
  2) The initial in vivo findings, which support the proposed metabolic perturbation, are based on descriptive profiling data obtained at one time point following a single dose of LPS. The author's conclusion that the ultimate transcriptional pathway identified hinges critically on knowledge of the time course of this effect following LPS, which is not adequately addressed in the paper. How was this time and dose of LPS established and are there data from different dose and time points?
  
  We thank the Reviewer for raising this question, which we indeed addressed at the beginning of our studies in order to determine a suitable time point and dose of LPS treatment. We chose 6 h as a suitable starting time point to perform transcriptional analyses, based on the fact that LPS triggers transcriptional changes in the adrenal gland and other tissues within the range of few hours (1-3). Confirming our expectations we found 2,609 differentially expressed genes (Figure 1a) in the adrenal cortex of LPS-treated mice among which many were involved in cellular metabolism (Figure 1d,e, 2a-e, Table 1, Table 2). Acute transcriptional changes, which are more likely to reflect direct effects of inflammatory signals compared to changes occurring at later time points (for instance in the range of days), would allow us to mechanistically investigate the effects of inflammation in the adrenal gland, which was the purpose of our studies. Hence, we were guided by the transcriptional changes observed at 6 h of LPS treatment and established the hypothesis that disruption of the TCA cycle in adrenocortical cells is key in the impact of inflammation on adrenal function. Along this line, we analyzed the metabolomic profile of the adrenal gland at 6 and 24 h of LPS treatment. At 6 h succinate levels as well as the succinate / fumarate ratio remained unchanged (Author response image 1A), while at 24 h post-injection these were increased by LPS (Author response image 1B, Figure 2l,o,q). The time delay of the increase in succinate levels (observed at 24 h) following downregulation of Sdhb mRNA expression (at 6 h) can be explained by the time required for reduction of SDHB protein levels, which is dependent on the protein turnover suggested to be approximately 12 h in HeLa cells (4). Based on these findings, all further metabolomic analyses were performed at 24 h of LPS treatment.
  
  Author response image 1. LPS increases the succinate/fumarate ratio at 24 but not 6h. Mice were i.p. injected with 1 mg/kg LPS and 6 h (A) and 24 h (B) post-injection succinate and fumarate levels were determined by LC-MS/MS in the adrenal gland. n=8-10; data are presented as mean ± s.e.m. Statistical analysis was done with two-tailed Mann-Whitney test. *p < 0.05.
  
  Having established the most suitable time points of LPS treatments to observe induced transcriptional and metabolic changes, we set out to define the LPS dose to be used in subsequent experiments. The data shown in Author response image 1, were acquired after treatment with 1 mg/kg LPS. This is a dose that was previously reported to cause transcriptional re-profiling of the adrenal gland (1, 2). However, 5 mg/kg LPS, similarly to 1 mg/kg LPS, also reduced Sdhb, Idh1 and Idh2 expression at 4 h (Author response image 2A) and increased succinate and isocitrate levels at 24 h (Author response image 2B) in the adrenal gland. Given that the effects of 1 and 5 mg/kg LPS were similar, for animal welfare reasons we continued our studies with the lower dose.
  
  Author response image 2. Five mg/kg LPS downregulate Sdhb, Idh1 and Idh2 expression and increase succinate and isocitrate levels in the adrenal gland of mice. Sdhb, Idh1 and Idh2 expression (A) and succinate and isocitrate levels (B) were assessed in the adrenal gland of mice treated with 5 mg/kg LPS for 4 h (A) and 24 h (B). n=5; data are presented as mean ± s.d. Statistical analysis was done with two-tailed Mann-Whitney test. p < 0.05, *p < 0.01.
  
  3) Related to the point above, the authors data supporting a break in the TCA cycle would be strengthened direct biochemical assessment (metabolic flux analysis) of step kin the TCA cycle process impacted.
  
  We entirely agree with the Reviewer and considered performing TCA cycle metabolic flux analyses in adrenocortical cells. Unfortunately, the low yield of adrenocortical cells per mouse (approx. 3,000- 6,000) does not allow the performance of metabolic flux experiments, which require higher cell numbers per sample, several time points per condition and an adequate number of replicates per experiment. Moreover, NCI-H295R cells being adrenocortical carcinoma cells are expected to have substantially altered metabolic fluxes compared to normal cells. Since we wouldn’t have the capacity to confirm findings from metabolic flux experiments in NCI-H295R cells in primary adrenocortical cells, as we did for the rest of the experiments, we decided to not perform metabolic flux experiments in NCI-H295R cells. However, performing metabolic flux analyses in adrenocortical cells under inflammatory or other stress conditions remains an important future task that we will pursue upon establishment of a more suitable cell culture system.
  
  4) The proposed connection of DNMT and IL1 signaling to systemic inflammation and reduced steroidogenesis could be more firmly established by additional studies in adrenal cortical cells lacking these genes.
  
  We thank the Reviewer for this excellent suggestion. In the revised manuscript we strengthened the evidence for an IL-1β –DNMT1 link and show that DNMT1 deficiency blocks the effects of IL-1β on SDHB promoter methylation (Figure 6k), the succinate / fumarate ratio (Figure 6m), the oxygen consumption rate (Figure 6n) and steroidogenesis (Figure 6o-q) in adrenocortical cells. In order to validate the role of IL-1β in vivo, mice were simultaneously treated with LPS and Raleukin, an IL-1R antagonist. Treatment with Raleukin increased the SDH activity (Figure 6r), reduced succinate levels and the succinate / fumarate ratio (Figure 6s,t) and increased corticosterone production in LPS-treated mice (Figure 6u).
  
  Reviewer #2 (Public Review):
  
  The present manuscript provides a mechanistic explanation for an event in adrenal endocrinology: the resistance which develops during excessive inflammation relative to acute inflammation. The authors identify disturbances in adrenal mitochondria function that differentiate excessive inflammation. During severe inflammation the TCA in the adrenal is disrupted at the level of succinate production producing an accumulation of succinate in the adrenal cortex. The authors also provide a mechanistic explanation for the accumulation of succinate, they demonstrate that IL1b decreases expression of SDH the enzyme that degrades succinate through a methylation event in the SDH promoter. This work presents a solid explanation for an important phenomenon. Below are a few questions that should be resolved experimentally.
  
  1) The authors should confirm through direct biochemical assays of enzymatic activity that steroidogenesis enzyme activity is not impaired. Many of these enzymes are located in the mitochondria and their activity may be diminished due to the disturbed, high succinate environment of the cortical cell as opposed to the low ATP production.
  
  We thank the Reviewer for this question. The activity of the first and rate-limiting steroidogenic enzyme, cytochrome P450-side-chain-cleavage (SCC, CYP11A1) which generates pregnenolone from cholesterol, was recently shown to require intact SDH function (5). In agreement with this report we show that production of progesterone, the direct derivative of pregnenolone, is impaired upon SDH inhibition (Figure 5b,e,h). In addition, we assessed the activity of CYP11B1 (steroid 11β-hydroxylase), the enzyme catalyzing the conversion of 11-deoxycorticosterone to corticosterone, i.e. the last step of glucocorticoid synthesis, by determining the corticosterone and 11-deoxycorticosterone levels by LC-MS/MS and calculating the ratio of corticosterone to 11-deoxycorticosterone in ACTH-stimulated adrenocortical cells and explants. The corticosterone / 11-deoxycorticosterone ratio was not affected by Sdhb silencing in adrenocortical cells (Figure 5- Supplement 2g) nor did it change upon LPS treatment in adrenal explants (Figure 5- Supplement 2h), suggesting that CYP11B1 activity may not be altered upon SDH blockage. Hence, we propose that upon inflammation impairment of SDH function may disrupt at least the first steps of steroidogenesis (producing pregnenolone/progesterone), thereby diminishing production of all downstream adrenocortical steroids. This is now discussed in the revised manuscript.
  
  2) What is the effect of high ROS production? Is steroidogenesis resolved if ROS is pharmacologically decreased even if the reduction of ATP is not resolved?
  
  We thank the Reviewer for this suggestion, which helped us to broaden our findings. Indeed, ROS scavenging by the vitamin E analog Trolox (Figure 5n) partially reversed the inhibitory effect of DMM on steroidogenesis (Figure 5o,p), suggesting that impairment of SDH function impacts steroidogenesis also via enhanced ROS production (Figure 4g).
  
  3) Does increased intracellular succinate (through cell permeable succinate treatment) inhibit steroidogenesis even if there is not a blockage of OXPHOS?
  
  We suggest that SDH inhibition and succinate accumulation lead to reduced steroidogenesis due to impaired oxidative phosphorylation (Figure 4c,e, 5i), reduced ATP synthesis (Figure 4d, 5j-m) and increased ROS production (Figure 4g, 5o,p). Since SDH is part (complex II) of the electron chain transfer it cannot be decoupled from oxidative phosphorylation, thereby limiting the experimental means for addressing this question.
  
  4) It should be demonstrated the genetic loss of IL1 signaling in adrenal cortical cells results in a loss of the effect of LPS on reduced steroidogenesis and increased succinate accumulation.
  
  We thank the Reviewer for this suggestion. Development of a mouse line with genetic loss of Il-1r in adrenocortical cells was rather impossible during the short time of revisions. Instead, mice under LPS treatment were treated with the IL-1R antagonist, Raleukin, to study the in vivo effects of IL-1β in the adrenal gland. IL-1R antagonism increased SDH activity in the adrenal cortex (Figure 6r), decreased succinate levels and the succinate/fumarate ratio in the adrenal gland (Figure 6s,t) and enhanced corticosterone production (Figure 6u) in LPS-treated mice, supporting our hypothesis that IL-1β mediates the effects of systemic inflammation in the adrenal cortex.
  
  5) It should be demonstrated the genetic loss of IL1 signaling in adrenal cortical cells results in a loss of the effect of LPS on SDH activity and ATP production and SDH promoter methylation
  
  As outlined above, Raleukin treatment increased SDH activity in the adrenal cortex (Figure 6r) and decreased succinate levels and the succinate/fumarate ratio in the adrenal gland (Figure 6s,t) of mice treated with LPS. Furthermore, IL-1β reduced the ATP/ADP ratio (Figure 6e) and enhanced SDHB promoter methylation in NCI-H295R cells (Figure 6k).
  
  6) It should be shown that the silencing of DNMT eliminates or diminishes the effect of LPS on reduced steroidogenesis and increased succinate accumulation.
  
  We thank the Reviewer for this suggestion, which prompted us to strengthen the evidence for the implication of DNMT1 in the effects of LPS on adrenocortical cell metabolism and function. As mentioned above, development of a new mouse line, in this case bearing genetic loss of DNMT1 in adrenocortical cells, was considered impossible during the short time of revisions. Therefore, we assessed the role of DNMT1 by silencing it via siRNA transfections in primary adrenocortical cells and NCI-H295R cells. We show that DNMT1 silencing inhibits the effect of IL-1β on SDHB promoter methylation (Figure 6k), restores Sdhb expression (Figure 6l) and reduces the succinate/fumarate ratio in IL-1β treated adrenocortical cells (Figure 6m). Accordingly, DNMT1 silencing restores ACTH-induced production of corticosterone, 11-deoxycorticosterone and progesterone in IL-1β treated adrenocortical cells (Figure 6o-q). We chose to stimulate adrenocortical cells with IL-1β instead of LPS, as in vitro the effects of IL-1β were more robust than these of LPS (possibly due to a reduction of TLR4 expression or function in cultured adrenocortical cells) and in order to show the link between IL-1β and DNMT1.
  
  7) Does silencing of DNMT reduce OXPHOS in adrenal cortical cells?
  
  We measured the oxygen consumption rate in NCI-H295R cells, which were transfected with siRNA against DNMT1 and treated or not with IL-1β. IL-1β reduced the OCR in cells transfected with control siRNA, while DNMT1 silencing blunted the effect of IL-1β (Figure 6n).
  
  8) The effects of LPS on reduced adrenal steroidogenesis are not elaborated at the physiological level. The manuscript should demonstrate the ramifications of the adrenal function decreasing after LPS. Does CORT release become less pronounced after subsequent challenges? Does baseline CORT decrease at some point? No physiological consequences are shown. Similarly, these physiological consequences of decreased adrenal function should be dependent on decreased SDH activity and OXPHOS in adrenal cells and this should be demonstrated experimentally.
  
  We thank the Reviewer for raising this excellent question. Inflammation is a potent inducer of the Hypothalamus-Pituitary-Adrenal gland (HPA) axis, causing increased glucocorticoid production, a stress response leading to vital immune and metabolic adaptations. Accordingly, LPS treatment rapidly increases glucocorticoid production in mice (1, 6, 7). Reduced adrenal gland responsiveness to ACTH associates with decreased survival of septic mice (8). These preclinical findings stand in accordance with observations in septic patients, in which impairment of adrenal function correlates with high risk for death (9). Along this line, ACTH test was suggested to have prognostic value for identification of septic patients with high mortality risk (9, 10).
  
  In order to confirm impairment of the adrenal gland function in septic mice, animals were subjected to sepsis via administration of a high LPS dose (10 mg / kg) and treated with ACTH 24 h later. Indeed, the ACTH-induced increase in corticosterone levels was diminished in LPS-treated mice (Author response image 3). This finding was further confirmed in adrenal explants, in which LPS pre-treatment also blunted ACTH-stimulated corticosterone production (Figure 5s).
  
  Author response image 3. High LPS dose blunts the ACTH response in mice. C57BL/6J mice were i.p. injected with 10 mg/kg LPS or PBS and 24 h later they were i.p. injected with 1 mg/kg ACTH. One hour after ACTH administration blood was retroorbitally collected and corticosterone plasma levels were determined by LC-MS/MS. n=4-5; data are presented as mean ± s.d. Statistical analysis was done with two-tailed Mann-Whitney test. *p < 0.05.
  
  Given that purpose of our studies was to dissect the mechanisms underlying adrenal gland dysfunction in inflammation rather than analyzing the physiological consequences thereof, we chose not to follow these lines of investigations and concentrate on the role of cell metabolism in adrenocortical cells in the context of inflammation.
  
  References
  
  W. Kanczkowski, A. Chatzigeorgiou, M. Samus, N. Tran, K. Zacharowski, T. Chavakis, S. R. Bornstein, Characterization of the LPS-induced inflammation of the adrenal gland in mice. Mol Cell Endocrinol 371, 228-235 (2013).
  
  L. S. Chen, S. P. Singh, M. Schuster, T. Grinenko, S. R. Bornstein, W. Kanczkowski, RNA-seq analysis of LPS-induced transcriptional changes and its possible implications for the adrenal gland dysregulation during sepsis. J Steroid Biochem Mol Biol 191, 105360 (2019).
  
  V. I. Alexaki, G. Fodelianaki, A. Neuwirth, C. Mund, A. Kourgiantaki, E. Ieronimaki, K. Lyroni, M. Troullinaki, C. Fujii, W. Kanczkowski, A. Ziogas, M. Peitzsch, S. Grossklaus, B. Sonnichsen, A. Gravanis, S. R. Bornstein, I. Charalampopoulos, C. Tsatsanis, T. Chavakis, DHEA inhibits acute microglia-mediated inflammation through activation of the TrkA-Akt1/2-CREB-Jmjd3 pathway. Mol Psychiatry 23, 1410-1420 (2018).
  
  C. Yang, J. C. Matro, K. M. Huntoon, D. Y. Ye, T. T. Huynh, S. M. Fliedner, J. Breza, Z. Zhuang, K. Pacak, Missense mutations in the human SDHB gene increase protein degradation without altering intrinsic enzymatic function. FASEB J 26, 4506-4516 (2012).
  
  H. S. Bose, B. Marshall, D. K. Debnath, E. W. Perry, R. M. Whittal, Electron Transport Chain Complex II Regulates Steroid Metabolism. iScience 23, 101295 (2020).
  
  W. Kanczkowski, V. I. Alexaki, N. Tran, S. Grossklaus, K. Zacharowski, A. Martinez, P. Popovics, N. L. Block, T. Chavakis, A. V. Schally, S. R. Bornstein, Hypothalamo-pituitary and immune-dependent adrenal regulation during systemic inflammation. Proc Natl Acad Sci U S A 110, 14801-14806 (2013).
  
  W. Kanczkowski, A. Chatzigeorgiou, S. Grossklaus, D. Sprott, S. R. Bornstein, T. Chavakis, Role of the endothelial-derived endogenous anti-inflammatory factor Del-1 in inflammation-mediated adrenal gland dysfunction. Endocrinology 154, 1181-1189 (2013).
  
  C. Jennewein, N. Tran, W. Kanczkowski, L. Heerdegen, A. Kantharajah, S. Drose, S. Bornstein, B. Scheller, K. Zacharowski, Mortality of Septic Mice Strongly Correlates With Adrenal Gland Inflammation. Crit Care Med 44, e190-199 (2016).
  
  D. Annane, V. Sebille, G. Troche, J. C. Raphael, P. Gajdos, E. Bellissant, A 3-level prognostic classification in septic shock based on cortisol levels and cortisol response to corticotropin. JAMA 283, 1038-1045 (2000).
  
  E. Boonen, S. R. Bornstein, G. Van den Berghe, New insights into the controversy of adrenal function during critical illness. Lancet Diabetes Endocrinol 3, 805-815 (2015).
  
  C. C. Huang, Y. Kang, The transient cortical zone in the adrenal gland: the mystery of the adrenal X-zone. J Endocrinol 241, R51-R63 (2019).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.29.490066v2
www.biorxiv.org www.biorxiv.org

The photosensitive phase acts as a sensitive window for seasonal multisensory neuroplasticity in male and female starlings

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The manuscript by Jasmien Orije and colleagues has used advanced Diffusion Tensor and Fixel-Based brain imaging methods to examine brain plasticity in male and female European starlings. Songbirds provide a unique animal model to interrogate how the brain controls a complex, learned behaviour: song. The authors used DT imaging to identify known and uncover new structural changes in grey and white matter in male and female brains. The choice of the European starling as a model songbird was smart as this bird has a larger brain to facilitate anatomical localization, clear sex differences in song behavior and well-characterized photoperiod-induced changes in reproductive state. The authors are commended for using both male and female starlings. The photoperiodic treatment used was optimal to capture the key changes in physiological state. The high sampling frequency provides the capability to monitor key changes in physiology, behaviour and brain anatomy. Two exciting findings was the increased role of cerebellum and hippocampal recruitment in female birds engaged in singing behaviour. The development of non-invasive, multi-sampling brain imaging in songbirds provides a major advancement for studies that seek to understand the mechanism that control the motivation and production of singing behavior. The methods described herein set the foundation to develop targeted hypotheses to study how the vocal learning, such as language, is processed in discrete brain regions. Overall, the data presented in the study is extensive and includes a comprehensive analyses of regulated changes in brain microstructural plasticity in male and female songbirds.
  
  Reviewer #2:
  
  Orije et al. employed diffusion weighted imaging to longitudinally monitor the plasticity of the song control system during multiple photoperiods in male and female starlings. The authors found that both sexes experience similar seasonal neuroplasticity in multisensory systems and cerebellum during the photosensitive phase. The authors' findings are convincing and rely on a set of well-designed longitudinal investigations encompassing previously validated imaging methods. The authors' identification of a putative sensitive window during which sensory and motor systems can be seasonally re-shaped in both sexes is an interesting finding that advances our understanding of the neural basis of seasonal structural neuroplasticity in songbirds.
  
  Overall, this is a strong paper whose major strengths are:
  
  1) The longitudinal and non-invasive measure of plasticity employed
  
  2) The use of two complementary MR assays of white matter microplasticity
  
  3) The careful experimental design
  
  4) The sound and balanced interpretation of the imaging findings
  
  I do not have any major criticism but just a few minor suggestions:
  
  1) Pp 6-7. While the comparative description of canonical DTI with respect to fixel-based analysis is well written and of interest to readers with formal training in MR imaging, I found this entire section (and especially the paragraphs in page 7) too technical and out of context in a manuscript that is otherwise fundamentally about neuroplasticity in song birds. The accessibility of this manuscript to non-MR experts could be improved by moving this paragraph into the methods section, or by including it as supplemental material.
  
  The main purpose of this section was to introduce and explain the diffusion parameters which are used throughout the rest of the paper. Furthermore, we wanted to familiarize the reader with the concept of the population based template and the different structures that can be visualized by them. We agree that the technical details might have distracted from this main message. Therefore, we have trimmed the technical details out of this section and left a short explanation of the biological relevance of the different diffusion parameters and the anatomical structures visible on the population template. The technical details that were taken out are now a part of the material and methods section.
  
  The section now reads as follows:
  
  In the current study, we analyzed the DWI scans in two distinct ways: 1) using the common approach of diffusion tensor derived metrics such as fractional anisotropy (FA) and; 2) using a novel method of fiber orientation distribution (FOD) derived fixel-based analysis. Both techniques infer the microstructural information based on the diffusion of water molecules, but they are conceptually different (table 1). Common DTI analysis extracts for each voxel several diffusion parameters, which are sensitive to various microstructural changes in both grey and white matter specified in table 1. Fixel-based analysis on the other hand explores both microscopic changes in apparent fiber density (FD) or macroscopic changes in fiber-bundle cross-section (log FC) (table 1). Positive fiber-bundle cross-section values indicate expansion, whereas negative values reflect shrinkage of a fiber bundle relative to the template (Raffelt, Tournier et al. 2017).
  
  A population-based template created for the fixel-based analysis can be used as a study based atlas in which many of the avian anatomical structures can be identified (figure 2). We recognize many of the white matter structures such as the different lamina, occipito-mesencephalic tract (OM) and optic tract (TrO) among others. Interestingly, many of the nuclei within the song control system (i.e. HVC, robust nucleus of the arcopallium (RA), lateral magnocellular nucleus of the anterior nidopallium (LMAN), and Area X), auditory system (i.e. intercollicular nucleus complex, nucleus ovoidalis) and visual system (i.e. entopallium, nucleus rotundus) are identified by the empty spaces between tracts. The applied fixel-based approach is inherently sensitive to changes in white matter and cannot report on the microstructure within grey matter like brain nuclei; but rather sheds light on the fiber tracts surrounding and interconnecting them. As such, it provides an excellent tool to investigate neuroplasticity of different brain networks, and in the case of a nodular song control system focusing on changes in the fibers surrounding the song control nuclei, referred to as HVC surr, RA surr and Area X surr.
  
  2) Similarly, many sections, especially results, are in my opinion too detailed and analytical. While the employed description has the benefit of being systematic and rigorous, the ensuing narrative tends to be very technical and not easily interpretable by non experts. I think the manuscript may be substantially shortened (by at least 20% e.g. by removing overly technical or analytical descriptions of all results and regions affected) without losing its appeal and impact, but instead gaining in strength and focus especially if the new result narrative were aimed to more directly address the interesting set of questions the authors define in the introductory sections.
  
  We rewrote the result section, taking out the statistic reporting when it was also reported in a figure to reduce the bulk of this section and make it more readable. We made some of the descriptions of the regions affected more approachable by replacing it with parts of the discussion. This way we incorporated some of the explanations why certain findings are unexpected or relevant, as suggested by reviewer #3. Parts of text that were originally in the discussion are indicated in purple.
  
  3) The possible effect of brain size has been elegantly controlled by using a medial split approach. Have the authors considered using tensor-based morphometry (i.e. using the 3D RARE scans they acquired) to account for where in the brain the small differences in brain size occur? That could be more informative and sensitive than a whole-brain volume quantification.
  
  We have taken into consideration to add tensor-based morphometry, but we feel that log FC calculated with MrTrix can provide a similar account of the localization of these brain differences. Both methods are based on the Jacobean warps created between the individual images and the population template. They only differ in the starting images they use (3D RARE images in tensor-based morphometry or diffusion weighted images in log FC metric of MrTrix3) and the fact that MrTrix3 limits itself to the volume changes along a certain tract.
  
  The log FC difference in figure 4 gives a similar account of the differences in brain size between both sexes. Additionally, figure 6 indicates the log FC differences between small and large brain birds.
  
  4) I think Figures Fig. 3 and Fig. 4 may benefit from a ROI-based quantification of parameters of interests across groups (similar to what has been done for Fig. 7 and its related Fig. 8). This could help readers assess the biological relevance of the parameter mapped. For instance, in Fig. 3, most FA differences are taking place in low FA (i.e. gray matter dense?) regions.
  
  We supplied the figures with extracted ROI-based parameters of figure 3 and figure 4. In line with this reasoning we also added the same kind of supplementary figures for figure 5 and 6.
  
  Figure 3 - figure supplement 1: Overview of the fractional anisotropy (FA) changes over time extracted from the relevant ROI-based clusters with significant sex differences. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant sex differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the fractional anisotropy values are not significantly different from each other.
  
  Figure 4 – figure supplement 2: Overview of the fiber density (FD) changes over time extracted from the relevant ROI-based clusters with significant sex differences. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant sex differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the FD values are not significantly different from each other. Abbreviations: surr, surroundings.
  
  Figure 4 –figure supplement 3: Overview of the fiber-bundle cross-section (log FC) changes over time extracted from the relevant ROI-based clusters with significant sex differences. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant sex differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the log FC values are not significantly different from each other. Abbreviations: surr, surroundings.
  
  Figure 5 – figure supplement 1: Overview of the fractional anisotropy (FA) changes over time in extracted from the relevant ROI-based clusters with significant differences in brain size. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant brain size differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the fractional anisotropy values are not significantly different from each other. Abbreviations: C, caudal; surr, surroundings.
  
  Figure 6- figure supplement 2: Overview of the fiber density (FD) changes over time in extracted from the relevant ROI-based clusters with significant differences in brain size. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant brain size differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the FD values are not significantly different from each other. Abbreviations: C, caudal; surr, surroundings.
  
  Figure 6- figure supplement 3: Overview of the fiber-bundle cross-section (log FC) changes over time in extracted from the relevant ROI-based clusters with significant differences in brain size. The grey area indicates the entire photosensitive period of short days (8L:16D). Significant brain size differences are reported with their p-value under the respective ROI-based cluster. Different letters denote significant differences by comparison with each other in post-hoc t-tests with p < 0.05 (Tukey’s HSD correction for multiple comparisons) comparing the different time points to each other. If two time points share the same letter, the log FC values are not significantly different from each other. Abbreviations: C, caudal; surr, surroundings.
  
  5) In Abstract: "We longitudinally monitored the song and neuroplasticity in male.." Perhaps something should be specified after the "the song"? Did the authors mean "the neuroplasticity of song system"?
  
  No, this is not what we meant, we monitor song behavior and neuroplasticity independently. In our study, we do not limit ourselves to the neuroplasticity of the song system, but instead use a whole brain approach. The monitoring of the song behavior in itself might be useful for other songbird researchers.
  
  We clarified this in the abstract as follows:
  
  We longitudinally monitored the song behavior and neuroplasticity in male and female starlings during multiple photoperiods using Diffusion Tensor and Fixel-Based techniques.
  
  Reviewer #3:
  
  In their paper, Orije et al used MRI imaging to study sexual dimorphisms in brains of European starlings during multiple photoperiods and how this seasonal neuroplasticity is dependent in brain size, song rates and hormonal levels. The authors main findings include difference in hemispheric asymmetries between the sexes, multisensory neuroplasticity in the song control system and beyond it in both sexes and some dependence of singing behavior in females with large brains. The authors use different methods to quantify the changes in the MRI data to support various possible mechanisms that could be the basis of the differences they see. They also record the birds' song rates and hormonal levels to correlate the neural findings with biological relevant variables.
  
  The analysis is very impressive, taking into account the massive data set that was recorded and processed. Whole-brain data driven analysis prevented the authors from being biased to well-known sexually dimorphic brain areas. Sampling of a large number of subjects across many time points allowed for averaging in cases where individual measurements could not show statistical significance. The conclusions of the paper are mostly well supported by data (except of some confounds that the authors mention in the text). However, the extensive statistically significant results that are described in the paper, make it hard to follow at times.
  
  1) In the introduction the authors mention the pre optic area as a mediator for increase singing and therefore seasonal neuroplasticity. Did the authors find any differences in that area or other well know nuclei that are involved in courtship (PAG for example)?
  
  Interestingly, we did not detect any seasonal changes in the pre-optic area or PAG. Whereas prior studies reported volume changes in the POM within 1-2 days after testosterone administration in canaries (Shevchouk, Ball et al. 2019). In male European starlings, POM volumes changed seasonally, although this seems to depend on whether or not the males possessed a nest box (Riters, Eens et al. 2000). In our setup, our starlings are not provided with nest boxes. The lack of seasonal change in POM could have a biological reason, besides the limitations of our methodology. Since these are small regions and are grey matter like structures, they are less likely to be picked up with our diffusion MRI methods.
  
  2) Following the first comment, what is the minimum volume of an area of interest that could be detected using the voxel analysis?
  
  The up-sampled voxel size is (0.1750.1750.175) mm3. In the voxel-based statistical analysis a significance threshold is set at a cluster size of minimum 10 voxels: 0.05 mm3.
  
  3) It would be useful to have a figure describing the song system in European starlings and how the auditory areas, the cerebellum and the hippocampus are connected to it, before describing the results. It would make it easier for the broader community to make a better sense of the results.
  
  An additional figure was added to the introduction to give a schematic overview of the song control system, the auditory system and the proposed cerebellar and hippocampal projections. This scheme includes both a 2D, and a 3D representation as well as a movie of the 3D representation of the different nuclei and the tractography.
  
  Figure 1: Simplified overview of the experimental setup (A), schematic overview of the song control and auditory system of the songbird brain and the cerebellar and hippocampal connections to the rest of the brain (B) and unilateral DWI-based 3D representation of the different nuclei and the interconnecting tracts as deduced from the tractogram (C). Male and female starlings were measured repeatedly as they went through different photoperiods. At each time point, their songs were recorded, blood samples were collected and T2-weighted 3D anatomical and diffusion weighted images (DWI) were acquired. The 3D anatomical images were used to extract whole brain volume (A). The song control system is subdivided in the anterior forebrain pathway (blue arrows) and the song motor pathway (red arrows). The auditory pathway is indicated by green arrows. The orange arrows indicate the connection of the lateral cerebellar nucleus (CbL) to the dorsal thalamic region further connecting to the song control system as suggested by (Person, Gale et al. 2008, Pidoux, Le Blanc et al. 2018) (B,C). Nuclei in (C) are indicated in grey, the tractogram is color-coded according to the standard red-green-blue code (red = left-right orientation (L-R), blue = dorso-ventral (D-V) and green = rostro-caudal (R-C)). For abbreviations see abbreviation list.
  
  Figure 1 – figure supplement 1: Movie of the unilateral 3D representation of the different nuclei and the interconnecting tracts rotating along the vertical axis.
  
  4) In the results section the authors clearly describe which brain areas are sexually dimorphic or change during the photoperiod and what is the underlying reason for the difference. However, only in the discussion section it is clearer why some of those differences are expected or surprising. It would be useful to incorporate some of those explanations in the results section other than just having a long list of brain areas and metrics. For example, I found the involvement of visual and auditory areas in the female brain in the mating season very interesting.
  
  Next to the reductions in technical explanation suggested by reviewer #2, We replaced some of the description of significant regions with parts of the discussion and vice versa(indicated in purple). This way we incorporated some of the explanations why certain findings are unexpected or relevant. Furthermore, we added some extra info on the reason why these changes are relevant for the visual system and the cerebellum.
  
  In line 420: Neuroplasticity of the visual system could be relevant to prepare the birds for the breeding season, where visual cues like ultraviolet plumage colors are important for mate selection (Bennett, Cuthill et al. 1997).
  
  In line 424: This shows that multisensory neuroplasticity is not limited to the cerebrum, but also involves the cerebellum, something that has not yet been observed in songbirds.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.21.427111v1
www.biorxiv.org www.biorxiv.org

New submission 27/01/2023, 09:21:11

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  […] Overall, the results from these analyses are convincing and valuable, but still do not seem to be a big leap from their Unger 2021 paper […]. The methodology that they established should be described more clearly so that it can be shared with the research community. For example, they say cells how many donors were recruited for this experiment? are there differences in efficiency in B cell differentiation by individual?
  
  Also, it would be important to assay for antibodies in the culture media. How would you suggest to improve the culture system to be used to model diseases?
  
  We appreciate the reviewer's queries and the points raised. In response to the first set of comments, the reviewer has correctly observed that the methodology of the assay itself as employed in this paper is not new or superior to our previously published data in (Unger et al., Cells 2021), where we described a minimalistic in vitro system for efficient differentiation of human naive B cells into antibody-secreting cells (ASCs). However, the current study aims to elucidate a comprehensive evaluation of the phenotype of the cells in the in vitro system and their relationships in potential differentiation pathways. In addition, we aimed to elucidate how the detailed gene expression profiles of the differentiating cells in vitro compare to in vivo observed counterparts. In this way, we were able to uncover an antibody secreting cell phenotype in vivo that was not observed before and could only be uncovered due to our full transcriptome knowledge of these cells. In addition, we present novel findings that demonstrate that this culture system not only enables efficient ASCs generation but also recapitulates the entire in vivo B cell differentiation pathway, as evidenced by the presence of germinal-center (GC)-like and pre-memory B cells in the culture. These results have not been previously reported in the literature for human B cells in culture and represent a significant contribution to the field of human B cell biology.
  
  In regards to the reviewer's inquiry about the cell culture protocol, its reproducibility, donors variability, and additional experimental applications, we refer to three additional recent publications from our group that have adopted the same in vitro B cell differentiation system and have provided extensive analysis of the immunoglobulin production, intracellular signaling pathways, as well as comparison with other culture systems in the field (Marsman et al., Cells 2020; Marsman et al., Eur. J. Immunol. 2022; Marsman et al., Front. Immunol. 2022). On top pf this, we now realize that the section that describes the culture system (MATERIAL AND METHODS - “In vitro naive B cell differentiation cultures”) was a bit too concise and we thank the reviewer for mentioning it. We have extended now on it and corrected an inconsistency at lines 125-127: “After six days, activated B cells were collected and co-cultured with 1 × 104 9:1 wild type (WT) to CD40L-expressing 3T3 cells that were irradiated and seeded one day in advance (as described above), together with IL-4 (100 ng/ml) and IL-21 (50 ng/ml; Invitrogen) for five days.”
  
  As for the application of our in vitro system in disease modeling, as requested by the reviewer, this would require modifying the culture conditions to mimic the disease-specific biology background (if known). For instance, by inhibiting or enhancing specific transcriptional pathways that are known to be associated with the disease in question. However, it would also require the presence of antigen-specific B cells in the pool of naive B cells included in the culture, which can be difficult to achieve due to their low frequency. Alternatively, the system could be used to study antigen-specific recall responses using antigen-specific memory B cells as starting material. Our group has evaluated this approach in a recent publication (Marsman et al., Front. Immunol 2022).
  
  [..] B cell differentiation may also influence to cell cycle regulation. Rather than normalize its effect, can authors analyze effect of cell cycle in B cell differentiation? [...]
  
  We very much agree with the reviewer and know that the cell cycle plays a significant role in B cell differentiation output trajectories (Zhou et al, Front Immunol. 2018; Duffy et al., Science 2012). Preparing the manuscript, we have in fact performed a parallel analysis in which we compared both cell cycle regressed- and not cell cycle regressed-based clustering and marker gene selection. Concerning the clustering, other clusters were obtained using the not cell-cycle-regressed dataset compared to the cell-cycle-regressed dataset (figure below). However, when overlaying the clusters obtained with the cell cycle-regressed dataset, the extra clusters were the same cell population but now split based on cycling and not cycling cells: cluster 2 is now divided into the cycling cluster “c”, and the not-cycling cluster “d” while cluster 4 and 5 are now divided into the cycling clusters “e” and the not-cycling cluster “f”. A comprehensive examination of the expression of the top 50 genes associated with antibody-secreting cells in the (non)cycling clusters 4 and 5 reveals that these genes are expressed at a higher level in (non)cycling cluster 5 as compared to cluster 4. This suggests that the cells within cluster 5 are more advanced in their differentiation, regardless of their cell cycle state. This finding has led us to the decision to present the data that has undergone cell cycle regression in the manuscript. Should the reviewer so desire, we are very willing to include additional supplementary figures to the manuscript that include the un-regressed representation.
  
  Figure legend: A-C) UMAP projection of single-cell transcriptomes of in vitro differentiated human naive B cells without cell cycle regression. Each point represents one cell, and colors indicate graph-based cluster assignments identified without cell-cycle regression (A), with cell cycle regression (B) or with cell cycle regression and additional subdivision in cycling and not cycling cells (C). D) Dotplot showing the top 50 differentially expressed genes in cycling and not-cycling cells from cluster 4 and 5. Point size indicate percentage of cell in the cluster expressing the gene, color indicates average expression
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.10.03.510595v1
www.biorxiv.org www.biorxiv.org

New submission 06/12/2022, 13:46:45

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  The manuscript by the Qiu and Lu labs investigates the mechanism of desensitization of the acid-activated Cl- channel, PAC. These trimeric channels reside in the plasma membrane of cells as well as in organelles and play important roles in human physiology. PAC channels, like many other ion channels, undergo a process known as desensitization, where the channel adopts a non-conductive conformation in the presence of a prolonged physiological stimulus. For PAC the mo-lecular mechanisms regulating this process are not well understood. Here the authors use a com-bination of electrophysiological recordings and MD simulations to identify several acidic residues and a conserved histidine side chain as important players in PAC desensitization. The results are overall interesting and clearly indicate a role for these residues in this process. However, there are several weaknesses in the experimental design, inconsistencies between the mutagenesis data and the MD results, as well as in the interpretation of the data. For these reasons I do not think the authors have made a convincing mechanistic case.
  
  We thank the reviewer for the constructive comments and address the concerns point-by-point below.
  
  Major weaknesses:
  
  The underlying assumption in the interpretation of all the data is that the mutations stabilize or destabilize the desensitized conformation of the channel. However, none of the functional meas-urements provide direct evidence supporting this key assumption. Without direct evidence sup-porting the notion that the mutations specifically impact the rate of recovery from desensitiza-tion, I do not think the authors have made a convincing mechanistic case.
  
  We agree with the reviewer that our functional data measure the degree and rate of the PAC channel entering desensitization from the activated state upon prolonged acid treatment. This is a common experimental procedure for research on desensitization/inactivation of ion channels. Fol-lowing the reviewer’s suggestion, we also sought to capture the kinetics from the desensitized state to the activated state by switching from more acidic pH to less acidic pH (for example 4.0 to 5.0) or neutral pH. However, we found that such experiments are not feasible partly because the kinetics of PAC desensitization is much slower compared to other channels, such as ASIC channels (see a recent study we cited: https://elifesciences.org/articles/51111). For the mutants with strong desensitization (E94R and D91R), it’s unclear whether the currents we recorded at pH 5.0 right after pH 4.0 representing the activated state or the desensitized state at pH 5.0. In other words, we don’t know if the PAC channel transitions from the desensitized state from a lower pH back to the activated state or rather directly to the desensitized state at a higher pH. For the mutants with reduced desensitization, the current amplitude at pH 4.0 were often similar to that at pH 5.0, which makes the recovery/transition variable. We also tried to switch the acidic pH to neutral pH. We found that the PAC channels (both WT and mutants) go back to the closed state from the desensitized state in seconds as limited by our perfusion speed. These data suggest that the desensitized state of PAC is no longer maintained after switching buffer from low pH to neutral pH. In summary, it’s technically infeasible, in our opinion, to measure the rate of recovery from desensitization to activation for the PAC channel. However, our data do support the con-clusion that the rates of entering desensitization from the activated state, a standard measurement of desensitization, change for various channel mutants we studied.
  
  Overall, the agreement between the MD simulations, functional data, and interpretation are often weak and some issues should be acknowledged and addressed.
  
  For example:
  
  1) The experimental data suggests that H98, E107, and D109 play analogous roles in PAC desen-sitization. However, the MD simulations suggest that the H98-D109 interaction energy is ~4 times larger than that of H98-E107. This should lead to a much greater effect of the D109 muta-tion. How is this rationalized?
  
  The purpose of quantifying the interaction between H/R98 with E107 and D109 is to better dis-sect the mechanism by which H/R98 interacts with the acidic pocket residues. The result suggests that R98 has a reduced association with E107/D109 when compared to H98. It also suggests that D109 makes a more direct interaction with H/R98 when compared to E107. We acknowledge that this is not clear in our initial manuscript and we have updated the text to better describe this result. However, this doesn’t imply that the desensitization phenotype of E107R should be less pronounced than D109R. Both E107R and D109R are expected to disrupt the integrity of the acidic pocket, thus resulting in diminished channel desensitization. It is worth pointing out that E107 played a more complex role as it was identified in our previous papers as one of the major proton sensors. The E107R mutant could allow the PAC channel to become more sensitive to ac-id-induced activation (Figure 4d-e in Ruan et al, Nature, 2020), further complicating its effect in desensitization. Taken together, we don’t think the E107/D109 and H/R98 interaction strength could have quantitative correlation with the desensitization phenotype of E107R and D109R.
  
  2) The experimental data shows that E94 plays a key role in desensitization and the authors argue that this is due to the interactions of this residue with the β10-11 linker. However, the MD simu-lations show that these interactions happen for a small fraction, ~10%, of the time and with inter-action energies comparable to those of the H98-E107-D109 cluster. It is not clear how these sparse and transient interactions can play such a critical role in desensitization. Also, if the inter-action energies are of the same sign, how come one set of mutants favors desensitization and one does not?
  
  The 10% value is the amount of time when at least a hydrogen bond forms between E94/R94 and the β10–β11 loop. It is NOT the amount of time that they form interactions, as there could be other types of non-bonded interactions such as Van der Waals interaction and Coulombic interaction. In fact, our non-bonded energy calculation clearly suggests that R94 interacts with the β10–β11 loop much more favorably than E94 (Figure 4C). The impact of E94R on β10–β11 loop is also reflected in the root-mean-square-fluctuation analysis, where the β10–β11 loop shows a reduced flexibility when R94 is present (Figure 4B).
  
  Our central hypothesis is that PAC becomes more prone to desensitization when the desensitized conformation is stabilized. Two critical interactions are characteristic of the desensitized structure of PAC, including the association of the E94 with the β10–β11 loop, and H98 with E107/D109. Therefore, we expect mutations that alter these interactions to affect PAC channel desensitization. Based on the MD simulations, we observed the root-mean-square-fluctuation of β10–β11 loop are reduced for E94R when compared to WT (Figure 4B), suggesting that β10–β11 loop is stabilized when E94 is replaced by an arginine. The non-bonded interaction energy between E94 and the β10–β11 loop is also more negative for E94R when compared to WT (Figure 4C), another indicator of conformation stabilization. As a result, the E94R mutant favors desensitization. This is in sharp contract with the H98R data, in which H98R interact less favorably with E107/D109 (Figure 2F, G, H, I) when compared to WT. Although the interaction energies are of the same sign, it is the difference between WT and the mutants that will ultimately determine whether a certain mutation will favor desensitization or not.
  
  The authors' MD analysis critically depends on assumptions on the protonation states of multiple residues, that are often located in close proximity to each other. In the methods, the authors state they use PropKa to estimate the pKa of residues and assigned the protonation states based on this. I have several questions about this procedure:
  
  What pH was considered in the simulations? I imagine pH 4.0 to match that of the electrophys-iological experiments.
  
  The exact pH environment cannot be explicitly modeled in standard MD as the protonation state of an ionizable group is not allowed to change during the simulation. Therefore, in our simulation, we prepared the MD system by first predicting the pKa of titratable residues of PAC in the de-sensitized state, and then assign the protonation status of these residues based on the pKa values. We acknowledge that the description in this part is not very clear in our original manuscript. We have revised the method to better describe how the protonation status is assigned.
  
  Was the propKa analysis run considering how choices in the protonation state of neighboring residues affect the pKa of the other residues? This is critical because the interaction energies will greatly depend on the protonation state chosen.
  
  The pKa analysis was done based on the WT structure and the residue protonation status was assigned based on the predicted value. It is possible that mutations on certain residues could change the pKa of neighboring residues. To evaluate this impact, we carried out pKa prediction for all the mutant structures that we used as input for simulation. This is summarized in the table below:
  
  As shown in the table, although mutations will affect the pKa of neighboring residues, the impact is generally within 0.3 units. As our simulation is carried out based on a pH of 4.0, this variability will not affect how we assign the residue protonation status.
  
  Was the pKa for the mutant constructs re-evaluated? For example, does having a Gln or Arg in place of a His affect the pKa of nearby acidic residues?
  
  We didn’t re-evaluate the pKa for each mutant in our initial manuscript. We have conducted such an analysis as indicated in the above table. The result suggests that arginine substitutions of H98/E94/D91 could have an impact on the pKa value of nearby residues. However, the differ-ence is relatively small and does not alter the predominant protonation status of these residues at pH 4.0.
  
  H98R and Q have the same functional effect. The MD partially rationalizes the effect of H98R, however, it is not clear how Q would have the same effect as R on the interaction energies.
  
  Our analysis on H98R and H98Q serves two different purposes. H98 is expected to be protonat-ed at pH 4.0. The fact that H98Q mutant reduced PAC desensitization suggests that positive charge at the location is critical for PAC desensitization, which we attribute to the loss of favora-ble interaction between H98 and E107/D109. This is different from H98R mutant as arginine bears the same amount of charge as a protonated histidine. Our data suggest that the exact bio-chemical property, including its charge and side-chain flexibility, of H98 is crucial for PAC de-sensitization.
  
  Are 600 ns sufficient to evaluate sampling of the different conformations?
  
  Our MD analysis doesn’t intend to sample large conformational transitions between different functional state. Instead, our analysis focused on local dynamics which allowed us to correlate the observation with electrophysiology data. During the revision, we have extended our simula-tion to 1 μs for each mutant. It is worth pointing out that because PAC protein is a trimer, and we performed all the calculations across three subunits. Therefore, the effective sampling time would become 3 μs in total. The new result remains the same as our initial analysis, suggesting that the sampling time is sufficient to evaluate the metrics reported in the study. We also acknowledged this limitation of our study in the discussion.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.30.505880v1
www.biorxiv.org www.biorxiv.org

Endothelial SIRPα signaling controls thymic progenitor homing for T cell regeneration and antitumor immunity

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2 (Public Review):
  
  The molecular mechanisms as well as the cellular players of colonization of the adult thymus are incompletely understood. In this manuscript, the authors investigate the role of the SIRPa-CD47 ligand pair in seeding of bone-marrow derived progenitors to the adult murine thymus. The study is based on the authors' earlier characterization of thymic portal endothelial cells, which have a role in mediating progenitor homing to the thymus (Shi et al., 2016). The authors show that loss of SIRPa or CD47 results in reduced frequencies and numbers of early T lineage progenitors (ETPs), but no substantial alterations in thymocyte numbers at later developmental stages and of bone-marrow precursors. Short-term homing assays suggest impaired colonization of the thymus. The authors further characterize cell biology and biochemistry of the SIRPa-CD47 system using peripheral lymphocyte co-cultures with genetically engineered MS1 endothelial cells. Finally, they assess the role of SIRPa-CD47 in thymus regeneration in combination with growth of a model tumor.
  
  Strengths:
  
  The authors describe a clear phenotype, consistent with the moderate effect size in ETP loss upon deletion of other homing mediators, such as PSGL-1 or individual chemokine receptors, such as CCR7, CCR9 or CXCR4.
  
  The authors use multiple genetic models, including both, SIRPa and CD47 deficient mouse strains, to support their findings. Using the Tie2Cre model for endothelial cell-specific deletion is particularly informative and could have been used more extensively. Some data are further strengthened by the complementary use of inhibitory SIRPa-Ig fusion proteins.
  
  In vitro analysis of the molecular mechanism and the role of signaling mediators using MS1 cells is well executed and conclusive.
  
  Weaknesses:
  
  Short-term homing assays suffer from the problem that the system is overwhelmed by an excessive number of donor cells (millions), whereas at steady state only a few hundred HPCs capable of colonizing the thymus circulate in peripheral blood, questioning the physiological relevance of this approach. The short-term nature of the experiments also precludes analysis, whether homed cells do in fact constitute T cell progenitors. More suitable experiments comprise mixed competitive bone marrow chimeras using congenically discernible donor cells or, even better, transfers into non-irradiated recipients of defined age as pioneered by the Goldschneider and Petrie labs. Thus, the conclusion that the SIRPa-CD47 system mediates homing of thymus seeding progenitors is not fully justified.
  
  a) Thank you for the comments. To overcome the disadvantage of total bone marrow transfer, we sorted progenitor-containing lineage- bone marrow cells, which takes about 3% of the total bone marrow cells, by MACS enrichment followed by FACS. The amount of donor cells needed for transfer was therefore reduced from 5×10^7 total bone marrow cells per mouse to less than 1×10^6 lineagecells per mouse. This would prevent the overwhelming effect in the previous method. Result of short-term homing assay with 1×10^6 lineage- bone marrow cells confirmed the homing defect in the thymus of Sirpα^-/- mice (new Figure 2I), but not in the spleen (new Figure2—figure supplement 2J).
  
  b) To track whether immigrated lineage^- progenitors actually develop into thymocytes, we conducted adoptive transfer of congenically marked (CD45.1) WT lineage^- into naïve non-irradiated WT or Sirpα^-/- (CD45.2) recipients. 3 weeks later, donor-derived cell subsets were detected. Significant defect of donor-derived thymocyte development, particularly at DN and DP stages, was found in Sirpα^-/- mice as shown in new Figure 2J,K. Therefore, the defective thymic homing of progenitor cells in Sirpα^-/- mice indeed influence following T cell development.
  
  c) Mixed bone marrow chimera or mixed congenically discernible WT and CD47KO progenitor cell transfer into non-irradiated WT recipients is not applicable as has been explained in details in response to the 2nd point of Summary of Essential Revisions. This is probably due to rapid clearance of CD47-null cells from the system by phagocytosis(Jaiswal et al., 2009). Therefore, it currently remains a technical difficulty to address the role of CD47 on progenitor cells for thymic homing using mixed competitive bone marrow chimeras or mixed progenitor cell transfer in non-irradiated hosts. Instead, we have used cleaner in vitro transwell assay to confirm the role of CD47 on progenitor cells during TEM (new Figure 4F), as explained in more details just below.
  
  While technically elegant and mechanistically conclusive, the in vitro studies using MS1 cells and peripheral lymphocytes are somewhat isolated from the original focus of the paper addressing the role of SIRPa-CD47 specifically in thymus seeding. It should be considered devising similar assays replacing lymphocytes with bone-marrow derived progenitors.
  
  Major in vitro transendothelial migration assays have been repeated with FACS sorted lineage^- bone marrow progenitor cells (Lin^- BMCs). Lin^- BMCs showed significant defect of TEM on Sirpα^-/- ECs compared to that on WT ECs (new Figure 3F); Cd47^-/- Lin^- BMCs also showed significant defect of TEM compared with WT Lin^- BMCs (new Figure 4F). Therefore, the conclusion that progenitor CD47 - endothelial SIRPα signaling is required for TEM remains unchanged.
  
  Analysis of thymus regeneration is interesting, but a number of open questions remain for this experimental setup, also in part raised by the authors in the discussion section. Most notably, during regeneration, the reduction in ETPs is accompanied by reduced numbers in more mature thymocyte subsets and peripheral T cells. Such a reduction was not observed at steady-state in KO models and it cannot be concluded from this experiment, that these observations are caused by a defect in thymus colonization. Notably, SL-TBI is associated with massive cell death and alterations in phagocytosis and many other factors may come into play here as well.
  
  We agree with these comments. CV-1 treatment during SL-TBI induced thymic injury and regeneration is a complicated scenario. To make it cleaner, we did SL-TBI directly on Sirpα^-/- mice and control mice. Congenically marked bone marrow cells were also adoptively transferred for better monitoring. At 4 weeks after transfer, donor derived DN thymocyte subset was found defective in Sirpα^-/- recipients compared to that in control hosts (Figure R1). However, DP, SP subsets did not show difference, probably due to compensation effect.
  
  Figure R1. Reconstitution of bone marrow-derived progenitors in Sirpα^-/- *mice. (A) Schematic view of the experiment. (B,C) Statistics of proportion (B) and cell number (C) of donor derived cells in the thymus 4 weeks after SL-TBI and adoptive transfer. n=6 in each group, unpaired t-test applied. *: p <0.01*
  
  As the reviewer indicated, SB-TBI is associated with massive changes on many aspects. Therefore, we also tested the role of SIRPα on thymic homing and thymocyte development in steady state. First, we conducted short-term homing assay using sorted lineage- bone marrow progenitor cells instead of total bone marrow cells to avoid the overwhelming effect of massive number of cells used. Short-term homing assay with 1×10^6 lineage^- bone marrow progenitor cells showed similarly significant defect in Sirpα^-/- recipient thymus (new Figure 2I), but not in the spleen (new Figure2—figure supplement 2J). Second, we also examined following T cell development in this scenario. At 3 weeks after adoptive transfer of lineage^- bone marrow progenitor cells, significantly reduced population of donor-derived thymocytes (mainly DP subset) was found in Sirpα^-/- mice (new Figure 2J,K). However, it should be noted that, later stage of thymocyte development, such as SP, was not significantly impaired, although there is a trend to be reduced in Sirpα^-/- mice.
  
  Thus, our data suggest that while SIRPα deficiency results in impaired thymic homing of progenitor cells and is accompanied with reduced ETP population and impaired early thymocyte development, later thymocyte development is less affected probably due to compensation effect. Whether this effect might be amplified at certain scenarios remains an intriguing open question.
  
  Taken together, the study in its presents form contains the description of an interesting new phenotype, consistent with a role of the CD47-SIRPa interaction in colonization of the thymus by bone-marrow derived progenitors. However, at present, homing experiments lack sufficient rigor and experiments on thymus regeneration, while showing an interesting additional finding, do not justify to conclude homing as mechanistic explanation.
  
  Thank you for the comment. With these new data, hopefully the role of SIRPα on thymic progenitor homing, T cell development during steady state and T cell regeneration at SL-TBI scenario has been made clearer. We agree that the causal relationship between thymic progenitor homing and thymus regeneration is still indirect and inconclusive, which may require further investigation in future. In this study, we would like to emphasize more on the novel role of CD47-SIRPα in controlling thymic progenitor homing, and the underlying molecular and biochemical mechanism. We hope these have been validated.
  
  Reviewer #3 (Public Review):
  
  The manuscript by Ren et al. seeks to describe a role for endothelial cell (EC) expression of Sirpα playing a role in the importation of hematopoietic progenitors from the circulation into the thymus. Specifically, the authors demonstrate that there is a reduction in the number of the earliest T lineage progenitors (ETPs) in the thymus in mice deficient for Sirpa or CD47 (its ligand), and through a series of elegant in vitro transendothelial migration studies, identify that intracellular Sirpα signaling mediates this process by regulating VE-Cadherin expression and thus EC tight junctions. In particular, the use of transwell assays modified to study TEM is particularly well utilized to tease apart the mechanisms. Overall, I found this to be an excellent manuscript. In fact, every time I had a critique developing in my head, the authors quickly dispensed of it by producing some follow up data that addressed my concern! My biggest concern with the manuscript is that it was difficult to determine exactly how many repeats of each experiment have been performed and what data is being presented in the figures (and being statistically analyzed). This should not change the conclusions of the manuscript but will make reading the figures and matching them with the legends easier. The following are a some major and minor concerns that should be addressed to strengthen the manuscript:
  
  Major:
  
  • My main concern is that there needs to be greater care taken with highlighting the number of repeats done for each individual study as it is not always clear. For instance, in Figure 2 the data are presented as being representative of three independent experiments with an n of 3 in each experiment but in 2B, D, and F there are 4 data points for the Sirpa-/- group. This is likely explained by there being 4 mice in that particular experiment, but that is why the numbers should be presented for each experiment rather than a general statement at the end. Another example of this is that in Figure 2 S1 the authors would like to claim that the only differences are in the DN1 subsets which contains the ETPs. However, it is likely this is just due to low numbers as it seems like there is a real decrease in the number of DN2, DN3, DN4 and even DP thymocytes (as well as total cellularity).
  
  1. This should not change any conclusions of the paper but will aid in reader interpretation.
  
  Thank you for your advice and we apologize for the negligence and have rechecked all figure legends and reported sample size for each panel individually. Furthermore, we repeated those experiments with too few samples in the group. For mouse experiments, we used littermates for detection which were not always have equal number of individual mouse in each group, now mouse used have been labeled specifically in each experiment. For thymic subset detection in Sirpα^-/- mice, we have increased sample size (n=5 for both Sirpα^-/- and control group as shown in Figure 2—figure supplement 1AE) and indeed found significant decrease of DN2, DN3 and DN4 subsets in Sirpα KO mice, though total cellularity was still not significantly changed. Overall, the conclusion of defective early thymocyte development in Sirpα^-/- mice retains valid.
  
  2. In this manuscript the authors show that Sirpa expression by TPECs is critical for their capacity to guide the importation of HPCs, and in their previous work they have shown that lymphotoxin can regulate the importation capacity of these same TPECs. Therefore, it would be extremely interesting to know if LT signaling is regulating the expression of Sirpa. Furthermore, it would be important to at least comment on what may be influencing Sirpa expression. For instance, we know from the work of Petrie and others that DN niche availability can influence the ability of the thymus to import of progenitors. Similarly, after TBI the "gates" are let open and the capacity of the thymus to import progenitors increases. Do the authors know (or could they comment) on what happens to Sipra expression after TBI in ECs?
  
  Thank you for your suggestion. It is an interesting and important question how SIRPα expression is regulated on TPECs. As the reviewer suggested, we examined SIRPα expression in different settings. Given the important role of LT-LTβR signaling on TPEC development and maintenance, we first tested whether LT-LTβR signal would be required for SIRPα expression. However, the remaining TPECs in Ltbr^-/- mice showed similar level of SIRPα expression compared to that in WT mice (new Figure 1—figure supplement 1C). Thymic stromal niche is another factor regulating thymic settling of progenitor cells (Krueger, 2018; Prockop and Petrie, 2004). Increased thymic stromal niche was found during irradiation (Zlotoff et al., 2011). We also detected SIRPα expression on TEPC at Day 14 after 5.5Gy total body sublethal irradiation and found no significant change in SIRPα expression (new Figure 1—figure supplement 1D). Whether SIRPα expression on TPECs is a constitutive event or regulatable upon thymic microenvironmental change remains to be tested in future.
  
  3. The use of the in vitro TEM assays in transwell plates are a nifty way of interrogating and manipulating the effect of Sirpa in these conditions, however, the caveat is that these all use EC cell lines that do not correspond to the TPECs being described in vivo. This caveat should be acknowledged in the text.
  
  Thank you for the advice, EC cell line we used is a pancreatic islet endothelial cell line (MS1), which is not derived from or corresponding to TPECs. We have mentioned this caveat in the text.
  
  4. I am a little confused as to the interpretation of the final experiment looking at tumor clearance. The authors show that this could be clinically relevant as blockade of the CD47-Sirpa axis is becoming an increasingly attractive immunotherapy option but its use could preclude thymic recovery after damage and thus contribute toward poorer T cell responses against tumors. This last study is very interesting but also very hard to interpret given the likely positive effect of Sirpa-CD47 blockade on tumor clearance, in opposition to its potential effects hindering thymic repair. While it is notable that there is reduced clearance of tumor in mice treated with CV1, it is unclear why there does not seem to be any positive effect of CV1 on tumor clearance (is this because there are fewer T cells in the periphery as it is still early after damage?). On the thymic repair and reconstitution front, perhaps a cleaner way would be to look in Sirpa or CD47 deficient mice and without tumors.
  
  We agree that the findings regarding tumor immunotherapy need further explanation on detailed mechanism, therefore this part of results was removed from this project. CV1 treatment in our approach is ahead of tumor inoculation, therefore, CV1 mediated blockaded of CD47 (which is the case in CV1 mediated tumor clearance) would not occur on tumor cells. However, we did not test for the mechanism behind, which is quite interesting and would be done in future study.
  
  As to the suggestion of testing thymic regeneration in straightforward Sirpα or CD47 deficient mice, we have done this in Sirpα deficient mice. We conducted SL-TBI directly on Sirpα-/- mice and control mice. Congenically marked bone marrow cells were also adoptively transferred for better monitoring. At 4 weeks after transfer, donor derived DN thymocyte subset was found defective in Sirpα-/- recipients compared to that in control hosts (Figure R1). However, DP, SP subsets did not show difference, probably due to compensation effect. (Figure R1).
  
  Minor Comments:
  
  • In Fig. 2I (and Fig. 2S2I-J), it is difficult to determine how long after the chimera transplant the homing assays were performed. However, this approach has limitations as the process of creating those chimeras (conditioning such as irradiation etc.) will change the function and possibly the mechanisms of progenitor entry into the thymus. There is clearly still an effect of Sirpa in this context but it is possible (even likely) that the importation mechanisms in the thymus change after damage such as that caused by the conditioning required in the initial chimera generation.
  
  For the study of short-term homing in bone marrow chimeric mice, we have updated legends for the related figure (which is now Figure 2G in the article). The homing assays were performed at 8 weeks after the chimeric reconstruction. Meanwhile, it is indeed possible that the changes of the thymic homing mechanisms may give rise to the abnormal progenitor cells entry. In order to exclude this potential effect, we conducted homing assays without irradiation. In this experiment, we also observed impaired shortterm homing (new Figure 2I) and following T cell development (new Figure 2J,K)
  
  Furthermore, although using the Tie2-Cre strain will distinguish Sirpa on ECs and TECs, it will not distinguish between expression on other cells such as DCs (Tie2 will delete expression in both endothelial and hematopoietic lineages). Although the optimal experiment to address these concerns would be to delete Sirpa from ECs specifically (such as with Cdh5-CreERT2 mice), I am convinced by the preponderance of in vitro data that there is an EC-specific effect and therefore it is not necessary to perform this time-consuming, albeit interesting, potential experiment. However, these limitations should be acknowledged in the discussion or text.
  
  Thank you for your kind suggestion, we have discussed this limitation in the text.
  
  • As a technical note I am surprised that there was considerable reconstitution of naive T cells at day 21 after TBI (Fig.7G-H). In our experience that is very early for naïve T cells in the periphery which generally take about 4 weeks to start reconstituting in a real sense. Is it possible there are direct effects of this treatment on residual radio-resistant peripheral T cell numbers?
  
  Thank you very much for sharing your information. Indeed, we cannot exclude the possibility of residual radio-resistant peripheral T cells. To better clarify this, we have performed SL-TBI (6 Gy) followed by adoptive transfer of congenically marked WT (CD45.1) total bone marrow cells into Sirpα^-/- or control mice (CD45.2) for better monitoring. In this situation, we found that at day 28, more that 97% of thymocytes were donor-derived in both groups and the thymus had been completely reconstituted (Figure R2). In addition, as have been shown in Figure R1, donor-derived DN thymocyte subset was found significantly reduced in Sirpα^-/- mice compared to that in control mice. However, no defect was found at later development stages of thymocytes.
  
  Given the complication of the original experimental design, and as suggested by the reviewers, the original Fig. 7 was removed. The new data described above are hopeful informative to understand the role of SIRPα in a thymic regeneration scenario.
  
  Figure R4. Chimerism detection at day 28 in host transferred with bone marrow cells. (A) Chimerism of thymic subsets, chimerism=CD45.1^+%/(CD45.1+ %+CD45.2^+ %). (B) Representative FACS of donor (CD45.1) and host (CD45.2) cells in total thymocyte (single and live cell gated). n=6 in each group, unpaired t-test applied. **: p<0.01
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.04.19.440464v1
www.biorxiv.org www.biorxiv.org

FMRP regulates mRNAs encoding distinct functions in the cell body and dendrites of CA1 pyramidal neurons

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This manuscript integrates conditional mouse models for TRAP, PAPERCLIP and FMRP-CLIP together with compartment specific profiling of mRNA in hippocampal CA1 neurons. Previously, similar approaches have been used to interrogate mRNA localization, differential regulation of 3'UTR isoforms, their local translation, and FMRP-dependent mRNA regulation. This study builds on these previous findings by combining all three approaches, together with analysis of mRNA dysregulation in Fmr1 KO neuron model of FXS. The strengths of the paper are the rich data sets and innovative integration of methods that will provide a valuable technical resource for the field. The weakness of the paper is the limited conceptual advance as well as lack of deeper mechanistic insights on FMRP biology over previous studies, although the present study validates and integrates past studies, adding some new information on 3'UTR isoforms.
  
  We appreciate the Reviewer’s recognition that “the present study validates and integrates past studies, adding some new information on 3'UTR isoforms”. We also appreciate the Reviewer’s recognition that “The strengths of the paper are the rich data sets and innovative integration of methods that will provide a valuable technical resource for the field.”
  
  We differ, however, with the concern that the work presents a “limited conceptual advance.” Specifically, we find, for the first time, that FMRP regulates two different biologically coherent sets of mRNAs in CA1 neuronal cell bodies and neurites. This provides a profound new insight into FMRP-RNA regulation, including the fact that these two different sets of mRNA targets (encoding chromatin-associated proteins and synaptic proteins, respectively) are both translationally regulated by FMRP and transcribed from genes implicated in autism.
  
  We recognize that FMRP was known, by our own work and that of others (as noted by the Reviewer) to regulate specific targets “in bulk” in neuronal cell types, brain and even in CA1 neurons. What is most unexpected here? Among directly bound FMRP mRNAs in brain CA1 neurons, there is subcellular compartmentalization of this regulation. This is new for FMRP, and in fact is new for RNA binding proteins more generally (recognizing of course the extensive work on RNA localization in different compartments previously discovered by others, beginning with Rob Singer’s work on actin localization and up to the present in work on neurons).
  
  We also think it is also important for readers to understand up-front the novelty in “combining approaches” referred to. We use cell-specific (cTag) CLIP to define direct FMRP interactions in subcompartments--dendrites vs cell bodies--of CA1 neurons within mouse brain hippocampus. We also normalize this data to ribosome-bound mRNAs in CA1 neurons, and validate observations by studying WT and FMRP-null brains. This set of complex mouse models and methods is completely new, and its application is what allowed us to make robust conclusions about FMRP translational regulation of different mRNAs in different cellular compartments.
  
  We strongly disagree with the Reviewer’s comment that FMRP directly interacts with functional classes of mRNAs in different cellular compartments “has previously been shown in the field.” Compartment-specific FMRP-CLIP has not been reported that we’re aware of, much less in a cell-type specific manner. Our previous cell-type specific FMRP-CLIP experiments have been on bulk neuronal material (Sawicka et al. 2019; Van Driesche et al., n.d.). Although cell-type specific TRAP-seq has been performed on microdissected CA1 compartments (Ainsley et al. 2014), investigators were unable to isolate significant amounts of RNA from resting neurons, and degradation of the isolated RNAs did not allow the types of 3’UTR and alternative splicing analyses that were performed here. The Schuman group has performed extensive analysis of mRNAs from microdissected CA1 compartments (Cajigas et al. 2012a; Tushev et al. 2018), but have not performed FMRP-CLIP or any experiments using cell-type specific or direct protein-RNA regulatory methods. In vitro systems have been used to analyze mRNA localization in FMRP KO systems (i.e. (Goering et al. 2020)), but in vitro systems are unable to fully recapitulate the complexities of in vivo brain regions, and did not analyze direct RNA-protein interactions. As our work is on in vivo brain slices, is cell-type specific, and integrates TRAP-seq, PAPERCLIP and CLIP-seq datasets, we believe that our work is novel and will be of great interest to the field.
  
  Despite the fact that FMRP targets are overrepresented in the dendritic transcriptome, it does not appear from this study that FMRP plays an active role in the mechanism of dendritic mRNA localization, at least under steady state conditions. One goal of the manuscript is to address a major question in the mRNA localization field, which is how FMRP may differentially modulate "localization" of functional classes of mRNAs such as those encoding transcriptional regulators and synaptic plasticity genes (Line 78-90). The data here indicate that FMRP directly interacts with functional classes of mRNAs in different cellular compartments, which has previously been shown in the field. However, no evidence is provided that mechanistically reveal a role for FMRP to promote subcellular localization of different functional classes of mRNAs. The correlative evidence presented in this manner does not add mechanistic insight.
  
  We do recognize that the question of what localizes FMRP mRNA targets differentially in the dendrite (and cell body) is of great interest, and remains unanswered. We also appreciate that, despite the Reviewer’s comment above, they also recognize “it does not appear from this study that FMRP plays an active role in the mechanism of dendritic mRNA localization, at least under steady state conditions.”
  
  We believe that some of the confusion here lies in the Reviewer’s comment “One goal of the manuscript is to address a major question in the mRNA localization field, which is how FMRP may differentially modulate "localization" of functional classes of mRNAs such as those encoding transcriptional regulators and synaptic plasticity genes (Line 78-90).” While this is a question of interest that has been studied, we think there is a major disconnect here in the Reviewer’s comments and our findings. To be clear, in the original manuscript, we did not find evidence, in WT vs KO CA1 neurons, that FMRP was acting to differentially localize mRNAs, including those mentioned by the Reviewer.
  
  Nonetheless, to further address the issue of a possible role for FMRP in localizing the transcripts it regulates, we have now performed quantitative analysis of FMRP target mRNA localization in dendrites from WT vs. Fmr1 KO mice. These results are now presented in Supplemental Figures 9 and 10 of the manuscript, and which we present and summarize below.
  
  Supplemental Figure 9. FMRP is not required for localization of its targets into the dendrites of CA1 neurons. A) Dendrite-enriched mRNAs were defined in FMRP KO mice (red) in the same manner as in Figure 1 for FMRP WT animals using bulk RNA-seq and TRAP-seq data. Overlap with dendrite-enriched mRNAs in WT (Figure 1, shown here in green) and CA1 FMRP targets (blue) in shown. 95.6% of dendrite-enriched FMRP targets in the WT were also found to be enriched in the dendrites of FMRP KO animals. B) Dendrite-present mRNAs were defined in FMRP KO. Overlap with dendrite-present mRNAs in WT (Figure 1) and CA1 FMRP targets is shown. 95.7% of dendrite-present FMRP targets in WT are also to be found as dendrite-present in KO animals. C-E) FISH was performed to assess FMRP target localization (Kmt2d (C) , Lrrc7 (D) and Map2 (E)) in FMRP KO mouse brain slices. Left panel shows the proportion of detected mRNAs that were detected in the neuropil (> 10 um from the predicted Cell bodies layer) in WT and KO animals. Wilcoxon ranked sum was performed to detect significance. Middle panel shows densitometry of 1000 spots samples from each picture analyzed. Distance from the CB was determined as described in methods and Figure 1. In the right panel, spots were binned into 15 groups according to the distance traveled from the CB, and the fraction of spots in each genotype in this range was analyzed by t-test to determined differences in the fraction of spots at each location in FMRP WT and KO animals (* indicates p-value < .05, ** is < .01).
  
  Supplemental Figure 10. FMRP is not required for differential localization of 3’UTR isoforms of its targets. A) Differential 3’UTR usage was analyzed using DEXseq as described in Figure 2 to identify 3’UTRs whose ratio of usage between neuropil and CB in FMRP WT and KO animals were altered. Shown is results from DEXseq analysis showing the log2foldChange (neuropil vs cell bodies, KO vs WT) and -log10(p-value) of each 3’UTR. Gray spots indicate that all 3’UTRs analyzed have an FDR > .05, indicating no significant change in usage between FMRP KO and WT animals. B and C) FISH analysis of localization of 3’UTR isoforms of Cnksr2 (B) and Anks1b (C ) isoforms in FMRP WT and KO animals. These genes were found in Figure 2 to express 3’UTR isoforms that are differentially localized to dendrites. Sequestered isoforms are those that are significantly localized to cell bodies in FMRP WT, and Localized are those that are significantly used in the dendrites of WT CA1 neurons. Left panel, the fraction of spots that are found to be localized to the neuropil (> 10 um from the cell body layer) are shown for each isoform in FMRP WT and KO animals. Differences were assessed by wilcoxon ranked sum tests. Middle panel, densitometry of the distance traveled from the cell bodies for a representative 1000 spots from each picture that was analyzed. Right panel, as described in Supplemental Figure 9, detected mRNAs were binned into 15 bins according to the distance traveled from the cell bodies, and differences in the fractions of spots in each bin in FMRP WT and KO slices were analyzed. Significance indicates results of t-tests (* indicates p-value < .05).
  
  In summary, we characterized the dendritic transcriptome in FMRP KO animals, and compared it to the FMRP WT results presented in Figures 1 and 2, as suggested by the Reviewers. We find that the dendritic transcriptome of FMRP KO animals is extremely similar to that of FMRP WT animals, with ~95% of mRNAs found to be dendrite-present or dendrite-enriched in WT also being found in FMRP KO animals (Figure S9). We validated these results with FISH and found no evidence for significant disruption in the localization of FMRP targets Kmt2d (Figure S9C), Lrrc7 (Figure S9D) or Map2 (Figure S9E) to the CA1 neuropil.
  
  To detect FMRP-dependent changes in distribution of 3’UTR isoforms of FMRP targets, we first performed global analysis of 3’UTR usage in TRAP from FMRP KO animals, using the expressed 3’UTR isoforms that were found in Figure 2. DEXseq analysis on 3’UTR expression in CA1 neuropil vs cell bodies TRAP showed no significant instances of altered 3’UTR usage ratios in FMRP KO animals (Figure S10A). We validated these results by performing FISH on the sequestered and localized 3’UTR isoforms of Cnksr2 and Anks1b genes and show no significant changes in the localization of the 3’UTR isoforms in FMRP KO animals (Figure S10B-C). Taken together, this data suggests that FMRP is not significantly involved in localization of its targets in resting CA1 neurons, but rather shows remarkable selection for localized mRNA isoforms. Instead, we find evidence that FMRP regulates the ribosome association of its targets in a compartment-specific manner by showing an increase in ribosome association of a subset of FMRP targets in the dendrites of CA1 neurons (see Figure 7E).
  
  Besides the addition of the figures described above, we have also now made corrections to the text of the manuscript, enumerated below, to address this.
  
  First, we have, as much as possible, reduced our emphasis throughout the manuscript on the “localization” of mRNAs and rather point out that the study seeks to characterize the differences between the regulated transcriptomes in CA1 cell bodies and dendrites. For example, for Figure 4, instead characterizing the log2FoldChange (neuropil vs CA1 cell bodies) as “dendritic localization”, we change the wording to “relative dendritic abundance” to focus on changes in the abundance of these transcripts in the dendrite vs the cell bodies. We also changed the section heading in the results that describes analysis in the FMRP KO animal from “Dysregulation of mRNA localization in FMRP KO animals” to “FMRP regulates the ribosome association of its targets in dendrites”. We believe that these changes will help to clear up this confusion for the reader.
  
  Second, we reformatted the model in Figure 7F. The new version of the model (shown here) emphasizes the point that our study reveals compartment-specific FMRP regulation of a subset of its targets without implying a role for FMRP in the mRNA localization of these transcripts. The text of the manuscript and figure legends have been updated accordingly.
  
  Figure 7F Distinct, compartment-specific FMRP regulation of functionally distinct subsets of mRNAs in CA1 cell bodies and dendrites. In dendrites, the absence of FMRP increases the ribosome association of its targets; this finding is consistent with a model in which FMRP inhibits ribosomal elongation and thereby translation (J. C. Darnell et al. 2011). In resting neurons, the translation of FMRP-bound mRNAs encoding synaptic regulators (FM2 and FM3 mRNAs) is repressed. When FMRP is absent, due to either genetic alteration (FMRP KO or FXS) or neuronal activity-dependent regulation (e.g. FMRP calcium-dependent dephosphorylation (Lee et al. 2011; Bear, Huber, and Warren 2004), ribosome association and translation of targets are increased. In cell bodies, FMRP binds mRNAs that encode for chromatin regulators (the FM1 cluster of FMRP targets), as well as FM2/3 mRNAs (consistent with synapses forming on the cell soma). FM1 targets show patterns of mRNA regulation similar to what our group observed in bulk CA1 neurons: FMRP target abundance is decreased in FMRP KO cells, perhaps due to loss of FMRP-mediated block of degradation of mRNAs with stalled ribosomes (Sawicka et al. 2019; R. B. Darnell 2020).
  
  Third, we have revised the Discussion in order to more completely discuss the model above and also emphasize the finding that FMRP was not found to be involved in the localization of its mRNA targets, but rather in the regulation of the local translation of its targets in a compartment-specific manner. We further speculate on the roles of FMRP in regulation of mRNA abundance and translation in these compartments.
  
  We hope that these changes better reflect the interpretation and novelty of our findings for both the Reviewers and the readers.
  
  Further related to a role of FMRP in mRNA localization, a recent paper in eLife reports that FMRP RGG box promotes mRNA localization of a set of FMRP targets through G-quadruplexes (Goering et al 2020). This relevant paper needs to be cited and discussed.
  
  We apologize for this omission, and have now cited and discussed this paper in the Results and Discussion of the manuscript. Importantly, we find that dendrite-enriched mRNAs have high GC content (see figure below, which is now Supplemental Figure 5). This complicates the discovery of potential G-quadruplexes; put another way, G-rich mRNAs will therefore be enriched when compared to not-localized mRNAs, and this is also true for C-rich mRNAs. Dendrite-enriched FMRP directly-bound CA1 neuronal targets (defined by CLIP) are actually G-poor when compared to dendrite-enriched FMRP non-targets (see new Figure S5 and below).
  
  Supplemental Figure 5A-D: Dendrite-enriched are GC rich and dendrite-enriched FMRP targets are GC poor compared to dendrite-enriched non FMRP targets. A) Schematic of the overlap between CA1 FMRP targets and dendrite-enriched mRNAs (defined in Main Figure 1) B) GC content, as defined by percent G + C for all CA1 mRNAs, dendrite enriched mRNAs (1211), dendrite-enriched FMRP targets (413), and dendrite-enriched non-FMRP targets (798, see A). Stars indicate significance in wilcoxon rank sum tests ( is p < .05, ** is p < .0001). C) G content, as defined by percent G, D) C content, as defined by percent C.
  
  In light of these observations, analysis of G- or C- containing motifs needs to be examined in this context. To this end, we performed the experiments suggested here, but did so by searching for the prevalence of G-quadruplexes in dendrite-enriched FMRP targets versus dendrite-enriched FMRP non-targets (Figure S5A). To do this, we used both experimentally-defined G-quadruplexes (described in (Guo and Bartel 2016), Figure S5E), as well as motifs (described in (Goering et al. 2020), Figure S5F). We include the results below, and in a new Figure S5 in the paper.
  
  Supplemental Figure 5E-F: mRNAs containing G-quadruplexes are not enriched in dendritic FMRP targets vs dendrite-enriched non-FMRP targets. E) The percent of all CA1 mRNAs, all dendrite-enriched mRNAs, dendrite-enriched FMRP-bound targets (413), and dendrite-enriched non-FMRP targets (798) that contain experimentally-defined G-quadruplexes is plotted. Shown are the results of chi-squared analysis comparing the enrichment of G-quadruplex containing mRNAs in dendrite-enriched FMRP targets vs dendrite-enriched non-FMRP targets. F) As in E, except looking for the presence of mRNAs with G-quadruplex motifs in 3’UTRs as described in (Goering et al. 2020)
  
  Interestingly, we found no difference in the presence of G-quadruplex motifs in the 3’UTRs of these two sets (above and new Supplemental Figure 5). For example, of 413 dendrite-enriched FMRP targets, 100 (24%) had experimentally defined G-quadruplexes in the 3’UTRs, while 159 (22.5%) dendrite-enriched non-FMRP targets had experimentally defined G-quadruplexes. These differences were not significant (by chi-square test).
  
  Searching the 3’UTR sequences of 413 dendrite-enriched FMRP targets above for G-quadruplex motifs (as described in (Goering et al. 2020), which searched for an empirically derived specific motif: GW--G, separated by 7nt), we only found 3 instances in dendrite-enchriched FMRP-bound target mRNAs. Similarly, we found out of 798 non-FMRP targets, only a small subset (6) contained this specific motif in their 3’UTRs. These results were not significant (chi-square test).
  
  In summary, we do not find evidence in our data of G-quadruplexes playing a role in determination of FMRP binding in CA1 dendrites. This data is now included in the results and discussed in the Discussion of the paper.
  
  Reviewer #2 (Public Review):
  
  The authors performed transcriptomic analyses from compartment-specific, micro-dissected hippocampal CA1 region tissue from transgenic mice. One feature that distinguishes this work from previous studies is the use of conditional knock-in of tags (GFP or HA) and tissue specific expression of the Cre recombinase to target a very specific population of pyramidal neurons in the CA1 region--as well as the combined use of TRAPseq, PAPERCLIP and FMRP-CLIP. Also, central to this work are the analysis pipelines that look at large populations of mRNA with the goal of finding features shared by those mRNA that bind FMRP.
  
  First, they established the identity of mRNAs that are dendritically enriched or/and alternatively polyadenylated (APA) by sequencing; followed by validation of a few candidates using smFISH. Next, the APA data was filtered through the rMATS statistical program to identify alternatively spliced (AS) mRNA variants within the APA population. The authors concluded that the majority of splicing events were of the exon-skipping type with NOVA2 as the likely culprit leading to this differential localization of AS isoforms. The authors then proceeded to perform FMRP-CLIP which was analyzed against the TRAP dataset. The (413) mRNAs that were shared by the two experiments (TRAP and FMRP-CLIP) exhibited two notable features: dendrite-enrichment and longer average transcript length. More importantly, They demonstrated that FMRP can preferentially bind to an AS isoform that is enriched in dendrites. Further analyses of FMRP CLIP targets showed that they shared a significant level of genes designated by gene set enrichment analysis (GSEA) as involved in ion transport and receptor signaling and similarly for ASD-related candidate genes.
  
  Strengths: -The combined use of tissue-specific Cre and conditional tags for RPL22, PABPC1 and FMRP help make these pull-downs highly specific and robust. -RNA sequencing approach allows for identification and comparison of populations of ribosome-, PABPC1- and FMRP-associated mRNAs. -Preferential binding of FMRP to AS or APA isoforms in dendrites is an impactful and significant finding.
  
  Weaknesses: -A caution in interpreting comparative or differential RNA-sequencing results as some are correlative.
  
  We appreciate this concern, and agree that RNA-seq analysis alone can be difficult to interpret. However, we feel that our unique approach of combining multiple cell-type specific approaches, including CLIP-seq and PAPERCLIP along with TRAP-seq and RNA-seq result in stronger conclusions that are supported by multiple lines of evidence.
  
  -Validation of FMRP interaction with AS or APA isoforms or ASD candidates by smFISH-IF is lacking.
  
  We find that smFISH-IF in the CA1 neuropil is difficult to interpret in mouse brain slices due to dense networks of processes in addition to contaminating cell types, making IF signals dense, noisy and difficult to quantitate. Although we could theoretically attempt these experiments using an in vitro cell culture model, we believe that the novelty of our work is in a) the cell-type specific nature of our analyses and in b) the fact that our analysis and validation is all performed in vivo. We do not feel confident that in vitro systems are similar enough to our in vivo system to be relevant for this work. This is due not only to differences in their transcriptomes, but also due to the limited number of synapses in vitro cells make with other neurons when compared to CA1 neurons in the brain. Instead, we validate the interactions between FMRP and AS and APA isoforms by isolating junction reads among FMRP-CLIP tags isolated in a cell-type specific manner from intact mouse brains (Figure 5). In this manner, we find direct evidence of FMRP selectively binding to dendritic mRNA isoforms in vivo.
  
  -Although hippocampal CA1 region is an excellent site to study FMRP-RNA interactome, are there other projection systems where altered FMRP-RNA interaction may lead to greater dysfunction?
  
  We appreciate this point and now include this in the revised Discussion.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.07.18.452839v1
www.biorxiv.org www.biorxiv.org

A novel immunopeptidomic-based pipeline for the generation of personalized oncolytic cancer vaccines

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Identifying private peptides for generating personalised cancer vaccines is a promising approach to launch robust anti-tumor response; however, the challenges remain in developing an effective process to achieve that. In this manuscript, the authors present an interesting and powerful pipeline (PeptiCRAD) to achieve this goal by examining CT26 model. Overall, this manuscript is well written and presented. Despite that this work presents interesting findings and pipeline, I have the following concerns. I do feel that this manuscript will improve if these concerns can be addressed.
  
  We thank the reviewer very much for having appreciated the quality and the originality of our work.
  
  It will be critical to confirm TILs and T cells in draining lymph nodes indeed recognise the peptide used in Figure 7-8 by ELISPOT of IFNg.
  
  We agree with the reviewer´s comment as regard to confirming that TILS and T cells in draining lymph nodes recognise the peptide used in Figure 7-8 by functional characterization in an ELISPOT IFN-γ assay. In our experience, the ELISPOT assay works at the best when fresh samples are employed; additionally, the splenocytes are source of enough cells to test individual mouse reactivity to single peptide. To this end, as the samples from figures 7 and 8 were frozen, we decided to repeat the animal experiment according to figure 7 schedule treatment to perform then the ELISPOT on splenocytes freshly harvested from mice. Following the previous results, we selected the best group (PeptiCRAd1) to further investigate the peptide response; untreated mice (Mock) and Virus alone (VALO-mD901) were used as control as well. Interestingly, the peptide deconvolution showed T cell reactivity to one peptide (RYLPAPTAL, peptide 2) (Figure 1A) in the PeptiCRAd1 group, in contrast no T-cell reactivity was observed for SYLPPGTSL (peptide 1) (Figure 1B). These data highlighted the role of an individual antigen in eliciting specific anti-tumor T cell response, appearing an interested candidate for further proof of concept in animal experimental setting.
  
  Figure 1 Interferon-γ Elispot results Harvested splenocytes from the treatment groups (as indicated in the figure) were functional characterized in an IFN-γ ELISPOT assay; individual response to SYLPPGTSL A) and RYLPAPTAL B) for each mouse is reported as IFN-γ spot forming cells (SFC)/106 splenocytes. The data are depicted as single dots plot and mean + SEM is shown. (Virus=VALO-mD901, PC=PeptiCRAd).
  
  It would be interesting to see if this pipeline can be used to identify human peptides in human melanomas.
  
  We thank the reviewer for pointing out that this pipeline can be used to identify human peptide in human melanomas. Indeed, the work here described is a proof concept meant to be translated in human setting. To this end, in the lab we have two projects on-going that are exploiting the same pipeline to investigate the human epithelioid and human mesothelioma ligandome landscape. Regarding this latter, we are investigating four different human cell lines (H2B, MSTO211H, H2452 and JL1). As shown in the picture below (Figure 2), the peptide length distribution showed an enrichment in 9mres in both replicates (Rep1 and Rep2), in line with a ligandome profile. The analysis of the binders revealed that most of them were good binders (according to EL-Rank score) for at least one of alleles for each cell line. Following the pipeline reported in this manuscript, to select candidate peptides we applied two different approaches; the first approach relied on RNA seq analysis to check which source proteins of the peptides isolated in the ligandome analysis were reported as upregulated or downregulated in resected tumor compared to normal tissues.
  
  (Fromhttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51024 GSE51024). The second approach (analysis still on-going) will be the use of HEX software.
  
  Regarding the epithelioid project, we have analysed the ligandome profile of two human cell lines: NEPS and VA-ES-BJ. Please find below the example of the ligandome analysis for VAES- BJ cell line (Figure 3). Overall, the analysis outcome was similar to published dataset (aminoacidic length distribution, Gibbs clustering profile, amount of binders) confirming the good quality of the ligandome landscape identified. Next, we applied HEX analysis to narrow down the list of peptide candidates for further test. We are currently in the stage of collecting more different epithelioid cell lines to expand our cohort of samples.
  
  Figure 2 Mesothelioma project (data not published)
  
  Figure 3 Epithelioid project (data not published)
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.08.447483v1
www.biorxiv.org www.biorxiv.org

A PX-BAR protein Mvp1/SNX8 and a dynamin-like GTPase Vps1 drive endosomal recycling

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2:
  
  The SNX-BAR family of sorting nexin proteins is involved in the formation of tubular carriers at endosomes. The best characterized yeast sorting nexins form part of the retromer complex, which binds sorting signals on cargo proteins to direct their recycling. There is some debate as to the role of sorting nexins in mediating cargo recognition vs tubule formation, and it is unclear which (if any) other members of the sorting nexin family bind directly to cargo.
  
  In this manuscript, the authors investigate the function of the yeast sorting nexin Mvp1. This protein was previously proposed to cooperate with retromer in the formation of recycling tubules, and to recruit the dynamin-like protein Vps1 to promote their scission (Chi et al, JCB 2014). Here, Suzuki et al find that Mvp1 has a cargo-sorting role that is distinct from that of other sorting nexins. They show that Mvp1 (but not retromer) is required for the correct localization of the membrane protein Vps55, and identify a cytosolically-exposed sequence in Vps55 required for its sorting. Using structurally-guided mutagenesis, they find that dimerization and membrane binding is important for Mvp1 function. They use live cell imaging to show that Vps55 is largely sorted into different tubules compared to the retromer cargo protein Vps10, and use fractionation of vesicle fusion-deficient cells to show these cargo are present in different vesicle populations, suggesting that Mvp1 and retromer form different classes of retrograde carriers. By surveying the trafficking of other membrane proteins, they show that in some cases Mvp1 acts redundantly with two other sorting nexin complexes (Snx4 and/or retromer) to recycle cargo at endosomes. Moreover, they find that loss of all three sorting nexin complexes perturbs endosome function, lipid asymmetry, and the endosomal recruitment of the scission factor Vps1. Although Mvp1 was previously implicated in Vps1 recruitment (Chi et al, 2014), Suzuki et al use a GTPase-defective form of Vps1 to provide the first evidence that Mvp1 physically interacts with Vps1 in vivo and in vitro. Taken together, these data suggest that Mvp1, retromer and Snx4 recognize distinct sets of cargo proteins and mediate independent recycling pathways at endosomes, and imply that each sorting nexin recruits Vps1 to complete tubule scission.
  
  Overall, this manuscript presents a large number of experiments that are technically well executed and makes several novel observations. It should be noted that many experiments largely repeat previous work: this was not always clearly indicated in the manuscript. For the most novel observations, some weaknesses were noted. A key novel finding was that Mvp1 binds to and sorts the cargo protein Vps55 via recognition of a cytosolic motif. The supporting data do not provide the typical burden of proof for such experiments, because: (1) the identified sequence was shown to be necessary but not sufficient, thus the mutation could indirectly affect binding at another site, and (2) Mvp1 failed to coIP with the Vps55 mutant from cell lysates, but this could be an indirect effect of Vps55 missorting to the vacuole while Mvp1 remains at the endosome, and does not prove that Mvp1 binds directly to Vps55 via this motif.
  
  Thank you for pointing this out. As mentioned above, to address your point, we examined the Mvp1-Vps55 interaction in cells lacking Vam3, required for endosome fusion with the vacuole. In this mutant, both WT and recycling mutants localize at the endosome (Fig. Rev. 1C). We confirmed that mutations in the recycling sequence altered the Mvp1-Vps55 interaction even in vam3Δ cells (Figure 3-figure supplement 1C was added to the revised manuscript). To address whether the recycling signal is sufficient for Mvp1-mediated recycling, we tried to generate several chimera constructs, but we did not obtain a construct recycled in Mvp1 dependent manner. Hence, we were not able to address this point.
  
  A second key finding is that Mvp1 and retromer form distinct classes of tubular carriers at endosomes. While the manuscript does provide data to support this conclusion, I was disappointed that there was no discussion of the work of Chi et al, who showed through careful quantitative analysis that Mvp1 and retromer frequently label the same population of tubules.
  
  Thank you for pointing this out. In the revised manuscript, we have also discussed the differences with Chi et al. in the text (Page 13, line 408).
  
  Moreover, the authors claim that mvp1 mutants secrete little CPY, yet the literature indicates these mutants secrete ~65% of newly synthesized CPY (Ekena and Stevens, MCB 1995), suggesting a functional link between Mvp1 and Vps10 recycling. In fact, vps55 mutants themselves have a significant CPY missorting defect (~50% secreted) suggesting that some mvp1 phenotypes could be a secondary consequence of Vps55 mislocalization.
  
  Thank you for pointing this out. We examined the CPY sorting in the recycling signal mutants. Strikingly, CPY was partially missorted to the extracellular space in vps55Y61A/T63A/F66A/M67A mutants (Fig. Rev. 6). Since Vps10 recycling was not altered in mvp1Δ cells (Figure 5A), we believe that the mislocalization of Vps55 causes the CPY sorting defect in mvp1Δ cells.
  
  It was not mentioned that Vps55 interacts with the transmembrane protein Vps68: these proteins are interdependent for their stability and loss of Vps68 slows traffic out of the endosome (Schluter et al MBOC 2008). This provides a simple explanation for the observed ubiquitination and degradation of overexpressed Vps55, which presumably saturates available Vps68.
  
  As suggested by the reviewer, we have revised the manuscript (Page 5, line 158). Also, as mentioned above, we observed that Vps55 missorting was suppressed by overexpression of Vps68 (Figure 3-supplement 1E was added to the revised manuscript), suggesting that Vps68 was saturated in this condition.
  
  Other experiments in this manuscript were not completely novel, including: the demonstration that Mvp1 tubules bud from endosomes and that Mvp1 is important for Vps1 recruitment to endosomes (Chi et al, JCB 2014); that Vps1 GTPase mutants accumulate Mvp1 at endosomes (Ekena and Stevens, MCB 1995); that Mvp1 plays a role in Vps55 localization (Bean et al, Traffic 2017); and that GFP-SNX8 is present on endosomal tubules when expressed in mammalian cells (van Weering et al, Traffic 2012). While in most cases the experiments presented in this manuscript build on and extend previous work, I would like to see the earlier work fully acknowledged, and any discrepancies appropriately discussed. The fact that many of the experiments presented in this manuscript are not entirely novel detracts from the overall impact of the work. Despite this, key original findings presented in this paper - including the discovery that Mvp1 is required for sorting specific cargo and binds directly to the dynamin-like protein Vps1 - will be of broad interest to the trafficking field.
  
  Thank you for pointing this out. In the revised manuscript, we have carefully revised the manuscript (Page 5, line 133; Page 8, line 236; Page 13, line 414; Page 12, line 377).
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.11.434991v1
www.biorxiv.org www.biorxiv.org

Direct Extraction of Signal and Noise Correlations from Two-Photon Calcium Imaging of Ensemble Neuronal Activity

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This study demonstrates with analyical methods and simulations a new approach to estimate pairwise noise and signal correlations in two-photon calcium imaging data. This approach compensates for biases introduced by the dynamics of calcium signals, without deconvolution and for low trial numbers. Simulations based on idealized calcium signals demonstrate the efficiency of the method, and application to auditory cortex imaging data leads to mild changes in the results shown in the past based on less accurate estimates. This study has the merit to identify biases that can arise when evaluating noise and signal correlations across neurons with indirect signals. Moreover the solution provided, may become a useful addition to the neuroscientist's signal analysis toolbox. Noise and signal correlation are related to fonctional connectivity between neurons, and thereby give insights about the fonctional structure of the underlying network. They do not necessarily account for the full complexity of neural interactions but are used in numerous studies, which would be improved by this tool. A potential improvement of the study could be to indicate how this approach could be generalized to other neuron to neuron interaction measurements or data-driven neural network modeling.
  
  We would like to sincerely thank Reviewer 1 for his supportive stance towards our work, and for providing helpful feedback to improve our manuscript
  
  The main weakness of the study is that the efficency of the method is only assessed with simulated datasets. Finding real ground-truth data for a validation beyond that would be difficult if not impossible. However, authors could further convince the reader by showing the effect of relaxing certain assumptions of their surrogate data generation model (e.g. absence of temporal correlation in measurement noise), and show the robustness and limits of the methods.
  
  Thank you for this suggestion. Motivated by this comment, and a related comment by Reviewer 2, we have now substantially enhanced our performance analyses in the revised manuscript and compiled them in a new subsection titled “Analysis of Robustness with respect to Modeling Assumptions” for better clarity and consistency. In summary:
  
  1) We first examined the robustness of our proposed method with respect to model mismatch in the stimulus integration model. As suggested, we generated data according to a non-linear (i.e., quadratic sum of linear filters) receptive field model:
  
  but assumed a linear stimulus integration model in our inference procedure
  
  The comparison of the correlations estimated under this setting by each method are shown in Figure 2 – Figure Supplement 3. While the performance of our proposed signal correlation estimates under this setting degrade as compared to that in Figure 2 with no model mismatch, our proposed estimates still outperform the other methods and recovers the ground truth signal correlation structure reasonably well.
  
  It is noteworthy that the model mismatch in the stimulus integration component does not affect the accuracy of noise correlation estimates in our method, as is evident from the noise correlation estimates in Figure 2 – Figure Supplement 3. In comparison, the biases induced in the other methods due to model mismatch and various other factors such as observation noise, temporal blurring, undermining non-linear mappings between spikes and underlying covariates, results in significantly larger errors in both signal and noise correlation estimates.
  
  2) We incorporated our previous analysis of robustness with respect to calcium decay model mismatch in this subsection, which is shown in Figure 2 – Figure Supplement 4.
  
  3) In response to a related comment by Reviewer 2, we then performed extensive simulations to evaluate the effects of SNR and firing rate on the performance of our method. Overall, while the performance of all algorithms degrades at low SNR or firing rate values (SNR < 10 dB, firing rate < 0.5 Hz), our algorithm outperforms the existing methods in a wide range of SNR and firing rate values considered. The results are summarized in Figure 2 – Figure Supplement 5.
  
  4) Finally, we considered two observation noise model mismatch conditions, namely, white noise + low frequency drift and pink noise, similar to the treatment in Deneux et al. (2016). For each noise mismatch model, we also varied the SNR level and firing rate and compared the performance of the different algorithms as reported in Figure 2 – Figure Supplement 6. These new analyses demonstrate that our proposed estimates outperform the existing methods, under correlated generative noise models, and also with respect to varying levels of SNR and firing rate. As clearly evident in panels C and F of Figure 2 – Figure Supplement 6, even though the estimated calcium concentrations are contaminated by the temporally correlated fluctuations in observation noise, the putative spikes estimated as a byproduct of our iterative method closely match the ground truth spikes, which in turn results in accurate estimates of signal and noise correlations.
  
  To address this comment, we performed extensive simulations to evaluate the robustness of different algorithms under model mismatch conditions induced by 1) non-linearity in the stimulus integration model, 2) calcium decay, 3) SNR and firing rate, and 4) temporal correlation of observation noise. We have now compiled these results in a new subsection called “Analysis of Robustness with respect to Modeling Assumptions” (Pages 6-7).
  
  Also further intuitions about why this method outperform others would be of great help for the non-specialist readers.
  
  Thank you for this suggestion. There are two sources for the performance gap between our proposed method and existing approaches:
  
  1) Favorable soft decisions on the timing of spikes achieved by our method, as a byproduct of the iterative variational inference procedure: an accurate probabilistic decoding of spikes results in better estimates of the signal/noise correlations, and conversely having more accurate estimates of the signal/noise covariances improves the probabilistic characterization of spiking events. This is in contrast with both the Pearson and Two-Stage methods: in the Pearson method, spike timing is heavily blurred by the calcium decay; in the two-stage methods, erroneous hard (i.e., binary) decisions on the timing of spiking events result in biases that propagate to and contaminate the downstream signal and noise correlation estimation and thus result in significant errors.
  
  2) Explicit modeling of the non-linear mapping from stimulus and latent noise covariates to spiking through a canonical point process model (which is in turn tied to a two-photon observation model in a multi-tier Bayesian fashion) results in robust performance under limited number of trials and observation duration. As we have shown in Appendix 1, as the number of trials L and trial duration T tend to infinity, conventional notions of signal and noise correlation indeed recover the ground truth signal and noise correlations, as the biases induced by non-linearities average out across trial repetitions. However, as shown in Figure 2 - Figure supplement 2, in order to achieve comparable performance to our method using 20 trials, the conventional correlation estimates require ~1000 trials.
  
  To address this comment, we have now included the aforementioned items in the revised Discussion section, highlighting the key aspects of our method that makes it outperform existing approaches (Pages 17-18).
  
  Reviewer #2 (Public Review):
  
  This manuscript describes a new method for estimating signal and noise correlations from two-photon recordings of calcium activity in large neuronal networks. Unlike existing methods that first require inferring spikes from calcium transients before estimating the correlations, the proposed method performs the correlation estimation directly from the fluorescence traces. It treats the different inputs to each neuron as latent variables to be inferred from its observed fluorescence activity, and divides these inputs according to whether they are provided by stimulus-dependent (signal) or stimulus-independent (noise) inputs. The authors showed with simulations that proper definitions of signal and noise correlations based on these inferred variables converge with trial repetition much faster to the true correlations than conventional estimates. They are not sensitive to blurring produced by inaccurate spike deconvolution and are less prone to erroneously mixing the signal and noise components of the correlations. By applying this new method to real optical recordings from the auditory cortex of awake mice, the authors shed new light on the structure of the circuitry underlying the processing of sound information in this brain region. Circuits processing sound-related and sound-independent information appear to be more orthogonal than previously thought, with a spatial signature that changes between thalamorecipient layer 4 and supragranular layers 2/3.
  
  This is a mathematical manuscript that introduces a promising new analysis approach. It is designed to be applied to two-photon experiments, that typically produce recordings of calcium activity of several hundred of neurons simultaneously. Because of their massive parallel recordings, which do not rely on spike sorting to identify single units, these optical techniques naturally provide access to correlation between units. They have given rise to a field of active research that attempts to link these correlations to elementary functional circuits in the brain. However, as the authors point out, the low efficiency of spike inference from calcium traces raises the need for correlation estimation approaches that circumvent this problem, as the method presented here does. As such, it could have a significant impact if the community succeeds in using it (see below).
  
  We would like to sincerely thank Reviewer 2 for his/her supportive stance towards our work, and for providing helpful feedback to improve our manuscript.
  
  Weaknesses and strengths
  
  1) Public availability of the code implementing the new method is clearly necessary for the two-photon microscopy community to adopt it, and this is indeed the case at https://github.com/Anuththara-Rupasinghe/Signal-Noise-Correlation. However, it is also crucial that any end-user be able to get a clear picture of the conditions under which the method can or cannot be applied before diving in. The fact that such an applicability domain is not well defined is a major concern. Notably, each Real Data Study presented in the paper uses a preliminary selection of "highly active cells" (1rst study: N = 16; 2nd study: N = 10; 3rd study: N~20 per field), as the authors succinctly discuss that performance is expected to degrade "in the regime of extremely low spiking rate and high observation noise" (l. 518-519). But no precise criteria are provided to specify what is meant by "highly active cells". On the other hand, the authors also assume that there is at most one spiking event per time frame for each neuron, which seems to exclude bursting neurons. The latter assumption seems to be a challenge with respect to the example traces shown on Fig. 4C (F/F reaches 400%) and on Fig. 6C (F/F reaches 100%), considering that the GCaMP6s signal for a single spike is expected to peak below 10-20%. This forces the authors to take a scaling factor of the observations A = 1 x I (Real Data Study 1 and 3) or A = 0.75 x I (Real Data Study 2) compared to the A = 0.1 x I taken in the Simulation Studies. Therefore, it looks like if the Real Data Studies were performed on mainly bursting cells and each burst was counted as one spiking event. A detailed discussion of the usable range of firing rates, whether in spike or burst units, as well as the usable range of SNR should be added to the main text to allow future users to assess the suitability of their data for this analysis.
  
  Thank you for pointing out the issues related to the applicability domain of our method. We agree that clarifying the rationale behind our model parameter choices is key to facilitating its usage by future users. In response to this comment, we have made three major revisions:
  
  1) Adding a new subsection to the Methods and Materials called “Guidelines for model parameter settings” that includes our rationale and criteria for choosing the number of neurons (N), stim- ulus integration window length (R), observation noise covariance (Σ_w), scaling matrix A, state transition parameter (α), and mean of the latent noise process (μ_x);
  
  2) Inspecting the capability of our proposed method in compensating for rapid increase of firing rate;
  
  3) Performing extensive new simulations to evaluate the effect of SNR level and firing rate on the performance of our proposed method, included in a new subsection in the Results section called “Analysis of robustness with respect to modeling assumptions”.
  
  We will next describe these changes in a point-by-point fashion.
  
  -Criterion for selecting the number of neurons. While our proposed method scales-up well with the population size due to low-complexity update rules involved, including neurons with negligible spiking activity in the analysis would only increase the complexity and potentially contaminate the correlation estimates. Thus, we performed an initial pre-processing step to extract N neurons that exhibited at least one spiking event in at least half of the trials considered. This criterion is now clearly stated in the subsection “Guidelines for model parameter settings”. We have also reworded “highly active cells” to “responsive cells (according to the selection criterion described in Methods and Materials)” for clarity.
  
  -Evaluating the effects of SNR level and firing rate. We had previously noted that the performance degrades at low SNR and firing rate values, with little quantitative justification. In response to this comment, and a related comment by Reviewer 1, we performed extensive simulations to evaluate the robustness of the different methods under varying SNR levels, firing rates, and observation noise model mismatch (including white noise + drift and pink noise models). These results are included in a new subsection called “Analysis of robustness with respect to modeling assumptions” and shown in Figure 2 – Figure Supplement 5 and 6.
  
  While the performance of all methods (including ours) degrades at low SNR levels or firing rates (SNR < 10 dB, firing rate < 0.5 Hz), our proposed method outperforms the existing methods in a wide range of SNR and firing rate values and under the considered observation noise model mismatch conditions. To quantify this comparison, we have also indicated the mean and standard deviation of the relative performance gain of our proposed estimates across SNR levels and firing rates as insets in Figure 2 – Figure Supplement 5 and 6.
  
  -Choosing the scaling matrix A. In each case, we set A=aI, and estimated a by considering the average increase in fluorescence after the occurrence of isolated spiking events. Specifically, we derived the average fluorescence activity of multiple trials triggered to the spiking onset and set a as the increment in the magnitude of this average fluorescence immediately following the spiking event.
  
  -Compensation for rapid increase of firing rate. The comment of the reviewer regarding the sudden increase of ∆F/F in Fig. 4C prompted us to inspect the performance of the algorithm in such scenarios where the choice of A may underestimate the rapid increase of firing rate (e.g., A= I). In the new supplementary figure to Fig. 4, called Figure 4 – Figure Supplement 2, we show a zoomed-in view of the time-domain estimates of the latent processes obtained by our proposed method (replicated here for discussion):
  
  Notably, the fluorescence activity rises up to a magnitude of ∼ 14, while we have set a=1. Thus, as the reviewer pointed out, this activity is induced by a burst-like event due to successive closely-spaced spikes. Due to the low firing rate of A1 neurons, we believe this is not a bursting event (in the electrophysiological sense), but a rapid increase in firing rate that may result in the occurrence of more than one spike per frame. From the estimates of the latent calcium concentration (purple) and putative spikes (green), we clearly see that our proposed method is still capable of matching the observed fluorescence activity through two mitigatory mechanisms that we describe next:
  
  1) The proposed method predicts spiking events in adjacent time frames to compensate for rapid increase of firing rate (see the green trace following the vertical dashed line) and thus infers calcium concentration levels that match the observed fluorescence activity;
  
  2) Even though our generative model assumes that there is only one spiking event in a given time frame, this assumption is implicitly alleviated in our inference framework by relaxing the constraint
  
  as explained in the section Methods and Materials - Low-complexity parameter updates (Page 23). While this relaxation was performed in order to make the inverse problem tractable, we see that it in fact leads to improved estimation results under such settings, by allowing the putative spike magnitudes
  
  to be greater than 1, as it is also evident in the magnitude of the inferred spikes right after the rise of fluorescence activity (the horizontal dashed line corresponds to spiking magnitude equal to 1).
  
  We have now discussed this observation in the Results section (Page 10).
  
  To address this comment, we have added a new subsection to Methods called “Guidelines for model parameter settings” that includes our rationale and criteria for choosing key model parameters (Page 24), have performed new simulation studies to evaluate the effects of SNR and firing rate on the performance of the proposed method (Pages 6-7), and closely inspected the performance of our method under rapid increase of firing rate (Page 10).
  
  2) Another parameter seems to be set by the authors on a criterion that is unclear to me: the number of time lags R to be included in the sound stimulus vector st. It seems to act as a memory of the past trajectory of the stimulus and probably serves to enhance the effect of stimulus onset/offset relative to the rest of the sound presentation. It is consistent with the known tendency of neurons in the primary auditory cortex to respond to these abrupt changes in sound power. However, this R is set at 2 in the Simulation Study 1, whereas it is set at 25, in the Real Data Studies 1 and 3, and to 40 in the Real Data Study 2. What leads to these differences escaped to me and should be explained more clearly.
  
  Thank you for pointing out this lack of clarity in explaining the rationale behind choosing R. In addressing this comment, we have now added an entry in the new subsection “Guidelines for model parameter settings”. Furthermore, we have unified our choice of R in the three real data studies. We will explain these changes in a point-by-point fashion next.
  
  -Choice of R in simulation studies. The stimulus used in the simulation was a 6th-order autoregressive process whose present and immediate past values contributed to spiking in our generative model (i.e., R=2). Given that the ground truth value of R was known in the simulations, we used R=2 for inference as well.
  
  -Choice of R for real data application. The number of lags R considered in stimulus integration is a key parameter that can be set through data-driven approaches or using prior domain knowledge. Examples of common data-driven criteria include cross-validation, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), which balance the estimation accuracy and model complexity.
  
  To quantify the effect of R on model complexity, we first describe the stimulus encoding model in our framework. Suppose that the onset of the pth tone in the stimulus set (p=1,⋯,P , where P is the number of distinct tones) is given by a binary sequence
  
  The choice of R implies that the response at time t post-stimulus depends only on the R most recent time lags. As such, the effective stimulus at time t corresponding to tone p is given by
  
  By including all the P tones, the overall effective stimulus at the tth time frame is given by
  
  The stimulus modulation vector d_j would thus be RP-dimensional. As a result, the number of parameters (M=RP) to be estimated linearly increases with R. By using additional domain knowledge, we chose R to be large enough to capture the stimulus effects, and at the same time to be small enough to control the complexity of the algorithm.
  
  As an example, given that the typical response duration of mouse primary auditory neurons is < 1 s, with a sampling frequency of f_s=30 Hz, we surmised that a choice of R∼30 would suffice to capture the stimulus effects. We further examined the effect of varying R on the proposed correlation estimates in Figure 4 – Figure Supplement 1. As shown, small values of R (e.g., R = 1 or 10) may not be adequate to fully capture the effects of stimuli. By considering values of R in the range 25 − 50, we noticed that the correlation estimates remain stable. We thus chose R=25 for our real data analyses. Notably, the results of real data study 2 (that previously used R = 40) are nearly unchanged with the new choice of R=25, which is in accordance with our observation in Figure 4 – Figure Supplement 1.
  
  To address this comment, we have added a new subsection to Methods called “Guidelines for model parameter settings” (Page 24) that includes our rationale for choosing the stimulus integration window length R and have performed a new analysis to evaluate the effect of R on the performance of the proposed method in real data study 1 (Page 10).
  
  3) This memory of the past stimulus trajectory appears to be specific to the proposed method and is not accounted for in the 2-stage Pearson estimation, for example. Since it probably helps to reflect the common sensitivity of neurons to onset/offset, it alone provides an advantage to the proposed method over the 2-stage Pearson estimation. It would be instructive to also perform this comparison with R set to 1 to get an idea of the magnitude of this advantage.
  
  We agree that explicit modeling of stimulus integration is a key advantage of our proposed method in comparison to the conventional ones. We have now explained this virtue in the discussion of the role of R in real data study 1 (Page 10). Additionally, as explained in our responses to the previous comment, we have included a new analysis of the sensitivity of our proposed estimates to the choice of R as a supplementary figure to Figure 4. As the reviewer suggested, we see that R=1 indeed fails to capture the underlying structure in the signal correlations. However, when R is sufficiently large (R>20), the estimates become stable.
  
  To address this comment, we have now discussed the advantage of including the stimulus history in our model and probed the sensitivity of our estimates to the choice of R in Figure 4 – Figure Supplement 1 (Page 10).
  
  4) Finally, although the example of ground truth signal and noise correlation matrices taken to illustrate the method in the simulation study on Fig. 2A have been chosen to be with almost no overlap in their non-zero coefficients, there is no fundamental reason why this separation should be the rule for real data. These coefficients reflect the patterns of stimulus-dependent and stimulus-independent functional connectivity in the recorded network. As such, these patterns could have different degree of overlap, depending on the brain areas recorded. It is therefore particularly striking that the authors find in their data a strong dissimilarity and almost no covariance between signal and noise correlation coefficients, throughout all the different sets of experiments they present here (Fig. 4E, Table 1, 2, 3, and Fig. 6A&B). This makes a strong and compelling statement on the likely separation of the corresponding circuits in the primary auditory cortex of the mouse.
  
  We agree with the assessment of the reviewer. We suspect that some of the reported similari- ties between signal and noise correlations in existing literature could be due to leakage in estimating these two quantities, likely indued by limited number of trials, short observation duration, and undermining the effect of calcium dynamics and non-linearities.
  
  Likely impact on the field
  
  It is now well established that sound processing is modulated, even at the level of primary auditory cortex, by locomotion (Schneider et al. Nature 2018), task engagement (Fritz et al. Nat. Neurosci. 2003), or several other factors. Applying the proposed method to these situations could help understand how sound processing circuits are remodeled, without confounding other coexisting processes. In general, whenever a brain structure makes associations between multiple processes within the same network, the presence of multiple circuits makes the observation of correlations difficult to attribute to the signature of a single circuit. By significantly improving the estimation of signal and noise correlations, the proposed method should help distinguish the boundaries of these circuits as well as their intersections. The exploration of the role of many secondary sensory and associative cortical structures could be renewed by this work.
  
  We would like to thank Reviewer 2 again for his/her supportive stance towards our work and for fairly summarizing our contributions
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.11.434932v1
www.biorxiv.org www.biorxiv.org

Two different cell-cycle processes determine the timing of cell division in Escherichia coli

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The manuscript "Two different cell-cycle processes determine the timing of cell division in Escherichia coli" by Colin et al. presents an experimental approach to investigate the role of two governing cell-cycle processes, namely, DNA replication-segregation and cell division cycle, in size regulation. Authors tackle the problem by first decoupling these two cell-cycle process via sub-lethal dosages of A22, and then analyze the role of each process in the timing of cell division. Modern imaging and analysis techniques are used in this work to monitor cell division with single-cell resolution and chromosome replication with sub-cellular resolution. The large pool of data allows the authors to perform correlation analysis of cell-size and the cell cycle parameters, which led to the conclusion that the two processes have a "balanced contributions in non-perturbed cells."
  
  The question studied in this manuscript is important and timely. The investigation of the two concurrent processes chosen by the authors is perhaps the right direction which may eventually lead to a complete understanding of the E. coli cell-cycle and size regulation. The high-resolution imaging and analysis accomplished in this work is also commendable. There is, however, a major concern about this manuscript, which is the entire conclusion is based on the cell-cycle and size perturbations by A22. The caveat of the A22 perturbations is that an aberrant cell shape could affect both of the cellular processes simultaneously. Even though the C-period and initiation size are largely unchanged, a possible, but unknown, cross-talk between the two processes may be affected by A22. Therefore, additional evidence is necessary to show whether the two processes independently determine cell division.
  
  We agree that A22 treatment could possibly affect DNA replication or organization, e.g., indirectly through an effect of cell width on DNA organization. It would thus indeed be desirable to confirm our findings based on alternative perturbations. At the same time, our experiments clearly demonstrate that cell sizes at replication initiation and division are decreasingly correlated with increasing A22 concentration, which suggests that a process different from DNA replication is responsible for the timing of division.
  
  Additionally, DNA replication could depend on cell division, which could possibly complicate the relationship between replication and division. We have now addressed the possibility of an influence on division on replication initiation in the Discussion, where we write ‘The concurrent-cycles framework assumes that replication initiation is independent of cell division or cell size at birth, [...]. However, we note that this is not the only possibility, and DNA replication may not be entirely independent of cell division. A complementary hypothesis \citep{Kleckner2018} posits a possible (additional or complementary) connection of initiation to the preceding division event. To test this hypothesis one could perturb specific division processes by titrating components involved in Z-ring assembly (e.g., titrating FtsZ \citep{Zheng2016}).’
  
  Reviewer #2:
  
  This is an interesting paper which makes important contributions to an interesting and highly controversial topic: how does an E.coli cell decide when to divide.
  
  As the authors describe in clear and careful detail, two main camps have argued (often dogmatically) for "single process" models in which division is either a direct, downstream consequence of replication initiation (which is the regulated step) or of effects that act directly on division (irrespective of replication and, more generally, the chromosome cycle). The authors of this paper have, instead, proposed that both types of effects are important, in different proportions according to the circumstances. They refer to this idea as a "concurrent cycles" hypothesis. In previous work they have presented arguments and data which they interpret as being incompatible with any single process model and consistent with their alternative hypothesis.
  
  This work now investigates the consequences of treatment with A22, a drug which inhibits MreB, with the result that it increases cell width and, concomitantly, increases the length of time between completion of a given round of DNA replication and the immediately ensuing cell division (an interval known as the "D period"). The idea to analyze this situation was motivated by the authors previous hypothesis: by the concurrent cycles idea, increasing the length of the D-period should prolong the replication-independent inter-division process such that it becomes rate limiting in determining the timing of division (relative to the replication-dependent process).
  
  The data presented confirm the authors' expectation. They first show that progressively increasing the amount of A22 does not (dramatically) alter either: (i) the basic "adder" behavior in which a fixed amount of cell length is added irrespective of the length of the cell at birth or (ii) the finding that a fixed amount of cell length is added per replication origin during the period from one round of replication initiation to the next, which is consistent with (and generally considered to be supportive of) a role for a replication-dependent process.
  
  However, they also discover an interesting additional effect by examining the amount of cell length added (per origin) during the entire period comprising replication plus the immediately ensuing division ("C+D"). In the unperturbed case, cells that are longer at the time of initiation of replication also add more length during the ensuing (C+D) period. In contrast, in the presence of increasing amounts of A22, this effect is progressively reversed such that, finally, at high drug levels, cells which are longer (per origin) at the time of initiation of replication add much less length during the ensuing (C+D) period. Since the length of the C period is essentially constant in all conditions, the relevant effect is the variation in the length of the D period. And since the observed effect becomes more and more prominent with increasing A22 concentration, variation in the D period dominates more and more as the length of that period gets longer and longer. The authors interpret this effect to mean that, with increasing D-period length, division timing is decreasingly dependent on replication initiation. They go on to infer that "with increasing average D period, a process different from DNA replication is likely increasingly responsible for division control". This is a sensible, relatively formal restatement of the finding. This statement allows for diverse specific interpretations. The authors focus on one possible interpretation: they show that their previously proposed concurrent cycles hypothesis can quantitatively explain these data. In essence, given a replication-independent and a replication-dependent process, the observed findings are explained by an increased contribution of the replication-independent process. This scenario also does a better job of explaining the presented data, as well as other findings, than other recent "single process" models, for reasons that are discussed in straightforward detail in the Discussion. The authors also do an excellent job of laying out the assumptions upon which their model (and other existing models) are based, thus laying open the possibility for future studies to consider other possible scenarios.
  
  This work is important for four reasons. First, provides interesting new data which must be accommodated by any synthetic explanation for cell division control. Second, it makes it abundantly clear that the validity of any proposed single process model remains to be further substantiated. Third, it suggests an interesting alternative model which can accommodate a diversity of data, including that presented in the current work, and which has the potentially attractive feature of combining the two existing single-process models. Fourth, and perhaps most importantly, the authors discussion of the available data in this field clear, thoughtful and thought-provoking and leaves open the possibility of some as-yet unimagined mechanism. Overall, this work provides an important counterpoint to other published work and is a very valuable contribution to thinking and discussion in this field.
  
  [It can also be noted specifically that this work provides an important counterpoint to the model proposed in a previous eLIFE paper on this topic by Witz et al., 2019 (eLife 2019;8:e48063 doi: 10.7554/eLife.48063).]
  
  We thank the reviewer for her careful assessment and appreciation of our work.
  
  Reviewer #3:
  
  Colin, Micali et al. investigated slow-growing E. coli cells' division and replication over cell cycles at single cell level with the perturbed cellular dimension. They found that the time between replication termination and division increased by perturbing cell width as recently reported, and that chromosome replication became decreasingly limiting for cell division. These results well supported the 'concurrent-processes model' previously proposed by some of the authors.
  
  1) Cell length can be used to represent the cell size (adder) only if the cell width keeps constant. In the current form of the manuscript, it is unknown whether or not the cell width varies significantly at single-cell level with A22 treatment (e.g., 1µg/ml A22). In this case, cell volume might not be nicely correlated with cell length. The interpretation of Figure 3 therefore would be devalued.
  
  We now demonstrate in the new Figure 2–S2 that the coefficient of variations of cell width does not increase with A22 concentration (neither in snapshots from cells grown in liquid culture nore in the mother machine):
  
  Figure: Variation of width at the single-cell level. Coefficient of variation of cell width as a function of mean cell with. Squares and triangles represent measurements done on cells grown in mother machine or in liquid culture respectively. Blue color represents wild-type cells. Grey color represents cells treated with different amounts of A22.
  
  We also reference this figure in the main text, writing: ‘Increasing A22 concentration leads to increasing steady-state cell width both in batch culture and in the mother machine (Figure \ref{fig2}B), without affecting cell-to-cell width fluctuations (Figure \ref{CV_width}),} and without affecting doubling time (Figure \ref{fig2}C) or single-cell growth rate (Figure \ref{SI_Fig1}).’
  
  2) The negative value of 𝜁C+D in Figure 3F (treated group) indicates that the division length is negatively correlated with the cell length at replication initiation. It is not obvious that this can rule out the possible contribution of DNA replication/segregation in offsetting the length difference at initiation and thus contribute to cell division. Since Figure 3F is the key observation to validate the model, more explanations are required to help readers understand how a negative 𝜁C+D can lead to a conclusion that a process different from DNA replication is likely responsible for division control with A22-treatment.
  
  The negative value of zeta_CD actually corresponds to a lack of correlation between division size and size at initiation, typically predicted by the models where replication is never limiting for cell division (Micali et al 2018, Si et al. 2019). We have commented more explicitly on this point in the text, writing: ‘Note that the negative value of $zeta_{\rm CD}$ corresponds to a lack of correlation between division size and size at initiation (Figure \ref{fig3}G), typically predicted by the models where replication is never limiting for cell division~\cite{Micali2018,Micali2018b,Si2019}.’
  
  3) As an important input for the model, the QC+D' is assumed to be equal to QC+D in unperturbed conditions and remains constant regardless of the A22 concentration (Line 548-554). This assumption is reasonable if the minimum time interval for segregation (D') is irrelevant to the change of cell width. But how D' and QC+D' changes with cell width are unknown. Earlier molecular studies revealed that the polymerization of MreB affects the activity of topoisomerase IV, an enzyme mediates the dimerization of sister chromosomes, which implies that changing cell width may affect D'. Given the importance of QC+D' to the model, it is vital for the authors to make this assumption clear in maintext and explain why such assumption is reasonable.
  
  Q_CD’ (related to average growth in the CD’ period) is a parameter that we cannot measure, or bypass in the model. We have made this assumption more explicit in the text. While this question deserves further investigation in future studies, we know that D’ cannot increase too strongly with width, because otherwise it would leave replication/segregation limiting for division under A22 perturbations, contrary to our observation. This is the main reason to assume D’ constant in the model. A posteriori we can say that the loss of correlation between size at division and size at initiation observed under A22 treatment is in line with the hypothesis that D’ does not increase too much in order for the segregation process to interfere with cell division. We now write: ‘Note that neither the minimum completion time C+D' nor the coupling parameter $zeta_{CD’}$ can be measured experimentally, or bypassed in the model. In principle these parameters could change under A22 perturbations, since MreB affects the activity of topoisomerase IV \citep{madabhushi2009actin,kruse2003dysfunctional}, an enzyme that mediates the dimerization of sister chromosomes. However, constancy of $\zeta_{CD'}$ is supported by the constancy of the C period, and the minimum D' period cannot increase too strongly with width in the model, because otherwise it would render replication/segregation limiting for division under A22 perturbations, contrary to our experimental observation. Hence, for simplicity, we assumed $\zeta_{CD'}$ and the D' period to stay constant.’
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.08.434443v1
www.biorxiv.org www.biorxiv.org

New submission 20/02/2023, 10:40:07

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors push a fresh perspective with a sufficiently sophisticated and novel methodology. I have some remaining reservations that concern the actual make-up of the data basis and consistency of results between the two (N=16) samples, the statistical analysis, as well as the “travelling” part.
  
  I previously commented on the fact that findings from both datasets were difficult to discern and more effort should be made to highlight these. Also, a major conclusion “the directionality effect [effect of attention on forward waves] only occurs for visual stimulation” only rested on a qualitative comparison between studies. The authors have improved on this here, e.g., by toning down this conclusion. One thing that is still missing is a graphical representation of the data from Foster et al. (the second dataset analysed here) that would support the statistical results and allow the reader a visual comparison between the sets of findings.
  
  We are glad that the reviewer recognizes the improvement in the presentation of the conclusions. According to the suggestions, we have modified figure 2, not only by including a third dataset (see point below), but also in a way that allows a direct comparison between the three datasets. Specifically, the results from the three datasets are now shown in three columns next to each other. The first row shows the FW and BW waves in contra and ipsilateral lines of electrodes for each dataset: our dataset and the one from Feldmann-Wustefeld and colleagues (the first and the second column in the figure, both with visual stimulation) shows a clear interaction between direction and laterality, as confirmed by the statistical analysis. The dataset from Foster and colleagues (the third column, no visual stimulation) shows a laterality effect only in the backward waves but not in the forward ones, in line with the hypothesis that FW waves are modulated only in the presence of visual stimulation. The second row shows a schematic representation of the task, and the third row illustrate the electrodes’ lines used in each dataset. We hope the reviewer will be satisfied with the current data presentation.
  
  Also, for any naive reader, the concept of travelling waves may be hard to grasp in the way data are currently presented - only based on the results of the 2D-FFT. Can forward and backward-travelling waves be illustrated in a representative example to make this more intuitive?
  
  We thank the reviewer for the suggestion. We included in figure 1 an additional panel E that represents a schematic example of forward and backward waves in the temporal domain (i.e., in the EEG data). We hope this example will provide a better understanding of the data and the traveling wave concept.
  
  Finally, the way Bayes Factors from the Bayesian ANOVA are presented, especially with those close to the ‘meaningful boundaries’ ⅓ and 3, as defined in the ‘Statistical analysis’ section, requires some unification/revision. For example, here: “We found a positive correlation between contra- and ipsi- lateral backward waves, and occipital (all Pearson’s r~=0.4, all BFs 10 ~=3) and -to a smaller extent- frontal areas (all Pearson’s r~=0.3, all BFs 10 ~=2).”, where the second part should strictly be labelled as inconclusive evidence. In the same vein, there is occasional mention of “negative effects”, where it should say that evidence favours the absence of an effect.
  
  We agree with the reviewer and apologize for the inaccuracies in reporting the statistical analysis. We corrected as suggested (see below), replacing ‘negative effects’ with ‘evidence favors the absence of an effect’.
  
  From the updated manuscript :
  
  "We found moderate evidence of a positive correlation between contra- and ipsi- lateral backward waves, and occipital (all Pearson’s r~=0.4, all BFs10~=3) but inconclusive evidence in the frontal areas (all Pearson’s r~=0.3, all BFs10~=2)."
  
  From the revised ‘Results’ section, now it reads:
  
  […] whereas all other factors and their interactions revealed evidence in favor of the absence of an effect (BFs10<0.3).
  
  […] but not in the forward waves (BF10=0.231, error<0.01%, supporting evidence in favor of the absence of an effect).
  
  Reviewer #2 (Public Review):
  
  The present manuscript takes a new perspective and investigates the functional relevance of traveling alpha waves’ direction for visual spatial attention. While the modulation of alpha oscillatory power - and especially the lateralization of alpha power - has been associated with spatial attention in the literature, the present investigation offers a new perspective that helps understand and differentiate the functional roles of alpha oscillations in the ipsi- versus contralateral hemisphere for spatial attention.
  
  The present study uses a straightforward approach and provides an analysis of two EEG datasets, which are convergingly in line with the authors’ claim that two patterns of travelling alpha waves need to be differentiated in visual spatial attention. First, backward waves in the ipsilateral hemisphere, and second, forward waves in the contralateral hemisphere, which are only observed during visual stimulation. Importantly, the authors test the relation of these patterns of traveling waves to the overall power of alpha oscillations and to the hemispheric lateralization of alpha power. Furthermore, to test the functional significance, the authors demonstrate that the pattern of forward and backward waves around stimulus onset differentiates between hits and misses in task performance.
  
  Although the results are in line with the conclusions drawn, some questions remain. The authors investigate the relationship between traveling alpha waves and the hemispheric lateralization of alpha power, which is a well-established neural signature of spatial attention. Surprisingly, the lateralization of alpha power shown in Figure 3B appears relatively weak in the present dataset (by visual inspection), which raises the question of whether the investigation of a relation between lateralized alpha power and alpha traveling waves is warranted in the first place.
  
  We agree with the reviewer that the effect seems reduced compared to other studies, despite the topography of alpha-band lateralization in our data is in line with the literature. In order to quantify the effect, we performed an analysis similar to (Thut et al., 2006), defining a laterality index as:
  
  We computed such index for occipital electrodes and their average (in red in figure R1). The results reveal that for most electrodes, including their average, the laterality index is significantly larger than 0, confirming the presence of alpha-band lateralization. However, we also note that the amplitude of the effect (~0.04) is reduced compared to the study by Thut and colleagues, which was between 0.05 and 0.10.
  
  Figure R1 – Laterality index for occipital electrodes, quantifying alpha-band lateralization during attention allocation. All electrodes go in the expected direction, revealing an increase of alpha-band power in the ipsilateral occipital hemisphere.
  
  Furthermore, the authors employ between-subject correlations (with N = 16) to test the relationship between alpha traveling waves and (lateralized) alpha power. However, as inter- individual differences in patterns of travelling waves are not the main focus here, within- subject analyses of the same relations would be able to test the authors’ hypotheses much more directly.
  
  As suggested, we included the recommended within-subject analysis in the revised manuscript by computing a trial-by-trial correlation between alpha power and traveling waves for each participant. First, we obtained a correlation coefficient and a p-value for each subject. Then, we tested whether the correlation coefficients had an overall positive or negative distribution (i.e., according to our previous results, we expected a positive correlation between backward waves and alpha power). Additionally, we combined the p-values to test for overall significance (using the Fisher method, see Methods section below). Our results corroborate the between-subject correlation, supporting the conclusion that alpha-band power correlates mostly with backward waves (especially contro-lateral to the attended location). The other correlations (i.e., forward waves and alpha power) were statistically inconclusive. We included in the revised manuscript these new results, as shown in the following.
  
  From the Results section:
  
  “To further investigate the relation between alpha-band travelling waves and alpha power, we performed the same analysis focusing on the correlation within each participant. In particular, we correlated trial-by-trial forward and backward waves with alpha-band power for each subject, obtaining correlation coefficients ‘r’ and their respective p-values. As in the previous analysis, we correlated forward and backward waves with frontal and occipital electrodes in both contro- and ipsilateral hemispheres. We applied the Fisher method (Fisher, 1992, see Methods for details) to combine all subjects' p-values in every conditions. Overall, we found a significant effect of all combined p-values (p<0.0001), except in the lateralization condition (contra- minus ipsilateral hemisphere), similar to our previous analysis. Additionally, we tested for a consistent positive or negative distribution of the correlation coefficients. As shown in figure 3C, the results support a significant correlation between backward waves and alpha- power in the hemisphere contralateral to the attended location (BF10=10.7 and BF10=7.4 for occipital and frontal regions, respectively; all other BF10 were between 1 and 2, providing inconclusive evidence). Interestingly, this analysis also revealed a small but consistent effect in the correlation between lateralization effects, as we reported a consistently positive correlation in the contra- minus ipsilateral difference between forward waves and alpha power (BF10~5 for both frontal and occipital electrodes). However, it’s important to notice that the combined p-values obtained using the Fisher method did not reach the significance threshold in the lateralization condition, reducing the relevance of this specific result.“
  
  From the Methods section:
  
  “Additionally, we computed trial-by-trial correlations between waves and alpha power for all participants. First, we tested the correlation coefficient against zero in all conditions. Then, we obtained a combined p-value per condition using the log/lin regress Fisher method (Fisher, 1992), as shown in (Zoefel et al., 2019). Specifically, we computed the T value of a chi- square distribution with 2*N degrees of freedom from the pi values of the N participants as:
  
  It needs to be appreciated that the authors analyze two datasets in the present study. However, the question remains whether the absence of the forward waves effect in paradigms without visual stimulation is a general one and would replicate in other datasets. Moreover, the manuscript would benefit from a discussion of the potential implications of traveling waves for functional connectivity between posterior and anterior regions.
  
  We have now included a third dataset in the paper. In this dataset, from (Feldmann-Wüstefeld & Vogel, 2019), participants performed a visual working memory task by attending either the left or the right side of the screen where a stimulus was displayed. We analyzed the amount of waves during stimulus presentation, and we found the same results as in our own dataset: very strong evidence in favor of an interaction between LATERALITY (contra- and ipsilateral) and DIRECTION (FW and BW). We now included the results in figure 2 (see point above) and in the results section of the manuscript. Unfortunately, we couldn't find any other publicly available EEG dataset in which participants attend to either side of the screen without ongoing visual stimulation.
  
  In addition, we re-analyzed our main findings (i.e. the interaction between LATERALITY and DIRECTION) in all three datasets using a classic ANOVA to report the effect size as 𝜂2 (see point above). Unlike the Bayesian ANOVA (which -in JASP- is based on linear mixed models), the classic one does not model the slope of the random effects. Yet, we observed that the LATERALITY x DIRECTION interaction in the Foster dataset proved very significant, with a large effect size (F(1,16)=9.81, p=0.003, 𝜂2=0.13). Supposedly, modeling the slope of the random effects in the Bayesian ANOVA lowered its statistical sensitivity. For the sake of completeness, we reported both results in the manuscript.
  
  Concerning the potential implications of traveling waves on functional connectivity, we consider the interpretation based on the Predictive Coding scheme in the one before the last paragraph of the discussion (reported below for the reviewer’s convenience). In this framework, top-down connections have inhibitory functions, suppressing the predicted activity in lower regions. These interpretations align with our findings, relating the inhibitory role of backward travelling waves to visual attention. Similarly, in the same paragraph, we refer to the work of Spratling, which extensively investigates the relationship between selective attention and Predictive Coding.
  
  From the Results section:
  
  "To confirm our previous results, we replicated the same traveling waves analysis on two publicly available EEG datasets in which participants performed similar attentional tasks (experiment 1 of Foster et al., 2017 and experiment 1 of Feldmann-Wüstefeld and Vogel, 2019). In the first experiment from the Feldmann-Wüstefeld and Vogel dataset, participants were instructed to perform a visual working memory task in which, while keeping a central fixation, they had to memorize a set of items while ignoring a group of distracting stimuli. We focused our analysis on those trials in which the visual items to remember were placed either to the right or the left side of the screen, while the distractors were either in the upper or lower part of the screen (we pulled together the trials with either 2 or 4 distractors, as this factor was irrelevant for the purposes of our analysis). The stimuli were shown for 200ms, and we computed the amount of forward and backward waves in the 500ms following stimulus onset. As shown in figure 2 (central column), the analysis confirmed our previous results, demonstrating a strong interaction between the factors DIRECTION and LATERALITY (BF10=667, error~2%; independently, the factors DIRECTION and LATERALITY had BF10=0.2 and BF10=0.4, respectively). These results confirmed that, in the presence of visual stimulation, spatial attention modulates both forward and backward waves. Next, we analyzed another publicly available dataset from Foster et al., 2017. [...]"
  
  "Remarkably, as shown in figure 2 (right panel), our analysis demonstrated an effect of the lateralization (LATERALITY: BF10=3.571, error~1%), revealing more waves contralateral to the attended location, but inconclusive results regarding the interaction between DIRECTION and LATERALITY (BF10=2.056, error~1%). However, using a classical ANOVA (i.e., without modeling the slope of the random terms), the interaction between DIRECTION and LATERALITY proved significant (F(1,16)=9.81, p=0.003, 𝜂2=0.13)."
  
  From the Methods section:
  
  "We included two additional datasets in this study. In both studies, participants performed a visual attention task while keeping their fixation in the center of the screen. Regarding the Feldmann-Wüstefeld and Vogel, 2019 study, participants were asked to memorize the colors of two stimuli while ignoring a set of distractors stimuli. We analyzed uniquely those trials in which the visual stimuli were presented to the left or right side of the screen, while the distractors were placed above or below the fixation cross. After 500ms of the fixation cross, two colored 'target' stimuli were presented for 200ms. Participants were asked to memorize these stimuli, and a new 'probe’ stimulus was shown after an additional second. Participants reported whether the probe matched the target stimuli or not. We analyzed the traveling waves in the 500ms following the target stimulus onset. Participants performed a spatial attention task in the second dataset from Foster et al. 2017. First, the fixation cross cued participants to covertly attend one of eight possible spatial positions uniformly distributed around the center of the screen. After one second, a digit was displayed either in the cued location or in any other one. The remaining locations were filled with letters. Participants were instructed to report the only displayed digit. We analyzed the waves the second before the stimuli onset when participants attended to the locations cued to the left or right side of the screen (we discarded trials in which participants attended locations above or below the fixation cross). For additional details about both experimental procedures, we refer the reader to Foster et al., 2017 and Feldmann-Wüstefeld and Vogel, 2019.”
  
  From the discussion:
  
  "Our previous work proposed an alternative cause for the generation of cortical waves (Alamia and VanRullen, 2019). We demonstrated that a simple multi-level hierarchical model based on Predictive Coding (PC) principles and implementing biologically plausible constraints (temporal delays between brain areas and neural time constants) gives rise to oscillatory traveling waves propagating both forward and backward. This model is also consistent with the 2-dipoles hypothesis (Zhigalov and Jensen, 2022), considering the interaction between the parietal and occipital areas (i.e., a model of 2 hierarchical levels). However, dipoles in parietal regions are unlikely to explain the observed pattern of top-down waves, suggesting that more frontal areas may be involved in generating the feedback. This hypothesis is in line with the PC framework, in which top-down connections have an inhibitory function, suppressing the activity predicted by higher-level regions (Huang and Rao, 2011). Interestingly, Spratling proposed a simple reformulation of the terms in the PC equations that could describe it as a model of biased competition in visual attention, thus corroborating the interpretation of our finding within the PC framework (Spratling, 2008, 2012)."
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.18.504422v1
www.biorxiv.org www.biorxiv.org

New submission 13/09/2022, 10:29:44

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Point 1) There is affluent evidence that the cortical activity in the waking brain, even in head restrained mice, is not uniform but represents a spectrum of states ranging from complete desynchronization to strong synchronization, reminiscent of the up and down states observed during sleep (Luczak et al., 2013; McGinley et al., 2015; Petersen et al., 2003). Moreover, awake synchronization can be local, affecting selective cortical areas but not others (Vyazovskiy et al., 2011). State fluctuations can be estimated using multiple criteria (e.g., pupil diameter). The authors consider reduced glutamatergic drive or long-range inhibition as potential sources of the voltage decrease but do not attempt to address this cortical state continuum, which is also likely to play a role. For example: does the voltage inactivation following ripples reflect a local downstate? The authors could start by detecting peaks and troughs in the voltage signal and investigate how ripple power is modulated around those events.
  
  Our study is correlational, and hence, we cannot speak as to any casual role that the awake hippocampal ripples may play in the post-ripple hyperpolarization observed in aRSC. It is indeed possible that the post-awake-ripple neocortical hyperpolarization is independent of ripples and reflects other mechanisms that our experiments have possibly been blind to. One such mechanism is neocortical synchronization in the awake state. As reviewer 1 pointed out, it is possible that a proportion of hippocampal ripples occur before neocortical awake down-states. To test this hypothesis, we triggered the ripple power signal by the troughs (as proxies of awake down-states) and peaks (as proxies of awake up-states) of the voltage signals, captured from different neocortical regions, during periods of high ripple activity when the probability of neocortical synchronization is highest (McGinley et al., 2015; Nitzan et al., 2020). According to this analysis (see the figure below), the ripple power was, on average, higher before troughs of aRSC voltage signal than before those of other regions. On the other hand, the ripple power, on average, was not higher after the peaks of aRSC voltage signal than after those of other regions. This observation supports the hypothesis that a local awake down-state could occur in aRSC after the occurrence of a portion of hippocampal ripples. However, a recent work whose preprint version was cited in our submission (Chambers et al., 2022, 2021) reported that, out of 33 aRSC neurons whose membrane potentials were recorded, only 1 showed up-/down-states transitions (bimodal membrane potential distribution). Still, a portion (10 out of 30) of the remaining neurons showed an abrupt post-ripple hyperpolarization. In addition, they reported a modest post-ripple modulation of aRSC neurons’ membrane potential (~ %20 of the up/down-states transition range). Hence, these results suggest that the post-ripple aRSC hyperpolarization is not necessarily the result of down-states in aRSC. A paragraph discussing this point was added to the discussion lines 262-279.
  
  Mean ripple power triggered by troughs and peaks of voltage signal captured from aRSC, V1, and FLS1. Zero time represents the timestamp of neocortical troughs/peaks. The shading represents SEM (n = 6 animals).
  
  Point 2) Ripples are known to be heterogeneous in multiple parameters (e.g., power, duration, isolated events/ ripple bursts, etc.), and this heterogeneity was shown to have functional significance on multiple occasions (e.g. Fernandez-Ruiz et al., 2019 for long-duration ripples; Nitzan et al., 2022 for ripple magnitude; Ramirez-Villegas et al., 2015 for different ripple sharp-wave alignments). It is possible that the small effect size shown here (e.g. 0.3 SD in Fig. 2a) is because ripples with different properties and downstream effects are averaged together? The authors should attempt to investigate whether ripples of different properties differ in their effects on the cortical signals.
  
  The seeming small effect size (e.g. 0.3 SD in Fig. 2a) is because the individual peri-ripple voltage/glutamate traces were z-scored against a peri-non-ripple distribution and then averaged. Alternatively, the peri-ripple traces could have been averaged first, and the averaged trace could have been z-scored against a sampling distribution constructed from the abovementioned peri-non-ripple distribution where the sample size would have been the number of ripples detected for a specific animal. In the latter case, the standard deviation of the sampling distribution would have been used as the divisor in the z-scoring process as opposed to the former case where the standard deviation of the original peri-non-ripple distribution would have been used. Since the standard deviation of the sampling distribution is smaller than the standard deviation of the original distribution by a factor of √(sample size), the final z-scored values in the latter would be higher than those in the former case by a factor of √(sample size). For instance, if the sample size in Fig. 2A (number of ripples) was 100, the mean z-scored value would be 0.3*10 = 3. In any case, it is of interest to investigate the relationship between the ripple and neocortical activity features.
  
  To investigate the relationship between the hippocampal ripple power and the peri-ripple neocortical voltage activity, we focused on the agranular retrosplenial cortex (aRSC) as it showed the highest level of modulation around ripples. To get an idea of what features of the aRSC voltage activity might be correlated with the ripple power, the ripples were divided into 8 subgroups using 8-quantiles of their power distribution, and the corresponding aRSC voltage traces were averaged for each subgroup (similar to the work of Nitzan et al. (Nitzan et al., 2022)). The results of this analysis are summarized in the figure below.
  
  Left: peri-ripple aRSC voltage trace was triggered on ripples in the odd-numbered ripple power subgroups for each animal and then averaged across 6 animals. The standard errors of the mean were not shown for the sake of simplicity. Right: the same as the left panel but for only lowest and highest power subgroups. The shading represents the standard error of the mean.
  
  These results suggested that there might be a positive correlation between the ripple power and the pre-ripple and post-ripple aRSC voltage amplitude. To test this possibility, Pearson’s correlation between the ripple power and pre-/post-ripple aRSC amplitude was calculated for each animal separately. The ripple power for each detected ripple was defined as the average of the ripple-band-filtered, squared, and smoothed hippocampal LFP trace from -50 ms to +50ms relative to the ripple's largest trough timestamp (ripple center). The pre- and post-ripple aRSC amplitude for each ripple was calculated as the average of the aRSC voltage trace over the intervals [-200ms, 0] and [0, 200ms], respectively. The results come as follows.
  
  Top: the scatter plots of the ripple power and pre-ripple aRSC voltage amplitude for individual animals. The black lines in each graph represent the linear regression line. The blue circles in each graph are associated with one ripple. The Pearson’s correlation values (ρ) and the p-value of their corresponding statistical significance are represented on top of each graph. Bottom: the same as top graphs but for post-ripple aRSC amplitude.
  
  According to this analysis, 4 out of 6 animals showed a weak positive correlation (ρ = 0.0806 ± 0.0115; mean ± std), 1 animal showed a negative correlation (ρ = -0.20183), and 1 animal did not show a statistically significant correlation (p-value > 0.05) between ripple power and pre-ripple aRSC voltage amplitude. Moreover, 2 out of 6 animals showed a negative correlation (ρ = -0.1 and -0.14), and 4 animals did not show a statistically significant correlation (p-value > 0.05) between ripple power and post-ripple aRSC voltage amplitude.
  
  To check that the correlation results were not influenced by the extreme values of the ripple power and aRSC voltage, we repeated the same correlation analysis after removing the ripples associated with top and bottom %5 of the ripple power and aRSC voltage values. According to this analysis, 1 out of 6 animals showed a negative correlation (ρ = -0.13), and 5 animals did not show a statistically significant correlation (p-value > 0.05) between ripple power and pre-ripple aRSC voltage amplitude. Moreover, 2 out of 6 animals showed a negative correlation (same animals that showed negative correlation before removing the extreme values; ρ = -0.12 and -0.14), 1 animal showed a positive correlation (ρ = 0.1), and 3 animals did not show a statistically significant correlation (p-value > 0.05) between ripple power and post-ripple aRSC voltage amplitude.
  
  Based on these results, we cannot conclude that there is a meaningful correlation between the ripple power and amplitude of aRSC voltage activity before and after the ripples. It is noteworthy to mention that Nitzan et al. (see Fig S6 in (Nitzan et al., 2022)) did not report a statistically significant correlation between ripple power octile number (by discretizing a continuous-valued random variable into 8 subgroups) and pre-ripple firing rate of the mouse visual cortex. However, they reported a statistically significant negative correlation (ρ = -0.13) between the ripple power octile number and post-ripple firing rate of the mouse visual cortex. It appears that their reported negative correlation was influenced by the disproportionately larger values of the firing rate associated with the first ripple power octile compared to the other octiles. Therefore, repeating their analysis after removing the first octile would probably lead to a weak correlation value close to 0.
  
  Next, we investigated the relationship between ripple duration and aRSC voltage activity. To get an idea of what features of the aRSC voltage activity might be correlated with the ripple duration, the ripples were divided into 8 subgroups using 8-quantiles of their duration distribution, and the corresponding aRSC voltage traces were averaged for each subgroup. The results of this analysis are summarized in the figure below.
  
  Left: peri-ripple aRSC voltage trace was triggered on ripples in the odd-numbered ripple duration subgroups for each animal and then averaged across 6 animals. The standard errors of the mean were not shown for the sake of simplicity. Right: the same as the left panel but for only lower and highest duration subgroups. The shading represents standard error of the mean.
  
  These results do not reveal a qualitative difference between the patterns of aRSC peri-ripple voltage modulation and ripple duration. However, the same correlation analysis performed for the ripple power was also conducted for the ripple duration. Only 1 animal out of 6 showed a statistically significant correlation (ρ = 0.08) between pre-ripple aRSC voltage amplitude and ripple duration.
  
  Moreover, only 1 animal out of 6 showed a statistically significant correlation (ρ = -0.08) between post-ripple aRSC voltage amplitude and ripple duration. In conclusion, there does not seem to be a meaningful linear relationship between peri-ripple aRSC voltage amplitude and ripple duration.
  
  Next, we investigated whether the peri-ripple aRSC voltage modulation differs depending on whether a single or a bundled ripple occurs in the dorsal hippocampus. The bundled ripples were detected following the method described in our previous work (Karimi Abadchi et al., 2020). We found that 9.4 ± 3.5 (mean ± std across 6 animals) percent of the ripples occurred in bundles. Then, the aRSC voltage trace was triggered by the centers of the single as well as centers of the first/second ripples in the bundled ripples, averaged for each animal, and averaged across 6 animals. The results of this analysis are represented in the following figure.
  
  Left: animal-wise average of mean peri-ripple aRSC voltage trace triggered by centers of the single and centers of the first ripple in the bundled ripples. Right: Same to the left but triggered by the centers of the second ripple in the bundled ripples.
  
  These results suggest that the amplitude of aRSC voltage activity is larger before bundled than single ripples, and the timing of aRSC voltage activity is shifted to the later times for bundled versus single ripples. The pre-ripple larger depolarization might signal the occurrence of a bundled ripple (similar to larger pre-bundled- than pre-single-ripple deactivation observed during sleep (Karimi Abadchi et al., 2020)).
  
  Point 3) The differences between the voltage and glutamate signals are puzzling, especially in light of the fact that in the sleep state they went hand in hand (Karimi Abadchi et al., 2020, Fig. 2). It is also somewhat puzzling that the aRSC is the first area to show voltage inactivation but the last area to display an increase in glutamate signal, despite its anatomical proximity to hippocampal output (two synapses away). The SVD analysis hints that the glutamate signal is potentially multiplexed (although this analysis also requires more attention, see below), but does not provide a physiologically meaningful explanation. The authors speculate that feed-forward inhibition via the gRSC could be involved, but I note that the aRSC is among the two major targets of the gRSC pyramidal cells (the other being homotypical projections) (Van Groen and Wyss, 2003), i.e., glutamatergic signals are also at play. To meaningfully interpret the results in this paper, it would be instrumental to solve this discrepancy, e.g., by adding experiments monitoring the activity of inhibitory cells.
  
  Observing that glutamate and voltage signals do not go hand-in-hand in awake versus sleep states was surprising for us as well, and it was the main reason that SVD analysis was performed. Especially that a portion of aRSC excitatory neurons showed elevated calcium activity despite the reduction of voltage and delayed elevation of glutamate signals in aRSC at the population level. At the time of initial submission, pre-ripple reduction and post-ripple elevation of calcium activity in a portion of three subclasses of the superficial aRSC inhibitory neurons were reported (Chambers et al., 2022, 2021), and it was the basis of our speculation on the potential involvement of feed-forward inhibition in the post-ripple voltage reduction. We speculated that the source of this potential feed-forward inhibition could stem from gRSC excitatory neurons, as the reviewer 1 pointed out, or from other neocortical or subcortical regions projecting to aRSC. It is also possible that feedback inhibition would be involved where the principal aRSC neurons that are excited by gRSC (as reviewer 1 pointed out) or any other region, including aRSC itself, excite aRSC inhibitory neurons.
  
  Point 4) I am puzzled by the ensemble-wise correlation analysis of the voltage imaging data: the authors point to a period of enhanced positive correlation between cortex and hippocampus 0-100 ms after the ripple center but here the correlation is across ripple events, not in time. This analysis hints that there is a positive relationship between CA1 MUA (an indicator for ripple power) and the respective cortical voltage (again an incentive to separate ripples by power), i.e. the stronger the ripple the less negative the cortical voltage is, but this conclusion is contradictory to the statements made by the authors about inhibition.
  
  A closer look at Figure 2B iv reveals that elevation of the cross-correlation function between peri-ripple aRSC voltage and hippocampal MUA starts with a short delay (~20 ms) and peaks around 75 ms after the ripple centers. It means the maximum correlation between the two signals occurs at point (75ms, 75ms) on the MUA time-voltage time plane whose origin (i.e. the point (0, 0)) is the ripple centers in the hippocampal MUA and corresponding imaging frame in the voltage signal. Reviewer 1’s interpretation would be correct if the maximum correlation occurred at the point (0, 0) not at the point (75ms, 75 ms). It is because the MUA value at the time of ripple centers (t = 0) is the indicator of the ripple power not at the time t = 75ms. Figure 2B iii shows that the amplitude of hippocampal MUA is more than 2 dB less at t = 75ms than at t = 0 which is a reflection of the fact that ripples are often short-duration events. Instead, if the maximum correlation occurred at the point (0, 100ms) where the ripples had maximum power and aRSC voltage was at its trough (Figure 2B iii), it could have been concluded that “the stronger the ripple the less negative the cortical voltage”.
  
  Point 5) Following my previous point, it is difficult to interpret the ensemble-wise correlation analysis in the absence of rigorous significance testing. The increased correlation between the HPC and RSC following ripples is equal in magnitude to the correlation between pre-ripple HPC MUA and post-ripple cortical activity. How should those results be interpreted? The authors could, for example, use cluster-based analysis (Pernet et al., 2015) with temporal shuffling to obtain significant regions in those plots. In addition, the authors should mark the diagonal of those plots, or even better compute the asymmetry in correlation (see Steinmetz et al., 2019 Extended Fig. 8 as an example), to make it easier for the reader to discern lead/lag relationships.
  
  The purpose of calculating the ensemble-wise correlation coefficient was to provide further information about the relationship between the two random processes peri-ripple HPC MUA and peri-ripple neocortical activity. In general, the correlation between the two random processes cannot be inferred from the temporal relationship between their mean functions. In other words, there are infinitely many options for the shape of the correlation function between two random processes with given mean functions. Moreover, the point was to compare the correlation of peri-ripple neocortical activity and HPC MUA across neocortical regions. The fact that mean peri-ripple activity in, for example, RSC and FLS1 are different does not necessarily mean their correlation functions with peri-ripple HPC MUA are also different.
  
  As requested, we performed cluster-based significant testing via temporal shuffling for each individual VSFP (n = 6), iGluSnFR Ras (n = 4), and iGluSnFR EMX (n = 4) animals. The following figures summarize the number of animals showing significant regions in their correlation functions between peri-ripple HPC MUA and different neocortical regions. The diagonal of the correlation functions is marked; however, the temporal lead/lag should not be inferred from these results mainly because the temporal resolution of the two signals, one electrophysiological and one optical, are not the same.
  
  Point 6) For the single cell 2-photon responses presented in Fig. 3, how should the reader interpret a modulation that is at most 1/20 of a standard deviation? Was there any attempt to test for the significance of modulation (e.g., by comparing to shuffle)? If yes, what is the proportion of non-modulated units? In addition, it is not clear from the averages whether those cells represent bona fide distinct groups or whether, for instance, some cells can be upmodulated by some ripples but downmodulated by others. Again, separation of ripples based on objective criteria would be useful to answer this question.
  
  As explained in response to point 2, the seeming small modulation size (e.g. 0.05 SD in Fig. 3b) is because the individual peri-ripple calcium traces were z-scored against a peri-non-ripple distribution and then averaged. Alternatively, the peri-ripple traces could have been averaged first, and the averaged trace could have been z-scored against a sampling distribution constructed from the abovementioned peri-non-ripple distribution where the sample size would have been the number of ripples detected for a specific animal. In this latter case, the standard deviation of the sampling distribution would have been used as the divisor in the z-scoring process as opposed to the former case where the standard deviation of the original peri-non-ripple distribution would have been used. Since the standard deviation of the sampling distribution is smaller than that of the original distribution by a factor of √(sample size), the final z-scored values in the latter would be higher than those in the former case by a factor of √(sample size).
  
  As suggested by the reviewer and to make our results more comparable with those of electrophysiological studies, we deconvolved the calcium traces and tested for the significance of the modulation of each neuron by comparing its mean peri-ripple deconvolved trace with a neuron-specific shuffled distribution (see the methods section for details). We found %8.46 ± 3 (mean ± std across 11 mice) of neurons were significantly modulated over the interval [0, 200ms] and %81.08 ± 8.91 (mean ± std across 11 mice) of which were up-modulated. If the criterion of being distinct is being significantly up- or down-modulated, these two groups could be considered distinct groups. The following figures show mean peri-ripple calcium and deconvolved traces, averaged across up- or down-modulated neurons for each mouse and then averaged across 11 mice.
  
  Point 7) Fig. 3: The decomposition-based analysis of glutamate imaging using SVD needs to be improved. First, it is not clear how much of the variance is captured by each component, and it seems like no attempt has been made to determine the number of significant components or to use a cross-validated approach. Second, the authors imply that reconstructing the glutamate imaging data using the 2nd-100th components 'matches' the voltage signal but this statement holds true only in the case of the aRSC and not for other regions, without providing an explanation, raising questions as to whether this similarity is genuine or merely incidental.
  
  The first 100 components explained about %99.9 of the variance in the concatenated stack of peri-ripple neocortical glutamate activity for each animal which is practically equivalent to the entire variance in the data. Our goal was not to obtain a low-rank approximation of the data for which the number of significant components had to be determined. Instead, we decomposed the data into the activity along the first principal component for which there was no noticeable topography among neocortical regions and the activity along the rest of the components for which there was a noticeable topography among neocortical regions. The first component explained %83.11 ± 6.75 (mean ± std across 4 iGluSnFR Ras mice) and %83.3 ± 5.07 (mean ± std across 4 iGluSnFR EMX mice) of variance in the concatenated stack of peri-ripple neocortical glutamate activity.
  
  As we discussed in the discussion section of the manuscript, SVD is agnostic about brain mechanisms and only cares about capturing maximum variance. Specifically, it is not designed to capture the maximum similarity between glutamate and voltage activity in the brain. Therefore, the only thing we can say with certainty comes as follows: when the activity along the axis with maximum co-variability (1st principal component) across the neocortical regions’ glutamate activity is removed, only aRSC, and no other regions, show a post-ripple down-modulation, whose timing matches that of aRSC post-ripple voltage down-modulation. Moreover, the timing of activity of 1st principal component matches better with that of calcium activity among the up-modulated portion of aRSC neurons. Even though the genuineness of these results is not guaranteed, the similarity between the timing of SVD output in aRSC glutamatergic activity with that in two independently collected signals in aRSC, i.e. voltage and calcium, could support the idea that peri-ripple aRSC glutamatergic activity is likely a mixture of up- and down-modulated components.
  
  Point 8) The estimation of deep pyramidal cells' glutamate activity by subtracting the Ras group (Fig. 4d) is not very convincing. First, the efficiency of transgene expression can vary substantially across different mouse lines. Second, it is not clear to what extent the wide field signal reflects deep cells' somatic vs. dendritic activity due to non-linear scattering (Ma et al., 2016), and it is questionable whether a simple linear subtraction is appropriate. The quality of the manuscript would improve substantially if the authors probe this question directly, either by using deep layer specific line/ 2-P imaging of deep cells or employing available public datasets.
  
  Simulation studies have suggested that the signal, captured by wide-field imaging of voltage-sensitive dye, can be modeled as a weighted sum of voltage activity across neocortical layers (Chemla and Chavane, 2010; Newton et al., 2021). Hence, modeling the glutamate signal as a weighted sum of the glutamate activity across neocortical layers is a good starting point. Future studies would be needed to improve this starting point by imaging glutamate activity in a cohort of mice with iGluSnFR expression in only deep layers’ neurons. Moreover, Ma et al. (Ma et al. 2016) stated that “This means that signal detected at the cortical surface (in the form of a two-dimensional image) represents a superficially weighted sum of signals from shallow and deeper layers of the cortex”.
  
  Reviewer #2 (Public Review):
  
  Point 1) The authors throughout the manuscript compare the correlation between hippocampal MUA and the imaged cortical ensemble activity (Example: Lines 120-122). There is a potential time lag in signal detection with regard to the two detection methods. While the time lag using electrophysiological recording is at the scale of milliseconds, the glutamate-sensitive imaging might take several 100s of ms to be detected. It is not clear in the manuscript how the authors considered this problem during the analysis.
  
  The ensemble-wise correlation analysis characterizes the relationship between two random processes, peri-ripple HPC MUA and peri-ripple neocortical activity (please see the response to reviewer 1’s major point 5). Although it is a valid point that the temporal resolution of the two signals is not the same which could introduce an error in the exact timing of the relationship between the two processes, we did not draw any conclusion based on the exact timing of the elevated correlation between the two processes. Moreover, we smoothed (equivalent to low-pass filtering) and down-sampled the MUA signal (please see the methods section) to bring the temporal scale of the two processes closer to each other. We also want to clarify that the temporal resolution of voltage and glutamate imaging is in the range of 10s of ms (Xie et al., 2016).
  
  Point 2) In the results section "The peri-ripple glutamatergic activity is layer dependent", are the Ras and EMX expressed in two different experimental animal groups? If yes, and there was a time lag between the two groups, is it valid to estimate the deeper layer activity using a scaled version of the Ras from the EMX signal?
  
  This comment is addressed in response to reviewer 1’s major point 8.
  
  Point 3) The authors did not discuss the results adequately in the discussion section. Since there is no behavioral paradigm and no behavioral read-out to induce or correlate it with possible planning and future decision-making process, the significance of the paper will be enhanced by discussing the possible underlying circuitry mechanism that might cause the reported observations. With no planning periods in the task (instead just sitting on a platform), it is actually quite unclear what the purpose of wake ripples should be. For example, the authors discuss the superficial and deep layer responses and their relation to the memory index theory. However, the RSC possesses different groups of excitable neurons in different layers. Specifically, three excitable neurons are found within the different layers of the RSC; the intrinsically bursting neurons (IB), regular spiking (RS), and low-rheobase (LR) neurons. These neurons are distributed heterogeneously within the RSC cortical layer. Although the RS are abundant in the deeper layers of the RSC, they occupy 40% of the total amount of excitable neurons found in layers II/III. On the other hand, the LR is the dominant excitable neuron in the superficial layers. It will add to the significance of the work if the authors discussed the results in the context of the cellular structure of the RSC and how would that impact the observed inhibition in the peri-ripple time window. It would be helpful for the readers and the reviewers to add a schematic diagram to the discussion section.
  
  The goal of our study was to characterize the patterns of neocortical activity around hippocampal ripples in the awake state and not shed light on the function (purpose) of awake ripples. However, we speculated about what our results could mean in the discussion section. To address the reviewer’s comment on the differences across RSC layers, the following paragraph was added to the discussion section lines 342-353.
  
  “Our results suggest that dendrites of deep pyramidal neurons, arborized in the superficial layers of the neocortex, receive glutamatergic modulation earlier than those of the superficial ones. However, the results do not provide a mechanistic explanation of the phenomenon. It is possible that the observed layer-dependency of the glutamatergic modulation would partially result from the heterogeneity of the excitatory as well as inhibitory neurons across aRSC layers. But, the question is how this heterogeneity may lead to the above-mentioned layer-dependency to which our data does not provide an answer. It could be speculated that the difference in the dendritic morphology and firing type of different types of RSC excitatory neurons (Yousuf et al., 2020) or the difference in connectivity of different RSC layers with other brain regions would play a role (Sugar et al., 2011; van Groen and Wyss, 1992; Whitesell et al., 2021). This is a complicated problem and could only be resolved by conducting experiments specifically designed to address this problem.”
  
  Point 4. A general issue (in addition to the missing behaviour), is the mix of the methods. On one side this makes the article very interesting since it highlights that with different methods you actually observe different things. But on the other side, it makes it very difficult to follow the results. It would be a major improvement of the article if the authors could include (as mentioned above) a schematic of the results and their theory, especially highlighting how the different methods would capture different parts of the mechanism. Finally, the authors should not use calcium signals as a direct measure of neuronal firing. Calcium influx is only seen in bursts of firing, not with individual spikes. It is a plasticity signal and therefore should be treated and discussed as such. Just recently it was shown by Adamantidis lab that the calcium signal changes between wake and sleep and this change does not parallel changes in neuronal firing/spikes.
  
  We agree with the reviewer that the calcium signal is biased toward burst of spikes (Huang et al., 2021). To address this concern, the term “spiking activity” was replaced with “calcium activity” throughout the manuscript. Moreover, the calcium signal was deconvoled to get a better estimate of the spiking activity (please refer to our response to the reviewer 1’s point 6).
  
  Point 5. In the discussion section, the authors focus their discussion on the connectivity between the CA1 area and the RSC. Although it is an important point, since the authors are examining the peri-ripple cortical dynamics, it is critical to discuss other possible connectivity effects. Furthermore, the hippocampal input preferentially targets the granular RSC, how would that impact the results and the interpretation of the authors? Additionally, a previous study reported the suppression of the thalamic activity during hippocampal ripples (Yang et al., 2019). Importantly, the thalamic inputs to the RSC target the superficial layers. It will add to the value of the paper if the authors expanded the discussion section and elaborated further on the possible interpretation of the results.
  
  At the time of our initial submission, pre-ripple reduction and post-ripple elevation of calcium activity in a portion of three subclasses of the superficial aRSC inhibitory neurons were reported (Chambers et al., 2022, 2021), and it was the basis of our speculation on the potential involvement of feed-forward inhibition in the post-ripple voltage reduction. We speculated that the source of this potential feed-forward inhibition could stem from gRSC excitatory neurons or other neocortical or subcortical regions projecting to aRSC (please see the discussion section). However, the source being from the thalamus is less likely because multiple studies have observed the suppression of the majority of thalamic neurons during awake ripples (Logothetis et al., 2012; Nitzan et al., 2022; Yang et al., 2019). Moreover, peri-awake-ripple suppression of thalamic axons projecting to the first layer of aRSC is reported (Chambers et al., 2022). On the other hand, it is also possible that feedback inhibition would be involved where the excitatory aRSC neurons that are excited by gRSC (as reviewer 1 pointed out) or any other region, including aRSC itself, excite aRSC inhibitory neurons which in turn inhibit pyramidal cells. To address this comment, the following paragraph was added to the discussion section in lines 323-328.
  
  “Thalamus is another source of axonal projections to aRSC (Van Groen and Wyss, 1992). However, it is less likely that thalamic projections contribute to the peri-awake-ripple aRSC activity modulation because multiple studies have observed the suppression of the majority of thalamic neurons during awake ripples (Logothetis et al., 2012; Nitzan et al., 2022; Yang et al., 2019). Moreover, peri-awake-ripple suppression of thalamic axons projecting to the first layer of aRSC is reported (Chambers et al., 2022).”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.02.482726v1
www.biorxiv.org www.biorxiv.org

Foveal vision predictively sensitizes to defining features of eye movement targets

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author reponse
  
  Reviewer #1 (Public Review):
  
  In their paper, Kroell and Rolfs use a set of sophisticated psychophysical experiments in visually-intact observers, to show that visual processing at the fovea within the 250ms or so before saccading to a peripheral target containing orientation information, is influenced by orientation signals at the target. Their approach straddles the boundary between enforcing fixation throughout stimulus presentation (a standard in the field) and leaving it totally unconstrained. As such, they move the field of saccade pre-processing towards active vision in order to answer key questions about whether the fovea predicts features at the gaze target, over what time frame, with what precision, and over what spatial extent around the foveal center. The results support the notion that there is feature-selective enhancement centered on the center of gaze, rather than on the predictively remapped location of the target. The results further show that this enhancement extends about 3 deg radially from the foveal center and that it starts ~ 200ms or so before saccade onset. They also show that this enhancement is reinforced if the target remains present throughout the saccade. The hypothesized implications of these findings are that they could enable continuity of perception trans-saccadically and potentially, improve post-saccadic gaze correction.
  
  Strengths:
  
  The findings appear solid and backed up by converging evidence from several experimental manipulations. These included several approaches to overcome current methodological constraints to the critical examination of foveal processing while being careful not to interfere with saccade planning and performance. The authors examined the spatial frequency characteristics of the foveal enhancement relative, hit rates and false alarm rates for detecting a foveal probe that was congruent or incongruent in terms of orientation to the peripheral saccade target embedded in flickering, dynamic noise (i/f )images. While hit rates are relatively easy to interpret, the authors also reconstructed key features of the background noise to interpret false alarms as reflecting foveal enhancement that could be correlated with target orientation signals. The study also - in an extensive Supplementary Materials section - uses appropriate statistical analyses and controls for multiple factors impacting experimental/stimulus design and analysis. The approach, as well as the level of care towards experimental details provided in this manuscript, should prove welcome and useful for any other investigators interested in the questions posed.
  
  Weaknesses:
  
  I find no major weaknesses in the experiments, analyses or interpretations. The conclusions of the paper appear well supported by the data. My main suggestion would be to see a clearer discussion of the implications of the present findings for truly naturalistic, visually-guided performance and action. Please consider the implication of the phenomena and behaviors reported here when what is located at the gaze center (while peripheral targets are present), is not a noisy, relatively feature-poor, low-saliency background, but another high-saliency target, likely crowded by other nearby targets. As such, a key question that emerges and should be addressed in the Discussion at least is whether the fovea's role described in the present experiments is restricted to visual scenarios used here, or whether they generalize to the rather different visual environments of everyday life.
  
  This is a very interesting question. While we cannot provide a definite answer, we have added a paragraph discussing the role of foveal prediction in more naturalistic visual contexts to the Discussion section (‘Does foveal prediction transfer to other visual features and complex natural environments?’). We pasted this paragraph in response to another comment in the ‘Recommendations for the authors’ section below. We suggest that “the pre-saccadic decrease in foveal sensitivity demonstrated previously[9] as well as in our own data (Figure 2B) may boost the relative strength of fed-back signals by reducing the conspicuity of foveal feedforward input”, presumably allowing the foveal prediction mechanism to generalize to more naturalistic environments with salient foveal stimulation.
  
  Reviewer #2 (Public Review):
  
  Human and primates move their eyes with rapid saccades to reposition the high-resolution region of the retina, the fovea, over objects of interest. Thus, each saccade involves moving the fovea from a pre-saccadic location to a saccade target. Although it has been long known that saccades profoundly alter visual processing at the time of saccade, scientists simply do not know how the brain combines information across saccades to support our normal perceptual experience. This paper addresses a piece of that puzzle by examining how eye movements affect processing at the fovea before it moves. Using a dynamic noise background and a dual psychophysical task, the authors probe both the performance and selectivity of visual processing for orientation at the fovea in the few hundred milliseconds preceding a saccade. They find that hit rates and false alarm rates are dynamically and automatically modulated by the saccade planning. By taking advantage of the specific sequence of noise shown on each trial, they demonstrate that the tuning of foveal processing is affected by the orientation of the saccade target suggesting foveal specific feedback.
  
  A major strength of the paper is the experimental design. The use of dynamic filtered noise to probe perceptual processing is a clever way of measuring the dynamics of selectivity at the fovea during saccade preparation. The use of a dual-task allows the authors to evaluate the tuning of foveal processing as well and how it depends on the peripheral target orientation. They show compellingly that the orientation of the saccade target (the future location of the fovea) affects processing at the fovea before it moves.
  
  There are two weaknesses with the paper in its current form. The first is that the key claim of foveal "enhancement" relies on the tuning of the false alarms. A more standard measure of enhancement would be to look at the sensitivity, or d-prime, of the performance on the task. In this study, hits and false alarms increase together, which is traditionally interpreted as a criterion shift and not an enhancement. However, because of the external noise, false alarms are driven by real signals. The authors are aware of this and argue that the fact that the false alarms are tuned indicates enhancement. But it is unclear to me that a criterion shift wouldn't also explain this tuning and the change in the noise images. For example, in a task with 4 alternative choices (Present/Congruent, Present/Incongruent, Absent/Congruent, Absent/Incongruent), shifting the criterion towards the congruent target would increase hits and false alarms for that target and still result in a tuned template (because that template is presumably what drove the decision variable that the adjusted criterion operates on). I believe this weakness could be addressed with a computational model that shows that a criterion shift on the output of a tuned template cannot produce the pattern of hits and false alarms.
  
  We thank the reviewer for this comment. We will present three arguments, each of which suggests that our effects are perceptual in nature and cannot be explained by a shift in decision criterion: (1) the temporal specificity of the difference in Hit Rates (HRs), (2) the spatial specificity of the difference in HRs and (3) the phenomenological quality of the foveally predicted signal. In general, a criterion shift would indeed affect hits and false alarms alike. Nonetheless, the difference in HRs only manifested under specific and meaningful conditions:
  
  First, the increase in congruent as compared to incongruent HRs, i.e., enhancement, was temporally specific: congruent and incongruent HRs were virtually identical when the probe appeared in a baseline time bin or one (Figure 2B) or even two (Figure 4A) early pre-saccadic time bins. Based on another reviewer’s comment, we collected additional data to measure the time course and extent of foveal enhancement during fixation. While pre-saccadic enhancement developed rapidly, enhancement started to emerge 200 ms after target onset during fixation. Crucially, these time courses mirror the typical temporal development of visual sensitivity during pre-saccadic attention shifts and covert attentional allocation, respectively[8,33]. We are unaware of data demonstrating similar temporal specificity for a shift in decision criterion. One could argue that a template of the target orientation needs to build up before it can influence criterion. Nonetheless, this template would be expected to remain effective after this initial temporal threshold has been crossed. In contrast, we observe pronounced enhancement in medium but not late stages of saccade preparation in the PRE-only condition (Figure 4A).
  
  Second, it has been argued that a defining difference between innately perceptual effects and post-perceptual criterion shifts is their spatial specificity[53]: in opposition to perceptual effects, criterion shifts should manifest in a spatially global fashion. Due to a parafoveal control condition detailed in our reply to the next comment, we maintain the claim that enhancement is spatially specific: congruent HRs exceeded incongruent ones within a confined spatial region around the center of gaze. We did not observe enhancement for probes presented at 3 dva eccentricity even when we raised parafoveal performance to a foveal level by adaptively increasing probe contrast. The accuracy of saccade landing or, more specifically, the mean remapped target location (Figure 3B) influenced the spatial extent of the enhanced region in a fashion that is reconcilable with previous findings[30]. A criterion shift that is both spatially and temporally selective, follows the time course of pre-saccadic or covert attention depending on observers’ oculomotor behavior, does not remain effective throughout the entire trial after its onset, is sensitive to the mean remapped target location across trials, and does not apply to parafoveal probes even after their contrast has been increased to match foveal performance, would be unprecedented in the literature and, even if existent, appear just as functionally meaningful as sensitivity changes occurring under the same conditions.
  
  Lastly and on a more informal note, we would like to describe a phenomenological percept that was spontaneously reported by 6 out of 7 observers in Experiment 1 and experienced by the author L.M.K. many times. On a small subset of trials, participants in our paradigms have the strong phenomenological impression of perceiving the target in the pre-saccadic center of gaze. This percept is rare but so pronounced that some observers interrupt the experiment to ask which probe orientation they should report if they had perceived two on the same trial (“The orientation of the normal probe or of the one that looked exactly like the target”). Interestingly, the actual saccade target and its foveal equivalent are perceived simultaneously in two spatiotopically separate locations, suggesting that this percept cannot be ascribed to a temporal misjudgment of saccade execution (after which the target would have actually been foveated). We have no data to prove this observation but nonetheless wanted to share it. Experiencing it ourselves has left us with no doubt that the fed-back signal is truly – and almost eerily – perceptual in nature.
  
  The analysis suggested by the reviewer is very interesting. Yet for several reasons stated in the ‘Suggestions to the authors’ section, our dataset is not cut out for an analysis of noise properties at this level of complexity. We had always planned to resolve these concerns experimentally, i.e., by demonstrating specificity in HRs. We believe that our arguments above provide a strong case for a perceptual phenomenon and have incorporated them into the Discussion of our revised manuscript.
  
  The second weakness is that the author's claim that feedback is spatially selective to the fovea is confounded by the fact that acuity and contrast sensitivity are higher in the fovea. Therefore, the subject's performance would already be spatially tuned. Even the very central degree, the foveola, is inhomogeneous. Thus, finding spatially-tuned sensitivity to the probes may simply indicate global feature gain on top of already spatially tuned processing in the fovea. Another possible explanation that is consistent with the "no enhancement" interpretation is that the fovea has increased. This is consistent with the observation that the congruency effects were aligned to the center of gaze and not the saccade endpoint. It looks from the Gaussian fits that a single gain parameter would explain the difference in the shape of the congruent and incongruent hit rates, but I could not figure out if this was explicitly tested from the existing methods. Additional experiments without prepared saccades would be an easy way to address this issue. Is the hit rate tuned when there is no saccade preparation? If so, it seems likely that the spatial selectivity is not tuned feedback, but inhomogeneous feedforward processing.
  
  We fully agree. We do not consider a fixation condition diagnostic to resolve this question since, as of now, correlates of foveal feedback have exclusively been observed during fixation. In those studies, it was suggested that the effect, i.e., a foveal representation of peripheral stimuli, reflects the automatic preparation of an eye movement that was simply not executed[11,12,14]. To address another reviewer’s comment, we collected additional data in a fixation experiment. The probe stimulus could exclusively appear in the screen center (as in Experiment 1) and observers maintained fixation throughout the trial. While pre-saccadic congruency effects were significantly more pronounced and developed faster, congruency effects did emerge during fixation when the probe appeared 200 ms after the target. If pre-saccadic processes indeed spill over to fixation tasks to some extent and trigger relevant neural mechanisms even when no saccade is executed, we could expect a similar feedback-induced spatial profile during fixation. Since this matches the reviewer’s prediction if the pre-saccadic profiles resulted from inhomogeneous feedforward processing, we do not consider a fixation condition suitable to distinguish between both hypotheses.
  
  To test whether the tuning of enhancement is effectively a consequence of declining visual performance in the parafovea/periphery, we instead raised parafoveal performance to a foveal level by adaptively increasing the opacity of the probe: while leaving all remaining experimental parameters unchanged, we presented the probe in one of two parafoveal locations, i.e., 3 dva to the left or right of the screen center. Observers were explicitly informed about the placement of the probe. We administered a staircase procedure to determine the probe opacity at which performance for parafoveal target-incongruent probes would be just as high as foveal performance had been in the preceding sessions. While the foveal probe was presented at a median opacity of 28.3±7.6%, a parafoveal opacity of 39.0±11.1% was required to achieve the same performance level. As a result, the gray dot at 0 dva in the figure below represents the incongruent HR in the center of gaze and ranges at 80% on the y-axis. The gray dots at ±3 dva represent incongruent parafoveal HRs and also range at ~80% on the y-axis. Using the reviewer’s terminology, we effectively removed the influence of acuity- (or contrast-sensitivity-) dependent spatial tuning. If the spatial profiles had indeed been the result of “global feature gain on top of already spatially tuned processing“, this manipulation should render parafoveal feature gain just as detectable as foveal feature gain. Instead, congruent and incongruent parafoveal HRs were statistically indistinguishable (away from the saccade target: p = .127, BF10 = 0.531; towards the saccade target: p = .336, BF10 = 0.352), inconsistent with the idea of a spatially global feature gain.
  
  We had included these data in our initial submission. They were collected in the same observers that contributed the spatial profiles (Experiment 2). The data points at 0 dva in the reduced figure above correspond to the foveal probe location in Figure 2D. The data points at ±3 dva had been plotted and discussed in our initial submission, yet only very briefly. Based on this and another reviewer’s comment, we realize that we should have explained this condition more extensively in the main text rather than in the Methods and have added a dedicated paragraph to the Results section.
  
  This paper is important because it compellingly demonstrates that visual processing in the fovea anticipates what is coming once the eyes move. The exact form of the modulation remains unclear and the authors could do more to support their interpretations. However, understanding this type of active and predictive processing is a part of the puzzle of how sensory systems work in concert with motor behavior to serve the goals of the organism.
  
  Reviewer #3 (Public Review):
  
  This manuscript examines one important and at the same time little investigated question in vision science: what happens to the processing of the foveal input right before the onset of a saccade. This is clearly something of relevance as humans perform saccades about 3 times every second. Whereas what happens to visual perception in the visual periphery at the saccade goal is well characterized, little is known about what happens at the very center of gaze, which represents the future retinal location where the saccade target will be viewed at high resolution upon landing. To address this problem the authors implemented an elegant experiment in which they probed foveal vision at different times before the onset of the saccade by using a target, with the same or different orientation with respect to the stimulus at the saccade goal, embedded in dynamic noise. The authors show that foveal processing of the saccade target is initiated before saccade execution resulting in the visual system being more sensitive to foveal stimuli which features match with those of the stimuli at the saccades goal. According to the authors, this process enables a smooth transition of visual perception before and after the saccade. The experiment is well designed and the results are solid, overall I think this work represents a valuable contribution to the field and its results have important implications. My comments below:
  
  The change in the overall performance between the baseline condition and when the probe is presented after the saccade target is large, but I wonder if there are other unrelated factors that contribute to this difference, for example, simply presenting the probe after vs before the onset of a peripheral stimulus, or the fact that in the baseline the probe is presented right after a fixation marker, but in the other condition there was a longer time interval between the presentation of the marker and the probe transient. The authors should discuss how these confounding factors have been accounted for.
  
  We thank the reviewer for this helpful comment. We would like to clarify that the probe was never presented right after the fixation dot. In the baseline condition, fixation dot and target were separated by 50 ms, i.e., the duration of one noise image. Since the fixation dot was an order of magnitude smaller than the probe (0.3 vs 3 dva in diameter) and since two large-field visual transients caused by the onset of a new background noise image occurred between fixation dot disappearance and probe appearance, we consider it unlikely that the performance difference was caused by any kind of stimulus interaction such as masking. Nonetheless, we had been puzzled by this difference already when inspecting preliminary results and wondered if it may reflect observers’ temporal expectations about the trial sequence. We therefore explicitly instructed and repeatedly reminded observers that the probe could appear before the peripheral target. Since the difference persisted, we ascribed it to a predictive remapping of attention to the fovea during saccade preparation, as we had stated in the Discussion.
  
  Another contributing factor may be that observers approached the oculomotor and perceptual detection tasks sequentially. In early trial phases, they may have prioritized localizing the target and programming the eye movement. After motor planning had been initiated, resources may have been freed up for the foveal detection task. Since on the majority of probe-present trials, the probe appeared after the saccade target, this strategy would have been mostly adaptive. Crucially, however, observers yielded similar incongruent Hit Rates in the baseline and last pre-saccadic time bin (70% vs 74%). While we observed pronounced enhancement in the last pre-saccadic bin, congruent and incongruent Hit Rates in the baseline bin were virtually identical. We therefore conclude that lower overall performance in the baseline bin did not prevent congruency effects from occurring. Instead, congruency effects started developing only after target appearance. We have added this potential explanation to the Results.
  
  Somewhat related to point 3, the authors conclude that the effects reported here are the result of saccade preparation/execution, however, a control condition in which the saccade is not performed is missing. This leaves me wondering whether the effect is only present during saccade preparation or if it may also be present to some extent or to its full extent when covert attention is engaged, i.e when subjects perform the same task without making a saccade.
  
  Foveal feedback has, as of now, exclusively been demonstrated during fixation (see references in Introduction and Discussion). In most of these studies, it was suggested that these effects (i.e., the foveal representation of a peripheral stimulus) may reflect the automatic preparation of an eye movement that was simply not executed[11,12,14]. Since foveal feedback has been demonstrated during fixation, and since eye movement preparation may influence foveal processing even when the eyes remain stationary, we considered it likely that congruency effects would emerge during fixation. Nonetheless, we agree with the reviewer that an explicit comparison between saccade preparation and fixation would enrich our data set and allow for stronger conclusions. We therefore collected additional data from seven observers. While all remaining experimental parameters were identical to Experiment 1, observers maintained fixation throughout each trial. We found that pre-saccadic foveal enhancement was more pronounced and emerged earlier than foveal enhancement during fixation. We present these data in the Results section (Figure 5) and have updated the Methods section to incorporate this additional experiment. We have furthermore added a paragraph to the Discussion which addresses potential mechanisms of foveal enhancement during fixation and saccade preparation.
  
  Furthermore, the reviewer’s comment helped us realize that we never stated a crucial part of our motivation explicitly. We now do so in the Introduction:
  
  “Despite the theoretical usefulness of such a mechanism, there are reasons to assume that foveal feedback may break down while an eye movement is prepared to a different visual field location. First and foremost, saccade preparation is accompanied with an obligatory shift of attention to the saccade target[6-8] which in turn has been shown to decrease foveal sensitivity[9]. Moreover, the execution of a rapid eye movement induces brief motion signals on the retina[20] which may mask or in other ways interfere with the pre-saccadic prediction signal. On a more conceptual level, the recruitment of foveal processing as an ‘active blackboard’[21] may become obsolete in the face of an imminent foveation of relevant peripheral stimuli – unless, of course, foveal processing serves the establishment of trans-saccadic visual continuity.”
  
  We believe that the additional data and the revisions to the Introduction and Discussion have strengthened our manuscript and thank the reviewer for this comment.
  
  Differently from other tasks addressing pre-saccadic perception in the literature here subjects do not have to discriminate the peripheral stimulus at the saccade goal, and most processing resources are presumably focused at the foveal location. Could this have influenced the results reported here?
  
  This is true. We intentionally made the features of the peripheral target as task-irrelevant as possible, contrary to previous investigations. We wanted to ensure that the enhancement we find would be automatic and not induced by a peripheral discrimination task, as we state in the Discussion and the Methods. We agree that the foveal detection task likely focused processing resources on the center of gaze in Experiment 1. In Experiment 2, however, we measured the spatial profile of enhancement which involved two different conditions:
  
  In each observer’s first six sessions, the probe could be presented anywhere on a horizontal axis of 9 dva length. On a given trial, an observer could not predict where it would appear, and therefore could not strategically allocate their attention. Nonetheless, enhancement of target-congruent orientation information was tuned to the fovea.
  
  In the final, seventh session, the probe appeared exclusively in one of two possible peripheral locations: 3 dva to the left or 3 dva to the right of the screen center. Observers were explicitly informed that the probe would never appear foveally, and processing resources should therefore have been allocated to the peripheral probe locations. The general performance level in this condition was comparable to performance in the fovea (see reply to the next comment). Nonetheless, we did not find peripheral enhancement of target-congruent information.
  
  Importantly, the magnitude of the foveal congruency effect in the PRE-only condition of Experiment 1 (i.e., when the target disappeared before the eyes landed on it) was comparable to the foveal congruency effect in Experiment 2 (PRE-only throughout), suggesting that the format of the task – i.e., purely foveal detection or foveal and peripheral detection – did not alter our findings.
  
  The spatial profile of the enhancement is very interesting and it clearly shows that the enhancement is limited to a central region. To which extent this profile is influenced by the fact that the probe was presented at larger eccentricities and therefore was less visible at 4.5 deg than it was at 0 deg? According to the caption, when the probe was presented more eccentrically the performance was raised to a foveal level by adaptively increasing probe transparency. This is not clear, was this done separately based on performance at baseline? Does this mean that the contrast of the stimulus was different for the points at +- 3 dva but the performance was comparable at baseline? Please explain.
  
  Based on the previous comment and comments of Reviewer #2, we realize that we should have explained this condition more extensively in the main text rather than in the Methods and have adapted the manuscript accordingly. As stated in our reply to the previous comment, Experiment 2 involved one session in which we addressed whether the lack of parafoveal/peripheral enhancement could be due to a simple decrease in acuity as mentioned by the reviewer. Observers were explicitly informed that the to-be detected stimulus (the probe) would appear either 3 dva to the left or right but never in the screen center and were shown slowed-down example trials for illustration. Observers then performed a staircase procedure which was targeted at determining the probe contrast at which performance for parafoveal target-incongruent probes would be just as high as foveal performance for target-incongruent probes had been in the previous six sessions. While the foveal probe was presented at a median opacity of 28.3±7.6%, an opacity of 39.0±11.1% was required to achieve the same performance level at a 3 dva eccentricity. Therefore, the gray curve in Figure 2D that represents incongruent Hits reaches its peak just under 80% on the y-axis. The gray dots at ±3 dva also range at ~80% on the y-axis. The performance level for target-incongruent probes (‘baseline’ here) in the parafovea is thus equal to foveal performance for target-incongruent probes. Target-congruent parafoveal feature information had the same “chance” to be enhanced as foveal information in the preceding sessions. Despite an equation of performance, we found no parafoveal enhancement. This suggests that enhancement is a true consequence of visual field location and not simply mediated by visual acuity at that location.
  
  The enhancement is significant within a region of 6.4 dva around the center of gaze. This is a rather large region, especially considering that it extends also in the direction opposite to the saccade. I was expecting the enhancement to be more confined to the central foveal region. Was the effect shown in Figure 2D influenced by the fact that saccades in this task were characterized by a large undershoot (Fig 1 D)? Did the effect change if only saccades landing closer to the target were included in the analysis? There may not be enough data for resolving the time course, but maybe there are differences in the size of the main effect.
  
  Width of the profile: In general, the width of the enhancement profile is likely to be influenced by two experimental/analysis choices: the size of the probe stimulus presented during the experiment and the width of the moving window combining adjacent probe locations for analysis.
  
  Probe size: Since the probe itself had a comparably large diameter of 3 dva, even the leftmost significant point at -2.6 dva could be explained by an enhancement of the foveal portion of the probe. We had mentioned this briefly in the Discussion but realize that this point is crucial and should be made more explicit. Moving window width: We designed the experiment with the intention to densely sample a range of spatial locations during data collection and combine a certain number of adjacent locations using a moving window during analysis (see preregistration: https://osf.io/6s24m). To ensure the reliability of every data point, the width of this window was chosen based on how many trials were lost during preprocessing. We chose a window width of 7 locations as this ensured that each data point contained at least 30 trials on an individual-observer level. Nonetheless, the width of the resulting enhancement profile depends on the width of the moving window:
  
  We added these caveats to the Results section and incorporated the figure above into the Supplements. We now state explicitly that…
  
  “the main conclusions that can be drawn are that enhancement i) peaks in the center of gaze, ii) is not uniform throughout the tested spatial range as, for instance, global feature-based attention would predict, and iii) is asymmetrical, extending further towards the saccade target than away from it.”
  
  For the above reasons, the absolute width of the profile should be interpreted with caution.
  
  Saccadic landing accuracy: To address the reviewer’s question, we inspected the spatial enhancement profile separately for trials in which the saccade landed on the target (i.e., within a radius of 1.5 dva from its center) or off-target but still within the accepted landing area. This trial separation criterion, besides appearing meaningful, ensured that all observers contributed trials to every data point. We had never resolved the time course in this experiment and could therefore not collapse across time points as suggested by the reviewer. To increase the number of trials per data point, we instead increased the width of the moving window sliding across locations from 6 to 9 neighboring locations (but see caveat above).
  
  Considering only saccades that landed on the target (‘accurate’; A) yielded significant enhancement from -2.6 to 2.1 dva and from 3.2 dva throughout the measured range towards the saccade target. Saccades that landed off-target (‘inaccurate’; B) showed a more pronounced asymmetry. When only considering inaccurate saccades, enhancement reached significance between -1.1 and 4.4 dva.
  
  The increased asymmetry for inaccurate saccades may be related to predictive remapping: since inaccurate saccades were hypometric on average, the predictively remapped location of the target was shifted towards the target by the magnitude of the undershoot. Asymmetric enhancement would therefore have boosted congruency at the remapped target location across all trials. In consequence, we inspected if aligning probe locations to the remapped target location on an individual-trial level would lead to a narrower profile for inaccurate saccades. This was not the case. Instead, we observed two parafoveal maxima (C). Their position on the x-axis equals the mean remapping-dependent leftwards (2.0 dva) and rightwards (1.9 dva) displacement across trials. In other words, they correspond to the pre-saccadic center of gaze. Note that these profiles could not be fitted with a mixture of Gaussians and were fitted using polynomials instead.
  
  In sum, while we do not observe a clear narrowing of the enhancement profile for accurate saccades, the profile’s asymmetry is more pronounced for inaccurate eye movements. An increase in asymmetry could bear functional advantages since it would boost congruency at the remapped target location across all trials. Importantly though, this adjustment seems to rely on an estimate of average rather than single-trial saccade characteristics: aligning probe locations to the remapped attentional locus on an individual trial level provides further evidence that, irrespective of individual saccade endpoints, enhancement was aligned to the fovea. We have added these analyses to the Results section (Figure 3). We have also added the remapped profiles for all saccades and accurate saccades only to the Supplements.
  
  Is the size of the enhanced region around the center of gaze related to the precision of saccades? Presumably, if saccades are less precise a larger enhanced area may be more beneficial.
  
  This is a very interesting point. To address this question, we estimated each observer’s saccadic precision by computing bivariate kernel densities from their saccade landing coordinates. As we measured the horizontal extent of enhancement in our experiment, we defined the horizontal bandwidth as an estimate of saccadic imprecision. To estimate the size of the enhanced region for each observer, we created 10,000 bootstrapping samples for each observer’s congruent and incongruent HRs (4 locations combined at each step) We then determined the difference between the bootstrapped congruent and incongruent HRs and defined significantly enhanced locations as all locations for which <= 5% of these differences fell below zero. We then defined the width of the enhancement profile as the maximum number of consecutive significant locations.
  
  Instead of a positive correlation, we observed a negative correlation between the bandwidth of landing coordinates (i.e., saccadic imprecision) and the size of the enhanced window (r = -.56, p = .117). In other words, there was a non-significant tendency that the less precise an observer’s saccades, the narrower their estimated region of enhancement. We furthermore inspected the magnitude of enhancement per position within in the enhanced region. To do so, we computed the mean difference between congruent and incongruent HR across all positions in the enhanced region. The sizes of the orange circles in the figure above represent the resulting values (ranging from 2.9% to 13.3%). As saccadic precision decreases, the magnitude of enhancement per data point in the enhanced region tends to decrease as well. We therefore suggest that high saccadic precision is a sign of efficient oculomotor programming, which in turn allows peri-saccadic perceptual processes to operate more effectively. We added this analysis to the Supplements and refer to it in the Results section of the revised manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.01.11.475331v1
www.biorxiv.org www.biorxiv.org

New submission 17/05/2023, 15:00:07

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  HCN channels are atypically opened by the downward movement of gating charges during hyperpolarisation and have such weak coupling between the VSD and pore domain, and in the absence of an open state structure, extracting mechanistic information has been difficult. This manuscript is a continuation of a previous study on HCN channel gating that revealed how hyperpolarisation causes a downward movement of the VSD's S4, with breakage into two helices. The authors explore gating motions and the coupling between VSD and the pore domain using atomistic simulations. This includes microsecond MD with and without very strong -1V applied potentials to try to drive VSD-TMD changes to open the channel. In the end, however, the authors used a biased simulation approach (adiabatic bias) to enforce conformational change from resting to an open homology model of HCN based on hERG/rEAG. This microsecond simulation followed three interaction distances that were suggested to change between resting and open states based on free MD. This simulation caused pore opening and allowed a description of changes that may occur during gating, including a competition of S5-S6 and S6-S6 contacts and lipid binding locations, which may suggest lipid-dependent function and explain an unexpected closed structure at 0mV in micelles. While I feel the manuscript is written for the HCN expert audience, the mechanistic information in terms of hyperpolarisation-induced voltage gating makes it of much interest. The manuscript is presented at a high level, though there are a couple of points to address, including reproducibility of simulations and potential for more relation to experimental findings.
  
  We appreciate the comments, thank you, please find a detailed answer below.
  
  The authors carried out 1μs-MD simulations of the resting, activated, and a Y289D mutant at 0 mV, and then tried to drive the conformational change with a very large -1V voltage (double that studied previously). In 1 us MD, is the membrane stable with such a big voltage, as it would likely not be experimentally? Even with a volt applied, there was incomplete activation of the voltage sensors, despite timescales approaching that of activation.
  
  This reviewer is correct in cautioning against membrane rupturing effects in simulations with a voltage of this magnitude. We have indeed checked that the membrane and the protein remains intact under these conditions and can confirm that no poration occurs. As membrane poration is stochastic, it could indeed occur over microsecond timescales under 1V, but the probability remains low, and we were lucky to not face this situation herein. Note that whereas potentials of this magnitude could not be applied in experiments, they are relatively routinely used in MD simulations to speed up processes that are driven by changes in transmembrane potentials.
  
  Interestingly, other work from our lab (Rems et al. Biophysical Journal 119 (1) 190-205 (2020)) has shown that HCN1 voltage sensor domains are less prone to poration than those from other voltage sensor domains, for reasons that remain to be determined.
  
  Author Response Figure 1. Final snapshots from the simulations of the resting (blue), intermediate (yellow) and activated (red) states. The representation of the solvent (water+ions) in cyan showed no membrane poration at the end of the 1us simulations.
  
  For the pulling/ driving simulations (adiabatic bias MD) to change suspected interaction distances (V390-I302, N300-W281, and D290-K412), it seems to be just 1 simulation, without reproducibility. One has to wonder, if the simulation was redone from a very different initial conformation, would the results be the same (in addition to the distances themselves that were enforced by the ABMD). Moreover, the authors had to model the open state, such that the results depend on a homology model based on other CNBD channels, hERG / rEAG. Although the model stayed open for a microsecond, what other measures of accuracy of the homology model are there, such as preserved distances according to mutants/double mutants?
  
  The ABMD simulations were repeated, please refer to the response to essential revisions point 1 for details.
  
  For reasons mentioned by the reviewer as well as a reconsideration of our strategy to model channel opening, we have decided to omit homology models from the revised version of the paper.
  
  The authors find that activation involves hydrophobic forces that strengthen the intra-subunit S4/S5/S6 interface, as well as lipid headgroups that make contact with hydrophilic residues at this interface, with lipid tails also contributing to hydrophobic contacts. The authors see bending and rotation of the lower S4 and a displacement of S1 away from S4 that exposes the VSD-pore interface to lipids, with increased lipid contacts at S4 and S5 during activation. This indicates lipid tails may play a role in coupling in HCN1 and may explain the closed state micelle structure at 0mV. Two sites of lipid contact are identified, one engaging VSD residues and the other polar or charged residues on S5 and S6. No experiments are presented or proposed to test the predicted lipid sites. e.g. Mutation of key residues, such as the arginine and histidine seen binding lipid headgroups could be tested as proof of their involvement, or perhaps experiments with varied phosphate moieties? In the absence of new experiments, is there existing data that could help validate the findings?
  
  We thank this reviewer for this comment. As noted in the response to essential revisions point 3, such experiments are challenging, and have not been reported so far in HCN channels. We do agree that aspects of the mechanism we propose remain hypothetical awaiting further work, but are happy to report that importance of lipid interactions with the crucial salt bridge pair mentioned in the response to essential revisions point 3 has been completely independently validated, thus strengthening our mechanistic hypothesis substantially.
  
  During free MD simulation, the authors see tilting of S5 caused by activation of the Y289D mutation that brings D290 and K412 positions into proximity. How do we know that the adjacent mutant of Y289 to aspartate has not caused this, or was this interaction also seen in wild-type simulation? Fig.3c might suggest the wt activated simulation may see such an interaction, but it is unclear given the large C_alpha distances, as opposed to H-bonding distances.
  
  Indeed, Figure 3 appears to indicate that this interaction between D290 and K412 is present in the activated state when the mutation is reverted to the WT sequence. We have recalculated the interaction propensity using all atoms of the residues and present an updated Figure 3c in response.
  
  The authors predict that a D290-K412 salt bridge may be important for gating and sought to experimentally validate the interaction in the activated-open state using cysteine cross-bridging. As this is the only experimental backing in the paper, it is important to be able to judge its ability to report on the D290-K412 salt bridge. A comparison experiment demonstrating other crosslinks that do not favour the open state would have been helpful in this regard e.g. if crossbridging at similar locations (but not predicted to change interaction during gating) had little effect on I/Imax, then the result may be bolstered. Are there existing mutagenesis experiments that may suggest the importance of these residues (as well as for other key interaction distances identified)?
  
  Negative results in cross bridging and cysteine accessibility studies in general are difficult to interpret as the lack of a cadmium-specific effect may be due to inaccessibility of the site to cadmium, pairwise distance too far to bridge by cadmium, or bridging or the specified site without a functional effect. However, as reviewer 2 pointed out below, the Yellen group has performed extensive cross bridging experiments in the S4-S5 to Clinker region in spHCN and in most of these positions, the pairs favoring the open state are closer together in our models than pairs favoring the closed state or those without functional effect. We have added Videos 1-6 to highlight this comparison on our open state models and describe in our updated discussion section.
  
  Rotation of the V390 side chain from a position facing the pore lumen to a position facing I302 on S5 is coupled to an increase of the pore radius at V390, an increased hydration of the pore intracellular gate, and K+ ion movement. Perhaps 5 or 6 ions cross in that single simulation. As K channel ion permeation can depend critically on starting ion configs (as well as the model/force field), reproducibility of this finding is important but does not appear to have been tested. How can we be sure that periods of permeation or no permeation in individual simulations are reliable?
  
  As mentioned in our response to essential revisions point 1, we have modified the collective variable set used in ABMD, and repeated the simulations in 4 replicates. Whereas the number of permeation events is low in each simulation (Figure 4 S1), the consistency across repeats indicates that these open pore models indeed represent conductive states. Given how short the simulations are, however, it appears unreasonable to infer conductance values from these observations.
  
  Reviewer #3 (Public Review):
  
  In this work, Elbahnsi and colleagues use enhanced sampling MD simulation, to recapitulate step by step, the electromechanical coupling between VSD and the pore in HCN1 channels. Building on the available cryoEM structures of HCN1 with the VSD in resting and active state, the authors characterize by MD a subset of interactions that seemingly stabilize the open channel. This subset is, in turn, used in enhanced-sampling simulations to guide channel opening. The main findings are that S4 movement induces a rearrangement of the hydrophobic interaction at the level of S1- S4- and S5 interfaces. Occupancy of lipids seems therefore statedependent and highlights their regulatory role in HCN gating.
  
  The approach is rather innovative, and it apparently allows the reconstruction of the whole mechanism of gating, pushing the predictive power of MD simulation well beyond its actual temporal limitations. At the same time, the initial choice of interactions is crucial for this approach, because the result cannot differ from the inputs. And reading the paper it does not emerge clearly how the correctness of the reconstructed gating pathway can be verified, if not by functional validation.
  
  We thank the reviewer for this thoughtful review. It has pushed us to reconsider our approach to enhance the sampling of channel activation and gating. Please refer to the detailed response below as well as the response in particular to essential revisions point 1.
  
  Here are my comments on the main interactions that were used to feed the final MD simulation:
  
  1) W281-N300: this interaction, previously identified and studied in SpH channels (Ramentol et al, 2020; Wu et al, 2021), has been elegantly confirmed in this paper. Its inclusion in the initial subset seems appropriate. In the other two cases, the choice of interactions requires further explanations and experimental validation.
  
  2) D290 and K412: the validation of this interaction shown in Figure 3 and suppl Figure 1 is missing a control, i.e., the effect of the addition of Cd++ on the wt channel. Please add.
  
  We have performed the control suggested. Please also refer to the answer to essential revisions point 2.
  
  3) Modelling the open state of HCN1 pore (page 18), is done on the structure of the distantly related hERG rather than on the available open pore structure of HCN4. This choice is justified as follows by the authors:
  
  a) "Available structures in the CNBD channel family for which representative structures have been solved in closed and open states".
  
  b) "The structural mechanism of pore gating (i.e. the ⍺ to 𝜋 helix occurring at the glycine657 hinge in hERG) observed in rEAG/hERG may be a conserved gating transition in the CNBD family of channels"
  
  I encourage the authors to consider the following:
  
  a) The structure of hERG channel is not available in the closed/open configuration, indeed the comparison must be done with the closed configuration of the related channel rEAG. On the contrary, HCN4 is available in the closed/open configurations. Moreover, one of the open pore structures shows S4-S5-S6 in a very similar conformation to the lock open mutant (F186C/S264C) of HCN1 (Saponaro et al, 2021). With an available HCN4 open structure, forcing HCN1 to the open pore structure of hERG channel (which opens in depolarization and is not regulated by cAMP) seems not necessary.
  
  In response to this point, we reconsidered our approach and chose to instead use a biasing distance that is consistently increased in CNBD channels of resolved structures, that between neighboring and cross-subunits V390. We have detailed our rationale in the response to essential revisions point 1.
  
  To my knowledge, hERG is the only channel of the CNBD family for which the transition ⍺ to 𝜋 helix reported by the Authors, occurs in S6. It is not reported for other CNBD family members, in particular for the CNG channels mentioned by the Authors (Zheng et al., 2020; Xue et al., 2021, 2022). Task 4 (Zheng et al) does not show it. Its pore opens by a right-handed twist of S6 at glycine 399, a conserved glycine in all CNG. Human CNGA1 too, opens the pore by a rotational movement of S6 hinged at the equivalent glycine (glycine 385) (Xue et al, 2021). And the same occurs in the non-symmetrical channel CNGA1/B1 (Xue te al, 2022). So, it seems that CNG channels do not show the ⍺ to 𝜋 helix transition in the open pore. Moreover, hERG excluded, all other members of the CNBD family, CNG, EAG, and HCN4 included, do not bend at the hinge glycine 657 of hERG, but at another glycine (gly 648 in hERG numbering) located upstream. Further, their opening is due to a rotation of S6 associated with an outward movement, rather than to the lifting of the lower part of S6, as in hERG.
  
  After considering this reviewer’s comment, we were surprised to see that HCN1 is apparently prone to secondary structure deformation in S6, even when biasing the aforementioned distances, and thus enforcing no rotation at all in S6. We are intrigued by this observation and eagerly await experimental validation or disproval.<br /> In the meantime, we have made clear in the text that this hypothesis remains based exclusively on modeling work.
  
  4) V390-I302: this interaction is predicted to stabilize the open pore configuration and was included in the subset. The contact between V390 on S6 and I302 on S5 is observed in the homology model discussed above when the S6 is twisted at the glycine hinge, rotating the preceding residue (V390) out of its pore-lining position and is. Again, I can only disagree with this hypothesis because it has been experimentally demonstrated (Cheng et al, J Pharmacol Exp Ther. 2007 Sep;322(3):931-9) that the side chain of Valine390 is inside the cavity of the open pore of HCN1 channels as it controls the affinity for the pore blocker ZD7288.
  
  In accordance with other comments above, we have eliminated the bias applied to the V390I302 distance. However, the new ABMD simulations with bias applied to encourage dilation at position 390 still involve rotation of V390 away from the central pore axis, albeit with bending of S6 at the upper glycine mentioned by this reviewer. The degree of rotation is lower than in our previous simulations so that V390 still lines the inner vestibule in the open state, consistent with the observation that this position influences the apparent affinity of open pore blockers.
  
  In conclusion, modelling the open state pore of HCN1 on hERG rather than on that of HCN4 seems not justified based on accumulated evidence in the published literature. Therefore, the choice of the authors to use it as the open pore model of HCN1 channels needs to be experimentally validated. One possibility is to mutate the glycine hinge, gly391 in HCN1, into an Alanine in order to remove the flexible hinge. If this mutation alters pore gating, it will support the choice of the Authors.
  
  Once more, we thank the reviewer for the comments, which have led us to reconsider a larg part of our modeling work.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.05.20.492765v1
www.biorxiv.org www.biorxiv.org

New submission 22/09/2022, 11:09:53

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  There is emerging evidence that connexin43 hemichannels localized to mitochondria can influence their function. Here the authors demonstrated using an osteocyte cell model that connexin43 is localized to mitochondria and that this is enhanced in response to oxidative stress. Several lines of evidence were presented showing that mitochondrial connexin43 forms functional hemichannels and that connexin43 is required for optimal mitochondrial respiration and ATP generation. These aspects were major strengths of the study.
  
  The authors also show that connexin43 is recruited to mitochondria in response to oxidant stress, as a cell protective mechanism. This was primarily done using hydrogen peroxide to generate oxidant stress; primary osteocytes from Csf-1+/- mice, which are prone to Nox4 induced oxidant stress, also show enhanced mitochondrial connexin43 when compared with wild type osteocytes.
  
  Several approaches were used to demonstrate that connexin43 interacts with the ATP synthase subunit, ATP5J2, suggesting a direct role for connexin43 in the control of ATP synthesis by mediating mitochondrial ion homeostasis. Several experiments were done using a series of pHluorin fusion protein constructs as a proton sensor, these experiments hint at a potential role for connexin43 in regulating H+ permeability to support ATP production. However, the effects of inhibiting connexin43 on pH were modest, suggesting that additional roles for mitochondrial connexin43 in ATP generation should be considered.
  
  Thank you for your positive and thoughtful comments. We agree that additional roles for mitochondrial Cx43 may be possible. As an example, we consider that there may be a change in the stability of ATP synthase that occurs after mtCx43 deficiency. This and other possible roles of mtCx43 ought to be investigated in the future.
  
  Reviewer #3 (Public Review):
  
  This manuscript should be of broad interest to readers not only in the field of gap junction (GJ) mediated cell-to-cell communication but also to scientists and clinicians working on the function of mitochondria and metabolism. Their data elucidates a new function of Cx43 in regulating the energy (ATP) generation of mitochondria, e.g., under oxidative stress.
  
  The canonical function of gap junctions is in direct cell-to-cell communication by forming plasma membrane traversing channels that electrically and chemically connect the cytoplasms of adjacent cells. These channels are assembled from connexin proteins, connexin 43 (Cx43). However, more recently new, non-canonical cellular locations and functions of Cx43 have been discovered, e.g. mitochondrial Cx43 (mtCx43). However, very little is known about where Cx43 transported into mitochondria is derived from, how Cx43 is transported into mitochondria, where it is located in mitochondria, in which form Cx43 is present in mitochondria, (polypeptides, hemi-channels (HCs), complete GJ channels), and what the function of mtCx43 is. The authors addressed the latter question. The authors provide convincing evidence that mtCx43 modulates mitochondrial homeostasis and function in bone osteocytes under oxidative stress. Together, their study suggests that mtCx43 hemi-channels regulate mitochondrial ATP generation by mediating K+, H+, and ATP transfer across the mitochondrial inner membrane by directly interacting with mitochondrial ATP synthase (ATP5J2), leading to an enhanced protection of osteocytes against oxidative insult. These findings provide important information of a role of Cx43 functioning directly in mitochondria and not at the canonical location in the plasma membrane. While most of the functional assays presented in Figures 2-8 appear solid, the mitochondrial localization of Cx43, its translocation into mitochondria under oxidative stress, and its configuration as hemi-channels (Figure 1) is less convincing. I have five general comments that should be addressed:
  
  1) This study was performed in MLO-Y4 osteocyte cells. Is the H2O2 induced increase of mitochondrial Cx43 MLO-Y4 cell type or osteocyte specific, or is Cx43 playing a more general role in mitochondrial function, e.g. under oxidative stress? Osteoblasts such as MC3T3-E1 and MG63, and many other cell types endogenously express Cx43, and oxidative stress is a general physiological stressor, not only for osteocytes and bone cells. Attending to this question would address the generality of the findings for mitochondrial function.
  
  We thank the reviewer for bringing up these valid points; seeing the phenotype displayed in secondary cell types, such as osteoblasts, would be of great relevance and interest. To address this, we conducted new experiments on MC3T3-E1 cells (Figure 1-figure supplement 2). After 2 hrs of H2O2 treatment, Cx43 accumulated on the mitochondria, marked by Mitotracker. Statistical analysis also showed a significant increase of the localization between Cx43 and Mitotracker (Figure 1-figure supplement 2B). The colocalization coefficient is higher in the Ctrl group in MC3T3-E1 cells when compared with the MLO-Y4 Ctrl group, indicating a different response level in other cell lines. Osteoblasts seemed to be more sensitive to redox interference. Overall, proving the point that under oxidative stress, mtCx43 may display a similar phenotype, across multiple cell lines, although the degree of sensitivity may differ.
  
  2) The images of MLO-Y4 cells (Figure 1A) and the primary osteocytes isolated from Csf-1+/- and control mice (Figure 8) do not show visible gap junctions. I guess this is due to the fact that slides were stained with the Cx43(E2) antibody. I feel, staining of these cells in addition with the Cx43(CT) antibody would be helpful to get a better understanding on the distribution of Cx43 in gap junctions and undocked/un-oligomerized Cx43 in these cells.
  
  Thank you for the suggestion. To get a better understanding of the distribution of Cx43, either in GJ or HC form, we performed additional experiments in MLO-Y4 cells using the Cx43(CT) antibody and data are shown below. With Cx43(CT) staining, we observed more signals in the cells and on the plasma membrane. After H2O2 treatment, we observed increased and stronger signals localized on the mitochondria compared with the untreated control group. Stronger signals observed in the plasma membrane indicate the gap junction stained by Cx43(CT) antibody.
  
  3) The images of cells presented in Figure 1A are quite fussy. No mitochondria are visible, and the Cx43 staining is hazy and does not localize to any subcellular structures. Also, it is not clear if the higher resolution image presented in Figure 1C actually represents a mitochondrion. A good DIC image, or co-staining with another mitochondrial marker such as MitoTracker (as shown in Figure 4-S1) would make the localization and translocation of Cx43 into mitochondria upon oxidative stress more convincing. This is especially important as the translocation, although statistically significant, increases only by about 10% or less (Figure 1B). Such a small difference (also represented in the Western analyses presented in Figure 1D) could easily be artefactual, depending on how the correlation coefficient was generated. Of note in this respect is that control cells in Figure 1A appear larger (compare the size of the nuclei) and are spread out more than the H2O2 treated cells. Better, more clear images would make the mitochondrial localization/translocation more convincing.
  
  The reviewer made great points. To improve the image clarity, we redid the staining/imaging and determined the colocalization of SDHA and MitoTracker Deepred. The result (shown below) suggested that under normal conditions without H2O2 treatment, SDHA and MitoTracker merged perfectly, while after H2O2 treatment for 2 hrs, mitochondria became fragmented and the SDHA signal exhibited a more dotted pattern compared to the MitoTracker. Overall, we feel that MitoTracker represents the distribution of mitochondria better. SDHA is a subunit of mitochondrial complex II, and the images we presented in Figure 1C were captured from isolated mitochondria under a confocal microscope with SDHA and Cx43(CT) co-staining. Considering the specificity of SDHA (see images below), we believe the Cx43 signal we captured demonstrates the mitochondrial localization/translocation. After using MitoTracker as a mitochondrial marker and higher magnificent images, the correlation coefficient increased from 0.35 to 0.47, a 32% increment with statistical significance. As to the nuclei size, some cells indeed have smaller sizes, which may be affected by varied local cell density. The new images represented in Figure 1A are much more consistent in the nuclei size.
  
  4) How pure are the mitochondria that were probed for Cx43 by Western shown in Figure 1D? The preparation method described is relatively simple, collecting the 10,000xg supernatant (here 9,000xg supernatant) as mitochondrial fraction. Is it possible that the Cx43 signal, at least in part, is derived from other, contaminating membranes, such as PM, Golgi, or ER? Testing the mitochondrial preparation by Western with marker proteins specific for these compartments would strengthen the author's results.
  
  The reviewer made a great suggestion. To address this, we did a western blot to test the mitochondrial purity. Indeed, this method using centrifugation is simple, and as expected there were some contamination of ER (marked by PDI) and Golgi (marked by STX6). However, to further confirm the purity of the mitochondrial fraction, fluorescent dyes for mitochondria (MitoTracker Deepred), ER (ER-Tracker Blue-White), and nuclei (Hochest) were used. The organelle-specific dyes indicated most parts of the fraction were mitochondria. There were some contaminations with ER fragments and minimal nuclear contamination. Combining our western blot and immunofluorescence data, it can be concluded that our Cx43 signal is primarily derived from mitochondria.
  
  5) The authors rely on previous studies to postulate that Cx43 in mitochondria forms hemichannels in their system, is localized in the inner membrane, and is oriented with the Cx43 C-termini facing the inter-membrane space (as schemed in Figure 8C). The authors use lucifer yellow (LY) dye transfer and carbenoxolone, but both are not hemi-channel specific probes. They are transferred by, and block GJ channels as well. Experiments, using hemi-channel specific probes would be more convincing. This is important, as the information cited is based on only two references (Boengler et al., 2009; Miro-Casas et al., 2009), and it still is highly unclear how a membrane protein that is co-translationally inserted into the ER membrane, then traffics through the Golgi to be inserted into the plasma membrane is actually imported into mitochondria and in which state (monomeric, hexameric). Why the Cx43(CT) specific antibody traverses the outer mitochondrial membrane and reaches the Cx43CT while the Cx43(E2) specific antibody is not described and clear either. Where are these mitochondria permeabilized with Triton X-100 as described in M&M?
  
  We edited the Methods section. We did not use Triton X-100 to permeate mitochondria. PMP appeared to preserve mitochondrial inner membrane integrity allowing us to assess the localization of Cx43(CT) antibody on mitochondria. We showed these new immunofluorescence images in Figure 5- figure supplement 2. PMP used as a plasma membrane permeabilizer has a 6x affinity with MOM compared with MIM. Meanwhile, no Cx43(E2) Ab signal was detected in mitochondria, suggesting the extracellular loop of Cx43 faces the matrix and cannot be accessed by Cx43(E2) antibody.
  
  The translocation of Cx43 to mitochondria was reported to involve the chaperone Hsp90-dependent TOM complex pathway (Rodriguez-Sinovas et al., 2006). After the translocation, if mtCx43 forms gap junctions in mitochondria is unclear. Lucifer yellow is widely used in hemichannel-mediated dye uptake or gap junction-mediated dye transfer. In our case, considering the channel orientation, mtCx43 should form hemichannels, and Cx43(CT) Ab could be used as a specific Cx43 HCs blocker like the study reported in cardiomyocytes (Lillo et al., 2019).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.05.502934v1
www.biorxiv.org www.biorxiv.org

Dopamine D2Rs Coordinate Cue-Evoked Changes in Striatal Acetylcholine Levels

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  The authors showed that D2R antagonism did not affect the initial dip amplitude but shortened the temporal length of the dip and the rebound ACh levels. In addition, by using both ACh and DA sensors, the authors showed DA levels correlate with ACh dip length and rebound level, not the dip amplitude. Both pieces of evidence support their conclusion that DA does not evoke the dip but controls the overall shape of ACh dip. Overall the current study provides solid data and interpretation. The combination of D2R antagonist and CIN-specific Drd2 KO further support a causal relationship between DA and ACh dip. Overall, the experiments are well-designed, carefully conducted and the manuscript is well-written.
  
  At the behavioral level, the author found a positive correlation between total AUC (of ACh signal dip) and press latency in Figure 10, indicating cholinergic levels contributes to the motivation. The next logic experiment would be to compare the press latency between control and ChAT-Drd2KO mice, since KO mice have smaller AUC while not affecting DA. However, this piece of information was missing in the manuscript. The author instead showed the correlation between AUC and latency was disrupted, which is indirectly related to the conclusion and hard to interpret. Figure 10 showed that eticlopride elongates the press latency, in a dose-dependent manner. However, it is not clear what this press latency means and how it was measured in this CRF task (Since there is no initial cue in the CRF test, how can we define the press latency?).
  
  We did compare the press latency between control and ChATDrd2KO mice (Figure 10B). At baseline (saline), there is no difference between press latency between these two groups. We measured press latency as the time to press the lever after the lever has been extended. When the lever extends, it makes a sound (cue), which signals to the mice that a new trial has started. The fact that press latency is not enhanced in ChATDrd2KO mice was surprising to us. It is possibly due to compensation via other neuronal mechanisms that regulate press latency (see discussion to comment 6 of public review).
  
  Pearson r<0.5 is normally defined as a weak correlation. It is better to state r values and discuss that in the manuscript.
  
  A valid comment. We clarified our correlation analyses in the methods section (line 717):
  
  “We used a variance explained statistical analysis (R2) to determine the % of variance in our correlation analyses (example: a correlation of 0.5 means 0.52 X 100= 25% of the variance in Y is “explained” or predicted by the X variable. When comparing correlation values, Fisher’s transformation was used to convert Pearson correlation coefficients to z-scores.”
  
  We also added this to the result section: e.g., line 256: “which accounts for 22% of the variance in the ACh decrease explained by the DA peak.
  
  Is there any correlation between ACh AUC and other behavior indexes such as press speed or the time between press and reward licking?
  
  We don’t have the ability to measure press speed and there is no press rate because the lever retracts after the first lever press. We quantified the correlation between time to press until head entry (press to reward latency) and ACh AUC and the results are difficult to interpret. For Drd2f/fl control mice we determined a weak negative correlation (the larger the ACh dip the lower the press to reward latency). In contrast, in ChATDrd2KO mice we found a weak positive correlation between ACh AUC and press to reward latency (the smaller the dip, the lower the press to reward latency). Given these conflicting results, it is difficult to determine how the ACh AUC affects press to reward latency.
  
  In figure 2B CS+ group, the author was focusing on the responses at CS+, however, the ACh dips at reward delivery seem to persist even after in this particular example. This might be an interesting phenomenon in which ACh got dissociated from DA signals, which needs further analysis from the author.
  
  We see a persistent signal at reward delivery in both DA and ACh up to the 8 days of testing. However, 1 mouse lost its optical fiber for the GACh signal so the data from Days 6-8 is from 2 mice. We also measured the correlation between DA and ACh at reward delivery for all 8 days of testing (see below). The correlation data is variable with the strongest correlation being observed on Day 2. It is possible that these signals could get dissociated after even more days of testing, but we do not have this data available.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.12.08.471871v1
www.biorxiv.org www.biorxiv.org

Isoform-specific mutation in Dystonin-b gene causes late-onset protein aggregate myopathy and cardiomyopathy

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  In general, the study has several novel comments, the experimental design is appropriate and the manuscript is well written. While the manuscript contains a lot of data, still it is a bit descriptive. There are also some issues, which should be addressed.
  
  1) In Figure 1E, the authors demonstrate a small but significant decrease in body weight of mutant mice. The difference is not so drastic. They also mentioned that some mice showed kyphosis. Please provide data on what percentage of mutant mice showed kyphosis. Please also provide individual hind limb muscle weight normalized with body weight.
  
  Thank you for your suggestions. The kyphosis was observed in some (more than one third of) Dst-b mutant mice as shown in the author response image 1. MRI or CT imaging of the skeleton is necessary to accurately diagnose kyphosis, however, the imaging was not performed in this paper. Therefore, we would like not to provide data on what percentage of mutant mice showed kyphosis.
  
  We weighed the soleus of hind limb and demonstrated the data (lines 132-135).
  
  2) There is a lot of variability in the age of the mice employed for this study. For example, in Figure 3, the authors mentioned 23 months old mice (Fig. 3a) and over 20 months old and over 18 months old mice. What was the exact age of the mice? Why three different age mice were used for the same set of experiments? The authors should also comment on whether the onset of myopathy in skeletal and cardiac muscle occurs at the same or different age in mutant mice.
  
  According to the comments, we described exact ages in each figure legends. The reason for the variability in age of mice is that we performed a lot of different kinds of experiment at different time points. We described the myopathy phenotypes occurred around 16 months of age and older (lines 128-129). As for cardiomyopathy, fibrosis was observed around 16 months of age and older (Figure 3D,E).
  
  3) Authors have studied protein aggregation only in the soleus muscle of mutant mice. Do the same types of aggregates also form in cardiomyocytes? They write that desmin aggregates were observed in cardiomyocytes of mutant mice. Please show those results in a supplemental figure.
  
  According to the suggestion, we presented the data on desmin aggregates in the cardiomyocytes of Dst-bE2610Ter/E2610Ter mice (Figure 4-figure supplement 1).
  
  4) In Figure 5, the authors suggest that mutant mice have mitochondrial abnormalities. However, this analysis is quite abstract and inconclusive. Immunohistochemical images show higher levels of CytoC and Tom20 whereas QRT-PCR demonstrates a significant decrease in mRNA levels of some of the mitochondria-related molecules. Authors should perform additional experiments to determine whether there is any difference in mitochondrial content between WT and mutant mice. In addition, they should perform some functional assays (i.e. OCR, seahorse experiment etc.) to measure mitochondria oxidative phosphorylation capacity is affected in mutant mice.
  
  Thank you very much for the comment. Mitochondrial accumulation was a characteristic phenotype in Dst-bE2610Ter/E2610Ter muscle and also in other types of MFM. We performed quantitative analyses and added the data (Figure 5B). Mitochondrial accumulation was observed even in young stage when protein aggregates were not observed (Figure 3-figure supplement 1A). As the reviewer pointed out, it is important to demonstrate changes in mitochondrial function, but at this moment, we do not have that assay system and would like to present it as data for a future paper, including analysis on mitophagy.
  
  5) The morphology of the mitochondria in TEM images shows features that are commonly observed during oxidative damage. Is there any evidence of oxidative stress in skeletal and cardiac muscle of mutant mice?
  
  Thank you very much for the insightful comment. Gene ontology and KEGG pathway analysis on RNA-seq data did not show alterations of oxidative stress in the heart. We performed q-PCR for genes associated with oxidative stress in soleus (Figure 1-figure supplement 3), which did not show alterations in oxidative stress. In the future, we would to investigate on this point.
  
  Reviewer #3 (Public Review):
  
  This manuscript by Yoshioka et al. provides an extensive analysis of cardiac and skeletal muscle in a mouse model of Dst-b mutation. The authors have generated the mutant mouse model to selectively mutate Dst-b isoform of Dystonin and show that such a mutation leads to cardiomyopathy and late-onset myofibrillar myopathy. This is a novel discovery which adds valuable information to the genetic basis and molecular mechanism of MFM mediated by Dst-b. However, the manuscript needs substantial revision and additional feasible experiments.
  
  In Figure3A, the authors suggest that there are smaller myofibers in the mutated mice however they do not provide enough data to support that. Cross-sectional areas between the mutant and WT have to be counted and represented as bins. This can better show the presence of smaller myofibers and muscle degeneration in the mutant mice.
  
  Thank you for the helpful comment. We quantified distribution of cross-sectional area (CSA) in the soleus and then the data was indicated in Figure 3C. It indicates that there are smaller myofibers in the mutant mice.
  
  In Figure 3A-B, the authors show that mutant mice have significantly more myofibers with centrally located myonuclei indicating the constant degeneration and regeneration in the mutant mice. Another indicator of this is the number of activated muscle stem cells. Under homeostasis, authors can compare the number of quiescent muscle stem cells and activated muscle stem cells. If there is constant degeneration and regeneration in the mutant muscle, there will be more cycling muscle stem cells and that will further prove such phenotype in question. Alternatively, they can use EdU water and quantify the number of EdU+/Pax7+ cells between the mutant and WT.
  
  Thank you very much for the interesting comment. We agree that the subject of muscle regeneration in Dst-b mutant mice to be interesting. The authors tried to address this issue by making ISH probes for Pax7 and Emerin, which label muscle stem cells (image below). However, we were unable to reach a conclusion at this time. We intend to address this issue in the future.
  
  In figure 2F, the authors show behavioral tests on the mutant mice of age 1 year. They do not show any significant difference in muscle strength. However, most of the myopathic phenotypes they observe are at 23 months of age, these behavioral tests can be repeated at that age to see if there is more muscle weakness in the mutant mice compared to the WT. Also, are these behavioral test readouts affected by the cardiomyopathy independent of skeletal muscle strength?
  
  We have used rotarod test and wire hang test to evaluate motor coordination and have reported impairment of motor performance in dt mice (Horie et al., 2020). The purpose of these behavior tests in the present study was to evaluate motor coordination of Dst-b mutant mice compared to dt mice, not to address the skeletal muscle function. The text has been changed to clarify this point (lines 121-123).
  
  Generally speaking, these behavioral tests, especially the rotarod test, may be affected by cardiac abnormalities. However, it is difficult to draw conclusions from the results of this study, since there were no significant differences in the behavioral experiments.
  
  They show in Figure 3B that the number of CNF's are affected to a different extent in different muscles. These muscles have a different composition of myofibers, one consisting mostly of slow-type fibers while the other is mostly of fast-type. The question of whether Dst-b mutation effect of muscle fiber types is not clear. Is there a difference?
  
  Thank you very much for insightful comment. We performed qPCR to evaluate whether Dst-b mutation affects the myofiber type of soleus muscle (Figure 1-figure supplement 3B). Expression levels of the genes did not change between WT and Dst-b mutant mice.
  
  The cardiac myopathy phenotype that is clearly shown in figure 3 is shown in mice of 16 months of age whereas the skeletal muscle myopathy phenotype is shown in 23-month-old mice. The reason for the choice of the age of the mice should be discussed. Does the cardiac phenotype precede the skeletal muscle phenotype? Have they looked at the skeletal muscle phenotype at earlier ages? If so, that data should be provided as well and discussed.
  
  Thank you for the comment. We analyzed myopathy and cardiomyopathy phenotypes in mice aged between 16-23 months and then have chosen histological photographs with the high quality. As shown in Figure 3B, CNFs increased in the soleus from all Dst-b mutant mice aged between 16-23 months. We added description that skeletal myopathy phenotypes occurred at 16-month-old mice.
  
  The authors clearly show the formation of protein aggregates in the myofibers in the mutant mice. They further characterize the composition of these desmin aggregates by determining their co aggregates such as plectin and ab-Crystallin. Another component of the z-disk that has been shown to be involved in the aggregates in MFM is myotilin. The authors should also show the presence/ absence and co-aggregation of this protein with the desmin aggregates present in the mutant mice.
  
  According with the suggestion, we performed immunohistochemistry of myotilin. Myotilin was abnormally accumulated in myofibers of the soleus from Dst-b mutant mice. We thank the nice comment and added the data in Figure 4-figure supplement 2.
  
  The authors show abnormal accumulation of mitochondria through cyt c and Tom20 staining. The increased Tom20 levels in the mutant are shown in figure 5A which is from mice that are 23-month-old. However, in figure 3-figure supplement 1a they also show elevated Tom20 staining in the mutant mice that are only 1-2 months old. However, no other phenotype is observed at this age except for the disrupted mitochondria according to the data provided. This needs to be discussed and addressed.
  
  We would like to correct that the data in figure 3-figure supplement 1a is 3-4 months old mutant mice. These data show that mitochondrial accumulation precedes CNF and desmin aggregation. We have described this point in the text (lines 206-209).
  
  In Figure 5, the authors show changes in gene expression levels of genes involved in oxidative phosphorylation which supports the disrupted mitochondrial function. Additionally, ROS levels could be compared between the WT and mutant mice.
  
  To address the involvement of oxidative stress, we performed q-PCR for genes associated with oxidative stress response in soleus (Figure 1-figure supplement 3C). qPCR data did not show alterations in such genes. In the future, we would like to investigate on this point.
  
  In Figure 5 authors show disrupted oxidative phosphorylation in the mutant soleus muscle. Is this also associated with the fiber-type switch? Since mouse soleus muscle is a mix of fast and slow fiber types, they can look at differences in gene expression of key marker genes for slow and fast myofibers.
  
  Thank you very much for the suggestion. We quantified expression levels of muscle fiber-type marker genes (Figure 1-figure supplement 3B). There is no data to suggest the fiber-type switch.
  
  In figure 2, the authors show that mutant mice increase their body weight at a normal pace until 13 weeks of age after which the mutant mice become lighter than their WT counterparts. Is this suggestive of loss of muscle mass? If so, the authors show the muscle atrophy phenotype in 23-month-old mice with cross-sections. Does this mean muscle atrophy starts at an earlier age at 16 months in these mice? Please provide details on the age of the mice for each experiment. In addition, in the text (line 121) authors phrase that the mutant mice become leaner. Lean usually means a decrease in fat mass and an increase in muscle mass. Is this the case? If so, there is no data to support that and the phenotype in the mutant mice suggests there is muscle atrophy in these mice. Therefore, it would not be appropriate to suggest that these mice get lean. However, it is interesting that the bodyweight of the mutant mice gets significantly lighter after 13 weeks. EchoMRI analysis can be performed between these mice to see the total body composition to determine if there is a change in the different type of fat, lean or water composition.
  
  Thank you for your comments. We provided exact ages in each figure legend. We described that skeletal myopathy phenotypes occur as early as 16-month-old mice, and CSA analysis showed that increased small caliber myofibers in the soleus of Dst-b mutant mice. However, muscle mass of the soleus normalized by body weight was not significantly different between control and Dst-bE2610Ter/E2610Ter mice. Therefore, muscle atrophy may be not significant enough to affect muscle weight.
  
  Because we have not quantified the fat mass in Dst-b mutant mice, we changed the phrase from “the mutant mice become leaner” to “they become lower body weight compare with WT mice” (line 120).
  
  Authors have performed RNA-Seq for the left ventricle from the mutant and the WT mice. Separate clustering of the WT and the mutant has to be shown at least through a PCA plot. Some IGV tracks to show the expression level changes in key genes between the mutant and WT should be shown as well. In addition, they could show how some of the genes involved in autophagy and protein degradation are affected since these are mainly the mechanism by which there is protein aggregation in MFM's.
  
  Thank you for your helpful comment. We performed principal component analysis (PCA) and hierarchical clustering. The data showed that transcriptomic features of WT and Dst-b mutant hearts are separated (Figure 8-figure supplement 1A, B). To evaluate the change in expression level of genes, we also performed real time-PCR (Figure 8-figure supplement 1C). Our Gene ontology analysis and KEGG pathway analysis on RNA-seq data in the heart did not suggest the alterations in autophagy and protein degradation, while many genes responsible for unfolded protein response affected (Figure 8C, Figure 8-figure supplement 1C). Previous studies have reported that unfolded protein response is abnormal in several animal models for myofibrillar myopathy (Winter et al., 2014; Fang et al., J Clin Invest, 2017). We would like to investigate underlying mechanisms of protein aggregates in Dst-b mutant myofibers in the future.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.17.484743v1
www.biorxiv.org www.biorxiv.org

Error Prediction Determines the Coordinate System Used for the Representation of Novel Dynamics

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author response:
  
  Reviewer #1 (Public Review):
  
  This paper proposes a novel framework for explaining patterns of generalization of force field learning to novel limb configurations. The paper considers three potential coordinate systems: cartesian, joint-based, and object-based. The authors propose a model in which the forces predicted under these different coordinate frames are combined according to the expected variability of produced forces. The authors show, across a range of changes in arm configurations, that the generalization of a specific force field is quite well accounted for by the model.
  
  The paper is well-written and the experimental data are very clear. The patterns of generalization exhibited by participants - the key aspect of the behavior that the model seeks to explain - are clear and consistent across participants. The paper clearly illustrates the importance of considering multiple coordinate frames for generalization, building on previous work by Berniker and colleagues (JNeurophys, 2014). The specific model proposed in this paper is parsimonious, but there remain a number of questions about its conceptual premises and the extent to which its predictions improve upon alternative models.
  
  A major concern is with the model's premise. It is loosely inspired by cue integration theory but is really proposed in a fairly ad hoc manner, and not really concretely founded on firm underlying principles. It's by no means clear that the logic from cue integration can be extrapolated to the case of combining different possible patterns of generalization. I think there may in fact be a fundamental problem in treating this control problem as a cue-integration problem. In classic cue integration theory, the various cues are assumed to be independent observations of a single underlying variable. In this generalization setting, however, the different generalization patterns are NOT independent; if one is true, then the others must inevitably not be. For this reason, I don't believe that the proposed model can really be thought of as a normative or rational model (hence why I describe it as 'ad hoc'). That's not to say it may not ultimately be correct, but I think the conceptual justification for the model needs to be laid out much more clearly, rather than simply by alluding to cue-integration theory and using terms like 'reliability' throughout.
  
  We thank the reviewer for bringing up this point. We see and treat this problem of finding the combination weights not as a cue integration problem but as an inverse optimal control problem. In this case, there can be several solutions to the same problem, i.e., what forces are expected in untrained areas, which can co-exist and give the motor system the option to switch or combine them. This is similar to other inverse optimal control problems, e.g. combining feedforward optimal control models to explain simple reaching. However, compared to these problems, which fit the weights between different models, we proposed an explanation for the underlying principle that sets these weights for the dynamics representation problem. We found that basing the combination on each motor plan's reliability can best explain the results. In this case, we refer to ‘reliability’ as execution reliability and not sensory reliability, which is common in cue integration theory. We have added further details explaining this in the manuscript.
  
  “We hypothesize that this inconsistency in results can be explained using a framework inspired by an inverse optimal control framework. In this framework the motor system can switch or combine between different solutions. That is, the motor system assigns different weights to each solution and calculates a weighted sum of these solutions. Usually, to support such a framework, previous studies found the weights by fitting the weighed sum solution to behavioral data (Berret, Chiovetto et al. 2011). While we treat the problem in the same manner, we propose the Reliable Dynamics Representation (Re-Dyn) mechanism that determines the weights instead of fitting them. According to our framework, the weights are calculated by considering the reliability of each representation during dynamic generalization. That is, the motor system prefers certain representations if the execution of forces based on this representation is more robust to distortion arising from neural noise. In this process, the motor system estimates the difference between the desired generalized forces and generated generalized forces while taking into consideration noise added to the state variables that equivalently define the forces.”
  
  A more rational model might be based on Bayesian decision theory. Under such a model, the motor system would select motor commands that minimize some expected loss, averaging over the various possible underlying 'true' coordinate systems in which to generalize. It's not entirely clear without developing the theory a bit exactly how the proposed noise-based theory might deviate from such a Bayesian model. But the paper should more clearly explain the principles/assumptions of the proposed noise-based model and should emphasize how the model parallels (or deviates from) Bayesian-decision-theory-type models.
  
  As we understand the reviewer's suggestion, the idea is to estimate the weight of each coordinate system based on minimizing a loss function that considers the cost of each weight multiplied by a posterior probability that represents the uncertainty in this weight value. While this is an interesting idea, we believe that in the current problem, there are no ‘true’ weight values. That is, the motor system can use any combination of weights which will be true due to the ambiguous nature of the environment. Since the force field was presented in one area of the entire workspace, there is no observation that will allow us to update prior beliefs regarding the force nature of the environment. In such a case, the prior beliefs might play a role in the loss function, but in our opinion, there is no clear rationale for choosing unequal priors except guessing or fitting prior probabilities, which will resemble any other previous models that used fitting rather than predictions.
  
  Another significant weakness is that it's not clear how closely the weighting of the different coordinate frames needs to match the model predictions in order to recover the observed generalization patterns. Given that the weighting for a given movement direction is over- parametrized (i.e. there are 3 variable weights (allowing for decay) predicting a single observed force level, it seems that a broad range of models could generate a reasonable prediction. It would be helpful to compare the predictions using the weighting suggested by the model with the predictions using alternative weightings, e.g. a uniform weighting, or the weighting for a different posture. In fact, Fig. 7 shows that uniform weighting accounts for the data just as well as the noise-based model in which the weighting varies substantially across directions. A more comprehensive analysis comparing the proposed noise-based weightings to alternative weightings would be helpful to more convincingly argue for the specificity of the noise-based predictions being necessary. The analysis in the appendix was not that clearly described, but seemed to compare various potential fitted mixtures of coordinate frames, but did not compare these to the noise-based model predictions.
  
  We agree with the reviewer that fitted global weights, that is, an optimal weighted average of the three coordinate systems should outperform most of the models that are based on prediction instead of fitting the data. As we showed in Figure 7 of the submitted version of the manuscript, we used the optimal fitted model to show that our noise-based model is indeed not optimal but can predict the behavioral results and not fall too short of a fitted model. When trying to fit a model across all the reported experiments, we indeed found a set of values that gives equal weights for the joints and object coordinate systems (0.27 for both), and a lower value for the Cartesian coordinate system (0.12). Considering these values, we indeed see how the reviewer can suggest a model that is based on equal weights across all coordinate systems. While this model will not perform as well as the fitted model, it can still generate satisfactory results.
  
  To better understand if a model based on global weights can explain the combination between coordinate systems, we perform an additional experiment. In this experiment, a model that is based on global fitted weights can only predict one out of two possible generalization patterns while models that are based on individual direction-predicted weights can predict a variety of generalization patterns. We show that global weights, although fitted to the data, cannot explain participants' behavior. We report these new results in Appendix 2.
  
  “To better understand if a model based on global weights can explain the combination between coordinate systems, we perform an additional experiment. We used the idea of experiment 3 in which participants generalize learned dynamics using a tool. That is, the arm posture does not change between the training and test areas. In such a case, the Cartesian and joint coordinate systems do not predict a shift in generalized force pattern while the object coordinate system predicts a shift that depends on the orientation of the tool. In this additional experiment, we set a test workspace in which the orientation of the tool is 90° (Appendix 2- figure 1A). In this case, for the test workspace, the force compensation pattern of the object based coordinate system is in anti-phase with the Cartesian/joint generalization pattern. Any globally fitted weights (including equal weights) can produce either a non-shifted or 90° shifted force compensation pattern (Appendix 2- figure 1B). Participants in this experiment (n=7) showed similar MPE reduction as in all previous experiments when adapting to the trigonometric scaled force field (Appendix 2- figure 1C). When examining the generalized force compensation patterns, we observed a shift of the pattern in the test workspace of 14.6° (Appendix 2- figure 1D). This cannot be explained by the individual coordinate system force compensation patterns or any combination of them (which will always predict either a 0° or 90° shift, Appendix 2- figure 1E). However, calculating the prediction of the Re-Dyn model we found a predicted force compensation pattern with a shift of 6.4° (Appendix 2- figure 1F). The intermediate shift in the force compensation pattern suggests that any global based weights cannot explain the results.”
  
  With regard to the suggestion that weighting is changed according to arm posture, two of our results lower the possibility that posture governs the weights:
  
  (1) In experiment 3, we tested generalization while keeping the same arm posture between the training and test workspaces, and we observed different force compensation profiles across the movement directions. If arm posture in the test workspaces affected the weights, we would expect identical weights for both test workspaces. However, any set of weights that can explain the results observed for workspace 1 will fail to explain the results observed in workspace 2. To better understand this point we calculated the global weights for each test workspace for this experiment and we observed an increase in the weight for the object coordinates system (0.41 vs. 0.5) and a reduction in the weights for the Cartesian and joint coordinates systems (0.29 vs. 0.24). This suggests that the arm posture cannot explain the generalization pattern in this case.
  
  (2) In experiments 2 and 3, we used the same arm posture in the training workspace and either changed the arm posture (experiment 2) or did not change the arm posture (experiment 3) in the test workspaces. While the arm posture for the training workspace was the same, the force generalization patterns were different between the two experiments, suggesting that the arm posture during the training phase (adaptation) does not set the generalization weights.
  
  Overall, this shows that it is not specifically the arm posture in either the test or the training workspaces that set the weights. Of course, all coordinate models, including our noise model, will consider posture in the determination of the weights.
  
  Reviewer #2 (Public Review):
  
  Leib & Franklin assessed how the adaptation of intersegmental dynamics of the arm generalizes to changes in different factors: areas of extrinsic space, limb configurations, and 'object-based' coordinates. Participants reached in many different directions around 360{degree sign}, adapting to velocity-dependent curl fields that varied depending on the reach angle. This learning was measured via the pattern of forces expressed in upon the channel wall of "error clamps" that were randomly sampled from each of these different directions. The authors employed a clever method to predict how this pattern of forces should change if the set of targets was moved around the workspace. Some sets of locations resulted in a large change in joint angles or object-based coordinates, but Cartesian coordinates were always the same. Across three separate experiments, the observed shifts in the generalized force pattern never corresponded to a change that was made relative to any one reference frame. Instead, the authors found that the observed pattern of forces could be explained by a weighted combination of the change in Cartesian, joint, and object-based coordinates across test and training contexts.
  
  In general, I believe the authors make a good argument for this specific mixed weighting of different contexts. I have a few questions that I hope are easily addressed.
  
  Movements show different biases relative to the reach direction. Although very similar across people, this function of biases shifts when the arm is moved around the workspace (Ghilardi, Gordon, and Ghez, 1995). The origin of these biases is thought to arise from several factors that would change across the different test and training workspaces employed here (Vindras & Viviani, 2005). My concern is that the baseline biases in these different contexts are different and that rather the observed change in the force pattern across contexts isn't a function of generalization, but a change in underlying biases. Baseline force channel measurements were taken in the different workspace locations and conditions, so these could be used to show whether such biases are meaningfully affecting the results.
  
  We agree with the reviewer and we followed their suggested analysis. In the following figure (Author response image 1) we plotted the baseline force compensation profiles in each workspace for each of the four experiments. As can be seen in this figure, the baseline force compensation is very close to zero and differs significantly from the force compensation profiles after adaptation to the scaled force field.
  
  Author response image 1.
  
  Baseline force compensation levels for experiments 1-4. For each experiment, we plotted the force compensation for the training, test 1, and test 2 workspaces.
  
  Experiment 3, Test 1 has data that seems the worst fit with the overall story. I thought this might be an issue, but this is also the test set for a potentially awkwardly long arm. My understanding of the object-based coordinate system is that it's primarily a function of the wrist angle, or perceived angle, so I am a little confused why the length of this stick is also different across the conditions instead of just a different angle. Could the length be why this data looks a little odd?
  
  Usually, force generalization is tested by physically moving the hand in unexplored areas. In experiment 3 we tested generalization using a tool which, as far as we know, was not tested in the past in a similar way to the present experiment. Indeed, the results look odd compared to the results of the other experiments, which were based on the ‘classic’ generalization idea. While we have some ideas regarding possible reasons for the observed behavior, it is out of the scope of the current work and still needs further examination.
  
  Based on the reviewer’s comment, we improved the explanation in the introduction regarding the idea behind the object based coordinate system
  
  “we could represent the forces as belonging to the hand or a hand-held object using the orientation vector connecting the shoulder and the object or hand in space (Berniker, Franklin et al. 2014).” The reviewer is right in their observation that the predictions of the object-based reference frame will look the same if we change the length of the tool. The object-based generalized forces, specifically the shift in the force pattern, depend only on the object's orientation but not its length (equation 4).
  
  The manuscript is written and organized in a way that focuses heavily on the noise element of the model. Other than it being reasonable to add noise to a model, it's not clear to me that the noise is adding anything specific. It seems like the model makes predictions based on how many specific components have been rotated in the different test conditions. I fear I'm just being dense, but it would be helpful to clarify whether the noise itself (and inverse variance estimation) are critical to why the model weights each reference frame how it does or whether this is just a method for scaling the weight by how much the joints or whatever have changed. It seems clear that this noise model is better than weighting by energy and smoothness.
  
  We have now included further details of the noise model and added to Figure 1 to highlight how noise can affect the predicted weights. In short, we agree with the reviewer there are multiple ways to add noise to the generalized force patterns. We choose a simple option in which we simulate possible distortions to the state variables that set the direction of movement. Once we calculated the variance of the force profile due to this distortion, one possible way is to combine them using an inverse variance estimator. Note that it has been shown that an inverse variance estimator is an ideal way to combine signals (e.g., Shahar, D.J. (2017) https://doi.org/10.4236/ojs.2017.72017). However, as we suggest, we do not claim or try to provide evidence for this specific way of calculating the weights. Instead, we suggest that giving greater weight to the less variable force representation can predict both the current experimental results as well as past results.
  
  Are there any force profiles for individual directions that are predicted to change shape substantially across some of these assorted changes in training and test locations (rather than merely being scaled)? If so, this might provide another test of the hypotheses.
  
  In experiments 1-3, in which there is a large shift of the force compensation curve, we found directions in which the generalized force was flipped in direction. That is, clockwise force profiles in the training workspace could change into counter-clockwise profiles in the test workspace. For example, in experiment 2, for movement at 157.5° we can see that the force profile was clockwise for the training workspace (with a force compensation value of 0.43) and movement at the same direction was counterclockwise for test workspace 1 (force compensation equal to -0.48). Importantly, we found that the noise based model could predict this change.
  
  Author response image 2.
  
  Results of experiment 2. Force compensation profiles for the training workspace (grey solid line) and test workspace 1 (dark blue solid line). Examining the force nature for the 157.5° direction, we found a change in the applied force by the participants (change from clockwise to counterclockwise forces). This was supported by a change in force compensation value (0.43 vs. -0.48). The noise based model can predict this change as shown by the predicted force compensation profile (green dashed line).
  
  I don't believe the decay factor that was used to scale the test functions was specified in the text, although I may have just missed this. It would be a good idea to state what this factor is where relevant in the text.
  
  We added an equation describing the decay factor (new equation 7 in the Methods section) according to this suggestion and Reviewer 1 comment on the same issue.
  
  Reviewer #3 (Public Review):
  
  The author proposed the minimum variance principle in the memory representation in addition to two alternative theories of the minimum energy and the maximum smoothness. The strength of this paper is the matching between the prediction data computed from the explicit equation and the behavioral data taken in different conditions. The idea of the weighting of multiple coordinate systems is novel and is also able to reconcile a debate in previous literature.
  
  The weakness is that although each model is based on an optimization principle, but the derivation process is not written in the method section. The authors did not write about how they can derive these weighting factors from these computational principles. Thus, it is not clear whether these weighting factors are relevant to these theories or just hacking methods. Suppose the author argues that this is the result of the minimum variance principle. In that case, the authors should show a process of how to derive these weighting factors as a result of the optimization process to minimize these cost functions.
  
  The reviewer brings up a very important point regarding the model. As shown below, it is not trivial to derive these weights using an analytical optimization process. We demonstrate one issue with this optimization process.
  
  The force representation can be written as (similar to equation 6):
  
  We formulated the problem as minimizing the variance of the force according to the weights w:
  
  In this case, the variance of the force is the variance-covariance matrix which can be minimized by minimizing the matrix trace:
  
  We will start by calculating the variance of the force representation in joints coordinate system:
  
  Here, the force variance is a result of a complex function which include the joints angle as a random variable. Expending the last expression, although very complex, is still possible. In the resulted expression, some of the resulted terms include calculating the variance of nested trigonometric functions of the random joint angle variance, for example:
  
  In the vast majority of these cases, analytical solutions do not exist. Similar issues can also raise for calculating the variance of complex multiplication of trigonometric functions such as in the case of multiplication of Jacobians (and inverse Jacobians)
  
  To overcome this problem, we turned to numerical solutions which simulate the variance due to the different state variables.
  
  In addition, I am concerned that the proposed model can cancel the property of the coordinate system by the predicted variance, and it can work for any coordinate system, even one that is not used in the human brain. When the applied force is given in Cartesian coordinates, the directionality in the generalization ability of the memory of the force field is characterized by the kinematic relationship (Jacobian) between the Cartesian coordinate and the coordinate of interest (Cartesian, joint, and object) as shown in Equation 3. At the same time, when a displacement (epsilon) is considered in a space and a corresponding displacement is linked with kinematic equations (e.g., joint displacement and hand displacement in 2 joint arms in this paper), the generated variances in different coordinate systems are linked with the kinematic equation each other (Jacobian). Thus, how a small noise in a certain coordinate system generates the hand force noise (sigma_x, sigma_j, sigma_o) is also characterized by the kinematics (Jacobian). Thus, when the predicted forcefield (F_c, F_j, F_o) was divided by the variance (F_c/sigma_c^2, F_j/sigma_j^2, F_o/sigma_o^2, ), the directionality of the generalization force which is characterized by the Jacobian is canceled by the directionality of the sigmas which is characterized by the Jacobian. Thus, as it has been read out from Fig*D and E top, the weight in E-top of each coordinate system is always the inverse of the shift of force from the test force by which the directionality of the generalization is always canceled.
  
  Once this directionality is canceled, no matter how to compute the weighted sum, it can replicate the memorized force. Thus, this model always works to replicate the test force no matter which coordinate system is assumed. Thus, I am suspicious of the falsifiability of this computational model. This model is always true no matter which coordinate system is assumed. Even though they use, for instance, the robot coordinate system, which is directly linked to the participant's hand with the kinematic equation (Jacobian), they can replicate this result. But in this case, the model would be nonsense. The falsifiability of this model was not explicitly written.
  
  As explained above, calculating the variability of the generalized forces given the random nature of the state variable is a complex function that is not summarized using a Jacobian. Importantly the model is unable to reproduce or replicate the test force arbitrarily. In fact, we have already shown this (see Appendix 1- figure 1), where when we only attempt to explain the data with either a single coordinate system (or a combination of two coordinate systems) we are completely unable to replicate the test data despite using this model. For example, in experiment 4, when we don’t use the joint based coordinate system, the model predicts zero shift of the force compensation pattern while the behavioral data show a shift due to the contribution of the joint coordinate system. Any arbitrary model (similar to the random model we tested, please see the response to Reviewer 1) would be completely unable to recreate the test data. Our model instead makes very specific predictions about the weighting between the three coordinate systems and therefore completely specified force predictions for every possible test posture. We added this point to the Discussion
  
  “The results we present here support the idea that the motor system can use multiple representations during adaptation to novel dynamics. Specifically, we suggested that we combine three types of coordinate systems, where each is independent of the other (see Appendix 1- figure 1 for comparison with other combinations). Other combinations that include a single or two coordinate system can explain some of the results but not all of them, suggesting that force representation relies on all three with specific weights that change between generalization scenarios.”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.04.425325v2
www.biorxiv.org www.biorxiv.org

Mitochondrial genome sequencing of marine leukemias reveals cancer contagion between clam species in the Seas of Southern Europe

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Garcia-Souto, Bruzos, and Diaz et al. analyzed hemic neoplasia in warty venus clams at multiple sites throughout Europe. They identified cases of disease in two locations, in Galicia and in the Mediterranean. They then use Illumina sequencing to discover that the samples with cancer DNA had reads which mapped to the mtDNA reference sequences from a different clam species in the same family, suggesting a cross-species transmissible cancer. By mapping reads to both the V. verrucosa and C. gallina mitogenomes they showed that more reads mapped to C. gallina in cancer samples compared to matched host tissue samples, and this was consistent across the whole mitogenome. Phylogenetic analysis of mtDNA genes of the host and cancer samples as well as identification of SNVs at a short region of one single-copy nuclear locus suggest that all cancer samples come from a single C. gallina transmissible cancer clone. All data agree that a single lineage of cancer from C. gallina is responsible for all identified cancers in V. verrucosa.
  
  There are a few sections where there are either unclear methods or the methods do not quite match the descriptions of the results. 1. Regarding mapping of reads to different reference Cox1 sequences (for Figure 2a): "Then, we mapped the paired-end reads onto a dataset containing non-redundant mitochondrial Cytochrome C Oxidase subunit 1 (Cox1) gene references from 137 Vererid clam species." I do not see where this is explained anywhere in the methods, where this list of references comes from, or what is in it.
  
  Answer: We retrieved a dataset of 3,745 sequences comprising all the barcode-identified venerid clam Cox1 fragments available from the Barcode of Life Data System (BOLD, http://www.boldsystemns.org/). Redundancy was removed using CD-HIT (Fu, et al. 2012), applying a cut-off of 0.9 sequence identity, and sequences were trimmed to cover the same region. Whole-genome sequencing data from both healthy and tumoral warty venus clams was mapped onto this dataset, containing 118 venerid species-unique sequences, using BWA-mem, filtering out reads with mapping quality below 60 (-q60) and quantifying the overall coverage for each sequence with samtools idxstats. PCR primers were designed with Primer3 v2.3.7 (Koressaar, et al. 2018) to amplify a fragment of 354 bp from the Cox1 mitochondrial gene of V. verrucosa and C. gallina (F: CCT ATA ATA ATT GGK GGA TTT GG, R: CCT ATA ATA ATT GGK GGA TTT GG). PCR products were purified with ExoSAP-IT and sequenced by Sanger sequencing.
  
  Action: We have included this new information in the methods section.
  
  Regarding de novo assembly of mitogenomes: "Hence, we employed bioinformatic tools to reconstruct the full mitochondrial DNA (mtDNA) genomes in representative animals from the two species involved....Then, we mapped the paired-end sequencing data from the six neoplastic specimens with evidence of interspecies cancer transmission onto the two reconstructed species-specific mtDNA genomes." In contrast to this, the methods say, "Then, we run MITObim v1.9.1 (Hahn, Bachmann, & Chevreux, 2013) to assemble the full mitochondrial genome of all sequenced samples, using gene baits from the following Cox1 and 16S reference genes to prime the assembly of clam mitochondrial genomes." It is unclear which method was used.
  
  Answer: In total, we performed whole-genome sequencing on 23 samples from 16 clam specimens, which includes eight neoplastic and eight non-neoplastic animals by Illumina pairedend libraries of 350 bp insert size and reads 150 bp long. First we assembled the mitochondrial genomes of one V. verrucosa (FGVV18_193), one C. gallina (ECCG15_201) and one C. striatula (EVCS14_02) specimens with MITObim v1.9.1 (Hahn, et al. 2013), using gene baits from the 7 following Cox1 and 16S reference genes to prime the assembly of clam mitochondrial genomes: V. verrucosa (Cox1, with GenBank accession number KC429139; and 16S: C429301), C. gallina (Cox1: KY547757, 16S: KY547777) and C. striatula (Cox1: KY547747, 16S: KY547767). These draft sequences were polished twice with Pilon v1.23 (Walker, et al. 2014), and conflictive repetitive fragments from the mitochondrial control region were resolved using long read sequencing with Oxford Nanopore technologies (ONT) on a set of representative samples from each species and tumours. ONT reads were assembled with Miniasm v0.3 (Li 2016) and corrected using Racon v1.3.1 (Vaser, et al. 2017). Protein-coding genes, rDNAs and tRNAs were annotated on the curated mitochondrial genomes using MITOS2 web server (Bernt, et al. 2013), and manually curated to fit ORFs as predicted by ORF-FINDER (Rombel, et al. 2002). Then, we employed the entire mitochondrial DNAs of V. verrucosa (FGVV18_193) and C. gallina (ECCG15_201) as “references” to map reads from individuals with neoplasia, filter reads matching either mitogenome and assemble and polish their two (healthy and tumoral) mitogenomes individually as above. Further healthy individuals were later sequenced and their mitogenomes assembled, to further investigate the geographic and taxonomic spread of this neoplasia.
  
  Action: We have included this information in the methods section (page 21-22), and in the results (pages 7 and 8). mtDNA annotations are now shown in Supplementary Figure 3. Nucleotide data for the mitochondrial DNA assemblies has been uploaded to GenBank under accession numbers MW662590-MW662611 and will be released upon publication or request.
  
  There is one minor claim which may not be fully supported by the data: the statement that, "The analysis of mitochondrial and nuclear gene sequences revealed no nucleotide divergence between the seven tumours sequenced." If I am understanding the filtering of the SNVs from the nuclear gene correctly, only the presence or absence of the 14 SNVs that were fixed within each of the two species were analyzed. Therefore, it is unclear whether the authors looked for any additional somatic mutations within the cancer lineage that would have occurred at other positions. For mitochondria, the authors state that sequences were "extracted from paired-end sequencing data," but it is not explained how this was done. The data suggest that there are no differences between cancer samples in the 13 coding genes and 2 rDNA genes, but data on possible SNVs in the intergenic regions is not shown.
  
  Answer: We obtained a preliminary nuclear assembly using short-reads only. Obviously, the resulting assemblies are fragmented and incomplete. This has limited the identification of candidate regions shared by the three genomes (V. verrucosa and both Chamelea clams). Out of the 44 candidate nuclear fragments we tested, only two (DEAH12 and TFHII) turned out to give good PCR products, adequate for Sanger sequencing. As mentioned above, we now provide additional data on a second gene (TFIIH), identified and selected on the same basis as DEAH12. We find 14 and 15 sites, respectively, for the DEAH12 and the TFIIH loci, with fixed SNVs (allele frequency >95%) that allowed to discriminate between the three relevant species (V. verrucosa, C. gallina and C. striatula) and the tumour. These diagnostic nucleotides were then used to filter the reads from individuals with neoplasia harbouring both DNA’s. Variation within the host lineage but not within the tumour was found along the nuclear DNA fragments employen in the ML phylogenies (see figure below).
  
  Figure. Molecular phylogenies based on the two selected nuclear markers. (a) DEAH12 gene and (b) TFIIH gene, and diagnostic loci discriminating among species and tumour. Bootstrap support values (500 replicates) from ML analyses above 50 are shown above the corresponding branches. Note all diagnostic nucleotides are identical between tumours (black dots).
  
  Regarding the mtDNA, firstly, we assembled the mitochondrial genomes of one V. verrucosa (FGVV18_193), one C. gallina (ECCG15_201) and one C. striatula (EVCS14_02) specimens with MITObim v1.9.1 (Hahn, et al. 2013). Then, we employed the entire mitochondrial DNAs from V. verrucosa (FGVV18_193) and C. gallina (ECCG15_201) as “references” to map reads from individuals with neoplasia, filter reads matching either mitogenome and assemble and polish their two (healthy and tumoral) mitogenomes individually as above. Further healthy individuals were later sequenced and their mitogenomes assembled, to further investigate the geographic and taxonomic spread of this neoplasia. Despite the usefulness of the mitochondrial control region (CR) to detect differences among lineages, we refrained from using it for two reasons. (1) The CR shows considerable variation in both length and sequence among the three species, making their alignment difficult (in fact, previous phylogenetic studies based on whole mitochondrial DNA sequences in Veneridae excluded the CR: https://doi.org/10.1111/zsc.12454), and (2) the CR contains quasi-but-not-identical tandem repeats, as a other mollusks (i.e., the Venerid Dosinia clams https://doi.org/10.1371/journal.pone.0196466 or the Littorina marine snails https://doi.org/10.1016/j.margen.2016.10.006). In our case, repeats are larger than the short-reads insert size, and even though we could infer them by means of long read sequencing, polishing the resulting consensus sequences to overcome the intrinsic error rate of those lectures would yield inconclusive results, hindering the comparison between normal and tumoral haplotypes.
  
  Action: We updated the methods for the mitochondrial DNA analyses (pages 21-22, 24) and the nuclear DNA analyses (page 23). We now include new data in the results and discussion (pages 9-10).
  
  Reviewer #2 (Public Review):
  
  In rare but well-documented instances, certain types of cancers can transmit horizontally. These transmissible cancers have a clonal origin and have adapted to bypass allorecognition. A form of marine leukemia (hemic neoplasia or HM) belongs to this class of transmissible cancers and has been detected in several bivalve species (oysters, mussels, cockles and clams). Although HM mostly propagates within the same bivalve species, instances of cross-species transmission have been reported. To better understand the mode of transmission of HM, Garcia-Souto et al. analysed mitochondrial DNA (mtDNA) by next generation sequencing in different bivalve species collected in the Mediterranean Sea and the Atlantic Ocean. The authors found that HM isolated in Venus verrucosa contained mtDNA that actually matched Chamelea gallina. Analysis of the nuclear gene DEAH12 also showed single nucleotide polymorphisms (SNPs) matching C. gallina DNA. Based on mtDNA and DEAH12 sequences, the authors use Bayesian inference to generate phylogenetic trees showing that HM found in V. verrucosa is much closer to C. gallina than the host species. They conclude that HM propagated from C. gallina to V. verrucosa.
  
  Overall, the study is well performed with enough samples analysed. The results are quite convincing but there are also some concerns.
  
  Transmissible cancers are known to split into clades based on mtDNA differential rate of evolution and also to incorporate mtDNA from exogenous sources, so one has to be extra careful that the results prove cross-species transmission and not HM divergence into two clades and/or exogenous acquisition. Samples HM ERVV17-2997 and EMVV18-376, both at the N1 stage, appear devoid of C. gallinae mtDNA and do not appear to have been screened for DEAH12. One explanation for this result is that there are too few HM cells in the samples (but supplementary Figure 1 shows some HM cells in ERVV17-2997. However, a different explanation is that these samples contain V. verrucosae mtDNA. ERVV17-2997 and EMVV18-376 could have been analysed in greater depth to verify that they also contained C. gallinae mtDNA and typical DEAH12 SNPs.
  
  Answer: Despite the high sequencing coverage obtained for the sequenced individuals, we did not find foreign reads in the N1 tumours (ERVV17-2997 and EMVV18-73) to mitochondrial nor nuclear (i.e., DEAH12, TFHII) level. This is most likely due to a very low proportion of neoplastic cells in their tissues.
  
  Action: We have added a sentence on page 8 that discuss this issue.
  
  To strengthen their argument, the authors could have analysed a few more nuclear genes for specific SNPs, although the sensitivity of this approach will depend on the depth of sequencing.
  
  Answer: We obtained a preliminary nuclear assembly using short-reads only. Obviously, the resulting assemblies are fragmented and incomplete. This has limited the identification of candidate regions shared by the three genomes (V. verrucosa and both Chamelea clams). Out of the 44 candidate nuclear fragments we tested, only two (DEAH12 and TFHII) turned out to give good PCR products, adequate for Sanger sequencing. As mentioned above, we now provide additional data on a second gene (TFIIH), identified and selected on the same basis as DEAH12. Individual ML phylogenies for these two fragments evidenced that tumours cluster together and separately from the host species and, in the case of DEAH12, closer to C. gallina. The MSC phylogeny was rebuilt including this new nuclear fragment. 12 In addition, we conducted a comparative screening of tandem repeats on the genomes of C. gallina and V. verrucosa. Two DNA satellites, namely CL4 and CL17, of, respectively, 332 and 429 bp monomer size, were very abundant in C. gallina and in the tumoral animals, but absent from all healthy V. verrucosa specimens. FISH probes designed for these satellites mapped on the heterochromatic regions, mainly in subcentromeric and subtelomeric positions, of both C. gallina and the neoplastic metaphases found in V. verrucosa, but were absent from the normal metaphases of the host species V. verrucosa. These results were consistent with the genomic abundance of these satellites in the NGS data and strongly suggest that these chromosomes derive from C. gallina.
  
  Action: We include the analysis of one additional nuclear locus, TFIIH (pages 9-10). We have obtained new ML and MSC phylogenies including this new locus (pages 9-10, figures 3b-c). Additional FISH approach looking for satellite DNA CL4 and CL11 was performed (page 10, figure 3d, supplementary figure 5). The methods section has been updated accordingly (pages 20- 21, 23-24).
  
  It would have been interesting to have more information in the Discussion on the potential immunological barriers that this tumour needs to overcome for cross-species transmission.
  
  Answer: At a glance, we could argue/discuss that this transmissibility, inside or cross-species, is prone to occur in bivalves due to their filtering feeding system and the fact that their immune system is not entirely developed and yet to be completely understood, as the reviewer may know. Also, it would be tempting to suggest that some genetic restrictions allowing for cancer contagion happening only between close taxa might be in place, but, unfortunately we do not have the means to state that with our current data.
  
  Action: At this point, no specific action has been taken for this query. However, we are happy to include something in the discussion if the reviewer still thinks this is relevant for improving the manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.10.434714v1
www.biorxiv.org www.biorxiv.org

Efficient differentiation of human primordial germ cells through geometric control reveals a key role for NODAL signaling

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Jo et al. use a combination of micropatterned differentiation, single cell RNA sequencing and pharmacological treatments to study primordial germ cell (PGC) differentiation starting from human pluripotent stem cells. Geometrical confinement in conjunction with a pre-differentiation step allowed the authors to reach remarkable differentiation efficiencies. While Minn et al. already reported the presence of PGC-like cells in micropatterned differentiating human cultures by scRNA-Seq (as acknowledged by the authors), the careful characterization of the PGC-like population using immunostainings and scRNA-Seq is a strength of the manuscript. The attempt at mechanistically dissecting the signaling pathways required for PGC fate specification is somehow weaker. The authors do not present sufficient evidence supporting the ability to specify PGC fate in the absence of Wnt signaling and the importance of the relative signaling levels of BMP to Nodal pathways; the wording of the text should be amended to better reflect the presented evidence or the authors should perform additional experiments to support these claims.
  
  We thank the reviewer for this comment. As described in more detail in the responses below, we have significantly strengthened the evidence for the rescue of Wnt inhibition by exogenous Activin treatment and have nuanced our interpretation. We believe that our data suggest low levels of Wnt may be required directly for PGC competence, while much higher levels are required indirectly to induce Nodal, with Nodal signaling being the limiting factor for PGC specification under the reference condition with BMP4 treatment only. We describe this in detail in the manuscript but summarize it here in a simplified diagram:
  
  We have also carried out additional experiments that match model predictions demonstrating the importance of relative BMP and Nodal signaling levels and amended the text to reflect the evidence as suggested. More details are provided below.
  
  The molecular characterization of why colonies confined to small areas differentiate much better would greatly increase the biological significance of the manuscript (the technical achievement of reaching such efficiency is impressive on its own).
  
  We believe the mechanism by which cells confined to small colonies differentiate to PGCLCs more efficiently is explained by a larger fraction of the cells being exposed to the necessary levels of BMP and Nodal signaling. In large colonies BMP signaling was shown to be restricted to a distance of 50-100 um from the colony edge through receptor localization and secretion of inhibitors (Etoc et al, Dev Cell 2016). From this one would expect that BMP signaling extends a similar distance from the edge in small colonies, so that a larger fraction of cells are receiving the BMP signal needed to differentiate to PGCLCs. Because it was not previously shown that the length scale of BMP signaling and downstream signals are preserved as colony size is reduced, we have now included an analysis of BMP signaling (pSmad1 levels) and Nodal signaling (nuclear Smad2/3 levels) as a function of colony size (Figure 5i-k). This confirms our hypothesis and provides a potential mechanism.
  
  The authors propose a mathematical model based on BMP and Nodal signaling that qualitatively recapitulate their experimental data. While the authors should be commended for providing examples of other simple models that do not fully recapitulate their data, it would have been nice to see an attempt at challenging quantitatively the model. In particular, the authors do not take advantage of the ability to explore in a more systematic manner the BMP/Nodal phase space with their system.
  
  We thank the reviewer for this suggestion. Experimentally we have now tested the effect of 5x5 = 25 different combinations of BMP and Activin doses on PGCLC differentiation. We then challenged the mathematical model to predict the ‘phase diagram’ corresponding to this data with good agreement (Figure 6f). It is important to note here that the model was fit using only data with 50ng/ml of BMP, making this a true prediction. We also point out that the phase diagram predicted in this way is different from the one shown in Figure 6d, not only because of the lower resolution, but because Figure 6f shows the steady state after uniform stimulation in space and time (i.e. the response on the very edge), whereas the predicted phase diagram shows average expression at 42h in a 100um range from the colony edge using the previously measured spatiotemporal gradients of BMP and Activin response. Finally, the data in Figure 6f shows mean expression levels as opposed to the percentage double positive cells for the same data in Figure 4q because our model does not simulate individual cells and noise, only allowing us to compare mean expression. We explain all this in the text now. As a minor change to facilitate comparison of data and model we have now plotted the concentrations of BMP and Activin in Figure 6 rather than the scaled model parameters from 0 to 1, we also further optimized the model parameters without qualitative changes.
  
  The authors' claim that PGCLC formation can be rescued by exogenous Activin when blocking endogenous Wnt production is surprising given the literature. The authors only show that they can restore a TFAP2C+SOX17+ population but do not actually stain for an established germ cell marker. It appears essential to perform a PRDM1 staining in these conditions (Figure 4A) to unambiguously identify this population.
  
  We have significantly extended our analysis of the effect of WNT inhibition and subsequent rescue of PGCs by Activin treatment. This includes staining for TFAP2C,NANOG,PRDM1 and staining for LEF1 as a measure of WNT signaling. Figure 4 and Figure 4—figure supplement 1 now also include treatment with IWR-1, a different small molecule inhibitor of WNT signaling, as well inhibition by IWR-1 and IWP2 at different times and different doses.
  
  The authors only provide weak evidence that the fates depend on the relative signaling levels of BMP and Nodal. Indeed, fewer cells acquire a fate the lower BMP concentration they use, including the fates marked by Sox17 expression. It would more convincing to show the assay of Figure 4F for a range of BMP concentrations at which the overall differentiation works sufficiently well.
  
  As suggested, we have now included a range of BMP concentrations. The reduction in PGCs at lower BMP doses is in line with our model and does not contradict a dependence on the relative signaling levels of BMP and Nodal by which we mean that optimal dose of Activin for PGCLC specification depends on the level of BMP and vice versa. We have amended the text to state this more clearly.
  
  References
  
  Chen, Di, Na Sun, Lei Hou, Rachel Kim, Jared Faith, Marianna Aslanyan, Yu Tao, et al. 2019. “Human Primordial Germ Cells Are Specified From Lineage-Primed Progenitors..” Cell Reports 29 (13): 4568–4582.e5. doi:10.1016/j.celrep.2019.11.083.
  
  Etoc, Fred, Jakob Metzger, Albert Ruzo, Christoph Kirst, Anna Yoney, M Zeeshan Ozair, Ali H Brivanlou, and Eric D Siggia. 2016. “A Balance Between Secreted Inhibitors and Edge Sensing Controls Gastruloid Self-Organization..” Developmental Cell 39 (3): 302–15. doi:10.1016/j.devcel.2016.09.016.
  
  Kobayashi, Toshihiro, Haixin Zhang, Walfred W C Tang, Naoko Irie, Sarah Withey, Doris Klisch, Anastasiya Sybirna, et al. 2017. “Principles of Early Human Development and Germ Cell Program From Conserved Model Systems..” Nature 546 (7658): 416–20. doi:10.1038/nature22812.
  
  Kojima, Yoji, Kotaro Sasaki, Shihori Yokobayashi, Yoshitake Sakai, Tomonori Nakamura, Yukihiro Yabuta, Fumio Nakaki, et al. 2017. “Evolutionarily Distinctive Transcriptional and Signaling Programs Drive Human Germ Cell Lineage Specification From Pluripotent Stem Cells..” Cell Stem Cell 21 (4): 517–532.e5. doi:10.1016/j.stem.2017.09.005.
  
  Sasaki, Kotaro, Tomonori Nakamura, Ikuhiro Okamoto, Yukihiro Yabuta, Chizuru Iwatani, Hideaki Tsuchiya, Yasunari Seita, et al. 2016. “The Germ Cell Fate of Cynomolgus Monkeys Is Specified in the Nascent Amnion..” Developmental Cell 39 (2): 169–85. doi:10.1016/j.devcel.2016.09.007.
  
  Tyser, R.C.V., Mahammadov, E., Nakanoh, S. et al. Single-cell transcriptomic characterization of a gastrulating human embryo. Nature 600, 285–289 (2021). https://doi.org/10.1038/s41586-021-04158-y
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.08.04.455129v1
www.biorxiv.org www.biorxiv.org

New submission 08/11/2022, 20:27:29

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1:
  
  This is a very timely paper that addresses an important and difficult-to-address question in the decision-making field - the degree to which information leakage can be strategically adapted to optimise decisions in a task-dependent fashion. The authors apply a sophisticated suite of analyses that are appropriate and yield a range of very interesting observations. The paper centres on analyses of one possible model that hinges on certain assumptions about the nature of the decision process for this task which raises questions about whether leak adjustments are the only possible explanation for the current data. I think the conclusions would be greatly strengthened if they were supported by the application and/or simulation of alternative model structures.
  
  We thank the reviewer for this positive appraisal of our study. We now entirely agree with their central comment about whether leak adjustments are the only (or even the best) explanation for the current data. We hope that the additional modelling sections that we have discussed in response to main comment 1 above have strengthened the paper. We have responded point-by-point to their public review, as this contained their main recommendations for revision.
  
  The behavioural trends when comparing blocks with frequent versus rare response periods seem difficult to tally with a change in the leak. […] Are there other models that could reproduce such effects? For example, could a model in which the drift rate varies between Rare and Frequent trials do a similar or better job of explaining the data?
  
  We can see why the reviewer has advocated for a possible change of drift rate (or ‘gain’ applied to sensory evidence) between conditions to explain our behavioural findings. We found, however, that changes in drift rate could elicit qualitatively similar changes in integration kernels to changes in decision threshold:
  
  Author response image 1.
  
  Changes in gain applied to incoming sensory evidence (A parameter in model) have similar effects on recovered integration kernels from Ornstein-Uhlenbeck simulation as changes in decision threshold.
  
  The likely reason for this is that the overall probability of emitting a response at any point in the continuous decision process is determined by the ratio of accumulated evidence to decision threshold. A similar logic applies to effects on reactions times and detection probability (main figure 2): increasing sensory gain/decreasing decision threshold will lead to faster reaction times and increased detection probability during response periods.
  
  Both parameters may even have a similar effect on ‘false alarms’, because (as the reviewer notes below) false alarms in our paradigm are primarily being driven by the occurrence of stimulus changes as well as internal noise. In fact, the false alarm findings mean it is difficult to fully reconcile all of our behavioural findings in terms of changes in a single set of model parameters in the O-U process. It is possible that other changes not considered within our model (such as expectations of hazard rates of inter-response intervals leading to dynamic thresholds etc.) may have had a strong impact upon the resulting false alarm rates. A full exploration of different variations in O-U model (with varying urgency signals, hazard rates, etc.) is beyond the scope of this paper.
  
  For this reason, we have decided in our new modelling section to focus primarily on a single, well-established model (the O-U process) and explore how changes in leak and threshold affect task performance and the resulting integration kernels. We note that this is in line with the suggestion of reviewer #2, who focussed on similar behavioural findings to reviewer #1 but suggested that we look at decision threshold rather than drift rate as our primary focus.
  
  This ties in to a related query about the nature of the task employed by the authors. Due to the very significant volatility of the stimulus, it seems likely that the participants are not solely making judgments about the presence/absence of coherent motion but also making judgments about its duration (because strong coherent motion frequently occurs in the inter-target intervals). If that is so, then could the Rare condition equate to less evidence because there is an increased probability that an extended period of coherent motion could be an outlier generated from the noise distribution? Note that a drift rate reduction would also be expected to result in fewer hits and slower reaction times, as observed.
  
  As mentioned above, the rare and frequent targets are indeed matched in terms of the ease with which they can be distinguished from the intervening noise intervals. To confirm this, we directly calculated the variance (across frames) of the motion coherence presented during baseline periods and response periods (until response) in all four conditions:
  
  Author response image 2.
  
  The average empirical standard deviation of the stimulus stream presented during each baseline period (‘baseline’) and response period (‘trial’), separated by each of the four conditions (F = frequent response periods, R = rare, L = long response periods, S = short). Data were averaged across all response/baseline periods within the stimuli presented to each participant (each dot = 1 participant). Note that the standard deviation shown here is the standard deviation of motion coherence across frames of sensory evidence. This is smaller than the standard deviation of the generative distribution of ‘step’-changes in the motion coherence (std = 0.5 for baseline and 0.3 for response periods), because motion coherence remains constant for a period after each ‘step’ occurs.
  
  Some adjustment of the language used when discussing FAs seems merited. If I have understood correctly, the sensory samples encountered by the participants during the inter-response intervals can at times favour a particular alternative just as strongly (or more strongly) than that encountered during the response interval itself. In that sense, the responses are not necessarily real false alarms because the physical evidence itself does not distinguish the target from the non-target. I don't think this invalidates the authors' approach but I think it should be acknowledged and considered in light of the comment above regarding the nature of the decision process employed on this task.
  
  This is a good point. We hope that the reviewer will allow us to keep the term ‘false alarms’ in the paper, as it does conveniently distinguish responses during baseline periods from those during response periods, but we have sought to clarify the point that the reviewer makes when we first introduce the term.
  
  “Indeed, participants would occasionally make ‘false alarms’ during baseline periods in which the structure of the preceding noise stream mistakenly convinced them they were in a response period (see Figure 4, below). Indeed, this means that a ‘false alarm’ in our paradigm has a slightly different meaning than in most psychophysics experiments; rather than it referring to participants responding when a stimulus was not present, we use the term to refer to participants responding when there was no shift in the mean signal from baseline.”
  
  And:
  
  “The fact that evidence integration kernels naturally arise from false alarms, in the same manner as from correct responses, demonstrates that false alarms were not due to motor noise or other spurious causes. Instead, false alarms were driven by participants treating noise fluctuations during baseline periods as sensory evidence to be integrated across time, and the physical evidence preceding ‘false alarms’ need not even distinguish targets from non-targets.”
  
  The authors report that preparatory motor activity over central electrodes reached a larger decision threshold for RARE vs. FREQUENT response periods. It is not clear what identifies this signal as reflecting motor preparation. Did the authors consider using other effectorselective EEG signatures of motor preparation such as beta-band activity which has been used elsewhere to make inferences about decision bounds? Assuming that this central ERP signal does reflect the decision bounds, the observation that it has a larger amplitude at the response on Rare trials appears to directly contradict the kernel analyses which suggest no difference in the cumulative evidence required to trigger commitment.
  
  Thanks for this comment. First, we should simply comment that this finding emerged from an agnostic time-domain analysis of the data time-locked to button presses, in which we simply observed that the negative-going potential was greater (more negative) in RARE vs. FREQUENT trials. So it is simply the fact that it precedes each button press that we relate it to motor preparation; nonetheless, we note that (Kelly and O’Connell, 2013) found similar negative-going potentials at central sensors without applying CSD transform (as in this study). Like them, we would relate this potential to either the well-established Bereitschaftpotential or the contingent negative potential (CNV).
  
  We agree that many other studies have focussed on beta-band activity as another measure of motor preparation, and to make inferences about decision bounds. To investigate this, we used a Morlet wavelet transform to examine the time-varying power estimate at a central frequency of 20Hz (wavelet factor 7). We repeated the convolutional GLM analysis on this time-varying power estimate.
  
  We first examined average beta desynchonisation at a central cluster of electrodes (CPz, CP1, CP2, C1, Cz, C2) in the run-up to correct button presses during response periods. We found a reliable beta desynchonisation occurred, and, just as in the time-domain signal, this reached a greater threshold in the RARE trials than in the FREQUENT trials:
  
  Author response image 3.
  
  Beta desynchronisation prior to a correct response is greater over central electrodes in the RARE condition than in the FREQUENT condition.
  
  We agree with the reviewer that this is likely indicative of a change in decision threshold between rare and frequent trials. We also note that our new computational modelling of the O-U process suggests that this in fact reconciles well with the behavioural findings (changes in integration kernels). We now mention this at the relevant point in the results section:
  
  “As large changes in mean evidence are less frequent in the RARE condition, the increased neural response to |Devidence| may reflect the increased statistical surprise associated with the same magnitude of change in evidence in this condition. In addition, when making a correct response, preparatory motor activity over central electrodes reached a larger decision threshold for RARE vs. FREQUENT response periods (Figure 7b; p=0.041, cluster-based permutation test). We found similar effects in beta-band desynchronisation prior, averaged over the same electrodes; beta desynchronisation was greater in RARE than FREQUENT response periods. As discussed in the computational modelling section above, this is consistent with the changes in integration kernels between these conditions as it may reflect a change in decision threshold (figure 2d, 3c/d). It is also consistent with the lower detection rates and slower reaction times when response periods are RARE (figure 2 b/c).”
  
  We did also investigate the lateralised response (left minus right beta-desynchronisation, contrasted on left minus right responses). We found, however, that we were simply unable to detect a reliable lateralised signal in either condition using these lateralised responses. We suspect that this is because we have far fewer response periods than conventional trialbased EEG experiments of decision making, and so we did not have sufficient SNR to reliably detect this signal. This is consistent with standard findings in the literature, which report that the magnitude of the lateralised signal is far smaller than the magnitude of the overall beta desynchronisation (e.g. (Doyle et al., 2005))
  
  P11, the "absolute sensory evidence" regressor elicited a triphasic potential over centroparietal electrodes. The first two phases of this component look to have an occipital focus. The third phase has a more centroparietal focus but appears markedly more posterior than the change in evidence component. This raises the question of whether it is safe to assume that they reflect the same process.
  
  We agree. We have now referred to this as a ‘triphasic component over occipito-parietal cortex’ rather than centroparietal electrodes.
  
  Reviewer #2:
  
  Overall, the authors use a clever experimental design and approach to tackle an important set of questions in the field of decision-making. The manuscript is easy to follow with clear writing. The analyses are well thought-out and generally appropriate for the questions at hand. From these analyses, the authors have a number of intriguing results. So, there is considerable potential and merit in this work. That said, I have a number of important questions and concerns that largely revolve around putting all the pieces together. I describe these below.
  
  Thanks to the reviewer for their positive appraisal of the manuscript; we are obviously pleased that they found our work to have considerable potential and merit. We seek to address the main comments from their public review and recommendations below.
  
  1) It is unclear to what extent the decision threshold is changing between subjects and conditions, how that might affect the empirical integration kernel, and how well these two factors can together explain the overall changes in behavior.
  
  I would expect that less decay in RARE would have led to more false alarms, higher detection rates, and faster RTs unless the decision threshold also increased (or there was some other additional change to the decision process). The CPP for motor preparatory activity reported in Fig. 5 is also potentially consistent with a change in the decision threshold between RARE and FREQUENT. If the decision threshold is changing, how would that affect the empirical integration kernel? These are important questions on their own and also for interpreting the EEG changes.
  
  This important comment, alongside the comments of reviewer 1 above, made us carefully consider the effects of changes in decision threshold on the evidence integration kernel via simulation. As discussed above (in response to ‘essential revisions for the authors’), we now include an entirely new section on how changes in decision threshold and leak may affect the evidence integration kernel, and be used to optimise performance across the different sensory environments. In particular, we agree with the reviewer that the motor preparatory activity that differs between RARE and FREQUENT is consistent with a change in decision threshold, and our simulations have suggested that our behavioural findings on evidence integration are also consistent with this change as well. These are detailed on pp.1-4 of the rebuttal, above.
  
  2) The authors find an interesting difference in the CPP for the FREQUENT vs RARE conditions where they also show differences in the decay time constant from the empirical integration kernel. As mentioned above, I'm wondering what else may be different between these conditions. Do the authors have any leverage in addressing whether the decision threshold differs? What about other factors that could be important for explaining the CPP difference between conditions? Big picture, the change in CPP becomes increasingly interesting the more tightly it can be tied to a particular change in the decision process.
  
  We fully agree with the spirit of this comment, and we’ve tried much more carefully to consider what the influences of decision threshold and leak would be on our behavioural analyses. As discussed in the response to reviewer 1, we think that the negative-going potential at the time of responses (which is greater in RARE vs. FREQUENT, main figure 7b, and mirrored by equivalent changes in beta desynchronisation, see Reviewer Response Figure 5 above) are both reflective of a change in decision threshold between RARE and FREQUENT conditions. We have tried to make this link explicit in the revised results section:
  
  “As large changes in mean evidence are less frequent in the RARE condition, the increased neural response to |Devidence| may reflect the increased statistical surprise associated with the same magnitude of change in evidence in this condition. In addition, when making a correct response, preparatory motor activity over central electrodes reached a larger decision threshold for RARE vs. FREQUENT response periods (Figure 7b; p=0.041, cluster-based permutation test). We found similar effects in beta-band desynchronisation prior, averaged over the same electrodes; beta desynchronisation was greater in RARE than FREQUENT response periods. As discussed in the computational modelling section above, this is consistent with the changes in integration kernels between these conditions as it may reflect a change in decision threshold (figure 2d, 3c/d). It is also consistent with the lower detection rates and slower reaction times when response periods are RARE (figure 2 b/c).”
  
  I'll note that I'm also somewhat skeptical of the statements by the authors that large shifts in evidence are less frequent in the RARE compared to FREQUENT conditions (despite the names) - a central part of their interpretation of the associated CPP change. The FREQUENT condition obviously has more frequent deviations from the baseline, but this is countered to some extent by the experimental design that has reduced the standard deviation of the coherence for these response periods. I think a calculation of overall across-time standard deviation of motion coherence between the RARE and FREQUENT conditions is needed to support these statements, and I couldn't find that calculation reported. The authors could easily do this, so I encourage them to check and report it.
  
  See Author response image 2.
  
  3) The wide range of decay time constants between subjects and the correlation of this with another component of the CPP is also interesting. However, in trying to interpret this change in CPP, I'm wondering what else might be changing in the inter-subject behavior. For instance, it looks like there could be up to 4 fold changes in false alarm rates. Are there other changes as well? Do these correlate with the CPP? Similar to my point above, the changes in CPP across subjects become increasingly interesting the more tightly it can be tied to a particular difference in subject behavior. So, I would encourage the authors to examine this in more depth.
  
  Thanks for the interesting suggestion. We explored whether there might be any interindividual correlation in this measure with the false alarm rate across participants, but found that there was no such correlation. (See Author response image 4; plotting conventions are as in main figure 9).
  
  Author response image 4.
  
  No evidence of between-subject correlations in CPP responses and false alarm rates, in any of the four conditions.
  
  We hope instead that the extended discussion of how the integration kernel should be interpreted (in light of computational modelling) provides at least some increased interpretability of the between-subject effects that we report in figure 9.
  
  Reviewer #3 (Public Review):
  
  The main strength is in the task design which is novel and provides an interesting approach to studying continuous evidence accumulation. Because of the continuous nature of the task, the authors design new ways to look at behavioral and neural traces of evidence. The reverse-correlation method looking at the average of past coherence signals enables us to characterize the changes in signal leading to a decision bound and its neural correlate. By varying the frequency and length of the so-called response period, that the participants have to identify, the method potentially offers rich opportunities to the wider community to look at various aspects of decision-making under sensory uncertainty.
  
  We are pleased that the reviewer agrees with our general approach as a novel way of characterising various aspects of decision-making under uncertainty.
  
  The main weaknesses that I see lie within the description and rigor of the method. The authors refer multiple times to the time constant of the exponential fit to the signal before the decision but do not provide a rigorous method for its calculation and neither a description of the goodness of the fit. The variable names seem to change throughout the text which makes the argumentation confusing to the reader. The figure captions are incomplete and lack clarity.
  
  We apologise that some of our original submission was difficult to follow in places, and we are very grateful to the reviewer for their thorough suggestions for how this could be improved. We address these in turn below, and we hope that this answers their questions, and has also led to a significant improvement in the description and rigour of the methodology.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.08.18.504278v2
www.biorxiv.org www.biorxiv.org

New submission 30/12/2022, 12:46:03

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  I am not a specialist in cryo-EM, so cannot comment on the technicalities of the structure reconstruction or methods used. I thus focus on the conclusions and observations that the authors provide in the manuscript and their relevance to functional photosynthesis.
  
  The authors attempt to resolve the structure of PSII from Dunaliella and noticed that three types of PSII could be identified: two conformational states, and a stacked configuration. There is no doubt that these structures add to our current knowledge of PSII and that they exist in abundance upon solubilisation of the sample. My main issue however is the relevance to in vivo conditions, and the efforts to exclude the possibility that pigment loss and conformational states and stacking are a reflection of ex-vivo manipulations.
  
  Our compact model contains 202 Chls molecules while the stretched conformation contains 206 Chls. All of the differences in Chl binding are attributed to CP29. We have compiled a table enumerating the different CP29 structures currently available from plants and green alga at similar resolution to our work (Supplementary table 2). In the larger plant complexes (C2S2M2) CP29 contains 14 chls, while CP29 in smaller C2S2 complexes contains 10-13 chls, so it appears the some chl loss from CP29 is associated with the release of LHCIIM. In the green alga structures, CP29 contains less chls in general and shows a similar trend. The currently published structure most relevant to our work contains 8 chls (6KAC), a somewhat lower amount then both the compact and stretched models (9 and 11 chls, respectively). The stretched orientation, which is the closest match to the known PSII core arrangement, therefore contains more chls than comparable models. While the in-vivo configuration is not known in the sense that it could contain more chls, the current structure is apparently the closest representation of it.
  
  The presence of CP29 with lower chls content in the chlamy C2S2 (6KAC, which is in a stretched orientation) supports a conclusion that pigment loss from CP29 alone is not sufficient to trigger the stretch to compact transition although it is associated with it. In general, the precise orientation of CP29 is variable and seem to depend on the binding of additional LHCII, it is possible that some chl loss is accompanied with these changes in vivo.
  
  I see a number of questions pertaining to this work. Starting from the two conformations of PSII, compact and stretched, the authors say that both are highly active based on oxygen measurements at a saturating light intensity. In the meantime, they report large variations in the chl content and positions of the chlorophyll molecules in these structures (also compared to other known PSIIs). This gives the impression that one can lose two chlorophylls, and freely modify the distance between others without losing efficiency, certainly a risky conclusion. Are the samples highly active also in light-limiting conditions? It is thought that even tiny movements and alterations in chl-chl distances alter their coupling and spectral properties, how come the variations in this report are so huge? In other words, the assay tests the charge separation activity of the PSII RC in the preps, but not the light-harvesting efficiency.
  
  The chl content differences reported in this work amounts to 2%. In our opinion this represents quite a low variation in pigment content, which exist in virtually any experiment involving large complexes. We agree that measurements of activity in limiting light conditions are interesting, however this goes beyond the scope of the current work. Light harvesting efficiency in PSII is known to vary substantially as a result of additional mechanisms (NPQ in some of its forms), not associated with chl loss or gain. While the formation of quenching centers is attributed to small structural changes within specific pigment protein complexes, what we are showing in this work are structural changes between pigment protein complexes. These can affect transfer rates between the different complexes but are distinct from the structural changes thought to accompany the formation of quenching centers within specific pigment protein complexes.
  
  How does one ascertain that the lost chlorophyll molecules in CP29 are not a preparation error? Does slightly increasing the detergent concentration impact the proportion of stretched:compact forms?
  
  The effect of detergent concentration on the proportion of the different forms was not tested directly. However, we do not detect many differences in lipids or bound detergent molecules content between the two conformations, suggesting that for these “ligands” the differences are not substantial. We can only distinguish these two forms at the very last stages of data processing, at the present state of cryoEM cost and time availability, mapping the effect of detergent concentration on the different orientations is outside our reach.
  
  On a similar note, how do the authors exclude that a certain interaction with this type of grid impacts the distribution of these complexes? Is it identical to a biologically separate preparation of algae? In case of discoveries of this type, it is of high importance to exclude as many possibilities of non-native conditions or influences on the structure.
  
  It’s hard to completely exclude grid and sample preparation issues. However, we employed relatively standard grids and vitrification conditions. The observed complexes are embedded in vitrified ice and do not interact with the grid directly. The differences we observed are mainly in the orientations of the PSII cores, all the interactions between PSII subunits within each core are preserved and agree with previously published structures. Since the interactions within the core and between cores involve the same physical principles, we think its fairly conservative to think that the observed core orientations are not an artefact of sample preparation.
  
  I would further like to encourage the authors to elaborate on the CP29 phosphorylation. What is the proportion of PSIIcomp that are phosphorylated? I assume it is not 100%, as in this case, the authors would propose that this is the effect that modulates between compact and stretched architectures.
  
  Its difficult to estimate the proportion of observed phosphorylation/sulfinylation. To be detected in maps, most of the residues (above 50%) are probably modified. We attempted to estimate this by refining the atom occupancies of the Pi molecule on Ser84 and the oxygens attached to Cys218, both values suggested that about 70% of the complexes are modified. With regards to the possibility that these modifications can promote the formation of the compact state, we think that this is certainly a possibility, since these modifications were detected in this state and are in close proximity to each other. However, this can also result from the resolution differences of the maps and the structural implications of both modifications are hard to predict. At this point we prefer to note their existence without further interpretations.
  
  In line 290, the authors highlight the structural heterogeneity within the two groups' PSII conformations. I would like to see how does the distribution look like for all the structures together: are the two (stretched and compact) specifically forming two heterogenous distributions? Or is it possible that the distribution between the two is quasi-continuous? In other words, if the structures are not perfectly defined, how do the authors decide that two- and not more or less subtypes exist?
  
  We went back and refined the initial particle group (containing both compact and stretched orientations) using multibody with masks defining the two PSII monomers. This analysis showed the expected two peaks only in the first Principal components which accounted for ~38% of the variance in the dataset.
  
  Multibody refinement carried out on the combined particle dataset shows one very large PC accounting for about 38% of the variance and the presence of two distinct peaks in the particle distribution of the first PC.
  
  From this analysis it’s clear that there are two distinct classes in this particle set (as expected), as none of the other PC’s shows any signs of multiple peaks, this analysis suggests that two distinct models are the best representation of this eukaryotic PSII. Whether these are quasi continuous or distinct is more complex. There is continuity in this representation (particle distributions along PC), a different picture may appear if characters such as CP29 state are considered, but the size of CP29 and the remaining heterogeneity does not provide enough signal to carry out this classification at the moment.
  
  Considering the stacked PSII, I also have a few concerns. Contrary to previous studies the authors do not assign a functional role to the stacking beyond the structural aspect. This could be better backed by a discussion about the closest chlorophyll a molecules across the stacked PSII, which given the rather large distance shown in fig. 4L seems to be too large for any EET across the stromal gap.
  
  The closest chl-chl distance that we can measure in the stacked PSII dimer is ~54 Å, with most distances at the ~70 Å range, making EET between staked complexes very slow. We have added a statement clarifying this to our manuscript. In our opinion a structural role for the staked PSII dimer is more likely.
  
  There is a report that suggests the presence of some density between the stacked PSII - could the authors comment on the differences between it and their work? Are the angles and positions conserved between these types of stacks? https://doi.org/10.1038/s41598-017-10700-8
  
  We referred to Albanese et al, in our manuscript. We isolated the C2S2 complex from green alga, the analysis in Albanese et al was done on C2S2M1 complexes from pea and this can account for some of the differences. At any rate, our conclusion that we don’t find any evidence for protein linkers in the stacked complex is stated clearly. The angles described in Albanese et al are consistent with our analysis.
  
  Line 387, the authors state that due to the transient nature of the interactions across the stromal gap, the stacks could be "under-detected" in cryo-ET data. This statement is in my opinion misformulated. For once, the transient interaction argument would apply the same (if not more due to changing conditions induced by the purification process) to the single particle analysis performed in this paper. Second, tomographic volumes detect hundreds of PSII in a suspended state. Any transient interaction that adds up to 25% of particle population in a steady state cell should be clearly visible, while the in situ data suggests not more than random cross-stromal-gap orientations. Of course, this can be a specificity of Chlamydomonas or a particular growth condition. The statement used by the authors could be indeed converted into: the PSII stacks are over-detected in vitro, and it is certainly a simpler explanation for their presence. It is also important to mention that PSII stacking alone is not the only reason for grana architecture - stacking with the antenna of larger complexes, absent in the authors' preparation could also contribute to grana maintenance; and auxiliary proteins such as CURT help with this issue as well. Here a recent demonstration of the importance of minor antenna should probably be also cited: https://doi.org/10.1101/2021.12.31.474624
  
  We used the term “flexible” rather than “transient” to describe the interactions within the stacked PSII dimer. Our data (and tomographic data) do not contain any temporal component. When we used the term under-detected we refer to the fact that PSII is mainly detected by the luminal extrinsic subunits. The flexibility detected in our analysis may affect the concurrent visibly of these features in the PSII complexes making up an individual PSII stack. Specifically, Wietrzynski et al mainly analyze C2S2M2L2 complexes while our analysis only contained C2S2 complexes. It is likely that the different amount of bound LHCII affect PSII stacking as well. For example, Wietrzynski et al, show some overlap between LHCII complexes and little overlap between cores in the larger complexes they analyzed. We observe mainly core to core overlap with little LHCII overlap in the smaller C2S2, although we did not observe any states where LHC’s were not included in what appear to be the binding interface. We agree with the reviewer on the relevance Lhcb’s and CURT contributions to stacking but prefer to focus on what was directly demonstrated in our data. We clearly note that we are discussing in-vitro results.
  
  Taking these last thoughts, I would like to finish by mentioning one more thing - almost philosophical. The authors are certainly at the forefront of the booming cryoEM revolution in biology which is profoundly changing the way we understand the living. There is absolutely zero doubt that this powerful technique is of the highest interest. But a growing number of structures of photosynthetic complexes remain puzzling, in particular with regard to their abundance in vivo (such as the PSII stacks) and functional relevance. How do we ascertain that these interactions are not due to in vitro preparation (isolation from cells, solubilisation)? Which ways can we use to try to exclude this (simple) hypothesis? I suggest that at least a small extent of biological replicas - experiments performed on separate batches, in different technical conditions, with slightly altered solubilization conditions, and so on - could shed light on the nature of these structures and their occurrence in vivo. Technical reps of the freezing+analysis pipeline could also be tried to see the variability. This would strongly reinforce this manuscript and its conclusions, and while not completely unequivocal (the stacked PSII, for example, could form upon each purification), a quantification of the effects would be of high interest.
  
  We certainly share the reviewer hope of being able to conduct cause and effect cryoEM experiments covering a complete set of experimental parameters. This is still beyond reach in terms of time and cost. Within each cryoEM experiment, however, all the analysis is consistent and, more importantly, transparent with regards to image analysis, which is the most important factor in our opinion. Preparation artefacts are always a possibility but, in our opinion, cryoEM is not affected by them differentially compared to other techniques. As we mentioned above, the particles are being observed suspended in vitreous ice, this is not different, and one can say even better, then numerous low temperature spectroscopic observations on samples suspended in glass state or crystals obtained in the presence of high concentrations of various agents. One thing that validates structural studies are the chemical details (bond lengths and angles etc…) underlying every model which are consistence with known values to close tolerances.
  
  Reviewer #3 (Public Review):
  
  In this manuscript, Caspy et al. present a detailed structural analysis of eukaryotic photosystem II (PSII) isolated from the green alga Dunaliella salina. By combining single-particle cryo-EM with multibody refinement, the authors not only reveal a high-resolution (2.4Å) structure of the eukaryotic PSII, but also demonstrate alternate conformations and intrinsic flexibility of the overall complex. Stretched and compact conformations of the PSII dimer were readily identified within the single-particle dataset. From this structural analysis, the authors propose that excitation energy transfer properties may be modulated by changes in transfer distance between key chlorophyll molecules observed in different conformational states of the PSII dimer. Due to the high resolution of the maps obtained, the authors identify post-translational modifications and a sodium binding site based on the observed cryo-EM maps. Additionally, the authors analyze PSII complexes in stacked and unstacked configurations, and find that compact and stretched states also exist within the stacked PSII complexes. From their cryo-EM maps, the authors demonstrate that there is no direct protein-protein interaction between stacked PSII complexes, and rather propose a model wherein long-range electrostatic interactions mediated by divalent cations such as magnesium, can facilitate PSII stacking.
  
  The conclusions and models presented in the manuscript are mostly well justified by the data. The cryo-EM maps are high quality and the models appear generally well refined. However, some aspects of data processing and analysis, as well as the resultant conclusions need to be clarified.
  
  1) In general, it is not clear from the cryo-EM processing workflow (suppl. Fig 1) or the methods section when exactly symmetry was applied during 3D classification and refinement. In the case of C2S2 unstacked particles, when was symmetry first applied in the overall processing workflow? To identify the compact and stretched configurations of C2S2, did the 3D classification without alignment (and/or the refinement preceding this classification) have C2 symmetry applied? If so, have you considered the possibility that some particles may actually be asymmetric in some regions?
  
  We modified figure S1 to clearly indicate the use of symmetry and particle expansion. In general, we refined most of the particle sets without symmetry (C1). At the final processing stage of the unstacked PSII sets, after we separated both conformations, we used C2 symmetry to expand the data, this was followed by multibody refinement. No symmetry or symmetry expansion was used for the stacked PSII particle sets.
  
  2) Following multibody refinement in Relion individual maps and half-maps for each body will be generated. There is no mention in the methods of how these individual maps for each C2S2 "monomer" were combined to produce an overall map of the dimer following multibody refinement. There are several methods currently used to combine such maps, including taking the maximum or average of the two maps or using a model-based approach in phenix. The authors should be explicit about the method they used, any potential artifacts that may develop from this map combination process, and/or the interface between masks used in multibody refinement.
  
  We used phenix.combined_focused_maps to combine the maps. This is now indicated in the method section.
  
  3) In addition to the point raised above, following multibody refinement there will be an individual FSC curve and resolution for each body. However, in supplemental figure 2 and supplemental table 1, only a single FSC curve and resolution are reported. Are these FSC curves/resolutions only reported for the better of the two bodies? If not, how was a single resolution calculated for the overall map of combined bodies?
  
  Both FSC curves were calculated and were highly similar, as expected following C2 expansion. This can also be evaluated from the local resolution maps which are highly similar between the two bodies. The reported resolutions are all taken from the displayed FSC curves generated through relion PostProcess.
  
  4) One of the major conclusions from the 3D classification and multibody refinement is that conformational changes and inherent flexibility of the PSII dimers have the potential to change distances between cofactors in the complex, ultimately leading to altered excitation energy transfer. However, it is unclear whether or not the authors believe one conformation over another may more readily support the evolution of oxygen. It would be nice if the authors could elaborate slightly upon this topic in the discussion.
  
  As discussed above the structural changes associated with the formation of quenching centers are not expected to be detected in the current work. The changes we observe can however affect the transfer to such centers and by doing so can play an important part in PSII biology. We do not detect any changes around the OEC and we don’t find any reason to think the two conformations are different with respect to their ETC.
  
  5) Along the lines of point 4 above, on line 95 the authors claim that "the high specific activity of 816 umol O2/ (mg Chl * hr) suggest that" both the C2S2 compact and stretched conformation are highly active. However, it is not clear to me why this measure of specific activity would suggest that both PSII conformations should have "high" activity. Maybe a reference here would help guide readers to previous measures of specific activity?
  
  Looking at specific activity from previously published structural studies on eukaryotic PSII we find that Sheng et al, 2019 reported on a specific activity of 272 mol O2/ (mg Chl * hr), this difference can stem partially from the presence of larger complexes in their preparation and is comparable to the activity that we measured in our As fraction (276 mol O2/ (mg Chl * hr), Figure 1-figure supplement 9). Reported specific activity values from plants (Pisum sativum) are also similar, Su et al, reported on a maximal value of 288 mol O2/ (mg Chl * hr), again, for larger complexes which can explain some of the difference. However, the specific activity measured for the C2S2 PSII isolated in the current study is 2.8 X higher than this value, more than the differences in chl content which ranges between 1.5 X to 2 X in favor of the larger complexes. If either one of the conformations is not as active, it would only mean that the other conformation will display even higher specific activity which seems less likely. In addition, we find no difference around the oxygen evolution center or in the peripheral luminal subunits in both the shape or map strength so both orientations show highly similar structures around these regions which determine the oxygen evolution activity.
  
  6) It is claimed that "more than 2100 water molecules were detected in the C2S2 compressed model", and the water distribution is shown in Figure 3. Obtaining resolutions capable of visualizing waters with cryo-EM is still a significant challenge. Upon visual inspection of the map supplied, it appears that several of the waters that were built into the atomic model simply do not have supporting peaks in the coulomb potential map above the level of noise. While some of the modeled waters are certainly supported by the map, in my opinion, there are many waters that simply are not, or at best are questionable. What method or tool was originally used to build waters into the model, and how were these waters subsequently validated during structure refinement?
  
  We followed standard methods for water placement and refinement in the preparation of the model, in addition to manually curating the water structure. However, in light of the reviewer comment we undertook additional rounds of refinement and inspection of the water molecules in the model. We removed a few hundred water molecules so that the total number of water molecules is now around 1700. All the water molecules in the present model should be well supported at maps values higher then 2.5 sigma and in our opinion the current water model should be regarded as conservative and underestimates the number of bound water molecules. This also led to some improvements in additional validation statistics of the model which are listed in the Table 1. The new model has been deposited in the PDB and the new PDB validation report is included in our resubmission.
  
  7) The authors claim to identify several unique map densities during model building. One of these is a sodium ion close to the OEC, which is coordinated by D1-His337, several backbone carbonyls, and a water molecule. When looking closely at the cryo-EM map supplied, it appears that the coulomb potential map is quite weak for this sodium, and is only visible at quite low contour levels. In fact, the features for the coordinating water, and chloride ions located ~7-9A away are much stronger than the sodium. Do the authors have any explanation for why the cryo-EM map is significantly weaker for the sodium compared to the coordinating water or chloride ions in the same general vicinity? Similar to what they did for the other post-translational modifications, the authors should consider showing the actual cryo-EM map for the bound sodium in supplemental Figure 10 a,b.
  
  Our main support for the placement of a Na+ ion in this location stems from the analysis of Wang et al. Our maps show the presence of a density which is discernible at 4 σ with an elongated shape suggesting the presence of multiple atoms/waters. Although in principle positive ions should have very strong densities in cryoEM maps due to their interactions with electrons, other factors such as occupancy, coordination and b-factor also play a role making the distinction between water and sodium complicated and case specific. The sodium peak is not observed in unsharpened maps (as do most of the water molecules which occupy conserved positions).
  
  We collected a few examples from comparable cases (cryo-EM maps of similar resolution ranges) where the presence of sodium ions is highly probable based on additional evidence. These maps densities highlight the factors we discussed above. In cases ‘a’ (dual oxidase 1 prepared in high sodium conditions) and ‘b’ (human voltage-gated sodium channel), Na+ is observed in a highly coordinated states and especially in ‘a’ shows the expected increase density values compared to water molecules. However, cases ‘d’ (human Na+/K+ P type Atpase) and ‘e’ (voltage-gated sodium channel) appear very similar to the proposed Na+ assignment in PSII. We conclude that map density alone is not enough to distinguish between Na+ and water molecules and rely on the additional experiments described by Wang et al. which show increase PSII activity in elevated Na+ levels in basic conditions.
  
  8) The cryo-EM maps showing CP29-Ser84 phosphorylation and CP47-Cys218 sulfinylation are quite convincing. However, it is interesting that these modifications are only observed in the compact conformation, and not in the stretched conformation. Can the authors elaborate on whether or not they believe the compact and stretched conformations could be a result of these posttranslational modifications, or vice versa?
  
  This is an interesting suggestion. In our opinion it is less likely that the modification themselves trigger the transition between compact and stretched states. It is not clear how these modifications will stabilize the compact vs the stretched states. It is equally likely that these modifications are somehow triggered by the structural change. We cannot be certain that these modifications are not present in the stretched orientation as well but remain unobserved due to resolution differences. The correlation between the states and post translation modifications should be verified before a discussion on their possible roles in the transitions.
  
  9) Do the authors believe that PSII dimers in the solution can readily interconvert between compact and stretched conformations? Or is the relative ratio of these conformations fixed at the time of membrane solubilization with decyl-maltoside?
  
  We think that its more probable that the transition between these states occur in the membrane phase. The main reason for this will be that pigment loss and structural transitions in CP29 are more likely to occur in the membrane rather than in aqueous/micelle environments.
  
  10) The model proposed for divalent cation-mediated stacking of PSII dimers is compelling, and seems to be in agreement with previous investigations that observed a lack of stacked dimers in cryo-EM preparations lacking calcium/magnesium. However, my understanding from reading the methods section is that the observed lack of density between the stacked PSII dimers was inferred from maps obtained after multibody refinement. Based on the way the masks to define bodies were created for multibody refinement (Fig. 4A), the region between stacked dimers would be highly prone to map artifacts following multibody refinement. Have the authors looked closely at the interfacial region between stacked dimers following conventional 3D classification/refinement to ensure that there are indeed no features observed in the interfacial region even at low contour levels?
  
  We’ve made several attempts to resolve differences in the space between the stacked PSII dimer. These include focused classification with masks containing selected volumes from this regions and masks that include only one of the stacked PSII dimers to avoid signal subtraction in this region. All of these did not reveal any discernible features in this region. In addition, any stable binding of a bridging protein across the stacked dimer will probably be at least partially visible as additional density over the unstacked PSII. We searched for such features and found none.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.29.470333v1
www.biorxiv.org www.biorxiv.org

Linear summation of metabotropic postsynaptic potentials follows coactivation of neurogliaform interneurons

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  This manuscript by Gabor Tamas' group defines features of ionotropic and metabotropic output from a specific cortical GABAergic cell cortical type, so-called neurogliaform cells (NGFCs), by using electrophysiology, anatomy, calcium imaging and modelling. Experimental data suggest that NGFCs converge onto postsynaptic neurons with sublinear summation of ionotropic GABAA potentials and linear summation of metabotropic GABAB potentials. The modelling results suggest a preferential spatial distribution of GABA-B receptor-GIRK clusters on the dendritic spines of postsynaptic neurons. The data provide the first experimental quantitative analysis of the distinct integration mechanisms of GABA-A and GABA-B receptor activation by the presynaptic NGFCs, and especially gain insights into the logic of the volume transmission and the subcellular distribution of postsynaptic GABA-B receptors. Therefore, the manuscript provides novel and important information on the role of the GABAergic system within cortical microcircuits.
  
  We have made all changes humanely possible under the current circumstances and we are open to further suggestions deemed necessary.
  
  Reviewer #2:
  
  The authors present a compelling study that aims to resolve the extent to which synaptic responses mediated by metabotropic GABA receptors (i.e. GABA-B receptors) summate. The authors address this question by evaluating the synaptic responses evoked by GABA released from cortical (L1) neurogliaform cells (NGFCs), an inhibitory neuron subtype associated with volume neurotransmission, onto Layer 2/3 pyramidal neurons. While response summation mediated by ionotropic receptors is well-described, metabotropic receptor response summation is not, thereby making the authors' exploration of the phenomenon novel and impactful. By carrying out a series of elegant and challenging experiments that are coupled with computational analyses, the authors conclude that summation of synaptic GABA-B responses is linear, unlike the sublinear summation observed with ionotropic, GABA-A receptor-mediated responses.
  
  The study is generally straightforward, even if the presentation is often dense. Three primary issues worth considering include:
  
  1) The rather strong conclusion that GABA-B responses linearly summate, despite evidence to the contrary presented in Figure 5C.
  
  2) Additional analyses of data presented in Figure 3 to support the contention that NGFCs co-activate.
  
  3) How the MCell model informs the mechanisms contributing to linear response summation.
  
  These and other issues are described further below. Despite these comments, this reviewer is generally enthusiastic about the study. Through a set of very challenging experiments and sophisticated modeling approaches, the authors provide important observations on both (1) NGFC-PC interactions, and (2) GABA-B receptor mediated synaptic response dynamics.
  
  The differences between the sublinear, ionotropic responses and the linear, metabotropic responses are small. Understandably, these experiments are difficult – indeed, a real tour de force – from which the authors are attempting to derive meaningful observations. Therefore, asking for more triple recordings seems unreasonable. That said, the authors may want to consider showing all control and gabazine recordings corresponding to these experiments in a supplemental figure. Also, why are sublinear GABA-B responses observed when driven by three or more action potentials (Figure 5C)? It is not clear why the authors do not address this observation considering that it seems inconsistent with the study's overall message. Finally, the final readout – GIRK channel activation – in the MCell model appears to summate (mostly) linearly across the first four action potentials. Is this true and, if so, is the result inconsistent with Figure 5C?
  
  GABAB responses elicited by three and four presynaptic NGFC action potentials were investigated to have a better understanding about the extremities of NGFC-PC connection. Although, our spatial model suggests that in L1 in a single volumetric point one or two NGFCs could provide GABAB response with their respective volume transmission, it is still important that in the minority of the percentage three or more NGFCs could converge their output. The experiments in Fig 5 not only offer mechanistic understanding that possible HCN channel activation and GABA reuptake do not influence significantly the summation of metabotropic receptor-mediated responses, but also support additional information about the extensive GABAB signaling from more than two NGFC outputs. Interestingly in this experiment the summation until two action potentials show very similar linear integration as seen in the triplet recordings. This result suggests that the temporal and spatial summation is identical when limited inputs are arriving to the postsynaptic target cell. Similar summation interaction can be seen in our model until two consecutive GABA releases. Three or four consecutive GABA releases in our model still produces linear summation, our experiments show moderate sublinearity. One possible answer for this inconsistency is the vesicle depletion in NGFCs after multiple rapid release of GABA, which was not taken into account in our model.
  
  Presumably, the motivation for Figure 3 is that it provides physiological context for when NGFCs might be coactive, thereby providing the context for when downstream, PC responses might summate. This is a nice, technically impressive addition to the study. However, it seems that a relevant quantification/evaluation is missing from the figure. That is, the authors nicely show that hind limb stimulation evokes responses in the majority of NGFCs. But how many of these neurons are co-active, and what are their spatial relationships? Figure 3D appears to begin to address this point, but it is not clear if this plot comes from a single animal, or multiple? Also, it seems that such a plot would be most relevant for the study if it only showed alpha-actin 2-positive cells. In short, can one conclude that nearby, presumptive NGFCs co-activate, and is this conclusion derived from multiple animals?
  
  The aim of Fig. 3 D was to indicate that the active, presumably NGFCs are spatially located close to each other. The figure comes from a single animal. We agree with the reviewer, therefore changed the scatter plot figure in Fig. 3D to another one, that provides information about the molecular profiles of the active/inactive cells. We made an effort to further analyze our in vivo data and the spatial localization of the monitored interneurons (see Author response image 3.). The results are from 4 different animals, in these experiments numerous L1 interneurons are active during the sensory stimulus, as shown in the scatter plot. We calculated the shortest distance between all active cells and all ɑ-actinin2+ that were active in experiments. The data suggest that in the case of identified active ɑ-actinin2+ cells, the interneuron somas were on average 182.69+60.54 or 305.135+34.324 μm distance from each other. Data from Fig. 2D indicates that the average axonal arborization of the NGFCs is reaching ~200-250μm away. Taken these two data together, in theory it is probable that the spatial localization would allow neighboring NGFCs to directly interact in the same spatial point.
  
  The inclusion of the diffusion-based model (MCell) is commendable and enhances the study. Also, the description of GABA-B receptor/GIRK channel activation is highly quantitative, a strength of the study. However, a general summary/synthesis of the observations would be helpful. Moreover, relating the simulation results back to the original motivation for generating the MCell model would be very helpful (i.e. the authors asked whether "linear summation was potentially a result of the locally constrained GABAB receptor - GIRK channel interaction when several presynaptic inputs converge"). Do the model results answer this question? It seems as if performing "experiments" on the model wherein local constraints are manipulated would begin to address this question. Why not use the model to provide some data – albeit theoretical – that begins to address their question?
  
  We re-formulated the problem to be addressed in this Results section. We admit that our model is has several limitations in the Discussion and, consequently, we restricted its application to a limited set of quantitative comparisons paired to our experimental dataset or directly related to pioneering studies on GABAB efficacy on spines vs shafts. We believe that a proper answer to the reviewer’s suggestion would be worth a separate and dedicated study with an extended set of parameters and an elaborated model.
  
  In sum, the authors present an important study that synthesizes many experimental (in vitro and in vivo) and computational approaches. Moreover, the authors address the important question of how synaptic responses mediated by metabotropic receptors summate. Additional insights are gleaned from the function of neurogliaform cells. Altogether, the authors should be congratulated for a sophisticated and important study.
  
  Reviewer #3:
  
  The authors of this manuscript combine electrophysiological recordings, anatomical reconstructions and simulations to characterize synapses between neurogliaform interneurons (NGFCs) and pyramidal cells in somatosensory cortex. The main novel finding is a difference in summation of GABAA versus GABAB receptor-mediated IPSPs, with a linear summation of metabotropic IPSPs in contrast to the expected sublinear summation of ionotropic GABAA IPSPs. The authors also provide a number of structural and functional details about the parameters of GABAergic transmission from NGFCs to support a simulation suggesting that sublinear summation of GABAB IPSPs results from recruitment of dendritic shaft GABAB receptors that are efficiently coupled to GIRK channels.
  
  I appreciate the topic and the quality of the approach, but there are underlying assumptions that leave room to question some conclusions. I also have a general concern that the authors have not experimentally addressed mechanisms underlying the linear summation of GABAB IPSPs, reducing the significance of this most interesting finding.
  
  1) The main novel result of broad interest is supported by nice triple recording data showing linear summation of GABAB IPSPs (Figure 4), but I was surprised this result was not explored in more depth.
  
  We have chosen the approach of studying GABAB-GABAB interactions through the scope of neurogliaform cells and explored how neurogliaform cells as a population might give rise to the summation properties studied with triple recordings. This was a purposeful choice admittedly neglecting other possible sources of GABAB-GABAB interactions which possibly take place during high frequency coactivation of homogeneous or heterogeneous populations of interneurons innervating the same postsynaptic cell. We agree with the reviewer that the topic of summation of GABAB IPSPs is important and in-depth mechanistic understanding requires further separate studies.
  
  2) To assess the effective radius of NGFC volume transmission, the authors apply quantal analysis to determine the number of functional release sites to compare with structural analysis of presynaptic boutons at various distances from PC dendrites. This is a powerful approach for analyzing the structure-function relationship of conventional synapses but I am concerned about the robustness of the results (used in subsequent simulations) when applied here because it is unclear whether volume transmission satisfies the assumptions required for quantal analysis. For example, if volume transmission is similar to spillover transmission in that it involves pooling of neurotransmitter between release sites, then the quantal amplitude may not be independent of release probability. Many relevant issues are mentioned in the discussion but some relevant assumptions about QA are not justified.
  
  Indeed, pooling of neurotransmitter between release sites may affect quantal amplitude, therefore we examined quantal amplitude under low release probability conditions using 0.7- 1.5 mM [Ca]o to detect postsynaptic uniqantal events initiated by neurogliaform cell activation (Author response image 7). This way we measured similar quantal current amplitudes comparing with BQA method with no significant difference (4.46±0.83 pA, n=4, P=0.8, Mann-Whitney Test).
  
  3) The authors might re-think the lack of GABA transporters in the model since the presence and characteristics of GATs will have a large effect on the spread of GABA in the extracellular space.
  
  We agree that the presence of GAT could effectively shape the GABA exposure, e.g. (Scimemi 2014). During the development of the model, we took into consideration different possibilities and solutions to create the model’s environment. To our knowledge, there is no detailed electron microscopic study that would provide ultrastructural measurements of structural elements around the NGFC release sites and postsynaptic pyramidal cell dendrites in layer 1 while preserving the extracellular space. Moreover, quantitative information is scarce about the exact localization and density of the GATs along the membrane surface of glial processes around confirmed NGFC release sites. We felt that developing a functional environment that would contain GABA transporters without possessing such information would be speculative. Furthermore, during the development of the model it became clear that incorporating thousands of differentially located GABA transporters would massively increase the processing time of single simulations including monitoring each interaction between GATs and GABA molecules, and requiring computational power calculating the diffusion of GABA molecules in the extracellular space, even if GABA molecules are far from the postsynaptic dendritic site without any interaction.
  
  As an admittedly simple and constrained alternative, we decided to set a decay half-life for the GABA molecules released. This approach allows us to mimic the GABA exposure time of 20-200 ms, based on experimental data (Karayannis et al 2010). In the model the GABA exposure time was 114.87 ± 2.1 ms with decay time constants of 11.52 ± 0.14 ms. After ~200 ms all the released GABA molecules disappeared from the simulation environment.
  
  A detailed extracellular diffusion aspect was out of the scope of our model, we were interested in investigating how the subcellular localization of receptors and channels determine the summation properties.
  
  4) I'm not convinced that the repetitive stimulation protocol of a single presynaptic cell shown (Figure 5) is relevant for understanding summation of converging inputs (Figure 4), particularly in light of the strong use-dependent depression of GABA release from NGFCs. It is also likely that shunting inhibition contributes to sublinear summation to a greater extent during repetitive stimulation than summation from presynaptic cells that may target different dendritic domains. The authors claim that HCN channels do not affect integration of GABAB IPSPs but one would not expect HCN channel activation from the small hyperpolarization from a relatively depolarized holding potential.
  
  Use-dependent synaptic depression of NGFC induced postsynaptic responses was nicely documented by Karayannis and coworkers (2010) although they investigated the GABAA component of the responses and they found that the depression is caused by the desensitization of postsynaptic GABAA receptors. We are not aware of experiments published on the short term plasticity of GABAB responses. In our experiments represented in Fig 5 we found linearity in the summation of GABAB responses up to two action potentials and sublinearity for 3 and 6 action potentials. In fact, our results show that no synaptic depression is detectable in response to paired pulses since amplitudes of the voltage responses were doubled compared to a single pulse which means that the paired pulse ratio is around 1. To verify our result, we repeated our dual recording measurements with one, two, three and four spike initiation in the presynaptic neurogliaform cell (Author response image 6). Measuring both the amplitude and the overall charge of GABAB responses we again found linear relationship among one and two spike initiation protocol.
  
  Author response image 6 - Integration of GABAB receptor-mediated synaptic currents (A) Representative recording of a neurogliaform synaptic inhibition on a voltage clamped pyramidal cell. Bursts of up to four action potentials were elicited in NGFCs at 100 Hz in the presence of 1 μM gabazine and 10 μM NBQX (B) Summary of normalized IPSC peak amplitudes (left) and charge (right). (C) Pharmacological separation of neurogliaform initiated inhibitory current.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2020.12.10.418913v1
www.biorxiv.org www.biorxiv.org

Single-cell profiling reveals periventricular CD56bright NK cell accumulation in multiple sclerosis

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  Overall the work is an impressive analysis of an understudied cell-type in human MS, and represents an important finding. The paper is well presented and the figures very clear. However, the manuscript is descriptive and, although this is not a problem by itself, the depth and limitations of the Cytof (only 37 markers) leaves the reader without a clear idea of what these cells could be doing.
  
  Some single-cell RNAseq and other ways to interrogate potential mechanisms and function would be particularly helpful here, but is perhaps beyond the scope of the paper.
  
  We thank the reviewer for this nice comment. We fully agree that a next informative step would be the investigation of the function and mechanisms of the NK cell populations in MS pathology. At this moment, that is indeed beyond the scope of the current manuscript. We do believe that our findings can guide future studies to explore potential mechanisms of NK cells in more depth.
  
  At minimum more immunohistochemical and smFish or in situ hybridization to validate key findings (using the markers identified by CyTOF) and add to the spatial relationships of Nk Cells with other border and brain cells would be informative.
  
  We appreciate this suggestion and have performed different immunohistochemical analysis to study the spatial relationship of NK cells and other immune and brain cells in the MS brain (Essential Revisions Fig. 1). We have stained the same cohort described in the manuscript for CD45, NKp46, GrB and Iba1 as well as CD45, NKp46, GrB and GFAP, to study the interaction of NK cells with microglia/macrophages and astrocytes, respectively, and with CD45+ immune cells in general. In MS lesions, we were able to detect a small but similar percentage of putative CD56bright NK cells (CD45+ NKp46+ GrB- cells) interacting with CD45+ Iba1- cells and with CD45+ Iba1+ cells (Essential Revisions Figure 1a-b). Due to astrogliosis, the processes of astrocytes densely populate the MS lesions and as such, we cannot infer if the interaction between NK cells and astrocytes is functional. Furthermore, the absolute number of NK cells in control brains is low, so we can only obtain reliable data from MS brains. As a result, we are unable to compare the observed interactions in MS lesions with a control condition. Of note, CD56bright NK cells are potent cytokine producers and their potential regulatory functions are not be limited to contact-dependent interactions.
  
  Essential Revisions Fig. 1 cellular interactions of Granzyme B- NK cells (a) Representative immunohistochemical staining of Granzyme B- NK cells stained for CD45 (green), NKp46 (magenta) and negative for Granzyme B (cyan), together with microglia stained with iba1 (red). Scale bar = 10µm. (b) Pie chart displays the percentage of CD45+ NKp46+ Granzyme Bcells interacting with CD45+ Iba1+ and C45+ Iba1- cells in MS lesions. (c) Representative immunohistochemical staining of NK cells stained for CD45 (green), NKp46 (magenta) and negative for Granzyme B (cyan), together with astrocytes stained with GFAP (red). Scale bar = 10µm.
  
  A major weakness of the study is that is is underpowered and thus not clear how robust or representative these findings are in MS given the heterogeneity of the disease and also potential differences in Sex, Age and lack of healthy controls. (AD samples labelled as control.)
  
  We thank the reviewer for their comment. First we would like to comment on the presumed lack of healthy controls. In this study, we included two ‘control’ groups, one of them consisted out of non-neurological controls (“NNC”), free of any neurological disease, and the other consisted of neurological controls (“NC”), including demented and Alzheimer patients. We acknowledge that this terminology leaves the reader confused; as such, we renamed the “NC” group with patients suffering from dementia to “Dementia” and the “NNC” group of donors without neurological disease to “Controls”.
  
  Secondly, while our sample size is rather small, it is comparable to other studies that use fresh post-mortem brain tissue (Böttcher et al, 2020).. The usage of this unique postmortem brain tissue from human donors is severely limited by the number of well-characterized samples available, their demographics and clinical background. To overcome the underpowered design and possible effects of confounders as sex and age, we validated our main finding by multiplex immunohistochemistry in a separate cohort. This included 5 controls (2 females, 3 males, f:m ratio of 0.667) and 7 MS cases (3 females and 4 males, f:m of 0.75), with a similar female/male ratio and matched age (Wilcoxon rank sum test with continuity correction, p-value = 0.41). We now included the characteristics of the validation cohort in the manuscript as well.
  
  “Finally, to confirm that CD56bright NK cells accumulate in periventricular brain regions in MS donors, we used multiplex immunohistochemistry in an independent cohort (Table 1), wherein MS and control groups were age-matched (Wilcoxon rank sum test with continuity correction, p-value = 0.41) and had a similar female:male ratio (0.667 in controls and 0.75 in MS).”
  
  Böttcher C, van der Poel M, Fernández-Zapata C, Schlickeiser S, Leman JKH, Hsiao CC, Mizee MR, Adelia, Vincenten MCJ, Kunkel D, Huitinga I, Hamann J, Priller J (2020) Single-cell mass cytometry reveals complex myeloid cell composition in active lesions of progressive multiple sclerosis. Acta neuropathologica communications, 8(1), 1-18
  
  It is also important to show the NK cells are actually in the parenchyma and interacting with other cells (e.g., microglia) of the lesion. If the authors have this tissue and antibodies to do that, this would add to the study. Moreover, the details on samples and controls should be more clearly communicated in the text and legends as well as the caveats and limitations of the study in the Discussion.
  
  The location of NK cells within the brain parenchyma is an important determinant of their function within the CNS. Thus, we included a basement membrane marker (collagen IV) in our multiplex IHC panel in order to exclude the cells within the vessel lumen. As this has not been clearly communicated, we have adjusted the sentence from the subsection Multiplex immunohistochemistry in the Methods (from “Cells within the lumen of vessels from the choroid plexus sections were excluded manually” to “Cells within the lumen of vessels were excluded manually with the aid of collagen IV staining.”). We have addressed in Essential Revisions Fig. 1 the additional IHC experiments performed to explore the interactions of NK cells with other brainresident cells. We thank the reviewer for warning us on the difficulty of our nomenclature. We have thus adjusted the labels of the three main groups throughout the manuscript as follows: Control (previously, NNC), Dementia (previously, NC) and MS (same as before). We also have expanded the limitations of this study in the Discussion.
  
  “Our study has two main limitations, first scarcity of fresh human tissue prevented having sex and age-matched groups with large sample sizes for the CyTOF analysis. To overcome the underpowered design and possible effects of confounders, we have validated our main finding by multiplex immunohistochemistry in a separate cohort with a similar age and female/male ratio. Secondly, there is a strong contribution of blood-derived immune cells in the choroid plexus, which precluded a clear distinction between circulating and stromal immune cells. This may have prevented the detection of choroid-plexus specific changes in the stroma, such as an accumulation of CD8+ T cells in the choroid plexus from MS donors, previously described by our group using immunohistochemistry [47]. In addition, the high proportion of granulocytes in the CP as detected by our CyTOF analysis likely originates from the circulation [47,63]. Contrariwise, the scarcity of B cells, despite the high vascularisation, is in line with previous reports [47,63]; and the detection of rare ASCs in the choroid plexus but not in the blood reassures their tissue specificity [63].”
  
  Reviewer #2 (Public Review):
  
  The data are extensive, valuable, convincing, and entirely descriptive (as studies using human post-mortem material must be, of necessity). What emerges is a detailed account of NK cells in specific regions of the MS brain (although here the authors slightly overplay how little is known about NK cells in MS). The study provides a very comprehensive resource. The authors speculate on what their data might mean in terms of disease dynamics is a reasonable and informed way, but much of what is concluded is inference not backed up by experiment studies that would allow this to be more than a resource paper.
  
  We thank the reviewer for his/her compliments and agree that in this manuscript we can only speculate on the role of NK cells and their way of migration or proliferation, to and within the brain. Only future research can solve these speculations. We have addressed these concerns accordingly in the discussion and have removed any concluding or far-fetched speculations which is not backed-up by our own data.
  
  Reviewer #3 (Public Review):
  
  The authors introduce their work in the context of the prevailing uncertainties about the pathogenesis of multiple sclerosis (MS) and, in particular, seem to reference the initiation of immune lesions in early MS. However, the work itself addresses end-stage MS situations, which is quite possibly an entirely different landscape altogether, and may not be informative about MS initiation.
  
  We want to thank the reviewer for pointing out this misleading part of the text. We agree that our study does not provide any information on the initial stages of MS, and have therefore adjusted this part of the introduction to avoid confusion. “Brain regions around the ventricles are hotspots for MS lesions [8,21,39,52], but underlying mechanisms are poorly understood [41]. Since the majority of periventricular MS lesions occur around a central vessel [1,57], it has been suggested that vascular topography may influence MS pathology [33].”
  
  As a textual point, the manuscript makes far too many speculations about possible cell trafficking between compartments than is justified by a cross-section study.
  
  We appreciate this concern and we have therefore tuned down our speculations in the results and discussion sections.
  
  That said, the work itself is a carefully done descriptive characterisation of the leucocyte landscape found in the periventricular septum, choroid plexus (and peripheral blood) post-mortem from cases of multiple sclerosis (MS), non-MS neurological disease (dementia), and non-neurological controls (8-12 each). The material is rare, the post-mortem delays are quite short, the cell lineage characterisation is fairly extensive and some of the data are well supported by immunohistochemistry.
  
  We thank the reviewer for these compliments.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.09.17.460741v1
www.biorxiv.org www.biorxiv.org

Virus adaptation to heparan sulfate comes with capsid stability tradeoff

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #4 (Public Review):
  
  In this work, Tee et al. study the implications of Heparan Sulfate (HS) binding mutations observed on the Enterovirus A71 (EV-A71) capsid. HS-binding mutations are observed for several virus infections and are often presumed to be a cell culture adaptation. However, in the case of EV-A71, the presence of HS-binding mutations in clinical samples and the contradictory findings in animal studies have made the clinical relevance of HS-binding a subject of debate. Therefore, to better understand the role of HS-binding in EV-A71, the authors use a mouse-adapted EV-A71 variant (MP4) and compare it to a cell-adapted strong HS-binder (MP4-97R/167G). Using these two variants, the authors show that the strong HS-binder does not require acidification for uncoating and genome release. Furthermore, it is demonstrated that the capsid stability of the HS-binding variant is compromised, resulting in pH-independent uncoating. Overall, this study provides new insights demonstrating that seemingly beneficial mutations increasing viral replication may be counterbalanced by other unintended consequences.
  
  Strengths:
  
  The thoroughness of the experiments performed to demonstrate that the HS-binding phenotype results in pH-independent entry and capsid destabilisation is worth highlighting. In this regard, the authors have explored viral entry using a range of approaches involving lysosomotropic drugs, viral binding assays, and neutral red-labelled viruses coupled with diverse techniques such as FISH, RNAscope, and transient expression of constitutively active molecules to inhibit parts of the viral cycle. In my opinion, this is necessary to rule out the other downstream effects of the lysomotropic drugs and to confirm the role of the HS-binding mutation in the entry phase. The use of in silico analysis coupled with negative staining electron microscopy and environmental challenge assays is notable. Finally, the demonstration of some of the work using a human-relevant strain is commendable.
  
  We appreciate the reviewer recognition of the significance of our study and the precious advises.
  
  Weaknesses:
  
  A major weakness in this study is the focus on using a mouse-adapted EV-A71 strain (MP4). In the introduction, it is argued that HS-binding mutations are controversial due to their occurrence in cell culture. However, due to host limitations, mice are not the natural hosts for EV-A71 and thus, the same argument can be made for a mouse-adapted strain. It is not clear how different this strain is from circulating EV-A71 strains and the relevance of these findings to the human situation is questionable. This is particularly made evident in the discussion where it is highlighted that HS-binding variants (VP1-145G/Q mutants) have been associated with severe neurological cases while the same variants show attenuated phenotypes in mice and monkeys. This contrast between clinical data and animal studies should be highlighted in the introduction, rather than later in the discussion, as currently the in vivo animal studies are presented as the optimal situation and may lead to misconstrued conclusions from the results.
  
  As requested by the reviewer, we included new experiments performed with a clinical strain isolated in an immunosuppressed patient (Cordey et al., 2012). We compared the sensitivity of this human strain harboring or not the VP1 L97R and E167G mutations to HCQ and confirmed that the similar differential sensitivity to HCQ was observed as with the MP4 variant. This result is presented as a new supplementary figure (Figure 6-figure supplement 1) and is described in the result section of the revised manuscript (Page 7, lines 251).
  
  Page 7, lines 251: To determine if our observations are applicable to human strains, we examined the sensitivity of a closely related clinical strain. This strain was isolated from the respiratory tract of an immunosuppressed patient with a disseminated EV-A71 infection27. Additionally, we tested a strong HS-binding derivative that harbors the same VP1-L97R and E167G mutations as our MP4 double mutant. Notably, this human clinical strain shares 98.3% amino acid similarity with the MP4 variant used in this study and exhibits similar HS-binding phenotypes28. As shown in Figure 6-figure supplement 1, the original human strain was inhibited by HCQ, whereas the double mutant exhibited insensitivity to the drug.
  
  We also added the comment about discrepancy between clinical data and animal studies in the introduction as requested (page 2, lines 69-76): However, epidemiological surveillance of human EV-A71 infections19-21 and experimental evidence from 2D human fetal intestinal models22, human airway organoids23 and air-liquid interface cultures24 suggest that HS binding may enhance viral replication and virulence in humans. In addition, recent research has shown that EV-A71 can be released and transmitted via cellular extrusions25 or exosomes26, potentially preventing viral trapping of HS-binding strains in the circulation. Further studies are required to evaluate the true impact of HS-binding mutations on the spread and virulence of EV-A71 in both animal models and humans.
  
  An important consideration is that the results are based primarily on image analysis. The inclusion of RT-qPCR and/or plaque assays as supplementary data will help strengthen the findings.
  
  We have performed RT-qPCR to confirm the immunostaining data and included them in the supplementary data (Figure 1-figure supplement 1E). Reference to these data is made in the result section [Page 4, lines 114-116: These results were confirmed by viral load quantification with real-time RT-PCR (Figure 1-figure supplement 1E).]
  
  Moreover, there are suggestions of an intermediate binder having a different phenotype. As this intermediate binder is the clinical phenotype, data on the entry of this intermediate binder will be valuable.
  
  While we agree with reviewer that the single mutant is an intermediate binder and exhibits a clinical phenotype, we made the decision to work with variants that display clear phenotypes, selecting MP4 and the double mutant, as the latter is fully attenuated in both immunocompetent and immunosuppressed mice (Weng et al., 2023). Additionally, we performed an experiment using HCQ, where we observed an intermediate effect with the single mutant. This further confirmed our decision to proceed with MP4 and the double mutant for all experiments. The data supporting this are shown in Author response image 1, which we are sharing exclusively with the reviewer.
  
  Author response image 1.
  
  Differential sensitivity of MP4, MP4-97R and MP4-97R167G to Lysosomotropic drugs
  
  Another weakness in the study is the lack of contextualization of the results to current EV-A71 literature. For instance, SCARB2 is referred to as the internalization receptor but a recent study has shown that SCARB2 is not required for internalization (https://doi.org/10.1128%2Fjvi.02042-21). The findings from this study are consistent with the localization of SCARB2 in the lysosomal membranes. Furthermore, the same study has highlighted host sulfation as a key factor in EV-A71 entry. Post-translational sulfation introduces negatively charged residues on host proteins including HS and SCARB2. This increases the binding of HS-binding strains to these proteins. In this regard, the reduced infectivity upon soluble SCARB2 treatment may simply be due to enhanced binding rather than capsid opening as suggested in the results. Therefore, additional experiments (e.g. nSEM following soluble SCARB2 treatment) must be performed to support the conclusion of capsid opening, due to inherent instability, upon SCARB2 binding.
  
  We apologize for not citing this relevant literature excluding the role of SCARB2 in viral attachment. We have now included these references in the revised version of the manuscript. (Page 2, lines 54-56: “Since SCARB2 is mostly localized on endosomal and lysosomal membrane and sparsely on plasma membrane3,5, it seems to play only a minor role in EV-A71 cell attachment6,7.
  
  We thank the reviewer for mentioning the possibility that the sulfation of SCARB2 may enhance its binding to the mutated virus compared to the wild-type virus, potentially explaining the selective competitive inhibition of this variant by soluble SCARB2 produced in mammalian cells. To investigate this hypothesis, we performed nsEM imaging of the double mutant incubated with soluble SCARB2 and we observed an increase in the proportion of empty capsids in the presence of soluble SCARB2 (4% versus 0.7%), supporting our original findings that the inactivation is indeed associated with capsid opening. The results are included in the revised manuscript in Figure 5-figure supplement 4 and described on Page 7, lines 243-245: “However, the double mutant exhibited a ~5-fold increase in empty capsid percentage after treatment with sSCARB2 (Figure 5-figure supplement 4), consistent with the functional data above.”
  
  In addition to the above, other existing literature on EV-A71 pathogenesis using organoids contradicts some of the explanations of differential phenotype in clinical observations versus mice models. In the introduction, it is suggested that reduced neurovirulence of HS-binding strains is due to binding to the vascular endothelia. However, the correlation of clinical severity to viremia (https://doi.org/10.1186/1471-2334-14-417) and the association of HS-binding mutants to clinical disease counteract this suggestion. Similarly, viral infection in human organoids with EV-A71 results in as low as 0.4% of the cells being infected (https://doi.org/10.1038/s41564-023-01339-5). In this case, if viral binding to (ubiquitously expressed) HS results in viral trapping then the HS-binding mutants should show lowered infectivity in organoid models rather than the observed higher infectivity (https://doi.org/10.3389/fmicb.2023.1045587, https://doi.org/10.1038/s41426-018-0077-2). Finally, EV-A71 release has also been shown to occur in exosomes (https://doi.org/10.1093%2Finfdis%2Fjiaa174) which effectively provides a protective lipid membrane. These recent findings must be incorporated into the article and will help better contextualize their findings.
  
  We appreciate the reviewer thoughtful comments. We do not believe that the correlation between clinical severity and viremia contradicts the viral trapping hypothesis. For strains that do not bind to HS, the absence of viral trapping could indeed lead to higher viral concentrations in the bloodstream, potentially increasing neurovirulence. However, we agree with the reviewer that other observations in humans, along with experimental data from more relevant models such as organoids, challenge the trapping hypothesis. We are grateful for the suggested citations and have incorporated these references in the introduction, where we discuss this point in more detail
  
  Page 2, lines 69-76: “However, epidemiological surveillance of human EV-A71 infections19-21 and experimental evidence from 2D human fetal intestinal models22, human airway organoids23 and air-liquid interface cultures24 suggest that HS binding may enhance viral replication and virulence in humans. In addition, recent research has shown that EV-A71 can be released and transmitted via cellular extrusions25 or exosomes26, potentially preventing viral trapping of HS-binding strains in the circulation. Further studies are required to evaluate the true impact of HS-binding mutations on the spread and virulence of EV-A71 in both animal models and humans.”
  
  Overall, the authors present new findings with convincing methodology. The manuscript can be improved in the contextualization of the findings and highlighting the weakness in translating these findings to resolve the debate surrounding the relevance of HS-binding phenotype. The inclusion of additional experiments and data recommended to the authors will also help strengthen the manuscript.<br />
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.02.23.581741v1
www.biorxiv.org www.biorxiv.org

New submission 18/10/2022, 22:40:09

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  This manuscript investigates the gene regulatory mechanisms that are involved in the development and evolution of motor neurons, utilizing cross-species comparison of RNA-sequencing and ATAC-sequencing data from little skate, chick and mouse. The authors suggest that both conserved and divergent mechanisms contribute to motor neuron specification in each species. They also claim that more complex regulatory mechanisms have evolved in tetrapods to accommodate sophisticated motor behaviors. While this is strongly suggested by the authors' ATAC-seq data, some additional validation would be required to thoroughly support this claim.
  
  Strengths of the manuscript:
  
  1) The manuscript provides a valuable resource to the field by generating an assembly of the little skate genome, containing precise gene annotations that can now be utilized to perform gene expression and epigenetic analyses. The authors take advantage of this novel resource to identify novel gene expression programs and regulatory modules in little skate motor neurons.
  
  2) Cross-species RNA-seq and ATAC-seq data comparisons are combined in a powerful approach to identify novel mechanisms that control motor neuron development and evolution.
  
  Weaknesses:
  
  1) It is surprising that the analysis of RNA-seq datasets between mouse, chick, and little skate only identified 5 genes that are common between the 3 species, especially given the authors' previous work identifying highly conserved molecular programs between little skate and mouse motor neurons, including core transcription factors (Isl1, Hb9, Lhx3), Hox genes and cholinergic transmission genes. This raises some questions about the robustness of the sequencing data and whether the genes identified represent the full transcriptome of these motor neurons.
  
  To address reviewer #1’s questions, we have generated RNA sequencing data with mouse forelimb MNs and re-analyzed the RNA-seq data using only the homologous MN populations (Figure 3) among different species. As a result, many genes (1038 genes) are commonly expressed in MNs in different species, including many known MN marker genes. In the result section, we have added the following:
  
  “The evolution of genetic programs in MNs was investigated unbiasedly by comparing highly expressed genes in pec-MNs (percentile expression > 70) of little skate with the ones from MNs of mouse and chick, two well-studied tetrapod species. In order to compare gene expression with homologous cell types from each species, we performed RNA sequencing on forelimb MNs of mouse embryos at embryonic day 13.5 (e13.5) and wing level MNs of chick embryos at Hamburger-Hamilton (HH) stage 26–27…”
  
  We have also compared our re-analysis with previous results in Figure 2–figure supplement 1, shown above. Most of the fin MN genes (21/24) are highly expressed in pecMNs (percentile > 70), consistent with the previous in situ experiments. In the Results we have added the following:
  
  “Although the total number of DEGs are different from the previous data (592 vs. 135 genes in pec-MN DEGs), which might be caused by different statistical analysis with different reference genome, previous RNA-seq data based on de novo assembly and annotation using zebrafish was mostly recapitulated in our DEG analysis based on our new skate genome (21 out of 24 previous fin MN marker genes have the expression level ranked above 70th percentile in Pec-MNs; Figure 2‒figure supplement 1).”
  
  2) The authors suggest based on analysis of binding motifs in their ATAC-seq data that the greater number of putative binding sites in the mouse MNs allows for a higher complexity of regulation and specialization of putative motor pools. This could certainly be true in theory but needs to be further validated. The authors show FoxP1 as an example, which seems to be more heavily regulated in the mouse, but there is no evidence that FoxP1 expression profile is different between mouse and skate. It is suggested in Fig.5 that FoxP1 might be differentially regulated by SnaiI in mouse and skate but the expression of SnaiI in MNs in either species is not shown.
  
  We have added further discussion and data about differential expression of Foxp1 in mouse and little skate in Figure 5–figure supplement 16 and have discussed as follows:
  
  “Foxp1, the major limb/fin MN determinant appears to be differentially regulated in tetrapod and little skate. Although Foxp1 is expressed in and required for the specification of all limb MNs in tetrapods, Foxp1 is downregulated in Pea3 positive MN pools during maturation in mice (Catela et al., 2016; Dasen et al., 2008). In addition, preganglionic motor column neurons (PGC MNs) in the thoracic spinal cord of mouse and chick express half the level of Foxp1 expression than limb MNs. Although PGC neurons have not yet been identified in little skate, we tested the expression level of Foxp1 using a previously characterized tetrapod PGC marker, pSmad. We observed that Foxp1 is not expressed in MNs that express pSmad (Figure 5‒figure supplement 3). Since there is currently no known marker for PGC MNs in little skate, our conclusion should be taken with caution.”
  
  As for Snai1, in the revision we performed a motif enrichment analysis with an unbiased gene list where Snai1 didn’t show up. However, when we performed an RNA in situ hybridization experiment for Snai1 (Figure 5–figure supplement 3), we found that Snai1 is expressed in MNs of both mouse and little skate, but not in chick, which has been shown previously (Cheung et al., 2005). In order to examine the function of Snai1 in the regulation of Foxp1 expression, we ectopically expressed Snai1 in chick spinal cord by performing in ovo electroporation. However, we did not detect any changes in Foxp1. Instead we observed an increase in the number of neurons and abnormal MN exits from the spinal cord, which is the reminiscent of a previous observation (Zander et al., 2014). Although we did not detect any changes in Foxp1 expression, we cannot rule out the possibility that Snai1 regulates Foxp1 in mouse and little skate, which may require a gene knock out experiment. Because binding sites of Snai1 were not enriched in the new gene sets that we analyzed in the revision, we have not further discussed the Snai1 in the text.
  
  3) In their discussion section the authors state that they found both conserved and divergent molecular markers across multiple species but they do not validate the expression of novel markers in either category beyond RNA-seq, for example by in situ or antibody staining.
  
  We have added RNA in situ hybridization results in Figure 3C and Figure 3–figure supplement 1 and 2. Most of the genes were expressed in tissues in accordance with the sequencing results (6 out of 9 common MN genes; 4 out of 6 mouse specific genes; 5 out of 7 skate specific genes). Specifcally, Uchl1, Slc5a7, Alcam, and Serinc1 are expressed in MNs of all three species; Coch, Ppp1rc, Ctxn1, and Clmp are expressed in MNs of mouse but not in MNs of other species; Eya1, Etv5, Dnmbp, and Spint1 are expressed in MNs of skate but not in MNs of other species. In the result section, we have summarized the results as follow:
  
  “These results were validated by performing RNA in situ hybridization in tissue sections on a subset of species-specific genes …”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.14.484236v1
www.biorxiv.org www.biorxiv.org

New submission 24/08/2022, 10:12:53

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  Regulation of NAD and its intermediary metabolites is of critical importance in axon degeneration and neurodegenerative disease. Mounting evidence supports a scenario in which low NAD, and high NMN triggers axon degeneration by competitive allosteric inhibition/activation of SARM1. Strategies to increase NAD levels and/or lower NMN levels provide neuroprotection in a variety of contexts. NAD metabolism is a partially conserved process, however, there are key differences in pathway routes and dynamics between model organisms used for NAD research (yeast, worm, fly, zebrafish, mouse/mammalian systems). Drosophila is a key model organism for axon degenerative research based on its ease of use and range of available genetic tools, in addition, the effector of axon degeneration - SARM1 - was first identified in the fly. As Drosophila has some key differences in the NAD synthesis pathways to mammalian systems it is important to test and develop tools to enable exploration of these pathways on the fly. Llobet Rosell and colleagues have developed clear and demonstrable tools in Drosophila for exploring NAD-related axon degenerative pathways by modulating the use of NMN via the addition of NMN consuming and NMN generating enzymes. They utilize Drosophila genetics to adequately support the claims made in the manuscript. Importantly, the authors well-demonstrate that consuming NMN through an alternate route to NaMN provides neuroprotection and that the neuroprotective components of low NMN are upstream of SARM1. These should be useful tools for neuroscientists in the future to use Drosophila for neurodegenerative research.
  
  Strengths:
  
  • Clear demonstration that low NMN provides neuroprotection using novel, stable, enzymatic depletion of NMN (to NaMN).
  
  • Development of a novel Drosophila tool (NMN-D transgenics) to explore NMN metabolism in vivo, including a stabilized version to permit chronic NMN depletion.
  
  • Metabolomic profiles across the pathway to show all pathway changes (not just isolated NMN or NAD assays). • Neurodegenerative assays that include both histological outcomes (axon degeneration) but also circuitry/functional outcomes. Data from both series of experiments all support each other.
  
  • Assessment of other known potent axon degenerative genes via genetics in combination with the tools developed. • Staging of the molecular processes by strategic ablation of the inhibitory ARM domain on SARM1 (dSarm deltaARM). These experiments suggest that low NAD AND high NMN (i.e. ratio between the two) is the critical factor that drives axon degeneration. Once NAD is low, axon degeneration cannot be recovered by further lowering of NMN. The dSarm delta-ARM and dnmnat sgRNAs experiments support a hypothesis in that (high) NMN triggers, but doesn't, execute axon degeneration.
  
  We appreciate his recognition of the quality of our research.
  
  Weaknesses:
  
  • The authors use murine NAMPT (mNAMPT) to increase NMN. The degeneration assays support the hypotheses made, yet mNAMPT doesn't actually increase NMN. Thus it is unclear in this setting whether mNAMPT promotes axon degeneration by an NMN-related mechanism or through another route. It is also unclear as to why the murine form was chosen versus a human or other orthologues, or changing the metabolism of the intrinsic pathway (NR and NRK).
  
  Why mNAMPT:
  
  We decided to use mouse NAMPT (mNAMPT) because it was readily available by Giuseppe Orsomando (Amici et al., 2017), and because we did not have access to human NAMPT (hNAMPT).<br /> We agree with the observation that under physiological conditions, the expression of mNAMPT does not change NMN. However, we argue that after injury, once dNmnat is degraded, the additional NMN synthesis provided by mNAMPT expression (in addition to dNrk), leads to a faster NMN accumulation. It is supported by the observation that NMNAT2 is more labile than NAMPT in mammals (Gilley and Coleman, 2010; Stefano et al., 2015).
  
  • The authors use metabolic profiling to look at the individual metabolites during axon degenerative evens and treatments however it is unclear if any of these proteins or genes change as a consequence. This is likely not important for understanding the findings however, might be helpful in explaining the mNAMPT data.
  
  We agree with the idea to test whether there is a change induced at the mRNA or protein level when the metabolic flux is altered. To do this, first, we measured the relative expression levels of axon death and NAD+ synthesis genes (Figure 2 – figure supplement 1B). Then, we measured potential changes upon mNAMPT expression (Figure 4 – figure supplement 1). Importantly, while the Gal4-driven expression resulted in an increase of relative mNAMPT transcript abundance from 30 to 12’000, the change observed in the other genes was not notable. Importantly, compared to Actin–Gal4, dnrk is 2-fold lower in UAS-mNAMPT and Actin > mNAMPT backgrounds (control vs. experiment, respectively). Thus, overall, there appears to be no change in mRNAs of either axon death or NAD+ synthesis genes.
  
  In the results, we changed the text accordingly:
  
  "We then tested the effect of mNAMPT on the NAD+ metabolic flux in vivo. Surprisingly, NAM, NMN, and NAD+ levels remained unchanged under physiological conditions (Figure 4C). However, we noticed 3-fold higher NR and a moderate but significant elevation of ADPR and cADPR levels upon mNAMPT overexpression (Figure 4C). We also asked whether mNAMPT impacts on NAD+ homeostasis thereby altering the expression of axon death or NAD+ synthesis genes. Besides the expected significant increase in the Gal4-mediated expression of mNAMPT, we did not observe any notable changes at the mRNA level (Figure 4 – figure supplement 1)."
  
  • The authors repeatedly introduce a novel PncC antibody. However, no details on this, its generation, or its testing are found within the manuscript as presented. The antibody detects with several bands. The authors speculate that this could be a degradation product but nothing substantial is shown.
  
  In Materials and methods, we added a new section:
  
  "PncC antibody generation Rabbit anti-PncC antibodies were generated by Lubioscience under a proprietary protocol. The immunogen used was purified from Escherichia coli, strain K12, corresponding to the full protein sequence of NMN-D. The amino acid sequence is the following: MTDSELMQLSEQVGQALKARGATVTTAESCTGGWVAKVITDIAGSSAWFERGFVTYSNEAKAQMIGVREETLAQHGAVSEPVVVEMAIGALKAARADYAVSISGIAGPDGGSEEKPVGVWFAFATARGEGITRRECFSGDRDAVRRQAT AYALQTLWQQFLQNT"
  
  We also updated the results referencing it.
  
  "We found that both wild-type and enzymatically dead NMN-D enzymes are equally expressed in S2 cells, as detected by newly generated PncC antibodies (Materials & Methods, Figure 1–figure supplement 2). Notably, we observed two immunoreactivities per lane, with the lower band being a potential degradation product."
  
  In addition, we now provide evidence why we believe that the upper band is NMN-D, while the lower one is a degradation product. In the figure attached below, the samples of the first five lanes were denatured at 70 °C, while the samples of the last two lanes were denatured at 95 °C (each for 10 min, respectively). The resulting Western blot shows that at 70 °C, there is more unspecific background, but no lower degradation product, while at 95 °C, the background is drastically reduced; however, there is a lower degradation product appearing. NMN-D is indicated by an asterisk. We feel that it is important to show this data here in the rebuttal. But we feel that it would add confusion to the readers in the manuscript.
  
  • Olfactory receptor neuron degeneration assays are shown in Fig1 but no data is presented with it to support the images.
  
  We agree that a quantification would support our observation. However, it is difficult to precisely quantify individual axons in the ORN injury assay, for two main reasons:
  
  Severed axons are often bundled, thus the exact number cannot be scored.
  
  Due to the removal of the cell body, the axonal GFP intensity decreases over time, due to the absence of mCD8::GFP synthesis. It adds another level of difficulty. Nevertheless, we added numbers to each example in Figure 1E and D, where we quantified the % of brains where severed preserved axons were observed, similar to Figure 2 in (MacDonald et al., 2006).
  
  In the results section, we changed the text as indicated below:
  
  "We extended the ORN injury assay and found preservation at 10, 30, and 50 dpa (Figure 1E). While quantifying the precise number of axons is technically not feasible, severed preserved axons were observed in all 10, 30, and 50 dpa brains, albeit fewer at later time points (MacDonald et al., 2006). Thus, high levels of NMN-D confer robust protection of severed axons for multiple neuron types for the entire lifespan of Drosophila."
  
  In the Figure 1 legend, we changed the text accordingly:
  
  "D Low NMN results in severed axons of olfactory receptor neurons that remain morphologically preserved at 7 dpa. Examples of control and 7 dpa (arrows, site of unilateral ablation). Lower right, % of brains with severed preserved axon fibers. E Low NMN results in severed axons that remain morphologically preserved for 50 days. Representative pictures of 10, 30, and 50 dpa, from a total of 10 brains imaged for each condition (arrows, site of unilateral ablation). Lower right, % of brains with severed preserved axon fibers."
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.01.30.478002v2
www.biorxiv.org www.biorxiv.org

New submission 20/08/2023, 18:02:47

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  eLife assesssment:
  
  This paper conducts human and rodent experiments of non-invasive diffusion MRI estimates of axon diameter with the aim to establish whether these estimates provide biologically specific markers of axonal degeneration in MS. It will be of interest to researchers developing quantitative MRI methods and scientists studying neurodegeneration. The experiments provide evidence for the sensitivity of these markers, but do not directly validate axon diameter and do not reflect common pathological mechanisms across rodents and humans.
  
  We thank the Editor for the appreciation of our work. Thanks to the addition of an extensive electron microscopy paradigm, we now include a direct validation of axonal damage and expand on the common pathological mechanisms across the two species. The new results are detailed in the manuscript and summarized in Fig. 3 in the manuscript
  
  Reviewer #1 (Public Review):
  
  1.1 My primary concern relates to how meaningful the human-rodent comparisons are, and whether these comparisons really advance our understanding of AxCaliber estimates in MS. I applaud the aim to conduct "matched" experiments in both rodent models and human disease. It is a strength that the experiments are aligned with respect to the MRI measurements (although there are some caveats to this mentioned below). But beyond that, the overlap is not what one might hope for: the pathology would seem to be very distinct in humans and rodents, and the histological validation is not specific to what the MRI measurements claim to estimate. To summarize the main findings: (i) in a rat model of general axonal degeneration, axon calibre estimates correlate with neurofilaments; (ii) in MS in humans, axon calibre estimates correlate with demyelinating lesions. This gives a picture of AxCalibre estimates correlating with neuropathology, but is this something that has not already been established in the literature? If the aim is to validate AxCaliber, then there is a logic in using a rodent model that isolates alterations to axonal radius, but what then does this add to the existing literature in that space? If the aim is to study MS (for which AxCaliber results have been previously reported in Huang et al), then why not use a rodent model of MS?
  
  We thank the reviewer for their very insightful comments. Indeed, multiple sclerosis (MS) is a chronic neuroinflammatory and neurodegenerative disease of unknown etiology. An enormous effort has been made to obtain animal models that simulate the pathogenesis of this disease. However, while several models exist recapitulating distinct aspects of the disease (mostly related to demyelination), MS fundamentally remains a disease that only affects humans. This does not mean that EAE or lysolecithin models do not provide information on specific aspects and are therefore valuable. In fact, we believe that trying to replicate the pathological mechanisms of this disease in an animal model goes beyond the scope of the present work. In this work, our intention is to validate a biomarker of axonal damage preclinically, and for this, we use a model of axonal degeneration. We do not claim that this model should be valid to capture the complex clinical and pathological manifestation of MS, but we do think that it is a necessary step to ensure MRI sensitivity to axonal pathology. Why necessary? Because all the available (very limited) MRI literature which provides some form of validation: i) only focuses on healthy tissue, and ii) has an n of 1. Our preclinical paradigm gives conclusive evidence that the MRI axonal diameter proxy detects axonal damage as an increase in the mean diameter. This is now detailed in the discussion.
  
  After this necessary preclinical validation, we then apply the same framework to a human disease like MS that, among other manifestations, is believed to also cause axonal pathology. The improvements with respect to the one published work about axonal diameter in MS are: i) the whole brain analysis, which allowed us to characterize the extent of these early alterations outside the demyelinated lesions; and ii) the larger sample size, which allowed us to uncover an association with disease duration, strengthening our hypothesis about increased axonal diameter being a marker of early disease (new Fig. 5).
  
  Regarding the nonspecificity of histological validation, we thank the reviewer for this insightful comment, which triggered an additional analysis that we believe has added further value to the paper. Using electron microscopy, we found that in our model of neurodegeneration, axonal damage is indeed reflected as an increase in axon diameter (new Fig. 3). These recent findings strongly support the validation of our noninvasive diffusion MRI estimates of axon diameter alterations as an early-stage hallmark of normal-appearing tissue in MS.
  
  Coming back to the comparison between pathology in humans and in rodents, the EM data also support our choice of preclinical model, showing axonal swelling, the same phenomenon reported and characterized in recent postmortem histological data in the normal-appearing white matter of MS patients (Luchicchi et al., Ann Neurol 2021) and in lesions (Fisher et al. Ann Neurol 2007).
  
  All in all, we are confident that the new data supports the validity of this translational approach, and shed new light into the degenerating aspect of MS.
  
  Changes in the manuscript
  
  • Discussion, pag.12: It is important to stress that the aim of this work is not to propose a new animal model of MS, a disease that only affects humans, but rather to validate axonal damage detection (independently from the pathology that has induced it) through noninvasive MRI and apply the framework to characterize axonal pathology in MS.
  
  1.2 I appreciate that both rodent and patient studies are time intensive, major endeavors. Neverthless, the number of subjects is very low in both rodent (n=9) and human (MS=10, control=6) studies. At the very least, this should be more openly acknowledged. But I'm concerned that this is a major weakness of the paper. Related to this, I find it hard to tell how carefully multiple comparison correction was performed throughout. It seems reasonably clear for the TBSS analyses, but then other analyses were performed in ROIs. Are these multiple comparisons corrected as well? Similarly, in Methods, I am confused by the statement that: "post hoc t tests corrected for multiple comparisons whenever a significant effect was detected". What does this mean?
  
  We thank the reviewer for this comment. We agree that a small sample size was a weakness of the previous version of the paper, and therefore, in the new version, we have substantially increased the n for both animal and human experiments (from n=9 to 19 in animals, from 16 to 21 in humans). We removed the ROI analysis in the new version, and thus the confusing statement, and clarified the strategy for multiple comparisons.
  
  Changes in the manuscript
  
  • Data analysis, pag. 18: Lesion masks were excluded from the statistical analysis, and multiple comparisons across clusters were controlled for by using threshold-free cluster enhancement.
  
  1.3 While I do not think the text is in any sense deliberately misleading, I think the authors would do well to either tone down their claims or consider more carefully the implications of the text in many places. Some that stuck out for me are:
  
  Throughout, language in the paper (e.g., "Paired t tests were used to assess differences in the axonal diameter") presumes that the AxCaliber estimates specifically reflect axon diameter. I think the jury is out over whether this is true, particularly for measurements conducted with limited hardware specs. At the very least, I would encourage the author to refer to these measurements throughout as "estimates" of axon diameter.
  
  Thank you for this clarification. We have indeed changed the notation, and now consistently refer to the estimates of axon diameter through MRI as the “MRI axonal diameter proxy”.
  
  1.4 The authors suggest that their results provide "new tools for patient stratification" based on differences in lesion type, but it isn't clear what new information these markers would confer given that the lesions are differentiated based on T1w hypo/hyperintensities. In other words, these lesions are by definition already differentiable from a much simpler MRI marker.
  
  Thank you for this insightful comment. The reviewer is right, and following the general reviewers’ assessment we have decided to not include the lesion analysis in the new version of the manuscript.
  
  1.5 The authors note in the Discussion that: "sensitive to early stages of axonal degeneration, even before alterations in the myelin sheet are detected". Whether intentional or not, the implication in the context of this study is that this would hold for MS (that these markers would detect axonal degeneration preceding demyelination). While there is some discussion of alterations to axonal diameter in MS, the authors do not discuss whether these are the same mechanisms thought to occur in the IBO intervention used here.
  
  Thank you for this comment. Indeed, the scope of the paper is not to assess whether axonal swelling precedes or not myelin alterations, so we agree with the reviewer that this sentence might be misleading and have removed it in the text. While we do not claim that ibotenic acid injections are able to replicate the complex clinical and pathological manifestation of MS (and now we made it clear in the revised manuscript, see comment 1), the electron microscopy paradigm indicates the presence of axonal swelling in the damaged fimbria, which is indeed the same pathological manifestation found in MS post-mortem data (see e.g. Fisher et al. Ann Neurol 2007).
  
  1.6 In the Discussion, the authors note the lack of evidence for a relationship with disability or disease duration, but nevertheless, go on to interpret the "trends" they do observe. I would advise strongly against this: the authors acknowledge that their numbers are low, so I would avoid the temptation to speculate here.
  
  The reviewer is 100% correct. We should have refrained from speculating. In the new version of the paper, however, thanks to the larger human cohort, we were able to find significant associations with disease duration in voxelwise analysis of the white matter skeleton in standard space and in the whole white matter in single subject space (new Figure 5).
  
  1.7 In the Discussion state that "the use of neurofilaments has also been well validated in MS". Well validated for what? MS is a complex disease with a broad range of pathology, so this statement could be read to mean "neurofilaments are known to be altered in MS". However, in the context of this paragraph, the implication would seem to be that neurofilaments are a wellestablished proxy for axonal diameter. Is that the implication, and if so what general evidence is there for this?
  
  We thank the reviewer for this insightful comment. Indeed, altered neurofilaments are not conclusive evidence of increased axonal diameter. In this context, the addition of electron microscopy data in the new manuscript version supports the claim.
  
  Reviewer #2 (Public Review):
  
  Diffusion MRI is sensitive to the brain microstructure, and it has been used to assess the integrity of white matter for nearly 3 decades. Its main limitation is the limited specificity, which makes it difficult to link changes in diffusion parameters to a given pathological substrate. Recently methods based on diffusion MRI that enable the estimation of axonal diameter, non invasively, have become available. This paper aims at validating one of such methods using an experimental model of neurodegeneration. The authors found a significant correlation between axonal diameter estimated by MRI and an histological marker of neurodegeneration. Although this is of great interest, as it demonstrates that this method is sensitive to neurodegeneration, a direct validation would require a measurement of axonal diameter using electron or confocal microscopy, rather than a correlation with a measure of axonal degeneration not directly related to axonal diameter. So, although these data are compelling, they do not prove that the increase in axonal diameter suggested by diffusion MRI corresponds to actual axonal swelling. The Authors also apply the same method to compare the white matter of patients with multiple sclerosis (MS) and healthy controls, showing widespread increases in axonal diameter in the patients. These data are compelling, but again, not conclusive. Other factors such as gloss could bias the MRI measurement and lead to an apparent increase in axonal diameter.
  
  We would like to thank the reviewer for the positive assessment of our work and for the valuable suggestion. We are confident that the new version of the manuscript, by including an extensive validation based on electron microscopy, has addressed the reviewer´s criticisms.
  
  Reviewer #3 (Public Review):
  
  3.1 In this paper, Toschi et al. performed dMRI to in vivo estimate axon diameter in the brain and demonstrated that multi-compartmental modeling (AxCaliber) is sensitive to microstructural axonal damage in rats and axon caliber increase in demyelinating lesions in MS patients, suggesting that axon diameter mapping provides a potential biomarker to bridge the gap between medical imaging contrasts and biological microstructure. In particular, authors injected ibotenic acid (IBO) and saline in the left and right rat hippocampus, respectively, and compared in vivo estimated axon diameter and ex vivo neurofilament staining in left and right fimbria. The axon size estimation was larger in the fimbria of IBO injection side, where the neurofilament intensity is higher. Correlation of axon size estimation and neurofilament intensity was observed in both injection sides. Further, higher axon diameter estimation was observed in normal appearing white matter (NAWM) of MS patients, compared with the healthy subjects. The axon size estimation increased in hypointense lesions of T1 weighted contrast, but not in isointense lesions. Through the comparison of dMRI-estimated axon size and histology-based fluorescence intensity, authors indirectly validated the sensitivity of axon diameter mapping to the tissue microstructure in the rat brain, and further explored the axon size change in the brain of MS patients. However, the dMRI protocol and biophysical modeling in this study were not fully optimized to maximize the sensitivity to axon size estimation, and the dMRI-estimated axon size (4.4-5.4 micron) was much larger than values reported in previous histological studies (0.5-3 micron) [Barazany et al., Brain 2009]. Finally, although the modified AxCaliber model incorporated two fiber bundles in different directions, the fiber dispersion in each bundle was not considered (c.f. fiber dispersion ~20-30 degree in corpus callosum), potentially leading to overestimated axon diameter.
  
  We thank the reviewer for their appreciation of our work, which we believe is substantially improved in this revised version through the inclusion of an electron microscopy paradigm. Below, the point-by-point response to the specific points raised.
  
  3.2 The conclusions in this study are supported by experimental results. However, the dMRI protocol and biophysical model could be further optimized and validated: 1. To in vivo estimate the axon diameter ~1 micron using dMRI, strong diffusion weighting (b-value) should be applied to maximize the signal decay due to intra-axonal restricted diffusion and minimize the signal contribution of extra-cellular hindered diffusion. However, authors only applied maximal b-value = 4000 s/mm2, much smaller than values ~15,00020,000 s/mm2 in previous studies [Assaf et al., MRM 2008; Huang et al., BSAF 2020, 225:1277]. The use of low diffusion weighting in this study leads to a lower bound ~4-6 micron for accurate diameter estimation, the so-called resolution limit in [Nilsson et al., NMR Biomed 2017, 30:e3711]. In other words, the estimated axon diameter is potentially overestimated and related with the imaging protocol and image quality, confounding the biological interpretation.
  
  We thank the reviewer for this insightful comment. Indeed, while the resolution limit is a concern, the chosen b-value has been a compromise between sensitivity to small structure and SNR, as indicated by recent animal (Crater et al., 2022) and human (Jensen et al., 2016; McKinnon et al., 2017; Moss et al., 2019) work, pointing at 3000-4000 s/mm2 as the b-value for which the intra-axonal water signal is dominant. In addition, a paper from the laboratory that first developed the Axcaliber method recently came out (Gast et al., 2023, DOI: 10.1007/s12021-023-09630-w) demonstrating that an MRI protocol with a maximum b-value between 3000 and 4000 s/mm2 (and even lower) is sufficient to capture, in vivo and in humans, various well-known aspects of axonal morphometry (e.g., the corpus callosum axon diameter variation) as well as other aspects that are less explored (e.g., axon diameter-based separation of the superior longitudinal fasciculus into segments). The same paper contains resources and further bibliography supporting the fact that experimental evidence suggests that the contribution of intra-axonal water to restricted diffusion signals dominates other factors (see Online Resource 1, section A of the same paper). To challenge this recent evidence from a neurobiology perspective, we include in the supplementary material a subset of experiments in animals with lower maximum b-value (2500 s/mm2, Fig. S1), where we are able to detect the same effect of increased MRI axonal diameter proxy in the injected hemisphere compared to control.
  
  We would like to add that while extremely valuable and informative, simulation studies such as the excellent study by Veraart et al., 2020, are inevitably valid under certain assumptions. Among them, some critical ones are i) the need to neglect nonaxonal cells such as glia, ii) assuming that the bulk diffusivity of water in cerebral tissue would be the same as that of free water, and iii) impermeable barriers. All these assumptions are expected to play a role in the estimated resolution limit, a role difficult to quantify but likely substantial.
  
  For this reason, we believe that our approach, which is 100% focused on neurobiology and measurements performed in real tissue, can offer a different perspective and fuel the ongoing debate on axonal diameter measurement feasibility. We acknowledge the value of the reviewer comment and discuss the issue of b-value in the discussion (see also comment 1.8).
  
  Changes in the manuscript
  
  • Discussion, pag. 12:<br /> Despite some inevitable minor differences due to different brain sizes and magnet features, the human protocol was built to match the main characteristics of the preclinical diffusion sequence, such as the b-value and diffusion time range. The chosen b-value has been a compromise between sensitivity to small structures and the signalto-noise ratio (SNR), as indicated by recent animal (Crater et al., 2022) and human (Gast et al., 2023; Jensen et al., 2016; McKinnon et al., 2017; Moss et al., 2019) work, pointing at 4000 s/mm2 as the b-value for which the intra-axonal water signal is dominant. However, following recent work supporting sensitivity of diffusion-weighted MRI to axonal diameter even at lower b-values (Gast et al., 2023), we tested a protocol with a lower b-value in a subset of animals, with the aim of facilitating future clinical AxCaliber studies. We found no qualitative differences in the outcome (MRI axonal diameter proxy was increased following fimbria damage). Further work and perhaps more realistic simulations, considering real cell composition and morphology, are needed to clarify this issue.
  
  3.3 In this study, the positive correlation of dMRI-estimated axon size and neurofilament fluorescence intensity is indeed an encouraging result, and yet this validation is indirect since it relies on the positive correlation between neurofilament intensity and axon diameter in histology.
  
  The reviewer correctly points out a severe limitation of the previous manuscript version, which is now addressed by including an extensive electron microscopy evaluation, recapitulated in new Fig. 3.
  
  3.4 Authors did not consider the fiber dispersion in the proposed dMRI model. This can lead to overestimated axon diameter, even in the highly aligned WM, such as corpus callosum with ~20-30 degree dispersion in histology [Ronen et al., BSAF 2014, 219:1773; Leergaard et all, PLoS One 2010, 5(1), e8595] and MRI [Dhital et al., NeuroImage 2019, 189, 543; Novikov et al., NeuroImage 2018, 174:518].
  
  The reviewer is correctly pointing out an important characteristic of while matter microstructure as is fibre dispersion. However, we would like to point out that the use of a second fiber population is expected to mitigate this effect by absorbing some axonal directional dispersion in areas of a single fiber. To support this, we quantified dispersion as the angle between the two main fiber orientations captured by the AxCaliber fit, as showed in Author response image 1 for two representative subjects (one control, upper line, and one MS, lower line; the “dispersion” maps are masked by a white matter probability mask, and superimposed to a T2w). Indeed, the angle between the two main fibres in the corpus callosum is around 20 degrees or lower, compatible with the bibliography cited by the reviewer, and higher in other white matter areas known to be characterized by fiber crossing and dispersion.
  
  Author response image 1.
  
  Angle in radians between the two main fiber orientations captured by the AxCaliber fit, as showed below for two representative subjects (one control, upper line, and one MS, lower line). The dispersion maps are masked by a white matter probability mask (P>=0.95), and superimposed to a T2-weighted image.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.27.489694v1
www.biorxiv.org www.biorxiv.org

New submission 20/06/2022, 14:21:09

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This paper shows that a principled, interpretable model of auditory stimulus classification can not only capture behavioural data on which the model was trained but somewhat accurately predict behaviour for manipulated stimuli. This is a real achievement and gives an opportunity to use the model to probe potential underlying mechanisms. There are two main weaknesses. Firstly, the task is very simple: distinguishing between just two classes of stimuli. Both model and animals may be using shortcuts to solve the task, for example (this is suggested somewhat by Figure 8 which shows the guinea pig and model can both handle time-reversed stimuli).
  
  The task structure is indeed simple. In the context of categorization tasks that are typically used in animal experiments, however, we would argue that we are the higher end of stimulus complexity. Auditory categories used in most animal experiments typically employ a category boundary along a single stimulus parameter (for example, tone frequency or modulation frequency of AM noise). Only a few recent studies (for example, Yin et al., 2020; Town et al., 2018) have explored animal behavior with “non-compact” stimulus categories. Thus, we consider our task a significant step towards more naturalistic tasks.
  
  We were also faced with the practical factor of the trainability of guinea pigs (GPs). Prior to this study, guinea pigs have been trained using classical conditioning and aversive reinforcement on detecting tone frequency (e.g., Heffner et al., 1971; Edeline et al., 1993). More recently, competitive training paradigms have been developed for appetitive conditioning, using a single “footstep” sound as a target stimulus and manipulated sounds as non-target stimuli (Ojima and Horikawa, 2016). But as GPs had never been trained on more complex tasks before our study, we started with a conservative one vs. one categorization task. We mention this in the Discussion section of the revised manuscript (page 27, line 665).
  
  To determine whether these results hold for more complex tasks as well, after receiving the reviews of the original manuscript, we trained two GPs (that were originally trained and tested on the wheeks vs. whines task) further on a wheeks vs. many (whines, purrs, chuts) task. As earlier, we tested these GPs with new exemplars and verified that they generalized. In the figure below, the average performance of the two GPs on the regular (training) stimuli and novel (generalization) stimuli are shown in gray bars, and individual animal performances are shown as colored discs. The GPs achieved high performance for the novel stimuli, demonstrating generalization. We also implemented a 4-way WTA stage for a wheek vs. many model and verified that the model generalized to new stimuli as well.
  
  For frequency-shifted calls, these two GPs performed better for wheeks vs. many compared to the average for wheeks vs. whines shown in the main manuscript. The 4-way WTA model closely tracked GP behavioral trends.
  
  The psychometric curves for wheeks vs. many categorization in noise (different SNRs) did not differ substantially from the wheeks vs. whines task.
  
  We focused our one vs. many training on the two conditions that showed the greatest modulation in the one vs. one tasks. However, these preliminary results suggest that the one vs. one results presented in the manuscript are likely to extend to more complex classification tasks as well. We chose not to include these new data in the revised manuscript because we performed these experiments on only 2 animals, which were previously trained on a wheeks vs. whines task. In future studies, we plan to directly train animals on one vs. many tasks.
  
  Secondly, the predictions of the model do not appear to be quite as strong as the abstract and text suggest.
  
  We now replace subjective descriptors with actual effect size numbers to avoid overstatingresults. We also include additional modeling (classification based on the long-term spectrum) and discuss alternative possibilities to provide readers with points of comparison. Thus, readers can form their own opinions of the strengths of the observed effects.
  
  The model uses "maximally informative features" found by randomly initialising 1500 possible features and selecting the 20 most informative (in an information-theoretic sense). This is a really interesting approach to take compared to directly optimising some function to maximise performance at a task, or training a deep neural network. It is suggestive of a plausible biological approach and may serve to avoid overfitting the data. In a machine learning sense, it may be acting as a sort of regulariser to avoid overfitting and improve generalisation. The 'features' used are basically spectro-temporal patterns that are matched by sliding a crosscorrelator over the signal and thresholding, which is straightforward and interpretable.
  
  This intuition is indeed accurate – the greedy search algorithm (described in the original visionpaper by Ullman et al., 2002) sequentially adds features that add the most hits and the least false alarms compared to existing members of the MIF set to the final MIF set. The latter criterion (least false alarms) essentially guards against over-fitting for hits alone. A second factor is the intermediate size and complexity of MIFs. When MIFs are too large, there is certainly overfitting to the training exemplars, and the model does not generalize well (Liu et al., 2019).
  
  It is surprising and impressive that the model is able to classify the manipulated stimuli at all. However, I would slightly take issue with the statement that they match behaviour "to a remarkable degree". R^2 values between model and behaviour are 0.444, 0.674, 0.028, 0.011, 0.723, 0.468. For example, in figure 5 the lower R^2 value comes out because the model is not able to use as short segments as the guinea pigs (which the authors comment on in the results and discussion). In figure 6A (speeding up and slowing down the stimuli), the model does worse than the guinea pigs for faster stimuli and better for slower stimuli, which doesn't qualitatively match (not commented on by the authors). The authors state that the poor match is "likely because of random fluctuations in behavior (e..g motivation) across conditions that are unrelated to stimulus parameters" but it's not clear why that would be the case for this experiment and not for others, and there is no evidence shown for it.
  
  Thank you for this feedback. There are two levels at which we addressed these comments inthe revised manuscript.
  
  First, regarding the language – we have now replaced subjective descriptors with the statement that the model captures ~50% of the overall variance in behavioral data. The ~50% number is the average overall R2 between the model and data (0.6 and 0.37 for the chuts vs. purrs and wheeks vs. whine tasks respectively). We leave it to readers to interpret this number.
  
  Second, our original manuscript lacked clarity on exactly what aspects of the categorization behavior we were attempting to model. As recent studies have suggested, categorization behavior can be decomposed into two steps – the acquisition of the knowledge of auditory categories, and the expression of this knowledge in an operant task (Kuchibhotla et al., 2019; Moore and Kuchibhotla, 2022). Our model solely addresses how knowledge regarding categories is acquired (through the detection of maximally informative features). Other than setting a 10% error in our winner-take-all stage, we did not attempt to systematically model any other cognitive-behavioral effects such as the effect of motivation and arousal. Thus, in the revised manuscript, we have included a paragraph at the top of the Results section that defines our intent more clearly (page 5, line 117). We conclude the initial description of the behavior by stating that these factors are not intended to be captured by the model (page 6, line 171). We also edited a paragraph in the Discussion section for clarity on this point (page 26, line 629).
  
  In figure 11, the authors compare the results of training their model with all classes, versus training only with the classes used in the task, and show that with the latter performance is worse and matches the experiment less well. This is a very interesting point, but it could just be the case that there is insufficient training data.
  
  This could indeed be the case, and we acknowledge this as a potential explanation in therevised manuscript (page 22, line 537; page 27, line 653). Our original thinking was that if GPs were also learning discriminative features only using our training exemplars, they would face a similar training data constraint as well. But despite this constraint, the model’s performance is above d’=1 for natural calls – both training and novel calls; it is only the similarity with behavior on the manipulated stimuli that is lower than the one vs. many model. This phenomenon warrants further investigation.
  
  Reviewer #2 (Public Review):
  
  Kar et al aim to further elucidate the main features representing call type categorization in guinea pigs. This paper presents a behavioral paradigm in which 8 guinea pigs (GPs) were trained in a call categorization task between pairs of call types (chuts vs purrs; wheek vs whines). The GPs successfully learned the task and are able to generalize to new exemplars. GPs were tested across pitch-shifted stimuli and stimuli with various temporal manipulations. Complementing this data is multivariate classifier data from a model trained to perform the same task. The classifier model is trained on auditory nerve outputs (not behavioral data) and reaches an accuracy metric comparable to that of the GPs. The authors argue that the model performance is similar to that of the GPs in the manipulated stimuli, therefore, suggesting that the 'mid-level features' that the model uses may be similar to those exploited by the GPs. The behavioral data is impressive: to my knowledge, there is scant previous behavioral data from GPs performing an auditory task beyond audiograms measured using aversive conditioning by Heffner et al., in. 1970. [One exception that is notably omitted from the manuscript is Ojima and Horikawa 2016 (Frontiers)]. Given the popularity of GPs as a model of auditory neurophysiology these data open new avenues for investigation. This paper would be useful for neuroscientists using classifier models to simulate behavioral choice data in similar Go/No-Go experiments, especially in guinea pigs. The significance of the findings rests on the similarity (or not) of the model and GP performance as a validation of the 'intermediary features' approach for categorization. At the moment the study is underpowered for the statistical analysis the authors attempt to employ which frequently relies on non-significant p values for its conclusions; using a more sophisticated approach (a mixed effects model utilizing single trial responses) would provide a more rigorous test of the manipulations on behavior and allow a more complete assessment of the authors' conclusions.
  
  We thank the reviewer for their feedback and the suggestion for a more robust statistical approach. We have now replaced the repeated measures ANOVA based statistics for the behavior and model where more than 2 test conditions were presented (SNR, segment length, tempo shift, and frequency shift) with generalized linear models with a logit link function (logistic activation function). In these models, we predict the trial-by-trial behavioral or model outcome from predictors including stimulus type (Go or Nogo), parameter value (e.g., SNR value), parameter sign (e.g., positive or negative freq. shift), and animal ID as a random effect. To evaluate whether parameter value and sign had a significant contribution to the model, we compare this ‘full’ model against a null model that only has stimulus type as a predictor and animal ID as a random effect. These analyses are described in detail in the Materials and Methods section of the revised manuscript (page 36, line 930).
  
  These analyses reveal significant effects of segment length changes, and weak effects of tempo changes on behavior (as expected by the reviewer). Both the behavior and model showed similar statistical significance (except tempo shift for wheeks vs. whines) for whether performance was significantly affected by a given parameter.
  
  The behavioral data presented here are descriptive. The central conceptual conclusions of the manuscript are derived from the comparison between the model and behavioral data. For these comparisons, the p-value of statistical tests is not used. We realized that a description of how we compared model and behavioral data was not clear in the original manuscript. To compare behavioral data with the model, we fit a line to the d’ values obtained from the model plotted against the d’ values obtained from behavior, and computed the R2 value. We used the mean absolute error (MAE) to quantify the absolute deviation between model and behavior d’ values. Thus, high R2 values would signify a close correspondence between the model and behavior regardless of statistical significance of individual data points. We now clarify this in page 12, line 289. We derive R2 values for individual stimulus manipulations, as well as an overall R2 by pooling across all manipulations (presented in Fig. 11). This is now clarified in page 21, line 494.
  
  Reviewer #3 (Public Review):
  
  The authors designed a behavioral experiment based on a Go/ No-Go paradigm, to train guinea pigs on call categorization. They used two different pairs of call categories: chuts vs. purrs and wheeks vs. whines. During the training of the animals, it turned out that they change their behavioral strategies. Initially, they do not associate the auditory stimuli with rewards, and hence they overweight the No-Go behavior (low hit and false alarm rate). Subsequently, they learned the association between auditory stimuli and reward, leading to overweighting the Go behavior (high hit and false alarm rates). Finally, they learn to discriminate between the two call categories and show the corresponding behaviors, i.e. suppress the Go behavior for No-go stimuli (improved discrimination performance due to stable hit rates but lower false alarm rates).
  
  In order to derive a mechanistic explanation of the observed behaviors, the authors implemented a computational feature-based model, with which they mirrored all animal experiments, and subsequently compared the resulting performances.
  
  Strengths:
  
  In order to construct their model, the authors identified several different sets of so-called MIFs (most informative features) for each call category, that were best suited to accomplish the categorization task. Overall, model performance was in general agreement with behavioral performance for both the chuts vs. purrs and wheeks vs. whines tasks, in a wide range of different scenarios.
  
  Different instances of their model, i.e. models using different of those sets of MIFs, performed equally well. In addition, the authors could show that guinea pigs and models can generalize to categorize new call exemplars very rapidly.
  
  The authors also tested the categorization performance of guinea pigs and models in a more realistic scenario, i.e. communication in noisy environments. They find that both, guinea pigs and the model exhibit similar categorization-in-noise thresholds.
  
  Additionally, the authors also investigated the effect of temporal stretching/compression of calls on categorization performance. Remarkably, this had virtually no negative effect on both, models and animals. And both performed equally well, even for time reversal. Finally, the authors tested the effect of pitch change on categorization performance, and found very similar effects in guinea pigs and models: discrimination performance crucially depends on pitch change, i.e. systematically decreases with the percentage of change.
  
  Weaknesses:
  
  While their computational model can explain certain aspects of call categorization after training, it cannot explain the time course of different behavioral strategies shown by the guinea pigs during learning/training.
  
  Thank you for bringing this up – in hindsight the original manuscript lacked clarity on exactlywhat aspects of the behavior we were trying to model. As recent studies have suggested, categorization behavior can be decomposed into two steps – the acquisition of the knowledge of auditory categories, and the expression of this knowledge in an operant task (Kuchibhotla et al., 2019; Moore and Kuchibhotla, 2022) . Our model solely addresses how knowledge regarding categories is acquired (through the detection of maximally informative features). Other than setting a 10% error in our winner-take-all stage, we did not attempt to systematically model any other cognitive-behavioral effects such as the effect of motivation and arousal, or behavioral strategies. Thus, in the revised manuscript, we have included a paragraph at the top of the Results section that defines our intent more clearly (page 5, line 117). We conclude the initial description of the behavior by stating that these factors are not intended to be captured by the model (page 6, line 171). We also edited a paragraph in the Discussion section for clarity on this point (page 26, line 629).
  
  Furthermore, the model cannot account for the fact that short-duration segments of calls (50ms) already carry sufficient information for call categorization in the guinea pig experiment. Model performance, however, only plateaued after a 200 ms duration, which might be due to the fact that the MIFs were on average about 110 ms long.
  
  The segment-length data indeed demonstrates a deviation between the data and the model.As we had acknowledged in the original manuscript, this observation suggests further constraints (perhaps on feature length and/or bandwidth) that need to be imposed on the model to better match GP behavior. We originally did not perform this analysis because we wanted to demonstrate that a model with minimal assumptions and parameter tuning could capture aspects of GP behavior.
  
  We have now repeated the modeling by constraining the features to a duration of 75 ms (thelowest duration for which GPs show above-threshold performance). We found that the constrained MIF model better matched GP behavior on the segment-length task (R2 of 0.62 and 0.58 for the chuts vs. purrs and wheeks vs. whines tasks; with the model crossing d’=1 for 75 ms segments for most tested cases). The constrained MIF model maintained similarity to behavior for the other manipulations as well, and yielded higher overall R2 values (0.66 for chuts vs. purrs, 0.51 for wheeks vs. whines), thereby explaining an additional 10% of variance in GP behavior.
  
  In the revised manuscript, we included these results (page 28, line 699), and present results from the new analyses as Figure 11 – Figure Supplement 2.
  
  In the temporal stretching/compressing experiment, it remains unclear, if the corresponding MIF kernels used by the models were just stretched/compressed in a temporal direction to compensate for the changed auditory input. If so, the modelling results are trivial. Furthermore, in this case, the model provides no mechanistic explanation of the underlying neural processes. Similarly, in the pitch change experiment, if MIF kernels have been stretched/compressed in the pitch direction, the same drawback applies.
  
  We did not alter the MIFs in any way for the tests – the MIFs were purely derived by trainingthe animal on natural calls. In learning to generalize over the variability in natural calls, the model also achieved the ability to generalize over some manipulated stimuli. The fact that the model tracks GP behavior is a key observation supporting our argument that GPs also learn MIF-like features to accomplish call categorization.
  
  We had mentioned at a few places that the model was only trained on natural calls. To addclarity, we have now included sentences in the time-compression and frequency-shifting results affirming that we did not manipulate the MIFs to match test stimuli. We also include a couple of sentences in the Discussion section’s first paragraph stating the above argument (page 26, line 615).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.03.09.483596v1
www.medrxiv.org www.medrxiv.org

New submission 05/10/2022, 11:02:13

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Causality is important and desired but usually difficult to establish. In this work, Park et al. conducted a comprehensive phenome-wide, two-sample Mendelian randomization analysis to infer the casual effects of plasma triglyceride (TG) levels on 2,600 disease traits. They identified causal associations between plasma TG levels and 19 disease traits, related to both atherosclerotic cardiovascular diseases (ASCVD) and non-ASCVD diseases. They used biobank-scale data in both discovery analysis and replication analysis.
  
  The conclusions of this work are mostly supported by the data and analysis, but some aspects need to be clarified and extended.
  
  (1) The datasets used in this study may not be very consistent. For example, UKB participants are aged 40-69 years old at recruitment. In addition, UKB is United Kingdom-based and FinnGen is Finland-based. So the definition of outcomes may not be identical. The authors should discuss the differences between the datasets and their potential effects.
  
  The reviewer is correct about the differences between UKB and FinnGen and that the definition of clinical outcomes between the two datasets may not be identical due to differences in healthcare systems and population demographics. We now mention this in the discussion section as a potential limitation.
  
  Manuscript changes:
  
  Line 520-539: “Third, UKB and FinnGen have innate differences in participant demographics and medical coding systems, due in part to the former being based in the United Kingdom and the latter in Finland. As such, potential misclassification of participants in case-control assignment is a liability to this study. We exercised caution in mapping UKB traits to FinnGen traits, but we were unable to reliably map all “categorical” traits from UKB to corresponding traits in FinnGen, testing for replication only 221 of the 598 associations that were nominally significant in the primary analysis. We note however that, despite geographical differences, both datasets largely involve White European participants of older age, with the mean age in UKB and FinnGen being 56.5 and 59.8, respectively.”
  
  (2) The discovery analysis and replication analysis are not completely independent because data from UKB have been used in both analyses. Although in discovery, the data were used for association with outcomes; while in replication, the data were used for association with exposure. The authors may want to explain if this may cause problems.
  
  The reviewer is correct that UKB data were used in both the discovery and replication analyses with the caveat that the discovery analysis used UKB for outcomes while using GLGC for exposures, whereas the replication analysis used UKB for exposures while using FinnGen for outcomes. We believed this would be a creative use of three different datasets and a strength of the study; however, we agree that examining the implications of this study design is needed to acknowledge potential biases. We now expand on this in the discussion section as a potential limitation.
  
  Manuscript changes:
  
  Lines 539-545: “Fourth, discovery and replication analyses were not completely independent, since UKB data were used in both analyses. This could potentially exacerbate demographic and measurement biases inherent to UKB; however, we show that taking a traditional replication approach using GLGC instead of UKB for selecting exposure instruments in replication returns comparable Tier 1 results (Supplementary Files 5), while losing statistical power to highlight many of the Tier 2 and 3 results.”
  
  (3) As stated in the manuscript, there are three assumptions for MR analysis. The validity of the results depends on the validity of the assumptions. The last two assumptions are usually difficult to validate. To the authors' credit, they conducted sensitivity analyses addressing horizontal pleiotropy, which is related to assumption 3. It would be helpful if the authors can discuss those assumptions explicitly.
  
  We now explicitly state the assumptions of Mendelian randomization in the introduction section and discuss the validity of these assumptions in the discussion section.
  
  Manuscript changes:
  
  Lines 501-514: “The study has several limitations. First, MR is a powerful but potentially fallible method that relies on several key assumptions, namely that genetic instruments are (i) associated with the exposure (the relevance assumption); (ii) have no common cause with the outcome (the independence assumption); and (iii) have effects on the outcome solely through the exposure (the exclusion restriction assumption) (Hartwig et al., 2016). In MR, (i) is relatively straightforward to test, while (ii) and (iii) are difficult to establish unequivocally. As a prominent example, horizontal or type I pleiotropy has been shown to be common in genetic variation, which can bias MR estimates (Verbanck et al., 2018) (Jordan et al., 2019). This occurs when a genetic instrument is associated with multiple traits other than the outcome of interest. To detect and correct for this as best as possible, we used various MR tests as sensitivity analyses that each aim to adjust for or account for the presence of horizontal pleiotropy, including MR-PRESSO, as well as MR-Egger and weighted median methods. There is no universally accepted method that is perfectly robust to horizontal pleiotropy, but we take the best current approach by using multiple methods and examining the consistency of results.”
  
  Reviewer #2 (Public Review):
  
  This work conducted a Mendelian randomization analysis between TG and a large number of disease traits in biobanks. They leverage the publicly available summary statistics from the European samples from the UK Biobank and FinnGen. A solid but routine standard summary-statistics based MR study is conducted. Several significant causal associations from TG to phenotypes are called by setting p-value cutoff with some Bonferroni correction. Sensitivity statistical analyses are conducted which generate largely consistent results. The research problem is important and relevant for public health as well we drug development. Overall this is a solid execution of current methods over appropriate data source and yields a convincing result. The interpretation of the results in discussion is also well-balanced.
  
  While the paper does have strengths in principle, a few technical weaknesses are observed.
  
  They used UK Biobank as the discovery and FinnGen as the replication. But the two cohorts are rather used symmetrically. Especially for the Tier 3 (NB), it seems to be an attempt of reusing the replication cohort as the discovery. I wonder if that would create additional multiple testing burden as a greater number of hypotheses are considered.
  
  We thank the reviewer for this thought-provoking comment. As the reviewer is aware, MR studies have generally not accounted for multiple testing in the past since they have usually focused on single exposures and/or single diseases. Ours is among one of the more unique MR studies taking a phenome-wide, high-throughput approach, so determining the optimal threshold for balancing true-positive vs. false-positive discovery is an important aspect of the study warranting discussion.
  
  We agree that Tier 3 results carry the least stringent level of statistical evidence (i.e., nominally significant in discovery using UK Biobank and Bonferroni-significant in replication using FinnGen), and that these results should be interpreted with caution. As a phenome-wide study, a significant aim of this work was to generate hypotheses, and so, we decided to present our results using the three tiers of statistical evidence to highlight as many promising associations as possible for further investigation. Nevertheless, we now express extra caution in the results and discussion sections regarding Tier 2 and 3 results, and we also note as a limitation that these results especially require external replication.
  
  Manuscript changes:
  
  Lines 438-444: “Regarding non-ASCVDs, we present suggestive genetic evidence of potentially causal associations between plasma TG levels and uterine leiomyomas (uterine fibroids), diverticular disease of intestine, paroxysmal tachycardia, hemorrhage from respiratory passages (hemoptysis), and calculus of kidney and ureter (kidney stones). Due to the weaker statistical evidence supporting these associations, special caution is encouraged when interpreting these results to infer causality, and further replication and validation studies are essential for all Tier 2 and Tier 3 results.”
  
  The replication p-value cutoff is a bit statistically lenient. In a typical discovery-replication setting the two stages are conducted sequentially and replication should go through the Bonferroni adjustment on the number of significant signals from discovery that is tested in the replication. For example, in this case, in tier 2, the cutoff should be 0.05/39. This may make the association of leiomyoma of the uterus slightly non-significant though. Similar cutoff should be applied to tier 3 as well.
  
  We thank the Reviewer for highlighting this important point. We agree that in a standard two-stage discovery and replication study design, the Bonferroni adjustment should be based on the number of significant signals from discovery that is tested in the replication. We had initially considered this approach but chose the current tiered approach based on a number of factors:
  
  First, we had initially considered performing a standard meta-analysis between UK Biobank and FinnGen datasets and using the Bonferroni adjustment of the total number of tests. However, it was not possible to reliably map the phenotypes between UK Biobank and FinnGen on a large-scale due to different classification schemes.
  
  Second, we had noticed that if we only focus on the sequential two-stage design, then we would be ignoring strong causal relationships observed in FinnGen that passed Bonferroni adjustment but may only be nominally associated in UK Biobank. Although not as strong as Tier 1 findings, we believe that these findings warranted some consideration. This is particularly relevant since differences in the strength of the causal relationship could be attributed to the different populations studied, sample size, different health systems used to measure disease outcomes, differences in statistical power in the MR tests between the two stages (e.g., number of IVs), amongst others.
  
  Third, we wanted to point out that the total adjustment for number of phenotypes tested using Bonferroni is a very conservative adjustment because the multiple EHR phenotypes have varying degrees of redundancy and correlation. We believe the appropriate Bonferroni-adjusted P-value cutoff is somewhere in between the Bonferroni adjustment of total number of phenotypes, and the nominal P-value (no adjustment for number of phenotypes).
  
  Although somewhat unconventional, we came up with this tiered P-value approach to overcome the points mentioned above. We have now included text to further explain our approach and to mention that tier 2 and tier 3 results require further replication and validation.
  
  Manuscript changes:
  
  Lines 266-283: “This presentation is somewhat unconventional and partly arises from the study’s use of three different datasets for instrument selection. In a traditional two-stage discovery and replication design, Bonferroni adjustment is based on the number of significant signals from discovery that is tested in replication. Here, we used three tiers of statistical evidence to present results because a standard meta-analysis between UKB and FinnGen was not possible, given it was not possible to reliably map all phenotypes between the two datasets. Additionally, Bonferroni-significant results in the replication analysis would have been ignored in FinnGen in a sequential two-stage design if they were also only nominally associated in UKB. The three tiers are defined below:”
  
  Lines 441-444: “Due to the weaker statistical evidence supporting these associations, special caution is encouraged when interpreting these results to infer causality, and further replication and validation studies are essential for all Tier 2 and Tier 3 results.”
  
  Lines 498-500: “However, we reiterate that this Tier 3 association was only nominally significant in discovery, while Bonferroni-significant in replication, and future studies are needed to validate the statistical evidence.”
  
  Lines 565-567: “However, caution is still warranted in inferring causality, as MR depends on specific assumptions and the validity of those assumptions must be carefully assessed. Thus, diverse study designs remain necessary to triangulate evidence on the causal effects of plasma TG levels.”
  
  The causal effect of TG to leiomyoma of the uterus is weak, as indicated by both the sub-significant in the replication and the non-significant of MR-PRESSO. Similarly, I would recommend more caution on the weak statistical rigor when interpreting Tier 2 and Tier 3 results.
  
  We agree with the Reviewer. We have now emphasized more caution in interpreting Tier 2 and Tier 3 results. We have also explicitly restated the weaker statistical evidence underlying these results and noted need for future validation. Please see our detailed response to the Comment above.
  
  Manuscript changes:
  
  Lines 498-500: “However, we reiterate that this Tier 3 association was only nominally significant in discovery, while Bonferroni-significant in replication, and future studies are needed to validate the statistical evidence.”
  
  Another methodological choice that might need justification is the use of UKB TG GWAS loci (1,248 SNPs) are the instrument for FinnGen. This may create some subtle interference with the use of UKB as outcomes in the discovery analysis. It may be minor but some justification or at least some discussions of potential limitations should be mentioned. What about the alternative of using GLGC as instruments in replication?
  
  We agree with the reviewer that the use of UKB TG GWAS loci (1,248 SNPs) as instruments for FinnGen outcomes needs additional justification. We now detail this decision in the text as copied below.
  
  Additionally, we now present new data comparing MR results on FinnGen outcomes when selecting TG instruments from UKB GWAS versus GLGC GWAS. Statistical significance after Bonferroni correction was set to 0.05/221, where 221 was the number of disease traits nominally significant in UKB that were tested in FinnGen. We note that the results were fairly consistent. All Tier 1 results remained Bonferroni significant, whether using TG SNPs from UKB or GLGC. Though statistical significance decreased for the remaining diseases of interest, the direction of causality remained consistent, and three disease traits remained significant (hypertension, aortic aneurysm, and alcoholic liver disease). These results support that instrumenting TG using 1,248 SNPs from UKB might carry more power than the 141 SNPs from GLGC, allowing for the detection of associations in our initial replication analysis using UKB for exposures and FinnGen for outcomes. We now include this analysis in the text and include the figure below, as well as its underlying data, as supplementals (Supplementary File 5).
  
  Manuscript changes:
  
  Lines 229-236: “We selected UKB TG GWAS loci as the instruments for replication on FinnGen outcomes, rather than GLGC TG GWAS loci, to diversify the source of TG instruments and mitigate potential biases associated with one TG GWAS. Moreover, UKB GWAS included a larger study population than GLGC GWAS, providing a greater number of genetic instruments that can together explain more of the variance in plasma TG levels, and thus, greater statistical power and precision. Nevertheless, we also performed the replication analyses using TG instruments from GLGC and included these results as supplemental data (Supplementary File 5).”
  
  For disease outcomes (line 188), UKB European sample size is ~400,000 rather than ~500,000. Can the author clarify the sample size they used?
  
  We thank the reviewer for catching this detail. We have now clarified the sample size of UKB European participants in the Methods section, and we also included the exact sample size of each disease trait GWAS (cases and controls) in Supplementary Figure 1.
  
  Manuscript changes:
  
  Lines 194-201: “Pan-UKB had performed 16,131 GWASs on 7,221 phenotypes in ~420,531 UKB participants of European ancestry using genetic and phenotypic data (PanUKBTeam, 2020). A total of 7,221 total phenotypes had been categorized as “biomarker”, “continuous”, “categorical”, “ICD-10 code”, “phecode”, or “prescription” (PanUKBTeam, 2020). We filtered for outcomes to retain categorical, ICD-10, and phecode types; non-null heritability in European ancestry as estimated by Pan-UKB; and relevance to disease, excluding medications. This yielded 2,600 traits for primary analysis. The exact sample size of each GWAS for each of these traits is provided in Supplementary File 1.”
  
  It would be reassuring to the reader if the TG measurements were measured in a treatment-naïve manner. GLGC accounted for treatment (at least LDL, check paper for TGs; if they didn’t, there must be reason). Maybe not UKB.
  
  We now provide information about whether the lipid measurements were measured in a treatment-naïve manner in the Methods for GLGC and UKB. We also address this point in the discussion section as a potential limitation.
  
  Manuscript changes:
  
  Lines 179-180: “We note that the GLGC GWAS had excluded individuals known to be on lipid-lowering medications.”
  
  Lines 187-188: “We note that the Pan-UKB GWAS study did not exclude participants based on their use of lipid-lowering medications.”
  
  Lines 545-546: “Fifth, the GLGC GWAS used to select instruments for plasma TG levels in discovery had accounted for lipid-lowering treatment, while the UKB GWAS used in replication had not.”
  
  "Phenome-wide MR is a high-throughput extension of MR that, under specific assumptions, estimates the causal effects of an exposure on multiple outcomes simultaneously." - I guess it is more informative to mention the specific assumptions, at least briefly, in the introduction so it is easier for the reader to interpret the results.
  
  We agree with the reviewer that it would be informative to explicitly state the assumptions of Mendelian randomization. We now explicitly state these assumptions in the introduction.
  
  Manuscript changes:
  
  Lines 123-129: “Phenome-wide MR is a high-throughput extension of MR that estimates the causal effects of an exposure on multiple outcomes simultaneously. As in conventional MR, this method uses genetic variants as instrumental variables (IV) to proxy modifiable exposures (Davey Smith & Ebrahim, 2003), and importantly, it relies on three critical assumptions: (1) The genetic variant is directly associated with the exposure; (2) The genetic variant is unrelated to confounders between the exposure and outcome; and (3) The genetic variant has no effect on the outcome other than through the exposure (Davey Smith & Ebrahim, 2003).”
  
  Reviewer #3 (Public Review):
  
  Park and Bafna et al. applied a genetics-based epidemiological approach, the Mendelian randomization analysis (MR), to evaluate the potential causal roles of triglycerides across 2,600 disease traits (i.e., the phenome). In a typical two-sample MR framework, they utilized existing genome-wide association study (GWAS) summary statistics from two separate studies. They are Global Lipids Genetics Consortium (GLGC) and UK Biobank in the discovery analysis, and UK Biobank and FinnGen in the replication analysis. This replication design is a great strength of the study, enhancing the robustness and reproducibility of the results. For the candidate pairs of causal associations, the authors further perform multiple sensitivity analyses to evaluate the robustness of the results to possible violations of assumptions in MR. To disentangle the independent effects of triglycerides from other lipid fractions (i.e., LDL-cholesterol and HDL-cholesterol), the authors performed multivariable MR analysis. In the end, possible causal associations were revealed in three tiers, based on statistical significance in the two-stage analysis. The results support the causal effects of triglycerides in increasing the risk of atherosclerotic cardiovascular disease. They also reveal novel conditions, which are either new treatable conditions (e.g., leiomyoma, hypertension, calculus of kidney and ureter) for repurposing of triglycerides-lowering drug, or possible side effects (e.g., alcoholic liver disease) the triglyceride-lowering treatment should pay special attention to.
  
  The analysis approaches in the paper are standard and solid. The discovery-replication study design is a great strength. Correction for multiple testing was implemented in a conservative way. The sensitivity analyses and MVMR strengthen the robustness of the results. The manuscript is very clearly written and pleasant to read. The limitations were well-presented. The conclusions and interpretations are mostly supported by the data, with one major concern as explained below. But overall, in addition to the specific findings, this study could be an exemplar study for the use of phenome-wide MR in identifying treatable conditions and side effects for most existing drugs.
  
  1) My major concern is about reverse causation. For example, having atherosclerotic cardiovascular disease increases circulating triglycerides. Reverse causation can induce false positives in MR analysis. With the existing data in this study, the authors can perform a reverse MR to evaluate the effect of the 19 disease traits on triglycerides. Ruling out the presence of reserve causation is important to make sure that the current findings are not false positives.
  
  We agree with the reviewer that performing reverse MR would be important to rule out reverse causation. We now present new results using reverse MR, selecting instruments for disease from UKB and instruments for TG from GLGC (i.e., reversing the discovery analysis). We provide an interpretation of these new results in the discussion section and present the underlying data, including the number of genetic variants used, in Supplementary File 6. Please note we could only perform reverse MR on 9 of the 19 diseases of interest, due to insufficient genetic data in GLGC to extract the specific exposure instruments. As expected, we observed significant associations (orange) between “disorders of lipoprotein metabolism” and “hyperlipidemia” with plasma TG levels; however, all other estimates were non-significant, suggesting unidirectional associations for the remaining seven disease traits. We now include the figure below and its underlying data as supplements (Supplementary File 6).
  
  Manuscript changes:
  
  Lines 258-261 “Finally, we performed bidirectional or reverse MR on significant results to examine the potential presence of reverse causation. We selected instruments for each disease as described above from Pan-UKB and instruments for plasma TG levels from GLGC, essentially reversing the discovery stage design using a fixed-effect IVW method.”
  
  Lines 368-373: “Finally, we performed reverse MR to estimate the effects of significant disease traits on plasma TG levels, selecting instruments from UKB and GLGC, respectively. Genetic data were sufficiently available to perform this analysis for 9 of the 19 diseases of interest. These results are presented in Supplementary File 6. Expectedly, “disorders of lipoprotein metabolism” and “hyperlipidemia” had positive effects on plasma TG levels; however, no other examined disease trait showed results suggesting reverse causation.”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

medrxiv.org/content/10.1101/2022.07.21.22277900v1
www.biorxiv.org www.biorxiv.org

Translation Inhibitory Elements from Hox a3 and a11 mRNAs use uORFs for translation inhibition

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #2:
  
  Non-canonical pathways for regulating protein synthesis serve important roles for controlling gene expression in critical developmental pathways. Homeobox (Hox) genes encode many mRNAs regulated at the level of translation. A general feature for many of these mRNAs has been the proposal they are regulated by Internal Ribosome Entry Sites (IRESs) and possess sequences in the 5'-untranslated regions (5'-UTR) of the mRNA that prevent canonical cap-dependent translation, termed "translation inhibitory elements" or TIEs. However, the mechanisms by which these Hox mRNAs are regulated remain unclear. Here, the authors focus on two Hox mRNAs, Hox a3 and Hox a11, and find they use entirely different means to achieve the same end of repressing cap-dependent translation. Hox a3 uses the non-canonical translation initiation factor eIF2D and an upstream open reading fram (uORF), whereas a11 uses a "start-stop" uORF followed by a thermodynamically stable stem-loop to inhibit translation. Overall, the experiments support the major conclusions drawn by the authors, and nail down mechanisms that have been left unresolved since the Hox mRNAs were first discovered to be regulated at the level of translation. These results will be of wide interest to the translation and developmental biology fields.
  
  Some issues the authors should consider:
  
  1) The mapping of the TIE boundaries are in general well-supported by the luciferase reporter experiments. However, there seems to be a disconnect in the luciferase values in Fig. 1B compared to the western blots in Supplementary Fig. 1D, however. For example, in the a3 case the 106 and 113 bands don't seem to correspond to levels consistent with the luciferase activity. For a11, the 153 band is not consistent with the luciferase activity. Also, the gels at the bottom are confusing. Should 74 in the left gel be 77? It would help to have a clearer explanation in the figure legend.
  
  The reviewer is right, supplementary figure 1D is misleading. We have clarified the data with a new supplementary figure 1D. The gels presented in this figure are not western blots, they are SDS-page analysis of translated product (i.e. Renilla luciferase protein) in the presence of 35S-Methionin. Since the function of TIE elements was measured in comparison with reporters that do not contain any TIE element, we loaded on each gel a reference (lanes w/o TIE) for quantification purposes. Since the exposure time of distinct gels was variable, one should not compare the intensities in between gels. We added the quantification of the gel intensity related to the reference construct (w/o TIE). We agree with the reviewer that the two gels at the bottom are not informative, we removed them from the new supplemental figure 1D.
  
  2) The results in the various sucrose gradients are not entirely convincing as presented. In all these cases, the experiment would benefit from the use of high-salt conditions (See Lodish and Rose, 1977, JBC 252, 1181-ff) in the gradient to remove background 80S not engaged with mRNAs. For the +cycloheximide sample in Fig. 8, this looks more like a "half-mer" between a monosome and disome, rather than a standard polysome.
  
  We do not agree with the point raised by the reviewer on sucrose gradients. Obviously this is due to a misunderstanding of the conducted experiments. We would like to remind that the plots shown in the manuscript represent the percentage of mRNA transcripts labelled with a radioactive cap that were introduced in cell-free translation extracts. Therefore, since we monitor only radioactivity, the sole radioactive mRNA transcripts tested in these experiments are observed, consequently there is no background 80S that are not engaged with mRNAs. Such background 80S are visible on the OD profile shown now in a novel supplementary figure S6. However, non-engaged 80S are not radioactive and mRNAs that are not engaged in the 80S are found in the RNP fraction. The absence of radioactive background 80S is further corroborated by the use of edeine that prevents the codon-anticodon interaction (see data below).
  
  When we setup our experimental strategy, we first used edeine to validate our protocol, in this case no radioactive 80S is observed confirming that no background 80S is present in our assays. In conclusion, peaks at the level of 80S can only be radioactive mRNA engaged in an 80S. We have extended the figure legend to clarify the conducted experiments.
  
  Concerning Fig 8, we agree that this experiment is not conclusive and propose to remove it as mentioned in response to a comment from reviewer #1.
  
  3) In Fig. 7, it would be helpful to see the absolute level of translation from the reporters, as it is not clear what the baseline level of translation is in the knockdown cell lines. It's hard to judge the eIF4E knockdown case in particular without this information. Also in panel B, the GGCCC147 cell line is missing.
  
  As previously mentioned, we agree that Fig 7 is misleading and we have completely remodelled the figure in the revised manuscript. See also point 5 from reviewer #1. Because the GGCCC147 mutation had no effect in RRL, we decided not to test it in HEK cells and focused on the GGCC107 that has a significant effect both in RRL and in HEK cells.
  
  4) From the MS experiments in Fig. 6 and Supplementary Fig. 6, the authors focus on eIF2D, which makes sense. But they don't comment on two other highly suggestive hits in the a3 vs. beta-globin and a3 vs. a11 comparisons. These are eIF5B and HBS1L. Both are highly suggestive of what might be going in with the eIF2D-dependent translation mechanism. They don't show up in the GMP-PNP samples in Supplementary Fig. 6, which is interesting and would deserve a comment.
  
  We are grateful for this very interesting comment. As suggested, we have inserted a comment related to HBS1L and eIF5B in the discussion of the manuscript.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.01.19.427285v1
www.biorxiv.org www.biorxiv.org

New submission 01/10/2023, 17:45:28

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  This paper evaluates the effect of knocking out CST7(Cystatin 5) on the APPNL-G-F Alzheimer's disease mouse model. They found sexually dimorphic outcomes, with differential transcriptional responses, increased phagocytosis (but interestingly a higher plaque burden) in females and suppressed inflammatory microglial activation in males (but interestingly no change in plaque burden). This study offers new insight into the functional role of CST7 that is upregulated in a subset of disease- associated microglia in AD models and human brain. Despite the discovery of disease-associated microglia several years ago, there has been little effort in understanding the function of the different genes that make up this profile, making this paper especially timely. Overall, the experiments are well-controlled and the data support the main conclusions and the manuscript could be strengthened by addressing the below comments and clarifying questions that could impact the interpretation of their data/ findings.
  
  1) In the first section discussing CST7 expression levels in AD models, it would be good to involve a discussion of levels of CST7 change in human AD samples. There are sufficient available datasets to look at this, and it would help us understand how comparable the animal models are to human patients. For example, while in mice CST7 is highly enriched in microglia/macrophages, in human datasets it seems like it is not quite so specific to microglia - it is equally expressed in endothelial cells. This might have a significant impact on the interpretation of the data, and it would be good to introduce and assess the findings in mice through the human subjects lens. There is a discussion of the human data in the discussion section, but it would be more appropriately assessed in the same way as the mouse data and comparatively presented in the results section. The authors could also include the data from Gerrits et al. 2021 in their first figure.
  
  We agree with the reviewer on the importance of considering the work in the context of human disease. While CST7 is not as strongly upregulated in human AD brain as it is in mouse expression is observed predominantly in myeloid cells in the brain with very minimal expression detected in endothelial cells (see screenshots in Author response image 1 from Brain Myeloid Landscape platform (http://research-pub.gene.com/BrainMyeloidLandscape/BrainMyeloidLandscape2/) and is enriched in AD clusters vs homeostatic in scRNASeq studies (Gerrits et al., 2021). We attempted immunostaining for human CF (CST7) in AD brains to assess expression and co-localisation with microglial markers but failed to validate any of the antibodies tested. Additionally, King et al., 2023 (PMID: 36547260) recently showed increase in CST7 expression in bulk hippocampal RNASeq in AD vs mid-life controls suggesting an ageing/AD mechanism. CST7 has also been shown to be expressed following overexpression of TREM2 in human microglia in vitro and that siRNA-mediated knockdown of expression leads to an increase in phagocytosis (Popescu et al., 2023 - PMID: 36480007), mirroring our data and suggesting a conserved role in human cells. Overall, we believe that, even in the context of mouse models, the understanding of the function of genes upregulated in disease is of importance to the field and that this study paves the way for further work investigating human CST7 in disease. We have added this (with citations to the datasets mentioned) to the discussion (highlighted).
  
  Author response image 1
  
  2) The differential RNAseq data is perhaps one of the most striking results of this paper; however it is difficult to see exactly how similar the male v female APPNL-G-F profiles are, in addition to the genes shared or not between the KO condition. Venn diagrams, in addition to statistical tests, would enhance this part of the paper and add more clarity.
  
  We have added Venn diagrams to show DEGs between male and female AppNL-G-F microglia vs WT control to show how similar the male v female APPNL-G-F profiles are. Additionally, to exemplify the Cst7KO-Sex interaction, a Venn showing DEGs between male and female AppNL-G-F microglia vs. AppNL-G-FCst7-/- microglia (Fig. 2 – Fig. supplement 3). We confirm we have derived all differential gene expression changes reported (including those represented in the Venn diagrams) using appropriate Padj statistical approaches (see Methods).
  
  3) A major argument in the paper is a continuation of Sala-Frigerio 2019 which says that the female phenotype is an acceleration of the male phenotype. Does this mean that if males were assessed at later timepoints, they would be more similar to the females? Or are there intrinsic differences that never resolve? It would be helpful to see a later timepoint for males to get at the difference between these two options
  
  This is an interesting question and while we acknowledge that empirically addressing with a later timepoint could add insight, we believe it would actually need multiple closely-spaced timepoints as choosing what single later timepoint would be optimal is difficult to judge (and likely not possible at all) for reasons below. We also believe data already published combined with our observations show it is most-likely a cell-intrinsic effect that explains our sex-specific differences.
  
  First, we emphasize the acceleration of the microglial phenotype in female AppNL-G-F mice previously published is fairly subtle and relative rather than absolute e.g. the DAM/ARM microglia state represents ~50% of all microglia in male and ~55% of all microglia in females at 12 months old therefore both sexes have similarly abundant microglia in the state that most highly express Cst7. Indeed, after the age at which DAM/ARM state microglia appear in appreciable numbers (~ 6 months), both females and males both have an abundance of them. It is important to note that a 12-month male is far more “progressed” than a 6-month female hence the stepped age effect is temporally short.
  
  Second, Cst7 deletion in the AppNL-G-F mice condition caused qualitative differences affecting distinct genes and/or overlapping genes moving in different directions between female and male mice - if a stepped age effect explained sex differences from Cst7 deletion, given that it could only be stepped by a very short timeframe (several weeks maximum) from reasoning above, we would expect to see similar qualitative changes but of different magnitude in female and male mice arising from Cst7 deletion; this is not the pattern we see.
  
  Third, beyond 12 months old, regression from ARM/DAM actually occurs, again making it unlikely males would “catch up” with females to show the same profile from Cst7 deletion but just at an older age – practically, this also complicates choosing a single later timepoint (and age-related systemic morbidity emerges as a potential confounder as well).
  
  In summary, while the acceleration of the DAM signature in female microglia offers an intriguing possible explanation to our observation of sexual dimorphism in response to deletion of one of the key genes in this signature, we believe it more likely that intrinsic effects are responsible for the Cst7 deletion sex-related impact. Taking the alternative perspective, even if a stepped age effect in the underlying progression of the model could explain our findings, this would need multiple timepoints with short gaps between (e.g. monthly at 12, 13, 14, 15 months old) to provide the temporal resolution to expose this pattern; we would not have the resources to conduct such a resource-intensive and lengthy study. We hope this reasoning appears logical and conscious of the importance to convey this in our manuscript we have revised the Discussion to as concisely as possible capture some key points outlined above.
  
  4) If the central argument is that CST7 in females decreases phagocytosis and in males increases microglia activation, are there changes in amyloid plaque burden or structure in the APPNL-G-F /CST 7 KO mice compared to APPNL-G-F/CST7 WT that reflect these changes? Please address. If not, how does this affect the functional interpretation of differential expression observed in phagocytic/reactive microglia genes? Pieces of this are discussed but it could be clearer.
  
  We emphasise the data already presented in Fig 6 and Fig. 6 – Fig. Supplement 2 showing altered Aβ burden (6E10 staining) and plaque count (MeX04) but no change in plaque area. Regarding the functional interpretation of Cst7-dependent gene changes in microglia beyond the endolysosomal function we present in figures 3-5, we have included additional data using simple immunohistochemistry, as suggested by the reviewer, to assess synapse abundance. We show loss of Sy38 coverage around plaques (Fig. 6I) and a moderate but significant decrease in coverage between AppNL-G-F/Cst7-/- vs AppNL-G-F brains only in females (Fig. 6J). This reflects the effect observed with plaque coverage whereby we observe increased burden in AppNL-G-F/Cst7-/- vs AppNL-G-F females but not males (Fig. 6B-F) suggesting the increased plaque burden in Cst7-/- female mice may lead to increased synapse loss. We would also emphasise that altered expression of phagolysosomal genes could affect disease in ways beyond interactions with amyloid and synapses.
  
  5) It is confusing that increased phagocytosis in the APPNL-G-F/CST7 KO females leads to greater plaque burden, considering proteolysis is not affected. What might explain this observation? Additionally, it is interesting that suppression of microglial activation doesn't lead to an increase in plaques in the male APPNL-G-F/CST7 KO mice. How does the profile of phagocytic microglia in the male APPNL-G-F/CST7 KO mice differ from the APPNL-G-F males?
  
  We emphasize our comments on this topic in the discussion where we speculate that the greater plaque burden in females is linked to increased uptake of Aβ (which we observe in Fig. 4B&C) and deposition into plaques as suggested by Huang et al., 2021 (PMID: 33859405), d’Errico et al., 2022 (PMID: 34811521) and Shabestari et al., 2022 (PMID: 35705056). Regarding the lack of effect in males despite the suppression of inflammatory genes, we agree this is a curious observation, although may point to as yet ill-defined mechanisms for how inflammatory pathways influence plaque pathology. Unfortunately, we were not able to specifically compare the profile of phagocytic microglia in AppNL-G-F vs AppNL-G-FCst7-/- as we did not perform single-cell RNASeq. However, our bulk RNASeq profiling suggests modest downregulation of phagocytic/endolysosomal genes (eg Lilrb4a, Fig. 2I) and reduced expression of LAMP2 in microglia by immunostaining. We have added further comment on this in the discussion.
  
  6) Seems that the authors have potentially discovered an unusual mechanism for how CST7 could regulate cell autonomous function without impacting its canonical protease target. The authors deal with this extensively in the discussion but an ELISA or ICC to localize CST7 to microglia in vitro or in vitro would help address this point.
  
  We have added FISH data localising Cst7 expression to IBA1+ cells specifically around plaques in App brains (Fig. 1B-E). We agree that assessing the subcellular localisation and any non-microglial expression of Cystatin-F (the protein coded by Cst7) would offer valuable insight into the protease target and may reveal details on the precise mechanism by which CF deletion leads the phenotype we observe in this study. However, despite attempting numerous commercially available and gifted antibodies to detect CF we were unable to validate (using Cst7-/- as controls) any methods other than FISH.
  
  7) The authors focus on plaques in their final figure, however dysregulated microglial phagocytosis could impact many other aspects of brain health. Simple immunohistochemistry for synapses and myelin/oligodendrocytes (especially given the results of the in vitro phagocytosis assay) could provide more insight here.
  
  We fully agree with the reviewer. As also outlined in our responses elsewhere, phagocytic changes could have multiple consequences, and we have included additional data using immunohistochemistry as advised for synapses in WT, AppNL-G-F, and AppNL-G-F/Cst7-/- brains. We show loss of Sy38 coverage around plaques (Fig. 6I) and a moderate but significant decrease in coverage between AppNL-G-F/Cst7-/- vs AppNL-G-F brains only in females (Fig. 6J). This reflects the effect observed with plaque coverage whereby we observe increased burden in AppNL-G-F/Cst7-/- vs AppNL-G-F females but not males (Fig. 6B-F) suggesting the increased plaque burden in Cst7-/- female mice may lead to increased synapse loss.
  
  We also performed immunohistochemistry for myelin makers MAG and MBP but found no plaque-associated pathology. Finally, we searched for dystrophic neurites using LAMP1 but found that the antibody stained microglial lysosomes rather than dystrophic neurites in this model (see Author response image 2), an observation that has been made by others (Sharoar et al., 2021 - PMID: 34215298).
  
  Overall, our data suggest Cst7 may play a protective role in females, limiting phagocytosis, reducing plaque burden and blunting synapse loss.
  
  Author response image 2.
  
  Reviewer #3 (Public Review):
  
  In this manuscript, Daniels et al explored the role of Cystatin F in an A-driven mouse model of Alzheimer's disease. By crossing a constitutive knockout mouse lacking the gene that encodes Cystatin F, Cst7, to the AppNL-G-F mouse line, the authors describe impairments in microglial gene expression and phagocytic function that emerge more prominently in females versus males lacking Cst7. A strength of the study is its focus: given mounting evidence that microglia are a hub of neurological dysfunction with particular potential to trigger or exacerbate neurodegenerative disorders, it is essential to determine the changes in microglia that occur pathologically to promote disease progression. Similarly, the wide-spread identification of the gene in question, Cst7, as upregulated in AD models makes this gene a good target for mechanistic studies.
  
  The paper in its current form also has several weaknesses which limit the insights derived, weaknesses that are largely related to the experimental tools and approaches chosen by the authors to test their hypotheses. For example, the paper begins with a figure replotting data from previous studies showing that Cst7 is upregulated in mouse models of Alzheimer's disease. Though relevant to the current study, there are no new insights provided here. Next, the authors perform bulk RNA-sequencing on microglia isolated from male and female mice in the Cst7-/-; AppNL-G-F mouse line. In the methods, it is unclear whether the authors took precautions to preserve the endogenous transcriptional state of these cells given evidence that microglia can acquire a DAM-like signature simply due to the process of dissociation (Marsh et al, Nature Neuroscience, 2022). If the authors did not control for this, their results may not support the conclusions they draw from the data. Relatedly, it appears the authors pooled all microglia together here, instead of just isolating DAMs specifically or analyzing microglia at single-cell resolution, which could reveal the heterogeneous nature of the role of Cst7 in microglia. In addition to losing information about heterogeneity, another concern is that they could be diluting out the major effects of the model on microglial function by including all microglia. Overall, the biggest issue I have with the RNA-sequencing data is the lack of validation of the gene expression changes identified using a different method that does not require dissociation, like immunohistochemistry or fluorescence in situ hybridization. Especially given the limited number of genes they found to be mis-regulated (see Fig. 2 E and G), I worry that these changes might simply be noise, especially since the authors provide no further evidence of their mis-regulation. Without further validation, the data presented are not sufficient to support the authors' claims.
  
  We believe we have addressed this comment in the “Essential Revisions (for the authors)” section above. Please see again below:
  
  We took standard precautions to minimise the risk of aberrant ex vivo cell activation, including maintaining cells on ice during non-enzyme steps of the procedure and carrying out preps in small batches to minimise time taken from removal of brain to purification of microglial RNA. Importantly, we also validated key expression data by in situ methods such as RNA FISH for Cst7 and Lilrb4a (Fig. 1B-E, Fig 2. - Fig. supplement 3) thus eliminating dissection-induced effects. Additionally, when performing qPCR on microglia from non-disease mice to test the disease-specific role of Cst7-dependent gene regulation we did not observe the same gene changes (Fig 2. - Fig. supplement 4) which, if such changes were dependent on tissue dissociation, we would expect to observe in WT or disease animals. We utilised the resources provided by Marsh et al. 2022 to search for overlap between enzyme-induced genes and our DEG lists from our key comparisons. We found the enzyme-induced gene set had very minimal overlap with any of our comparisons with overlap of only 4 genes between enzyme-induced genes and Cst7-dependent genes in males and no overlap between enzyme-induced genes and Cst7-dependent genes in females. We would further point out that the disease-induced microglial RNAseq profile in the AppNL-G-F Cst7+/+ (i.e. disease WT) condition mirrors those observed previously by multiple methods including in situ profiling (Zeng et al 2023 - PMID: 36732642) and RiboTag approaches (Kang et al 2018 - PMID: 30082275). We believe these combined approaches provide convincing validation of the RNAseq data.
  
  In assessing the changes in microglial function and A pathology that occur in males and females of the Cst7-/-; AppNL-G-F line, the authors identify some differences between how females and males are affected by the loss of Cst7. While the statistical analyses the authors perform as given in the figure legends appear to be correct, the plots do not show significant changes between males and females for a given parameter. Take for example Figure 3H. Loss of Cst7 decreases IBA+Lamp+ microglia in males but increases this parameter in females. However, it does not appear that there is a significant difference in IBA+Lamp+ microglia in male versus female mice lacking Cst7. If there is no absolute difference between males and females, can the differential effects of Cst7 knockout on the sexes really be so relevant to the sexual dimorphism observed in the disease? I question this connection, but perhaps a greater discussion of what the result might mean by the authors would be helpful for placing this into context.
  
  We understand the reviewer’s perspective and we agree that the interpretations could be presented and explained better in the text - we have updated the discussion as suggested to address this.
  
  We designed our study initially to search for sex-specific effects of Cst7. Therefore, whilst our ANOVA does include main effects analysis for disease or sex, we carried out post-hoc analysis primarily to investigate effects of Cst7 deletion within sex. In the case of Fig. 3H pointed out by the reviewer, we observe a main effect for disease in the ANOVA and for disease-sex interaction but not for sex. Post-hoc analysis revealed the sex-specific effects of Cst7 we describe in the manuscript. This approach on analysis was also taken by Hoghooghi et al. (2020 - PMID: 33027652) who show related pathway gene Cstc is detrimental in EAE in females but not males (included in the discussion in this manuscript). The observation in Fig. 3H that there appears to be a Cst7 effect in males and females but not a sex effect in Cst7-/- is accurate but a relative anomaly in this study. Generally, we find that, alongside Cst7 deletion affecting females differently to males, we also see a sex effect in Cst7-/- animals but not in Cst7+/+ animals i.e. absolute levels in disease condition as well as relative changes from control to disease condition are different between males and females. This is exemplified in Fig. 4B&C where we observe increased microglial Aβ in female Cst7-/- animals vs male Cst7-/- animals and in Fig. 6D where we observe increased Aβ plaque burden in female Cst7-/- animals vs male Cst7-/- animals. This is most strikingly demonstrated in the case of our RNASeq data where we observe a difference in sex-dependent genes in AppNL-G-F vs AppNL-G-F/Cst7-/- (Fig. 2 – Fig. supplement 3B) implying removal of the Cst7 gene led to an ‘unlocking’ of sexual dimorphism in our cohort which we comment on in the discussion.
  
  Finally, the use of in vitro assays of microglial function can be helpful as secondary analyses when coupled with in vivo or ex vivo approaches, but are not on their own sufficient to support the authors' conclusions. Quantitative engulfment assays (see Schafer et al, Neuron, 2012) on brain tissue showing that male and female microglia lacking Cst7 engulf different amounts of material (e.g. plaques, synapses, myelin) in the intact brain would be more convincing.
  
  We agree that in vitro assays for microglial function are not always sufficient as standalone methods to support conclusions on functions in disease. The reviewer may have missed our in vivo MeX04 uptake assays (Fig 4A-D) which use measurements by flow cytometry on isolated microglia, this is a reflection of the microglial uptake in vivo following MeX04 injection pre-mortem – this experiment showed increased microglial Aβ in female Cst7-/- animals vs male Cst7-/- animals (Fig. 4B&C). Our in vitro assays complement and extend insight in ways not possible in vivo, for example they offer key insight into uptake/degradation kinetics that would be extremely challenging to carry out in vivo.
  
  In general, a major limitation to the insights that can be derived in the study is the decision of the authors to perform all experiments at a single late-stage time point of 12 months of age. As this is quite far into disease progression for many AD models, phenotypic changes identified by the authors could arise due to the downstream effects of plaque deposition and therefore may not implicate Cst7 as a mechanism driving neurodegeneration rather than one of many inflammatory changes that accompany AD mouse models nearing the one-year time point. A related problem is that the study uses a constitutive KO mouse that has lacked Cst7 expression throughout life, not just during disease processes that increase with aging. In summary, the topic of the article is important and timely, but the connection between the data and the authors' conclusions is not as strong as it could be.
  
  As described above, Cst7 expression is absent at steady-state and low until 6-12 months. Therefore, we predict that deletion would have little effect until 12+ months whereby cells expressing Cst7 have had the temporal window to affect disease pathology, as we find in the current study. This was a key part of the reasoning in our choice of the 12-month age for analyses. The negligible expression of Cst7 at baseline/early stages of disease suggests constitutive KO of the gene will not impact the phenotype until disease onset. This is substantiated by the lack of any genotype-related differences in the WT vs Cst7-/- comparisons in the non-disease condition.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.11.18.516922v1
www.biorxiv.org www.biorxiv.org

A mechano-osmotic feedback couples cell volume to the rate of cell deformation

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1:
  
  The paper uses a microfluidic-based method of cell volume measurement to examine single cell volume dynamics during cell spreading and osmotic shocks. The paper successfully shows that the cell volume is largely maintained during cell spreading, but small volume changes depend on the rate of cell deformation during spreading, and cell ionic homeostasis. Specifically, the major conclusion that there is a mechano-osmotic coupling between cell shape and cell osmotic regulation, I think, is correct. Moreover, the observation that fast deforming cell has a larger volume change is informative.
  
  The authors examined a large number of conditions and variables. It's a paper rich in data and general insights. The detailed mathematical model, and specific conclusions regarding the roles of ion channels and cytoskeleton, I believe, could be improved with further considerations.
  
  We thank the referee for the nice comment on our work and for the detailed suggestions for improving it.
  
  Major points of consideration are below.
  
  1) It would be very helpful if there is a discussion or validation of the FXm method accuracy. During spreading, the cell volume change is at most 10%. Is the method sufficiently accurate to consider 5-10% change? Some discussion about this would be useful for the reader.
  
  This is an important point and we are sorry if it was not made clear in our initial manuscript. We have now made it more clear in the text (p. 4 and Figure S1E and S1F).
  
  The important point is that the absolute accuracy of the volume measure is indeed in the 5 to 10% range, but the relative precision (repeated measures on the same cell) is much higher, rather in the 1% range, as detailed below based on experimental measures.
  
  1) Accuracy of absolute volume measurements. The accuracy of the absolute measure of the volume depends on several parameters which can vary from one experiment to the other: the exact height of the chamber, and the biological variability form one batch of cell to another (we found that the distribution of volumes in a population of cultured cells depends strongly on the details of the culture – seeding density, substrate, etc... - which we normalized as much as possible to reduce this variability, as described in previous articles, e.g. see2). To estimate this variability overall, the simplest is to compare the average volume of the cell population in different experiments, carried out in different chambers and on different days.
  
  Graph showing the initial average volume of cells +/- STD for 7 spreading experiments and 27 osmotic shock experiments, expressed as a % deviation from the average volume over all the experiments.
  
  The average deviation is of 10.9 +/- 8%
  
  2) Precision of relative volume measurements. When the same cell is imaged several times in a time-lapse experiment, as it is spreading on a substrate, or as it is swelling or shrinking during an osmotic shock, most of the variability occurring from one experiment to another does not apply. To experimentally assess the precision of the measure, we performed high time resolution (one image every 30 ms) volume measurements of 44 spread cells during 9 s. During this period of time, the volume of the cell should not change significantly, thus giving the precision of the measure.
  
  Graph showing the coefficient of variation of the volume (STD/mean) for each individual cell (n=44) across the almost 300 frames of the movie. This shows that on average the precision of volume measurements for the same cell is 0.97±0.21%. In addition, if more precision was needed, averaging several consecutive measures can further reduce the noise, a method which is very commonly used but that we did not have to apply to our dataset.
  
  We have included these results in the revised manuscript, since they might help the reader to estimate what can be obtained from this method of volume measurement. We also point the reviewer to previous research articles using this method and showing both population averages and time-lapse data2–8 . Another validation of our volume measurement method comes from the relative volume changes in response to osmotic shock (Ponder’s relation) measured with FXm, which gave results very similar to the numbers of previously published studies. We actually performed these experiments to validate our method, since the results are not novel.
  
  2) The role of cell active contraction (myosin dynamics) is completely neglected. The membrane tether tension results, LatA and Y-compound results all indicate that there is a large influence of myosin contraction during cell spreading. I think most would not be surprised by this. But the model has no contribution from cortical/cytoskeletal active stress. The authors are correct that the osmotic pressure is much larger than hydraulic pressure, which is related to active contraction. But near steady state volume, the osmotic pressure difference must be equal to hydraulic pressure difference, as demanded by thermodynamics. Therefore, near equilibrium they must be close to each other in magnitude. During cell spreading, water dynamics is near equilibrium (given the magnitude of volume change), and therefore is it conceptually correct to neglect myosin active contraction? BTW, 1 solute model does not imply equal osmolarity between cytoplasm and external media. 1 solute model with active contraction was considered before, e.g., ref. 17 and Tao, et al, Biophys. J. 2015, and the steady state solution gives hydraulic pressure difference equal to osmotic pressure difference.
  
  This is an excellent point raised by the referee. We have two types of answers for this. First an answer from an experimental point of view, which shows that acto-myosin contractility does not seem to play a direct role in the control of the cell volume, at least in the cells we used here. Based on these results we then propose a theoretical reason why this is the case. It contrasts with the view proposed in the articles mentioned by the referee for a reason which is not coming from the physical principles, with which we fully agree, but from the actual numbers, available in the literature, of the amount of the various types of osmolytes inside the cell. We give these points in more details below and we hope they will convince the referee. We also now mention them explicitly in the main text of the article (p. 6-7, Figure S3F) and in the Supplementary file with the model.
  
  A. Experimental results
  
  To test the effect of acto-myosin contraction on cell volume, we performed two experiments:
  
  1) We measured the volume of same cell before and after treatment with the Rho kinase ROCK inhibitor Y-27632, which decreases cortical contractility. The experiment was performed on cells plated on poly-L-Lysin (PLL), like osmotic shock experiments, a substrate on which cells adhere, allowing the change of solution, but do not spread and remain rounded. This allowed us to evaluate the effect of the drug. Cells were plated on PLL-coated glass. The change of medium itself (with control medium) induced a change of volume of less than 2%, similar to control osmotic shock experiments (maybe due to shear stress). When the cells were treated with Y-27, the change of volume was similar to the change with the control medium (now commented in the text p. 6-7, Figure S3F). To make the analysis more complete, we distinguished the cells that remained round throughout the experiment from the cells which slightly spread, since spreading could have an effect on volume. Indeed we observed that treatment with Y-27 induced more cells to spread (Figure S3F), probably because the cortex was less tensed, allowing the adhesive forces on PLL to induce more spreading9. Nevertheless, the spreading remained rather slow and the volume change of cells treated or not with Y-27 was not significantly different. This shows that, in the absence of fast spreading induced by Y-27, the reduction of contractility per se does not have any effect on the cell volume.
  
  Graphs showing proportion of cells that spread during the experiments (left); average relative volume of round (middle) and spread (right) control (N=3, n=77) and Y-27 treated cells (N=4, N=297).
  
  2) To evaluate the impact of a reduction of contractility in the total absence of adhesion, we measured the average volume of control cells versus cells which have been pretreated with Y-27, plated on a non-adhesive substrate (PLL-PEG treatment). This experiment showed that the volume of the cells evolved similarly in time for both conditions, proving that contractility per se has no effect on the cell volume or cell growth, in the absence of spreading.
  
  Graphs showing average relative volume of control (N=5, n=354) and Y-27 (N=3, n=292) treated cells plated on PLL-PEG (left); distributions of initial volume for control (middle) and Y-27 treated cells (right) represented on the left graph.
  
  Taken together these results show that inhibition of contractility per se does not significantly affect cell volume. It thus confirms our interpretation of our results on cell spreading that reduction of contractility has an effect on cell volume, specifically in the context of cell spreading, primarily because it affects the spreading speed.
  
  B. Theoretical interpretation
  
  In accordance with our experiments, in our model, the effect of contractility is implicitly included in the model because it modulates the spreading dynamics, which is an input to the model, i.e. through the parameters tau_a and A_0.
  
  We do not include the effect of contractility directly in the water transport equation because our quantitative estimates support that the contribution of the hydrostatic pressure to the volume (or the volume change) is negligible in comparison to the osmotic pressure, and this even for small variation near the steady-state volume. The main important point is that the concentration of ions inside the cell is actually much lower than outside of the cell10,11. The difference is about 100 mM and corresponds mostly to nonionic small trapped osmolytes, such as metabolites12. The osmotic pressure corresponding to this is about 10^5 Pa. Taking the cortical tension to be of order of 1 mN/m and cell size to be about ten microns we get a hydrostatic pressure difference of about 100 Pa due to cortical tension. A significant change in cell volume, of the order observed during cell spreading (let’s consider a ten percent decrease) will increase the osmotic pressure of the trapped nonionic osmolytes by 10^4 Pa (their number in the cell remaining identical). For this osmotic pressure to be balanced by an increase in the hydrostatic pressure, the cortical tension would need to increase by a factor of 100, which we consider to be unrealistic. Therefore, we find it reasonable to ignore the contribution of the hydrostatic pressure difference in the water flux equation. It is also consistent with the novel experiments presented above which show that inhibition of cortical contractility changes the cells volume below what can be detected by our measures (thus likely at maximum in the 1% range). This is now explained in the main text and Supplementary file.
  
  Regarding our minimal model required to define cell volume, the reason why we believe one solute model is not sufficient is fundamentally the same as above: the concentration of trapped osmolytes is comparable to the total osmolarity, which means that their contribution to the total osmotic pressure cannot be discarded. Secondly, within the simplest one solute model, the pump and leak dynamics fixes in inner osmolytes concentration but does not involve the actual cell size. The most natural term that depends on the size is the Laplace pressure (inversely proportional to the cell size in a spherical cell model). But as discussed above, this term may only permit osmotic pressure differences of the order of 100 Pa, corresponding to an osmolytes concentration difference of the order of 0.1 mM. That is only a tiny fraction of the external medium osmolarity, which is about 300 mM. Such a model could thus only work for extremely fine tuning of the pump and leak rates to values with less than about 1% variation. Furthermore, such a model could not explain finite volume changes upon osmotic shocks without involving huge (100-fold) cell surface tension variations, as discussed above. For these reasons, we believe that the one-solute model is not appropriate to describe our experiments, and we feel that a trapped population of nonionic osmolytes is needed to balance the osmolarity difference created by the solute pump and leak.
  
  In the revised version of the manuscript, we have now added a section in Supplementary file and in the main text, explaining in more detail this approximation.
  
  3) The authors considered the role of Na, K, and Cl in the model, and used pharmacological inhibitors of NHE exchanger. I think this part of the experiments and model are somewhat weak. I am not sure the conclusions drawn are robust. First there are many ion channels/pumps in regulating Na, K and Cl. The most important of which is NaK exchanger. NHE also involves H, and this is not in the model. The ion flux expressions in the model are also problematic. The authors correctly includes voltage and concentration dependences, but used a constant active term S_i in SM eq. 3 for active pumping. I am not sure this is correct. Ion pump fluxes have been studied and proposed expressions based on experimental data exist. A study of Na, K, Cl dynamics, and membrane voltage on cell volume dynamics was published in Yellen et al, Biophys. J. 2018. In that paper, they used different expressions based on previously proposed flux expressions. It might be correct that in small concentration differences, their expressions can be linearized or approximated to achieve similar expressions as here. But this point should be considered more carefully.
  
  We thank the reviewer for this comment. Indeed, we have not well justified our use of the NHE inhibitor EIPA. Our aim was not to directly affect the major ion pumps involved in volume regulation (which would indeed rather be the Na+/K+ exchanger), because that would likely strongly impact the initial volume of the cell and not only the volume response to spreading, making the interpretation more difficult. We based our choice on previous publication, e.g.13, showing that EIPA inhibited the main fast volume changes previously reported for cultured cells: it was shown to inhibit volume loss in spreading cells, as well as mitotic cell swelling14,15. Using EIPA, we also found that, while the initial volume was only slightly affected, the volume loss was completely abolished even in fast spreading cells (Y-27 and EIPA combined treatment, Figure S5H). This clearly proves that the volume loss behavior can be abolished, without changing the speed of spreading, which was our main aim with this experiment.
  
  The most direct effect of inhibiting NHE exchangers is to change the cell pH16,17, which, given the low number of H protons in the cell (negligible contribution to cells osmotic pressure), cannot affect the cell volume directly. A well-studied mechanism through which proton transport can have indirect effect on cell volume is through the effect of pH on ion transporters or due to the coupling between NHE and HCO3/Cl exchanger. The latter case is well studied in the literature18. In brief, the flux of proton out of the cell through the NHE due to Na gradient leads to an outflux of HC03 and an influx of Cl. The change in Cl concentration will have an effect on the osmolarity and cell volume.
  
  We thus performed hyperosmotic shocks with this drug and we found that, as expected, it had no effect on the immediate volume change (the Ponder’s relation), but affected the rate of volume recovery (combined with cell growth). Overall, the cells treated with EIPA showed a faster volume increase, which is what is expected if active pumping rate is reduced. This is in contrast with the above mentioned mechanism of volume regulation which will to lead to a reduced volume recovery of EIPA treated cells. This leads us to conclude that there is potentially another effect of NHE perturbation. Changing the pH will have a large impact on the functioning of many other processes, in particular, it can have an effect on ion transport16. Overall, the cells treated with EIPA showed a faster volume increase, which is what is expected if active pumping rate is reduced.
  
  On the model side, the referee correctly points out that there are many ion transporters that are known to play a role in volume regulation which are not included in Eq. 3. In the revised manuscript we now start with a more general ion transport equation. We show that the main equation (Eq.1 - or Supplementary file Eq.13) relating volume change to tension is not affected by this generalization. This is because we consider only the linear relation between the small changes in volume and tension. We note that the generic description of the PML (Supplementary file Eqs.1-6) can be seen as general and does not require the pump and channel rates to be constant; both \Lambda_i and S_i can be a function of potential and ion concentration along with membrane tension. It is only later in the analysis that we do make the assumption that these parameters only depend on tension. This point is now made clear in the Supplementary file.
  
  There is a huge body of work both theoretical and experimental in which the effect of different ion transporters on cell volume is analyzed. The aim of this work is not to provide an analysis of cell volume and the effect of various co-transporters but is rather limited to understanding the coupling between cell spreading, surface tension and cell volume.
  
  To analytically estimate the sign of the mechano-osmotic coupling parameter alpha we use a minimal model. For this we indeed take the pumps and channels to be constant. As it is again a perturbative expansion around the steady state concentration, electric potential, and volume, the expression of alpha can be easily computed for a model with more general ion transporters. This generalization will come at the cost of additional parameters in the alpha expression. We decided to keep the simpler transport model, the goal of this estimate is merely to show that the sign of alpha is not a given and depends on relative values of parameters. Even for the simple model we present, the sign of alpha could be changed by varying parameters within reasonable ranges.
  
  Given these points, and the clarification of the reasons to use EIPA in our experiments, a full mechanistic explanation of the effect of this drug is beyond the scope of this work. Because of this we are not analyzing the effect of EIPA on the model parameter alpha in detail. We now clarified our interpretation of these results in the main text of the article.
  
  Reviewer #2:
  
  The work by Venkova et al. addresses the role of plasma membrane tension in cell volume regulation. The authors study how different processes that exert mechanical stress on cells affect cell volume regulation, including cell spreading, cell confinement and osmotic shock experiments. They use live cell imaging, FXm (cell volume) and AFM measurements and perform a comparative approach using different cell lines. As a key result the authors find that volume regulation is associated with cell spreading rate rather than absolute spreading area. Pharmacological assays further identified Arp2/3 and NHE1 as molecular regulators of volume loss during cell spreading. The authors present a modified mechano-osmotic pump and leak model (PLM) based on the assumption of a mechanosensitive regulation of ion flux that controls cell volume.
  
  This work presents interesting data and theoretical modelling that contribute new insight into the mechanisms of cell volume regulation.
  
  We thank the referee for the nice comments on our work. We really appreciate the effort (s)he made to help us improve our article, including the careful inspection of the figures. We think our work is much improved thanks to his/her input.
  
  Reviewer #3:
  
  The study by Venkova and co-workers studies the coupling between cell volume and the osmotic balance of the cell. Of course, a lot of work as already been done on this subject, but the main specific contribution of this work is to study the fast dynamics of volume changes after several types of perturbations (osmotic shocks, cell spreading, and cell compression). The combination of volume dynamics at very high time resolution, and the robust fits obtained from an adapted Pump and Leak Model (PLM) makes the article a step-forward in our understanding of how cell volume is regulated during cell deformations. The authors clearly show that:
  
  -The rate at which cell deforms directly impacts the volume change
  
  -Below a certain deformation rate (either by cell spreading or external compression), the cells adapt fast enough not to change their volume. The plot dV/dt vs dA/dt shows a clear proportionality relation.
  
  -The theoretical description of volume change dynamics with the extended PLM makes the overall conclusions very solid.
  
  Overall the paper is very well written, contains an impressive amount of quantitative data, comparing several cell types and physiological and artificial conditions.
  
  We thank the referee for the positive comment on our work.
  
  My main concern about this study is related to the role of membrane tension. In the PLM model, the coupling of cell osmosis to cell deformation is made through the membrane-tension dependent activity of ion channels. While the role of ion channels is extensively tested, it brings some surprising results. Moreover, the tension is measured only at fixed time points, and the comparison to theoretical predictions is not always as convincing as expected: when comparing fig 6I and 6J, I see that predictions shows that EIPA (+ or - Y27), CK-666 (+ or - Y27) and Y27 alone should have lower tension than in the control conditions, and this is clearly not the case in fig 6J. But I would not like to emphasize too much on those discrepancies, as the drugs in the real case must have broad effects that may not be directly comparable to the theory.
  
  We apologize for the mislabeling of the Figure 6I (now Figure 5I). This plot shows the theoretical estimate for the difference in tension (in the units of homeostatic tension) between the case when the cell loses its volume upon spreading (as observed in experiments) compared to the hypothetical situation when the cell does not lose volume upon spreading (alpha = 0). The positive value of the tension difference predicts that the cell tension would have been higher if the cell were not losing volume upon spreading, which is the case for the treatments with EIPA and CK-666 (+ Y27) and corresponds to what we found experimentally.
  
  It thus matches our experimental observations for drug treatments which reduce or abolish the volume loss during spreading and correspond to higher tether force only at short time.
  
  We have corrected the figure and figure legend and explained it better in the text.
  
  But I wonder if the authors would have a better time showing that the dynamics of tension are as predicted by theory in the first place, as comparing theoretical predictions with experiments using drugs with pleiotropic effects may be hazardous.
  
  Actually, a recent publication (https://doi.org/10.1101/2021.01.22.427801) shows that tension follows volume changes during osmotic shocks, and overall find the same dynamics of volume changes than in this manuscript. I am thus wondering if the authors could use the same technique than describe in this paper (FLIM of flipper probe) in order to study the dynamics of tension in their system, or at least refer to this paper in order to support their claim that tension is the coupling factor between volume and deformation.
  
  As was suggested by the referee, we tried to use the FLIPPER probe. We first tried to reproduce osmotic shock experiments adding to the HeLa cells 4% of PEG400 (+~200 mOsm) or 50% of H20 (-~170 mOsm) and measuring the average probe lifetime before and after the shock. We found significantly lower probe lifetime for hyperosmotic condition compared with control, and non-significant, but slightly higher lifetime for hypoosmotic shock. The magnitude of lifetime changes was comparable with the study cited by the reviewer, but the quality of our measures did not allow us to have a better resolution. Next we measured average lifetime for control and CK-666+Y-27 treated cells 30 min and 3 h after plating, because we have highest tether force values for CK-666+Y-27 at 30 min. We did not see a change in lifetime in control cells between 30 min and 3 h (which also did not see with the tether pulling). Cells treated with CK-666+Y-27 showed a slightly lower lifetime values than control cells, but both 30 min and 3 h after plating, which means that it did not correspond to the transient effect of fast spreading but probably rather to the effect of the drugs on the measure.
  
  Graph showing FLIPPER lifetime before and after osmotic shock for HeLa cells plated on PLL- coated substrate. Left: control (N=3, n=119) and hyperosmotic shock (N=3, n=115); Right: control (N=3, n=101) and hypoosmotic shock (N=3, n=80). p-value are obtained by t-test.
  
  Graph showing FLIPPER lifetime for control just after the plating on PLL-coated glass (the same data for control shown at the previous graph), 30 min (control: N=3, n=88; Y-27+CK-666: N=3, n=130) and 3 h (control: N=3, n=78; Y-27+CK-666: N=3, n=142) after plating on fibronectin-coated glass. p-value are obtained by t-test.
  
  Because the cell to cell variability might mask the trend of single cell changes in lifetime during spreading, we also tried to follow the lifetime of individual cells every 5 min along the spreading. Most illuminated cells did not spread, while cells in non-illuminated fields of view spread well, suggesting that even with an image every 5 minutes and the lowest possible illumination, the imaging was too toxic to follow cell spreading in time. We could obtain measures for a few cells, which did not show any particular trend, but their spreading was not normal. So we cannot really conclude much from these experiments.
  
  Graph showing FLIPPER lifetime changes for 3 individual cells plated on fibronectin-coated glass (shown in blue, magenta and green) and average lifetime of cells from non-illuminated field (cyan, n=7)
  
  Our conclusions are the following:
  
  1) We are able to visualize some change in the lifetime of the probe for osmotic shock experiments, similar to the published results, but with a rather large cell to cell variability.
  
  2) The spreading experiments comparing 30 minutes and 3 hours, in control or drug treated cells did not reproduce the results we observed with tether pulling, with a global effect of the drugs on the measures at both 30 min and 3 hours.
  
  3) Following single cells in time led to too much toxicity and prevented normal spreading.
  
  We think that this technology, which is still in its early developments, especially in terms of the microscope setting that has to be used (and we do not have it in our Institute, so we had to go on a platform in another institute with limited time to experiment), cannot be implemented in the frame of the revision of this article to provide reliable results. We thus consider that these experiments are for further development of the work and are out of the scope of this study. It would be very interesting to study in details the comparison between the oldest and more established method of tether pulling and the novel method of the FLIPPER probe, during cell spreading and in other contexts. To our knowledge this has never been done so far, so it is not in the frame of this study that we can do it. It is not clear from the literature that the two methods would measure the same thing in all conditions even if they might match in some.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.06.08.447538v2
www.biorxiv.org www.biorxiv.org

New submission 29/07/2022, 08:59:49

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #2 (Public Review):
  
  In this manuscript, the authors performed single-cell RNA sequencing (scRNA-seq) analysis on bone marrow CD34+ cells from young and old healthy donors to understand the age-dependent cellular and molecular alterations during human hematopoiesis. Using a logistic regression classifier trained on young healthy donors, they identified cell-type composition changes in old donors, including an expansion of hematopoietic stem cells (HSCs) and a reduction of committed lymphoid and myeloid lineages. They also identified cell-type-specific molecular alterations between young and old donors and age-associated changes in differentiation trajectories and gene regulatory networks (GRNs). Furthermore, by comparing the single-cell atlas of normal hematopoiesis with that of myelodysplastic syndrome (MDS), they characterized cellular and molecular perturbations affecting normal hematopoiesis in MDS.
  
  The present manuscript provides a valuable single-cell transcriptomic resource to understand normal hematopoiesis in humans and the age-dependent cellular and molecular alterations. However, their main claims are not well supported by the data presented. All results were based on computational predictions, not experimentally validated.
  
  Major points:
  
  1) The authors constructed a regularized logistic regression trained on young donors with manually annotated cell types and predicted cell type labels of cells from old and MDS samples. As the manual annotation of cell types was implicitly assumed as ground truth in this manuscript, I'm wondering whether the predicted cell types in old and MDS samples are consistent with the manual annotation. They should apply the same strategy used in young samples for manual annotation to old and MDS samples, and evaluate how accurate their classifier is.
  
  We performed manual annotation for each MDS sample independently, and for the 3 healthy elderly donors integrated dataset. To do so, we performed unsupervised clustering with Seurat and annotated the clusters using the same set of canonical marker genes that we used for the young data. We then analyzed the correspondences between the annotated clusters and the predictions by GLMnet. Results are shown on Figure 1a. We observe that the biggest disagreements between methods occur between adjacent identities, such as HSC and LMPP, GMP and GMP with more prominent granulocytes profile, or MEP, early and late erythroid. When we explore these disagreements along the erythroid branch, we see that they particularly occur close to the border between subpopulations (Figure 1b). This is consistent with the continuous nature of the differentiation and the difficulty to establish boundaries between cell compartments. However, we observe that miss-labeling between different hematopoietic lineages is rare.
  
  In addition, unsupervised clustering was not always able to directly separate the data in the expected subpopulations. We can see different clusters containing the same cell types (e.g. LMPP1, LMPP2), as well as individual clusters containing cells with different identities (e.g. pDC and monocyte progenitors). This is usually due to sources of variability different to cell identity present in the data Additional, supervised finetuning by local sub clustering and merging would be needed to correct for this. On the contrary, we believe that our GLMnet-based method focusses on gene expression related to identity, resulting in a classification that is better suited for our purpose.
  
  Figure 1 Comparison between GLMnet predictions and manually annotated clusters A) Heatmaps showing percentages of cells in manually annotated clusters (columns) that have been assigned to each of the cell identities predicted by our GLMnet classification method (rows). The analysis was performed independently for the elderly integrated dataset and for every MDS sample. B) UMAP plots showing disagreements in classification between adjacent cell compartments in the erythroid branch. Cells from one erythroid cluster per patient are colored by the identity assigned by the GLMnet classifier. Cells in gray are not in the highlighted cluster, nor labeled as MEP, erythroid early or erythroid late by our classifier.
  
  2) The cell-type composition changes in Figures 1 and 4 were descriptively presented without providing the statistical significance of the changes. In addition, the age-dependent cell-type composition changes should be validated by flow cytometry.
  
  We thank the reviewer for the comment. Significance of the changes is included in Supplementary File 3. In addition, we included the percentage of several cell types we validated by flow cytometry, namely HSCs, GMPs and MEPs, in young and elderly healthy individuals in the manuscript, as Figure 1-figure supplement 3. Similarly to what we detected in our bioinformatic analyses, flow cytometry data demonstrated a significant increase in the percentage of HSCs, as well as an increasing trend in MEPs and a slight decrease in the percentage of GMPs in elderly individuals, corroborating our previous results.
  
  3) In Figure 2, the authors used two different pseudo-time inference methods, STREAM, and Palantir. It is not clear why they used two different methods for trajectory inference. Do they provide the same differentiation trajectories? How robust are the results of trajectory inference algorithms? It seems to be inconsistent that the pseudotime inferred by STREAM was not used for downstream analysis and the new pseudotime was recalculated by using Palantir.
  
  We thank the reviewer for the comment. The reason behind using two different methods to perform similar analyses, is that each of them provides specific outputs that can be used to perform a more robust and comprehensive analysis. STREAM allows to unravel the differentiation trajectories in a single cell dataset with an unsupervised approach. Also the visualization provided by STREAM (Figure 2C and 2D) allows for a simple interpretation of the results to the reader. On the other hand, Palantir provides a more robust analysis to dissect how gene expression dynamics interact and change with differentiation trajectories. For this reason, we decided to use this second method to investigate how specific genes were altered in the monocytic compartment.
  
  As a resource article, the showcase of different methods can be valuable as it provides examples on how each tool can be used to obtain specific results, which can help any reader to decide which might be the best tool for their specific case.
  
  Just to confirm that pseudotime results are similar, we perform a correlation analysis with the pseudotime values obtained from each method. We observed a correlation coefficient of 0.78 (p.val < 2.2e-16) confirming the similarity among both tools.
  
  Figure 2. Correlation analysis of pseudotime values obtained with STREAM and PALANTIR.
  
  4) In Figure 2D, some HSCs seem to be committed to the erythroid lineage. The authors should carefully examine whether these HSCs are genuinely HSCS, not early erythroid progenitors.
  
  We thank the reviewer for the comment. We have performed a deep analysis regarding the classification of HSCs (See Figure 3). Our analyses reveal that none of the cells classified as HSCs express early erythroid progenitor markers. We have also used STREAM to show the expression of these markers along the obtained trajectory and observed that erythroid markers show expression in the erythroid trajectory but not in the HSC compartment (Figure 4).
  
  Figure 3 Expression of marker genes in the HSC compartment. Dot plot depicting the normalized scaled expression of canonical marker genes by HSC of the 5 young and 3 elderly healthy donors. Marker genes are colored by the cell population they characterize. Dot color represents expression levels, and dot size represents the percentage of cells that express a gene.
  
  Figure 4. Expression of erythroid markers in STREAM trajectories. Expression of GATA1 and HBB (erythroid markers) in the predicted differentiation trajectories.
  
  5) It is not clear how the authors draw a conclusion from Figure 3D that the number of common targets between transcription factors is reduced. Some quantifications should be provided.
  
  We thank the reviewer for the comment. We have updated the manuscript to better reflect our findings and emphasize that the predicted regulatory networks of HSCs in elderly donors is displayed as an independent network, compared to the young donors. (Page 6, line 36).
  
  “Overall, we observed that the predicted regulatory network of elderly HSCs (Figure 3d) appeared as an independent network compared to the young GRN. This finding could result in the loss of co-regulatory mechanisms in the elderly donors.”
  
  6) The constructed GRNs and related descriptions were based solely on the SCENIC analysis. By providing the results of an orthogonal prediction method for GRNs, the authors should evaluate how robust and consistent their predictions are.
  
  We thank the reviewer for the comment regarding the method to build gene regulatory networks. As a resource article, our manuscript describes a complete workflow to perform different aspects of single cell analyses. These steps go from automated classification, trajectory inference and GRN prediction. All the selected algorithms have already been benchmarked and compared against other tools that perform similar analysis. SCENIC has already been benchmarked against other algorithms (11) and by others (12).
  
  We do agree with the reviewer that these new predictions could provide strength to our findings, however we believe that these orthogonal predictions would better fit if our article was intended for the Research Article category instead of Tools and Resources.
  
  7) The observed age-dependent cellular and molecular alterations in human hematopoiesis are interesting, but I'm wondering whether the observed alterations are driven by inflammatory microenvironment or intrinsic properties of a subpopulation of HSCs affected by clonal hematopoiesis (CH). To address this, the authors can perform genotyping of transcriptomes (GoT) on old healthy donors with CH. By comparing the transcriptomes of cells with and without CH mutations, we can evaluate the effects of CH on age-associated molecular alterations.
  
  We thank the reviewer for the comment. Unfortunately, in order to perform GoT (genotyping of transcriptomes) on the healthy donors, requires modifying the standard 10x Genomics workflow to amplify the targeted locus and transcript of interest. This would require collecting new samples, optimizing the method and performing new analysis from scratch (from sequencing up to analysis). We believe this is not in the scope of the manuscript. On the other hand, we don’t have enough material to create new single cell libraries, this fact would require the addition of new donors and as a result, a complete new analysis to perform the integration.
  
  Reviewer #3 (Public Review):
  
  The authors have performed a transcriptional analysis of young/aged hematopoietic stem/progenitor cells which were obtained from normal individuals and those with MDS.
  
  The authors generated an important and valuable dataset that will be of considerable benefit to the field. However, the data appear to be over-interpreted at times (for example, GSEA analysis does not have "functionality", as the authors claim). On the other hand, a comparison between normal-aged HSC and HSC from MDS patients appears to be under-explored in trying to understand how this disease (which is more common in the elderly) disrupts HSC function.
  
  A more extensive cross-referencing of other normal HSPC/MDS HSCP datasets from aged humans would have been helpful to highlight the usefulness of the analytical tools that the authors have generated.
  
  Major points
  
  1) The authors detail methodology for identification of cell types from single-cell data - GLMnet. This portion of the text needs to be clarified as it is not immediately clear what it is or how it's being used. It also needs to be explained by what metric the classifier "performed better among progenitor cell types" and why this apparent advantage was sufficient to use it for the subsequent analysis. This is critical since interpretation of the data that follows depends on the validation of GLMnet as a reliable tool.
  
  We thank the review for the comment. We have updated the corresponding section to better describe how GLMnet is used and that the reasoning on why we decided to use GLMnet as our cell type annotation method instead of other available tools such as Seurat, is based on the results of the benchmark described in Figure 1-figure supplement 1. We also described the main differences between our method and Seurat (See Answer to Review 1, Question # 4).
  
  2) The finding of an increased number of erythroid progenitors and decreased number of myeloid cells in aged HPSC is surprising since aging is known to be associated with anemia and myeloid bias. Given that the initial validation of GLMnet is insufficiently described, this result raises concerns about the method. Along the same lines, the authors report that their tool detects a reduced frequency of monocyte progenitors. How does this finding correlate with the published data on aging humans? Is monocytopenia a feature of normal aging?
  
  We thank the reviewer for this comment, as changes in the output of HSCs as a consequence of aging are of high interest. According to the literature, there is clear evidence of the loss of lymphoid progeny with age (13,14), which goes in agreement with our results. However, in the case of the myeloid compartment, the effects of aging are not as clear. Studies in mice have indeed observed that the loss of lymphoid cells is accompanied by increased myeloid output, starting at the level of GMPs (Rossi et al. 2005; Florian et al. 2012; Min et al. 2006). But studies on human individuals have not found changes in numbers of these myeloid progenitors (Kuranda et al. 2011; Pang et al. 2011). In addition, in the mentioned studies, myeloid production was measured exclusively by its white blood cells fraction. More recent studies have focused on the other myeloid compartments: megakaryocyte and erythroid cells. Results point towards the increase of platelet-biased HSC with age (Sanjuan-Pla et al. 2013; Grover et al. 2016) and a possible expansion of megakaryocytic and erythroid progenitor populations (Yamamoto et al. 2018; Poscablo et al. 2021; Rundberg Nilsson et al. 2016), which may represent a compensatory mechanism for the ineffective differentiation towards this lineage in elderly individuals. This goes in line with the accumulation of MEPs we see in our data. Finally, and in accordance with the reduced frequency of monocyte progenitors observed, it has been shown that with increasing age, there is a gradual decline in the monocyte count (15).
  
  Regarding the concerns about our classification method raised by the reviewer, we have performed additional validations that we describe in answers to reviewer 1 comment #4 and reviewer 2 comment #1. To further confirm that the changes in cellular proportions we found are real, we applied two additional classification methods: Seurat transfer and Celltypist (16) to the elderly donors dataset. We obtained a similar expansion in MEPs, together with reduction of monocytic progenitors with the three methods (Figure 5).
  
  Figure 5 Classification of HSPCs from elderly donors. Barplot showing proportions of every cell subpopulation per elderly donor, resulting from three classification methods: GLMnet-based classifier, Seurat transfer and Celltypist. For the three methods, cells with prediction scores < 0,5 were labeled as “not assigned”.
  
  3) The use of terminology requires more clarity in order to better understand what kind of comparison has been performed, i.e. whether global transcriptional profiles are being compared, or those of specific subset populations. Also, the young/aged comparisons are often unclear, i.e. it's not evident whether the authors are referring to genes upregulated in aged HSC and downregulated in young HSC or vice versa. A more consistent data description would make the paper much easier to read.
  
  We thank the reviewer for this comment. We have updated the manuscript to provide more clarity in the description of the different comparisons made in our analyses. Most changes are located in the Transcriptional profiling of human young and elderly hematopoietic progenitor systems sub-section within the Results.
  
  4) The link between aging and MDS is not explored but could be an informative use of the data that the authors have generated. For example, anemia is a feature of both aging and MDS whereas neutropenia and thrombocytopenia only occur in MDS. Are there any specific pathways governing myeloid/platelet development that are only affected in MDS?
  
  Thank you for raising this comment. We believe that discriminating events that take place during healthy aging from those associated to MDS will be helpful to understand this particular disease, as it is so closely related to age. This is why, when analyzing MDS, we have considered young and elderly donors as two separate sets of healthy controls, the eldery donors being the most suitable one for comparisons with MDS samples.
  
  With regards to the comment on myeloid and platelet development, the GSEA analysis gives potentially useful information. MYC targets and oxidative phosphorylation are significantly enriched in the MEP compartment from MDS patients when compared to elderly donors, indicating that these progenitors may recover a more active profile with the disease. Hypoxia related genes, on the other hand, are more active in HSCs and MEPs from healthy elderly donors than in MDS. Hypoxia is known to be implicated in megakaryocyte and erythroid differentiation (17)
  
  5) MDS is a very heterogeneous disorder and while the authors did specify that they were using samples from MDS with multilineage dysplasia, more clinical details (blood counts, cytogenetics, mutational status) are needed to be able to interpret the data.
  
  We thank the reviewer for the comment. All the clinical details for each MDS patient are included in Supplementary File 5.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.07.30.454542v1
www.biorxiv.org www.biorxiv.org

New submission 23/09/2023, 15:24:31

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #3 (Public Review):
  
  Dysbiosis has a substantial impact on host physiology. Using the nematode C. elegans and E.coli as a model of host-microbe interactions, Yang et al. defined a mechanism by which the host deals with gut dysbiosis to maintain fitness. They found that accumulation of E. coli in the intestine secreted indole, a tryptophan metabolite, and activated the transcription factor DAF-16. DAF-16 induced the expression of lys-7 and lys-8, which in turn limited E. coli proliferation in the gut of worms and maintained the longevity of worms. Finally, these authors demonstrated that indole-activated DAF-16 via TRPA-1 in neurons of worms.
  
  This study revealed a new mechanism of host-microbe interaction. The concept of their work is of broad interest and the results they present are convincing. However, there are some issues that need to be addressed to support the conclusions.
  
  Major issues
  
  1) The authors isolated the crude extract from a high-performance liquid chromatograph (HPLC). A candidate compound was detected by activity-guided isolation and further identified as indole with mass spectrometry and NMR data. The HPLC fractionations and activity-guided isolation experiments should be described in more detail with a schematic figure to reveal how these experiments were performed and how indole was identified. Showing a chemical characterization of indole in Figure 2A is not sufficient for the evaluation of the results. Rather, a figure comparing the fraction 26th with standard indole by MS and NMR is more appealing.
  
  We appreciate the concerns of the reviewer. Activity-guided isolation was performed as follows: The crude extract of E. coli supernatant metabolites was divided into 45 fractions according to polarity using Ultimate 3000 HPLC (Thermofisher, Waltham, MA) coupled with automated fraction collector. After freeze-drying each fraction, 1 mg of metabolites were dissolved in DMSO for DAF-16 nuclear localization assay in worms (Please see new Supplementary Table S2). The 26th fraction with DAF-16 nuclear translocation-inducing activity was then separated on silica gel column (200-300 mesh) with a continuous gradient of decreasing polarity (100%, 70%, 50%, 30%, petroleum ether/acetone) to yield four fractions (26a-d). Only the fraction of 26b could induce DAF-16 nuclear translocation. Then the fraction was further separated using a Sephadex LH-20 column to yield 32 fractions. The 26b-11th fraction with DAF-16 nuclear translocation-inducing activity contained a single compound identified by thin layer chromatography, mass spectrometry and nuclear magnetic resonance (NMR). The compound exhibited a quasimolecular ion peak at m/z 181.0782 [M+H]+ in the positive APCI-MS, and was assigned to a molecular formula of C8H7N. A comparison of these 1H NMR and 13C NMR spectra with the data reported in the literature revealed that the compound was indole (Yagudaev, 1986). The figure shows the comparison of the 26b-11 fraction with the standard indole by MS (Author response image 1).
  
  Author response image 1.
  
  High resolution mass spectrum of the candidate compound and indole.
  
  2) DAF-16::GFP was mainly located in the cytoplasm of the intestine in worms expressing daf-16p::daf-16::gfp fed live E. coli OP50 on Day 1 (Figure 1A and 1B). The nuclear translocation of DAF-16 in the intestine was increased in worms fed live E. coli OP50 on Days 4 and 7, but not in age-matched WT worms fed heat-killed (HK) E. coli OP50 (Figure 1A and 1B). Since DAF-16 functions downstream of DAF-2, have the levels of DAF-2 been tested during aging on OP50 and (HK) OP50, or with and without indole supplementation?
  
  In response to the reviewer’s suggestion, we carried out the RT-PCR experiment in 4-day-old and 7-day-old worms. It has been shown that DAF-2 initiates a kinase cascade that leads to the phosphorylation and cytoplasmic retention of DAF-16. By contrast, a reduction in the DAF-2 signaling leads to the dephosphorylation of DAF-16, allowing its nuclear translocation. In response to the reviewer’s suggestion, we tested the expression of daf-2 in 4-day-old and 7-day-old worms fed with OP50 and (HK) OP50. We found that the mRNA levels of daf-2 were significantly increased in worms on days 4 and 7 in the presence of either live or dead E. coli OP50, compared with those in worms on day 1 (Author response image 2A). In addition, supplementation with indole did not alter the mRNA levels of daf-2 in young adult worms (Author response image 2B). To conclude, the activation of DAF-16 is independent of DAF-2.
  
  Author response image 2.
  
  DAF-16 nuclear translocationisindependent of DAF-2.(A) The mRNA levelsof daf-2weregradually increasedin worms with age.P< 0.01;*P< 0.001; ns, not significant. (B)The mRNA levelsof daf-2were not alteredaftertreatment withindole for 24 hours.ns, not significant.
  
  3) In lines 155-157, the author argued that the increase in the levels of indole in worms results from the intestinal accumulation of live E. coli OP50, rather than exogenous indole produced by E. coli OP50 on the NGM plates. However, the work also showed that supplementation with indole (50-200 μM) could significantly increase the indole levels in young adult worms on Day 1 (Figure 2-figure supplement 3B), which could induce nuclear translocation of DAF-16 in worms (Figure 2B). This result suggested that worms could take in indole from outside culturing environment. The concentration of indole in OP50 and (HK) OP50 could be measured.
  
  We appreciate the concerns of the reviewer. Reviewer #2 also pointed out this problem. In this study, our data showed that the levels of indole were 30.9, 71.9, and 105.9 nmol/g dry weight in worms fed live E. coli OP50 on days 1, 4, and 7, respectively (Figure 2C). This increase in the levels of indole in worms was accompanied by an increase in CFU of live E. coli OP50 in the intestine of worms with age (Figure 2C). In addition, we determined the levels of indole in worms fed HK E. coli OP50, and found that the levels of indole were 28.2, 31.6, and 36.1 nmol/g dry weight in worms fed HK E. coli OP50 on days 1, 4, and 7, respectively (Figure 2-figure supplement 3A). It should be noted that the levels of indole in worms fed dead E. coli OP50 on day 1 were comparable of those in worms fed live E. coli OP50 on day 1 (30.9 vs 28.2 nmol/g dry weight). However, the levels of indole were not increased in worms fed HK E. coli OP50 on days 4 and 7. Furthermore, the observation that DAF-16 was retained in the cytoplasm of the intestine in worms fed live E. coli OP50 on day 1 (Figure 1A and 1B) also indicated that indole produced by E. coli OP50 on the NGM plates is not enough to induce DAF-16 nuclear translocation. By contrast, supplementation with indole (50-200 μM) significantly increased the indole levels in worms on day 1 (Figure 2-figure supplement 3B), which could induce nuclear translocation of DAF-16 in worms (Figure 2B). Thus, the increase in the levels of indole in worms with age results from intestinal accumulation of live E. coli OP50, rather than indole produced by E. coli OP50 on the NGM plates.
  
  4) Recent work showed that the multicopy DAF-16 transgene acts differently from the single copy GFP knock in DAF-16 transgene. Which DAF-16 transgene was used in this work?
  
  The strain we used is TJ356. Its genotype has been described as zIs356 [daf-16p::daf-16a/b::GFP+rol-6(su1006)] (Lee, Hench, & Ruvkun, 2001; Lin, Hsin, Libina, & Kenyon, 2001), from the Caenorhabditis Genetics Center (CGC).
  
  5) In lines 190-193, the author argued that the supplementation with indole (100 M) inhibited the CFU of E. coli K-12 in WT worms, but not daf-16(mu86) mutants, on Days 4 and 7 (Figure 3H and 3I). These results suggest that endogenous indole is involved in maintaining a normal lifespan in worms. This is overstating. The data here more likely suggest that indole could inhibit the proliferation of E. coli through DAF-16.
  
  We really appreciate this reviewer’s preciseness. In response to the reviewer’s suggestion, we had changed "...indole is involved in maintaining a normal lifespan in worms" to "...indole produced by bacteria in the gut could inhibit the proliferation of E. coli via DAF-16 in worms".
  
  6) Sonowal (2017) reported that AHR mediates indole-promoted lifespan extension at 16 C. Yet this work argued that RNAi knockdown of ahr-1 did not affect the nuclear translocation of DAF-16 in worms fed E. coli K12 strain on Day 7 (Figure 4-figure supplement 1A) or young adult worms treated with indole (100 M) for 24 h. The difference between these two works should be discussed.
  
  We really appreciate this reviewer’s preciseness. It has been shown that AHR-1 mediates indole-promoted lifespan extension in worms at 16 C (Sonowal et al., 2017). However, our data show that AHR-1 is not involved in activation of DAF-16 by indole-induced nuclear translocation of DAF-16 at 20 C. This means that AHR-1 and TRPA-1-lifespan extension by indole are essentially different. In our study, indole is added to NGM plates when worms reached the young adult stage. In the study by Sonowal et al., indole is supplemented at the stage of L1 larva. In addition, lifespan of C. elegans varies at different temperatures (Xiao et al., 2013). Thus, indole may promote lifespan extension via different mechanisms, which is dependent on exposure time and temperature.
  
  7) Sonowal (2017) conducted mRNA profiling for worms growing on K12 and K12△tnaA. Is TRPA1 in their de-regulated gene list? Have other de-regulated genes been tested in this work?
  
  We appreciate the concerns of the reviewer. We found that TRPA-1 is not included in the de-regulated gene list. Sonowal et al. focus on the gene expression profiles in worms from L1 larvae to young adults, whereas we pay attention to gene expression profiles in worms from young adults to aged worms. Thus, we did not test the de-regulated genes in their work.
  
  8) How does indole activate TRPA1? In the absence of trpa1, what is the concentration of indole in worms? Since TRPA1 is a channel, is there any possibility that TRPA1 is involved in the transport of indole? It is really interesting and surprising that neuronal TRPA-1, but not intestinal TRPA-1, mediates the beneficial effect of indole. How does indole specifically activate TRPA-1 in neurons to preserve the longevity of worms?
  
  We appreciate the concerns of the reviewer. TRPA1 is a nonselective cation channel permeable to Ca2+, Na+, and K+ (Zygmunt & Hogestatt, 2014). It is unlikely that TRPA1 is capable of transporting heterocyclic organic compounds, such as indole.
  
  In response to the reviewer’s suggestion, we detected the content of indole in trpa-1(ok999) worms. We found that the levels of indole in trpa-1(ok999) worms were slightly increased in worms on days 4 and 7, compared to those in WT worms on days 4 and 7 (Author response image 3).
  
  Recently, Ye et al. have demonstrated that indole and indole-3-carboxaldehyde (IAld) are agonists of TRPA1, which is conserved in vertebrates (Ye et al., 2021). Thus, it is mostly likely that indole acts as an agonist of TRPA-1 in C. elegans by directly binding to TRPA-1. One possibility is that activation of TRPA-1 in neurons by indole could induce a pathway that release a neurotransmitter, which in turn triggers a signaling pathway to extend lifespan of worms via activating DAF-16 in a non-cell autonomous manner. In contrast, the activation of TRPA-1 in the intestine by indole is unable to release such a neurotransmitter. Indeed, TRPA1 induces the releasing of calcitonin gene-related peptide in perivascular sensory nerves, leading to membrane hyperpolarization and arterial dilation on smooth muscle cells (Talavera et al., 2020). Moreover, the activation of TRPA1 by indole and IAld induces the secretion of the neurotransmitter serotonin in zebrafish (Ye et al., 2021).
  
  Author response image 3.
  
  The indole levels in trpa-1 mutants are increased on days 4 and 7, compared with those in WT worms. *P < 0.05.
  
  9) How neuronal- and intestinal-specific knockdown of trpa-1 by RNAi was conducted? And what is the tissue-specific expression pattern of trap-1? Speculating how indole was transported to neuron cells is pretty appealing.
  
  We appreciate the concerns of the reviewer. SID-1 is required cell-autonomously for systemic RNAi (Winston, Molodowitch, & Hunter, 2002). Thus, the sid-1 mutants are resistant to RNAi in the neuronal- and intestinal-specific RNAi strains, sid-1 was expressed under control of the neuronal-specific unc-119 and the intestinal-specific vha-6 promoters, respectively. Although it has been reported that TRPA-1 is expressed in neurons, muscles, hypodermal cells, and the intestine, Xiao et al. proved that only TRPA-1 expressed in the intestine and neurons contributes to life extension at low temperature (Xiao et al., 2013). The transporter of indole has not been identified. In Arabidopsis, ATP-binding cassette (ABC) transporter G family 37(ABCG37) has been reported to transport a range of indole derivatives (Ruzicka et al., 2010). However, all fifteen C. elegans ABC transporters share less than 30% sequence identity with ABCG37. Thus, it is impossible to determine which one is the transport channel for indole and indole derivatives in C. elegans.
  
  10) Supplementation with indole only up-regulated the expression of lys-7 and lys-8 in worms subjected to intestinal-specific (Figure 7-figure supplement 2C), but not neuronal-specific, RNAi of trpa-1 (Figure 7-figure supplement 2D). If this is the case, should the addition of indole specifically induce the expression of lys-7p::gfp or lys-8p::gfp in neurons?
  
  We really appreciate this reviewer’s preciseness. Indeed, lys-7 and lys-8 are expressed in both neurons and the intestine (Author response image 4A and 7B). However, the expression of lys-8p::gfp and lys-7p::gfp in neurons was not altered in worms after treatment with indole or knockdown of trpa-1 by RNAi (Author response image 4C and 4D).
  
  Author response image 4.
  
  The expression of LYS-7 and LYS-8 in neurons is not altered after treatment with indole or knockdown of trpa-1 by RNAi. (A and C) Representative images of lys-7p::gfp (A) and lys-8p::gfp (C). Both lys-7 and lys-8 could be expressed in neurons and the intestine. (B and D) Quantification of fluorescent intensity of lys-7p::gfp (B) and lys-8p::gfp (D) in neurons. These results are means ± SD of three independent experiments. ns, not significant.
  
  11) The authors demonstrated that K-12△tnaA strain had undetectable tnaA mRNA or indole levels. Furthermore, the deletion of tnaA significantly inhibited the nuclear translocation of DAF-16 in worms. However, mutations in E. coli still have non-specific effects as there are several transposon insertions or polar mutations influencing downstream genes. The authors should demonstrate that only disruption of TnaA causes the failure of nuclear translocation of DAF-16.
  
  In response to the reviewer’s suggestion, we rescued the expression of tnaA in the K-12 △tnaA strain. As expected, the indole level of from the supernatant in the K12 △tnaA::tnaA strain cultures was 34.1 μmol/L, which was comparable of that in the K12 strain cultures (42.5 μmol/L)（new Figure 2-figure supplement 4D). In addition, DAF-16 nuclear accumulation was increased in worms grown in the K12 △tnaA::tnaA strain on days 4 and 7 (new Figure 2-figure supplement 4E).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.12.19.520989v1
www.biorxiv.org www.biorxiv.org

PBN-PVT projection modulates negative emotions in mice

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response:
  
  Reviewer #1 (Public Review):
  
  In their manuscript entitled "PBN-PVT projection modulates negative emotions in mice", Zhu et al. combine circuit mapping techniques with behavioral manipulations to interrogate the function of anatomical projections from the parabrachial nucleus (PBN) to the paraventricular nucleus of the thalamus (PVT). The study addresses an important scientific question, since the PVT and particularly the posterior PVT is known to be mostly sensitive to aversive signals, but the neural circuit mechanisms underlying this process remain unknown. Here the authors contribute important evidence that PBN inputs to the PVT may be critical for this process. Specifically, the authors identify that the PVT receives glutamatergic projections from the PBN that promote aversive behavioral responses but do not modulate nociception. The latter finding is intriguing considering that the PBN is an important node in pain processing and that the PVT has recently emerged as a modulator of pain. Overall, the study includes an impressive array of techniques and manipulations and offers insight to an important scientific question. The authors' conclusions will be significantly strengthened by the inclusion of some additional experiments and controls.
  
  It is in my view problematic that the authors used different genetic strategies to target the PBN-PVT pathway. For example, in Figure 1 the authors used Vglut2-cre mice for the anterograde tracings but later on in the same figure used constitutively expressed ChR2 in the PBN to assess functional connectivity with the PVT using ex-vivo patch-clamp electrophysiology. In Figure 2 the authors once again employed Vglut2-Cre mice to target PBN projections to the PVT and manipulate these projections optogenetically during behavioral tests. However, in the following figure (Fig. 3) the authors then use a retro-Cre approach and chemogenetics. The interchangeable use of these different manipulations is not warranted by data presented by the authors. For example it is unclear whether all PBN neurons projecting to the PVT are glutamatergic and express VGLUT2. When using the constitutively expensed ChR2 in the PBN to demonstrate glutamatergic projections to the PVT, the authors may be faced by potential contamination from adjacent brain stem structures like the LC and DRN, which project to the PVT and are known to contain glutamatergic neurons (vglut1 and vglut3, respectively). Another example, for figure 4 why did the authors not use Vglut2-cre mice and inhibited PBN terminals in the PVT as in Figure 2?
  
  We agree with the reviewer. Now we have reframed this manuscript. We first presented the slice recording results from wild-type mice (Figure 1). We recorded both the EPSCs and IPSCs. We found that light-induced EPSCs in 34 of 52 neurons and light-induced IPSCs in 4 of 52 neurons. Please see Page 5 Line 119 to Line 121. We carefully examined the ChR2 virus infection area. Please see the following Fig R1 showcase. We found that there were dense ChR2-mCherry+ neurons in the PBN. We also observed ChR2-mCherry+ neurons in the nearby ventrolateral periaqueductal gray (VLPAG), locus coeruleus (LC), cuneiform nucleus (CnF), and laterodorsal tegmental nucleus (LDTg). And the dorsal raphe nucleus (DR) was not infected. We agreed with the reviewer that there could be potential contamination from the LC, which releases dopamine and norepinephrine to the PVT by LC-PVT projection. We have discussed this on Page 13 Line 375 to Line 380.
  
  Figure R1. AAV-hSyn-ChR2-mCherry virus infection showcase. LPBN, lateral parabrachial nucleus. MPBN, medial parabrachial nucleus; VLPAG: ventrolateral periaqueductal gray; LC, locus coeruleus; CnF, cuneiform nucleus; LDTg, laterodorsal tegmental nucleus; DR, dorsal raphe nucleus; scp, superior cerebellar peduncle, scale bar: 200 μm.
  
  We performed tdTomato staining with VgluT2 mRNA in situ hybridization and found that about 94.4% of tdTomato+ neurons express VgluT2 mRNA. These results indicate that the majority of PVT-projecting PBN neurons are glutamatergic. These new results have been included in Figure 1R−U.
  
  Then we used VgluT2-ires-Cre mice to perform tracing (Figure1−figure supplement 2) and behavioral tests (optogenetic activation in Figure 2, optogenetic inhibition in Figure 4). We also performed the pharmacogenetic activation of PVT-projecting PBN neurons on wild-type mice (Figure 3). We observed that pharmacogenetic activation of the PVT-projecting PBN neurons reduced the center duration in the OFT, similar to the optogenetic activation OFT result. We also observed that pharmacogenetic activation of the PVT-projecting PBN neurons induced freezing behaviors. Our pharmacogenetic activation experiment supported the hypothesis that PBN-PVT projections modulate negative affective states.
  
  Now we have now performed the optogenetic inhibition of the PBN-PVT projections using VgluT2-ires-Cre mice. We found that inhibition of PBN-PVT projections reduces 2-MT-induced aversion-like behaviors and footshock-induced freezing behaviors. These new results have been included in Figure 4, Figure 4−figure supplement 1 and 2, and were described in the text. Please see the text Page 9 Line 254 to Page 10 Line 274.
  
  Related to the previous point, in the retrograde labeling experiment (Fig. 1) it would be useful if the authors determined what fraction of retrogradely label cells are indeed VGLUT2+. For behavioral experiments employing the retro-Cre approach the authors may be manipulating a heterogenous population of PBN neurons which could be influencing their behavioral observations. In general, the authors should ensure that a similar population of PBN-PVT neurons is been assessed throughout the study.
  
  We have now performed tdTomato staining with VgluT2 mRNA in situ hybridization and found that approximately 94.4% of tdTomato+ neurons expressed VgluT2 mRNA. These results indicated that the majority of PVT-projecting PBN neurons are glutamatergic. These new results have been included in Figure 1R−U and were described in the text. Please see Page 5 Line 129 to Line 132.
  
  The authors' grouping of the behavioral data into the first vs the last four minutes of light stimulation in the OF does not seem to be properly justified an appears rather arbitrary. Also related to data analysis, the unpaired t-test analysis in the fear conditioning experiment in Figure 4J seems inappropriate. ANOVA with group comparisons is more appropriate here.
  
  To provide a more detailed profile of the behaviors in the OFT, we further divided the laser ON period (5−10 minutes) into five one-minute periods and analyzed the velocity, non-moving time, travel distance, center time, and jumping. We found that the velocity and non-moving time were increased, and the center time was decreased in the ChR2 mice during most periods. Furthermore, we observed that the travel distance and jumping behaviors were increased only in the first one-minute period in ChR2 mice. These new results have been included in Figure 2−figure supplement 2 and were described in the text. Please see Page 7 Line 179 to Line 189. We also discussed this on Page 14 Line 396 to Line 403.
  
  We now performed the optogenetic inhibition of PBN-PVT projections in footshock-induced freezing behavior on Vglut2-ires-Cre mice (Figure 4J−K). And we revised the statistics (Unpaired student's t-test) and calculated the percentage of freezing behaviors in 10 minutes, which matched the constant optogenetic inhibition. Similar changes have been made in the Figure 4−figure supplement 3K.
  
  Considering the persistency of the effect in the OF following optogenetic stimulation of PBN-PVT afferents, the lack of such persistent effect in the RTPA is hard to reconcile. By performing additional experiments the authors attempt to settle this discrepancy by proposing that the PBN-PVT pathway promotes aversion but does not facilitate negative associations. I find this conclusion to be problematic. If the pathway is critical for conveying aversive signals to the PVT, one expects that at the very least it would be require for the formation of associate memories involving aversive stimuli. However, the authors do not show data to this effect. Instead they show that animals decrease their acute defensive reactions to aversive stimuli (2-MT and fear conditioning), but do not show whether associative memory related to this experience (e.g. fear memory retrieval) is impacted by manipulations of the PBN-PVT pathway.
  
  We have now performed several experiments to examine the effects of the PBN-PVT projections on aversion formation and memory retrieval.
  
  We first performed a prolonged conditioned place aversion that mimics drug-induced place aversion. And we found that optogenetic activation of PBN-PVT projections did not induce aversion in the postconditioning test on Day 4. These new results have been included in Figure 2−figure supplement 2H−I and described in the text. Please see Page 7 Line 196 to Line 199.
  
  Then, we performed the classical auditory fear conditioning test and found that optogenetic inhibition of PBN-PVT projections during footshock in the conditioning period did not affect freezing levels in contextual test or cue test (Laser OFF trials). And inhibition of PBN-PVT projections during contextual test or cue test (Laser On trials) did not affect freezing levels either. These data suggest that PBN-PVT projections are not crucial for associative fear memory formation or retrieval. These new results have been included in Figure 4−figure supplement 2 and described in the text. Please see Page 10 Line 268 to Page Line 274. We also discussed this on Page 15 Line 430 to Page 16 Line 473.
  
  A similar lack of connection between aversive signals within the PVT and the PBN pathway is found in the photometry data presented in Figure 5. While importantly the authors' observation of aversive modulation of the pPVT reproduces data from other recent studies, the question here is whether the increased activity of PVT neurons is mediated by input from the PBN. The cFos experiment included in this figure attempts to draw this connection, but empirical evidence is required.
  
  We have now performed the dual Fos staining experiment and the optoeletrode experiment.
  
  In the dual Fos staining experiment, we found that there was a broad overlap between optogenetic stimulation-activated neurons (expressing the Fos protein) and footshock-activated neurons (expressing the fos mRNA) (Figure 6−figure supplement 1B−E).
  
  In optoelectrode experiment, there was also a broad overlap between laser-activated and footshock-activated neurons. This result was consistent with the dual Fos staining result, suggesting that PVTPBN neurons were activated by aversive stimulation. Next, we analyzed the firing rates of PVT neurons during footshock with laser sweeps and footshock without laser sweeps. We found that the footshock stimulus with laser activated 30 of 40 neurons and increased the overall firing rates of 40 neurons compared with the footshock without laser result (Figure 6I). These results indicated that activation of PBN-PVT projections could enhance PVT neuronal responses to aversive stimulation.
  
  These new results have been included in Figure 6, Figure 6−figure supplement 1, and described in the text. Please see Page 10 Line 295 to Page 11 Line 317. We also discussed these results on Page 15 Line 422 to Line 429.
  
  Reviewer #2 (Public Review):
  
  Zhu et al. investigated the connectivity and functional role of the projections from the parabrachial nucleus (PBN) to the paraventricular nucleus of the thalamus (PVT). Using neural tracers and in vitro electrophysiological recordings, the authors showed the existence of monosynaptic glutamatergic connections between the PBN and PVT. Further behavioral tests using optogenetic and chemogenetic approaches demonstrated that activation of the PVT-PBN circuit induces aversive and anxiety-like behaviors, whereas optogenetic inhibition of PVT-projecting PBN neurons reduces fear and aversive responses elicited by footshock or the synthetic predator odor 2MT. Next, they characterized the anatomical targets of PVT neurons that receive direct innervation from the PBN (PVTPBN). The authors also showed that PVTPBN neurons are activated by aversive stimuli and chemogenetically exciting these cells is sufficient to induce anxiety-like behaviors. While the data mostly support their conclusions, alternative interpretations and potential caveats should be addressed in the discussion.
  
  Strength:
  
  The authors used different behavioral tests that collectively support a role for PBN-PVT projections in promoting fear- and anxiety-like behaviors, but not nociceptive or depressive-like responses. They also provided insights into the temporal participation of the PBN-PVT circuit by showing that this pathway regulates the expression of affective states without contributing for the formation of fear-associated memories. Because previous studies have shown that activation of projection-defined PVT neurons is sufficient to induce the formation of aversive memories, the differences between the present study and previous findings reinforce the idea of functional heterogeneity within the PVT. The authors further explored this functional heterogeneity in PVT by using an anterograde viral construct to selectively label PVT neurons that are targeted by PBN inputs. Together, these results connect two important brain regions (i.e., PBN and PVT) that were known to be involved in fear and aversive responses, and provide new information to help the field to elucidate the complex networks that control emotional behaviors.
  
  Weakness:
  
  The authors should avoid anthropomorphizing the behavioral interpretation of the findings and generalizing their conclusions. In addition, there is a series of potential caveats that could interfere with the interpretation of the results, all of which must be discussed in the article. For example, the long protocol duration of laser stimulation, the possibility of antidromic effects following photoactivation of PBN terminals in PVT, and the existence of collateral PBN projections that could also be contributing for the observed behavioral changes. Additional clarification about the exclusive glutamatergic nature of the PBN-PVT projection should be provided and the present findings should be reconciled with prior studies showing the existence of GABAergic PBN-PVT projections.
  
  We agree with the reviewer. Now we have revised the text carefully to avoid using subjective terms. We showed the light-induced EPSCs and IPSCs results in Figure 1, and we performed RNAscope experiments to clarify the glutamatergic nature of the PVT-projecting PBN neurons (Figure 1 and Figure1−figure supplement 1). We also added discussion about the laser stimulation protocol, the potential possibility of antidromic effects, and collateral projections. Please see Page 14 Line 413 to Page 15 Line 418, and Page 16 Line 449 to Line 457.
  
  We also added several experiments to dissect the effect of manipulation of the PBN-PVT projection in fear memory acquisition and retrieval. These new results have been included in Figure 4−figure supplement 2 and described in the text. Please see Page 10 Line 268 to Line 274. We also discussed this on Page 15 Line 430 to Page 16 Line 473.
  
  Reviewer #3 (Public Review):
  
  Zhu YB et al investigated the functional role of the parabrachial nucleus (PBN) to the thalamic paraventricular nucleus (PVT) in processing negative emotions. They found that PBN send excitatory projection to PVT. The activation of PBN-PVT projection induces anxiety-like and fear-like behaviors, while inhibition of this projection relieves fear and aversion.
  
  Strengths:
  
  The authors dissected anatomic and functional connection between the PBN and the PVT by using comprehensive modern neuroscience techniques including viral tracing, electrophysiology, optogenetics and pharmacogenetics. They clearly demonstrated the significant role of PBN-PVT projection in modulating negative emotions.
  
  Weaknesses:
  
  The PBN contains a variety of neuronal subtypes that expressed distinct molecular marker such as CGRP, Tac1, Pdyn, Nts et al. The PBN also send projections to multiple targets, including VMH, PAG, BNST, CEA and ILN that could mediate distinct function. What's the neuronal identity of PVT-projecting PBN neurons, how is the PVT projection and other projections organized, are they overlapping or relative independent pathway? Those important questions were not examined in this study, which make it hard to relate this finding to other existing literature.
  
  We have now performed the RNAscope experiments detecting VgluT2, Tac1, Tacr1, Pdyn mRNA, and fluorescent immunostaining detecting CGRP protein in the PBN. We found that about 94.4% of tdTomato+ neurons express VgluT2 mRNA. We also found that tdTomato+ neurons were only partially co-labeled with Tacr1, Tac1, or Pdyn mRNA, but not with CGRP. These results indicate that the majority of PVT-projecting PBN neurons are glutamatergic. These new results have been included in Figure 1, Figure 1−figure supplement 1, and were described in the text. Please see Page 5 Line 129 to Line 140.
  
  We also provided the collateral projections from PVT-projecting neurons in Figure 1−figure supplement 3, Page 6 Line 148 to Line 151, and discussed on Page 16 Line 449 to Line 457.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.03.11.434900v1
www.biorxiv.org www.biorxiv.org

New submission 30/12/2022, 17:27:52

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  “A sample size of 3 idiopathic seems underpowered relative to the many types of genetic changes that can occur in ASD. Since the authors carried out WGS, it would be useful to know what potential causative variants were found in these 3 individuals and even if not overlapping if they might expect to be in a similar biological pathway.
  
  If the authors randomly selected 3 more idiopathic cell lines from individuals with autism, would these cell lines also have altered mTOR signaling? And could a line have the same cell biology defects without a change in mTOR signaling? The authors argue that the sample size could be the reason for lack of overlap of the proteomic changes (unlike the phosphor-proteomic overlaps), which makes the overlapping cell biology findings even more remarkable. Or is the phenotyping simply too crude to know if the phenotypes truly are the same?”
  
  We appreciate these thoughtful comments and also agree that of several models, our studies indicate the possibility of mTOR alteration in multiple forms of ASD. As above, we are currently pursuing this hypothesis with newly acquired DOD support. With regard to the I-ASD population, we agree that there are a large variety of genetic changes that can occur in genetically undefined ASDs. Indeed, this is precisely why we expected to see “personalized” phenotypes in each I-ASD individual when we embarked on this study. At that time, several years ago, we had planned to expand the analyses to more I-ASD individuals to assess for additional personalized phenotypes. However, as our studies progressed, we were surprised to find convergence in our I-ASD population in terms of neurite outgrowth and migration and later proteomic results showing convergence in mTOR. We found it particularly remarkable that despite a sample size of 3 that this convergence was noted. When we had the opportunity to extend our studies to the 16p11.2 deletion population, we were thrilled to conduct the first comparison between I-ASD and a genetically defined ASD and, as such, the scope of the paper turned towards this comparison. We do agree that analyses of the other I-ASD individuals would be a beneficial endeavor, both to understand how pervasive NPC migration and neurite deficits are in autism and to assess the presence of mTOR dysregulation. Furthermore, it would be important to see whether alterations in other pathways could also lead to similar cell biological deficits, though we know that other studies of neurodevelopmental disorders have found such cellular dysregulations without reporting concurrent mTOR dysregulation. Given our current grant funding to extend these analyses, such experiments within this manuscript would not be feasible.
  
  Regarding the phenotyping methods used, we decided to assess neurite outgrowth and migration as they are both cytoskeleton dependent processes that are critical for neurodevelopment and are often regulated by the same genes. Furthermore, similar analyses have been applied to Fragile-X Syndrome, 22q11.2 deletion syndrome, and schizophrenia NPCs (Shcheglovitov A. et al., 2013; Mor-Shaked H. et al., 2016; Urbach A. et al., 2010; Kelley D. J. et al., 2008; Doers M. E. et al., 2014; Brennand K. et al., 2015; Lee I. S. et al., 2015; Marchetto M. C. et al., 2011). As such, it seems that multiple underlying etiologies can lead to similar dysregulated cellular phenotypes that can contribute to a variety of neurodevelopmental disorders. On a more global level, there are only a few different cellular functions a developing neuron can undergo, and these include processes such as proliferation, survival, migration, and differentiation. Thus, to understand neurodevelopmental disorders, it is important to study the more “crude” or “global” cellular functions occurring during neurodevelopment to determine whether they are disrupted in disorders such as ASD. In our studies we find that there are indeed dysregulations in many of these basic developmental processes, indicating that the typical steps that occur for normal brain cytoarchitecture may be disrupted in ASD. To understand why, we then further utilized molecular studies to “zoom” in on potential mechanisms which implicated common dysregulation in mTOR signaling as one driver for these common cellular phenotypes. As suggested, we did complete WGS on all the I-ASD individuals and did not see any overlapping genetic variants between the three I-ASD individuals as mentioned in our manuscript. The genetic data was published in a larger manuscript incorporating the data (Zhou A. et al., 2023). However, there were variants that were unique to each I-ASD individual which were not seen in their unaffected family members, and it is possible these variants could be contributing to the I-ASD phenotypes. We also utilized IPA to conduct pathway analysis on the WGS data utilizing the same approach we did in analysis of p- proteome and proteome data. From WGS data, we selected high read-quality variants that were found only in I-ASD individuals and had a functional impact on protein (ie excluding synonymous variants). The enriched pathways obtained from this data were strikingly different from the pathways we found in the p-proteome analysis and are now included in supplemental Figure 6 in the manuscript. Briefly, the top 5 enriched pathways were: O-linked glycosylation, MHC class 1 signaling, Interleukin signaling, Antigen presentation, and regulation of transcription.
  
  Reviewer #2 (Public Review):
  
  1) I found that interpreting how differential EF sensitivity is connected to the rest of the story difficult at times. First, it is unclear why these extracellular factors were picked. These are seemingly different in nature (a neuropeptide, a growth factor and a neuromodulator) targeting largely different pathways. This limits the interpretation of the ASD subtype-specific rescue results. One way of reframing that could help is that these are pro-migratory factors instead of EFs broadly defined that fail to promote migration in I-ASD lines due to a shared malfunctioning of the intracellular migration machinery or cell-cell interactions (possibly through tight junction signaling, Fig S2A). Yet, this doesn't explain the migration/neurite phenotypes in 16p11 lines where EF sensitivity is not altered, overall implying that divergent EF sensitivity independent of underlying mTOR state. What is the proposed model that connects all three findings (divergent EF sensitivity based on ASD subtypes, 2 mTOR classes, convergent cellular phenotypes)?
  
  We thank you for the kind assessment of our manuscript and for the thought-provoking questions posed. In terms of extracellular factors, for our study, we defined extracellular factor as any growth factor, amino acid, neurotransmitter, or neuropeptide found in the extracellular environment of the developing cells. The EFs utilized were selected due to their well-established role in regulation of early neurodevelopmental phenotypes, their expression during the “critical window” of mid-fetal development (as determined by Allan Brain Atlas), and in the case of 5-HT, its association with ASD (Abdulamir H. A. et al., 2018; Adamsen D. et al., 2014; Bonnin A. et al., 2011; Bonnin A. et al., 2007; Chen X. et al., 2015; El Marroun H. et al., 2014; Hammock E. et al., 2012; Yang C. J. et al., 2014; Dicicco-Bloom E. et al., 1998; Lu N. et al., 1998; Suh J. et al., 2001; Watanabe J. et al., 2016; Gilmore J. H. et al., 2003; Maisonpierre P. C. et al., 1990; Dincel N. et al., 2013; Levi- Montalcini R., 1987). Lastly, prior experiments in our lab with a mouse model of neurodevelopmental disorders, had shown atypical responses to EFs (IGF-1, FGF, PACAP). As such, when we first chose to use EFs in human NPCs we wanted to know 1) whether human NPCs even responded to these EFs, 2) whether EFs regulated neurite outgrowth and migration and 3) would there be a differential response in NPCs derived from those with ASD. Our studies were initiated on the I-ASD cohort and given the heterogeneity of ASD we had hypothesized we would get “personalized” neurite and migration phenotypes. Due to this reason, we also wanted to select multiple types of EFs that worked on different signaling pathways. Ultimately, instead of personalized phenotypes we found that all the I-ASD NPCs did not respond to any of the EFs tested whereas the 16p11.2 deletion NPCS did – this was therefore the only difference we found between these two “forms” of ASD. As noted, in I-ASD the lack of response to EFs can be ameliorated by modulating mTOR. However, in the 16p11.2 deletion, despite similar mTOR dysregulation as seen in I-ASD, there is no EF impairment. We do not have a cohesive model to explain why the 16pDel individuals differ from the I-ASD model other than to point to the p- proteomes which do show that the 16pDel NPCs are distinct from the I-ASD NPCs. It seems that mTOR alteration can contribute to impaired EF responsiveness in some NPCs but perhaps there is an additional defect that needs to be present in order for this defect to manifest, or that 16p11.2 deletion NPCs have specific compensatory features. For example, as noted in the thoughtful comment, the p-proteome canonical pathway analysis shows tight junction malfunction in I-ASD which is not present in the 16pDel NPCs and it could be the combination of mTOR dysregulation + dysregulated tight junction signaling that has led to lack of response to EFs in I-ASD. Regardless, we do not think the differences between two genetically distinct ASDs diminish the convergent mTOR results we have uncovered. That is, regardless of whatever defects are present in the ASD NPCs, we are able to rescue it with mTOR modulation which has fascinating implications for treatment and conceptualization for ASD. Lastly, we see our EF studies as an important inclusion as it shows that in some subtypes of ASD, lack of response to appropriate EFs could be contributing to neurodevelopmental abnormalities. Moreover, lack of response to these EFs could have implications for treatment of individuals with ASD (for example, SSRI are commonly used to treat co-morbid conditions in ASD but if an individual is unresponsive to 5- HT, perhaps this treatment is less effective). We have edited the manuscript to include an additional discussion section to address the EFs more thoroughly and have included a few extra sentences in the introduction as well!
  
  2) A similar bidirectional migration phenotype has been described in hiSPC-derived human cortical interneurons generated from individuals with Timothy Syndrome (Birey et al 2022, Cell Stem Cell). Here, authors show that the intracellular calcium influx that is excessive in Timothy Syndrome or pharmacologically dampened in controls results in similar migration phenotypes. Authors can consider referring to this report in support of the idea that bimodal perturbations of cardinal signaling pathways can converge upon common cellular migration deficits.
  
  We thank you for pointing out the similar migration phenotype in the Timothy Syndrome paper and have now cited it in our manuscript. We have also expanded on the concept of “too much or too little” of a particular signaling mechanism leading to common outcomes.
  
  3) Given that authors have access to 8 I-ASD hiPSC lines, it'd very informative to assay the mTOR state (e.g. pS6 westerns) in NPCs derived from all 8 lines instead of the 3 presented, even without assessing any additional cellular phenotypes, which authors have shown to be robust and consistent. This can help the readers better get a sense of the proportion of high mTOR vs low- mTOR classes in a larger cohort.
  
  We have already addressed this in response to reviewer 1 and the essential revisions section, providing our reasoning for not expanding the study to all 8 I-ASD individuals.
  
  4) Does the mTOR modulation rescue EF-specific responses to migration as well (Figure 7)
  
  We did not conduct sufficient replicates of the rescue EF specific responses to migration due to the time consuming and resource intensive nature of the neurosphere experiments. Unlike the neurite experiments, the neurosphere experiments require significantly more cells, more time, selection of neurospheres based on a size criterion, and then manual trace measurements. We did one experiment in Family-1 where we utilized MK-2206 to abolish the response of Sib NPCs to PACAP. Likewise, adding SC-79 to I-ASD-1 neurospheres allowed for response to PACAP.
  
  Author response image 1.
  
  Author response image 2.
  
  Reviewer #3: Public Review
  
  We appreciate the kind, detailed and very thorough review you provided for us!
  
  The results on the mTOR signaling pathway as a point of convergence in these particular ASD subtypes is interesting, but the discussion should address that this has been demonstrated for other autism syndromes, and in the present manuscript, there should be some recognition that other signaling pathways are also implicated as common factors between the ASD subtypes.
  
  With regards to the mTOR pathway, we had included the other ASD syndromes in which mTOR dysregulation has been seen including tuberous sclerosis, Cowden Syndrome, NF-1, as well as Fragile-X, Angelman, Rett and Phelan McDermid in the final paragraph of the discussion section “mTOR Signaling as a Point of Convergence in ASD”. We have now expanded our discussion to include that other signaling pathways such as MAPK, cyclins, WNT, and reelin which have also been implicated as common factors between the ASD subtypes.
  
  The conclusions of this paper are mostly well supported by data, but for the cell migration assay, it is not clear if the authors control for initial differences in the inner cell mass area of the neurospheres in control vs ASD samples, which would affect the measurement of migration.
  
  Thank you for this thoughtful comment! When we first started our migration data, inner cell mass size was indeed a major concern for which we controlled in our methods. First, when plating the neurospheres, we would only collect spheres when a majority of spheres were approximately a diameter of 100 um. Very large spheres often could not be imaged due to being out of focus and very small spheres would often disperse when plated. Thus, there were some constraints to the variability of inner cell mass size.
  
  Furthermore, when we initially collected data, we conducted a proof of principal test to see if initial inner cell mass area (henceforth referred to as initial sphere size or ISS) influenced migration data. To do so, we obtained migration and ISS data from each diagnosis (Sib, NIH, I-ASD, 16pASD). Then we utilized R studio to see if there is a relationship between Migration and ISS in each diagnosis category using the equation (lm(Migration~ISS, data=bydiagnosis). In this equation, lm indicates linear modeling and (~) is a term used to ascertain the relationship between Migration and ISS and the term data=bydiagnosis allows the data to be organized by diagnosis
  
  The results were expressed as R-squared values indicating the correlation between ISS and Migration for each diagnosis and the p-value showing statistical significance for each comparison. As shown in Author response table 1, for each data set, there is minimal correlation between Migration and ISS in each data set. Moreover, there are no statistically significant relationships between Migration and ISS indicating that initial sphere size DOES NOT influence migration data in any of our data-sets.
  
  Author response table 1.
  
  Lastly, utilizing R, we modeled what predicted migration would be like for Sib, NIH, I-ASD, and 16pASD if we accounted for ISS in each group. Raw migration data was then plotted against the predicted data as in Author response image 3.
  
  Author response image 3.
  
  As shown in the graph, there are no statistical differences between the raw migration data (the data that we actually measured in the dish) and the modeled data in which ISS is accounted for as a variable. As such, we chose not to normalize to or account for ISS in our other experiments. We have now included the above R studio analyses in our supplemental figures (Figure S1) as well.
  
  Also, in Fig 5 and 6, panels I and J omit the effects of drug on mTOR phosphorylation as shown for other conditions.
  
  Both SC-79 and MK2206 were selected in our experiments after thorough analysis of their effects on human epithelial cells and other cultured cells (citations in manuscript). However, initially, we did not know whether either of these drugs would modulate the mTOR pathway in human NPCs, thus, in Figures 5A,5D, 6A and 6D we chose to focus on two of our data-sets to establish the effect of these drugs in human NPCs. Our experiments in Family-1 and Family-2 showed us that SC-79 increases PS6 in human NPCs while MK-2206 downregulates it. Once this was established, we knew the drugs would have similar effects in the NPCs from the other families. Thus, we only conducted a proof of principle test to confirm the drug does indeed have the intended effect in I-ASD-3 and 16pDel. We have included these proof of principle westerns in Figure 5I, 5K, 6I and 6K to show that the effects of these drugs are reproducible across all our NPC lines. We did not include quantification since the data is only from our single proof of principle western.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.09.17.508382v1
www.biorxiv.org www.biorxiv.org

Eye Movements Reveal Spatiotemporal Dynamics of Active Sensing and Planning in Navigation

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Zhu et al. found that human participants could plan routes almost optimally in virtual mazes with varying complexity. They further used eye movements as a window to reveal the cognitive computations that may underly such close-to-optimal performance. Participants’ eye movement patterns included: (1) Gazes were attracted to the most task-relevant transitions (effectively the bottleneck transitions) as well as to the goal, with the share of the former increasing with maze complexity; (2) Backward sweeps (gazes moving from goal to start) and forward sweeps (gazes from start to goal) respectively dominated the pre-movement and movement periods, especially in more complex mazes. The authors explained the first pattern as the consequence of efficient strategies of information collection (i.e., active sensing) and connected the second pattern to neural replays that relate to planning.
  
  The authors have provided a comprehensive analysis of the eye movement patterns associated with efficient navigation and route planning, which offers novel insights for the area through both their findings and methodology. Overall, the technical quality of the study is high. The "toggling" analysis, the characterization of forward and backward sweeps, and the modeling of observers with different gaze strategies are beautiful. The writing of the manuscript is also elegant.
  
  I do not see any weaknesses that cannot be addressed by extended data analysis or modeling. The following are two major concerns that I hope could be addressed.
  
  We thank the reviewer for their positive assessment of our work!
  
  First, the current eye movement analysis does not seem to have touched the core of planning-evaluating alternative trajectories to the goal. Instead, planning-focused analyses such as forward and backward sweeps were all about the actually executed trajectory. What may participants’ eye movements tell us about their evaluation of alternative trajectories?
  
  This is an important point that we previously overlooked because our experimental design did not incorporate mutually exclusive alternative trajectories. Nonetheless, there are many trials in which participants had access to several possible trajectories to the goal. Some of those alternatives may be trivially suboptimal (e.g. highly convoluted trajectory, taking a slightly curved instead of straight trajectory, or setting out on the wrong path and then turning back). Using two simple constraints described in the Methods (no cyclic paths, limited amount of overlap between alternatives), we algorithmically identified the number of non-trivial alternative trajectories (or options) on each trial that were comparable in length to the chosen trajectory (within about 1 standard deviation). A few examples are shown below for the reviewer.
  
  The more plausible trajectory options there were, the more time participants spent gazing upon these alternatives during both pre-movement and movement (Figure 4 – figure supplement 1D – left). This is not a trivial effect resulting from the increase in surface area comprising the alternative paths because the time spent looking at the chosen trajectory also increased with the number of alternatives (Figure S8D – middle). Instead, this suggests that participants might be deliberating between comparable options.
  
  Consistent with this, the likelihood of gazing alternative trajectories peaked early on during pre-movement and well before performing sweeping eye movements (Figure 5D). During movement, the probability of gazing upon alternatives increases immediately before participants make a turn, suggesting that certain aspects of deliberation may also be carried out on the fly just before approaching choice points. Critically, during both pre-movement and movement epochs, the fraction of time spent looking at the goal location decreased with the number of alternatives (Figure 4 – figure supplement 1D – right), revealing a potential trade-off between deliberative processing and looking at the reward location. Future studies with more structured arena designs are needed to better understand the factors that lead to the selection of a particular trajectory among alternatives, and we mention this in the discussion (line 445):
  
  "Value-based decisions are known to involve lengthy deliberation between similar alternatives. Participants exhibited a greater tendency to deliberate between viable alternative trajectories at the expense of looking at the reward location. Likelihood of deliberation was especially high when approaching a turn, suggesting that some aspects of path planning could also be performed on the fly. More structured arena designs with carefully incorporated trajectory options could help shed light on how participants discover a near-optimal path among alternatives. However, we emphasize that deliberative processing accounted for less than onefifth of the spatial variability in eye movements, such that planning largely involved searching for a viable trajectory."
  
  Second, what cognitive computations may underly the observed patterns of eye movements has not received a thorough theoretical treatment. In particular, to explain why participants tended to fixate the bottleneck transitions, the authors hypothesized active sensing, that is, participants were collecting extra visual information to correct their internal model about the maze. Though active sensing is a possible explanation (as demonstrated by the authors’ modeling of "smart" observers), it is not necessarily the only or most parsimonious explanation. It is possible that their peripheral vision allowed participants to form a good-enough model about the maze and their eye movements solely reflect planning. In fact, that replays occur more often at bottleneck states is an emergent property of Mattar & Daw’s (2018) normative theory of neural replay. Forward and backward replays are also emergent properties of their theory. It might be possible to explain all the eye movement patterns-fixating the goal and the bottleneck transitions, and the forward and backward replays-based on Mattar & Daw’s theory in the framework of reinforcement learning. Of course, some additional assumptions that specify eye movements and their functional roles in reinforcement learning (e.g., fixating a location is similar to staying at the corresponding state) would be needed, analogous to those in the authors’ "smart" observer models. This unifying explanation may not only be more parsimonious than the author’s active sensing plus planning account, but also be more consistent with the data than the latter. After all, if participants had used fixations to correct their internal model of the maze, they should not have had little improvements across trials in the same maze.
  
  We thank the reviewer for this reference. We note the strong parallels between our eye movement results and that study in the discussion, in addition to proposing experimental variations that will help crystallize the link. Below, we included our response that was incorporated into the Discussion section (beginning at line 462).
  
  "In [a] highly relevant theoretical work, Mattar and Daw proposed that path planning and structure learning are variants of the same operation, namely the spatiotemporal propagation of memory. The authors show that prioritization of reactivating memories about reward encounters and imminent choices depends upon its utility for future task performance. Through this formulation, the authors provided a normative explanation for the idiosyncrasies of forward and backward replay, the overrepresentation of reward locations and turning points in replayed trajectories, and many other experimental findings in the hippocampus literature. Given the parallels between eye movements and patterns of hippocampal activity, it is conceivable that gaze patterns can be parsimoniously explained as an outcome of such a prioritization scheme. But interpreting eye movements observed in our task in the context of the prioritization theory requires a few assumptions. First, we must assume that traversing a state space using vision yields information that has the same effect on the computation of utility as does information acquired through physical navigation. Second, peripheral vision allows participants to form a good model of the arena such that there is little need for active sensing. In other words, eye movements merely reflect memory access and have no computational role. Finally, long-term statistics of sweeps gradually evolve with exposure, similar to hippocampal replays. These assumptions can be tested in future studies by titrating the precise amount of visual information available to the participants, and by titrating their experience and characterizing gaze over longer exposures. We suspect that a pure prioritization-based account might be sufficient to explain eye movements in relatively uncluttered environments, whereas navigation in complex environments would engage mechanisms involving active inference. Developing an integrative model that features both prioritized memory-access as well as active sensing to refine the contents of memory, would facilitate further understanding of computations underlying sequential decision-making in the presence of uncertainty."
  
  In the original manuscript, we referred to active sensing and planning in order to ground our interpretation in terminology that has been established in previous works by other groups, which had investigated them in isolation. Although the role active sensing could be limited, we are unable to conclude that eye movements solely reflect planning. Even if peripheral vision is sufficient to obtain a good-enough model of the environment, eye movements can further reduce uncertainty about the environment structure especially in cluttered environments such as the complex arena used in this study. This reduction in uncertainty is not inconsistent with a lack of performance improvement across trials. This is because the lack of improvement could be explained by a failure to consolidate the information gathered by eye movements and propagate them across trials, an interpretation that would also explain why planning duration is stable across trials (Figure 2 – figure supplement 2B). Furthermore, participants gaze at alternative trajectories more frequently when more options are presented to them. However we acknowledge that this is a fundamental question, and identified this as an important topic for follow up studies and outline experiments to delineate the precise extent to which eye movements reflect prioritized memory access vs active sensing. Briefly, we can reduce the contribution of active sensing by manipulating the amount of visual information – ranging from no information (navigating in the dark) to partial information (foveated rendering in VR headset). Likewise, we can increase the contribution of memory by manipulating the length of the experiment to ensure participants become fully familiar with the arena. Yet another manipulation is to use a fixed reward location for all trials such that experimental conditions would closely match the simulations of the prioritization model. We are excited about performing these follow up experiments.
  
  Reviewer #2 (Public Review):
  
  In this study the authors sought to understand how the patterns of eye-movements that occur during navigation relate to the cognitive demands of navigating the current environment. To achieve this the authors developed a set of mazes with visible layouts that varied in complexity. Participants navigated these environments seated on a chair by moving in immersive virtual reality.
  
  The question of how eye-movements relate to cognitive demands during navigation is a central and often overlooked aspect of navigating an environment. Study eye-movements in dynamic scenarios that enable systematic analysis is technically challenging, and hence why so few studies have tackled this issue.
  
  The major strengths of this study are the technical development of the set up for studying, recording and analysing the eye-movements. The analysis is extensive and allows greater insight than most studies exploring eye-movements would provide. The manuscript is also well written and argued.
  
  A current weakness of the manuscript is that several other factors have not been considered that may relate to the eye-movements. More consideration of these would be important.
  
  We thank the reviewer for their positive assessment of the innovative aspects of this study. We have tried to address the weaknesses by performing additional analyses described below.
  
  In the experimental design it appears possible to separate the length of the optimal path from the complexity of the maze. But that appears not to have been done in this design. It would be useful for the authors to comment on this, as these two parameters seem critically important to the interpretation of the role of eye-movements - e.g. a lot of scanning might be required for an obvious, but long path, or a lot of scanning might be required to uncover short path through a complex maze.
  
  This is a great point. We added a comment to the Discussion at line 489 to address this:
  
  "Future work could focus on designing more structured arenas to experimentally separate the effects of path length, number of subgoals, and environmental complexity on participants’ eye movement patterns."
  
  To make the most of our current design, we performed two analyses. First, we regressed trial-specific variables simultaneously against path length and arena complexity. This analysis revealed that the effect of complexity on behavior persists even after accounting for path length differences across arenas (Figure 4 – figure supplement 3). Second, path length is but one of many variables that collectively determine the complexity of the maze. Therefore, we also analyzed the effects of multiple trial-specific variables (number of turns, length of the optimal path, and the degree to which participants are expected to turn back the initial direction of heading to reach the goal, regardless of arena complexity) on eye movements. This revealed fine-grained insights on which task demands most influenced each eye movement quality that was described. More complex arenas posed, on average, greater challenges in terms of longer and more winding trajectories, such that eye movement qualities which increased with arena complexity also generally increased with specific measures of trial difficulty, albeit to varying degrees. We added additional plots to the main/supplementary figures and described these analyses under a new heading (“Linear mixed effects models”) in the Methods section.
  
  Similarly, it was not clear how the number of alternative plausible paths was considered in the analysis.It seems possible to have a very complex maze with no actual required choices that would involve a lot of scanning to determine this, or a very simple maze with just two very similar choices but which would involve significant scanning to weight up which was indeed the shortest.
  
  Thank you for the suggestion. In conjunction with our response to the first comment from Reviewer #1, we used some constraints to identify non-trivial alternative trajectories – trajectories that pass through different locations in the arena but are roughly similar in length (within about 1 SD of the chosen trajectory). In alignment with your intuition, the most complex maze, as well as the completely open arena, did not have non-trivial alternative trajectories. For the three arenas of medium complexity, the more open arenas had more non-trivial alternative trajectories.
  
  When we analyzed the relative effect of the number of alternative trajectories on eye movement, we found that both possibilities you suggested are true. On trials with many comparable alternatives, participants indeed spend more time scanning the alternatives and less time looking at the goal (Figure S8D). Likewise, in the most complex maze where there are no alternatives, participants still spent much more time (than simpler mazes) learning about the arena structure at the expense of looking at the goal (Figure 3E-F). This analysis yielded interesting new insights into how participants solved the task and opens the door for investigating this trade-off in future work. More generally, because both deliberation and structure learning appear to drive eye movements, they must be factored into studies of human planning.
  
  Can the affordances linked to turning biases and momentum explain the error patterns? For example,paths that require turning back on the current trajectory direction to reach the goal will be more likely to cause errors, and patterns of eye-movements that might be related to such errors.
  
  Thank you for this question. In conjunction with the trial-specific analyses on the effect of the length of the trajectory (Point #1) on errors and eye movement patterns, we also looked into how the number of turns and the relative bearing (angle between the direction of initial heading and the direction of target approach) affected participants’ behavior. Turns and momentum do not affect the relative error (distance of the stopping location to the target) as much as the trajectory length does, which was unexpected (Figure 1 – figure supplement 1F). This supports that errors were primarily caused by forgetting the target location, and this memory leak gets worse with distance (or time). However, turns have an influence on eye movements in general. For example, more turns generally result in an increase in the fraction of time that participants spend gazing upon the trajectory (Figure 4 – figure supplement 1A) and sweeping (Figure 4D). Furthermore, the number of turns decreased the fraction of time participants spent gazing at the target during movement (Figure 2D).
  
  Why were half the obstacle transitions miss-remembered for the blind agent? This seems a rather arbitrary choice. More information to justify this would be useful.
  
  We tested out different percentages and found qualitatively similar results. The objective was to determine the patterns of eye movements that would be most beneficial when participants have an intermediate level of knowledge about the arena configuration (rather than near-zero or near-perfect), because during most trials, participants can also use peripheral vision to assess the rough layout, but they do not precisely remember the location of the obstacles. We added this explanation to Appendix 1, where the simulation details have been made in response to a suggestion by another reviewer.
  
  The description of some of the results could usefully be explained in more simple terms at various pointsto aid readers not so familiar with the RL formation of the task. For example, a key result reported is that participants skew looking at the transition function in complex environments rather than the reward function. It would be useful to relate this to everyday scenarios, in this case broadly to looking more at the junctions in the maze than at the goal, or near the goal, when the maze is complex.
  
  This is a great suggestion. We added an everyday analogy when describing the trade-off on line 258.
  
  "The trade-off reported here is roughly analogous to the trade-off between looking ahead towards where you’re going and having to pay attention to signposts or traffic lights. One could get away with the former strategy while driving on rural highways whereas city streets would warrant paying attention to many other aspects of the environment to get to the destination."
  
  The authors should comment on their low participant sample size. The sample seems reasonable giventhe reproducibility of the patterns, but it is much lower than most comparable virtual navigation tasks.
  
  Thank you for the recommendation. We had some difficulties recruiting human participants who were willing to wear a headset which had been worn by other participants during COVID-19, and some participants dropped out of the study due to feeling motion sickness. To ameliorate the low sample size, we collected data on four more participants and performed analyses to confirm that the major findings may be observed in most individual participants. Participant-specific effects are included in the new plots made in response to Points # 1-3, and the number of participants with a significant result for each figure/panel has been included as Appendix 2 – table 3.
  
  Reviewer #3 (Public Review):
  
  In this article, Zhu and colleagues studied the role of eye movements in planning in complex environments using virtual reality technology. The main findings are that humans can 1) near optimally navigate in complex environments; 2) gaze data revealed that humans tend to look at the goal location in simple environments, but spend more time on task relevant structures in more complex tasks; 3) human participants show backward and forward sweeping mostly during planning (pre-movement) and execution (movement), respectively.
  
  I think this is a very interesting study with a timely question and is relevant to many areas within cognitive neuroscience, notably decision making, navigation. The virtual reality technology is also quite new for studying planning. The manuscript has been written clearly. This study helps with understanding computational principles of planning. I enjoyed reading this work. I have only one major comment about statistical analyses that I hope authors can address.
  
  We thank the reviewer for the accurate description and positive assessment of our work.
  
  Number of subjects included in analyses in the study is only nine. This is a very small sample size for most human studies. What was the motivation behind it? I believe that most findings are quite robust, but still 9 subjects seems too low. Perhaps authors can replicate their finding in another sample? Alternatively, they might be able to provide statistics per individual and only report those that are significant in all subjects (of course, this only works if reported effects are super robust. But only in such a case 9 subjects are sufficient.)
  
  Thank you for the suggested alternatives. Due to the pandemic, we had some difficulties recruiting human participants who were willing to wear a headset which had been worn by other participants. We collected data on four more participants and included them in the analyses, and also confirmed that the major findings are observed in most individuals. The number of participants with a significant result for each analysis has been included in Figure 1 – figure supplement 3 and Appendix 2 – table 3.
  
  Somewhat related to the previous point, it seems to me that authors have pooled data from all subjects (basically treating them as 1 super-subject?) I am saying this based on the sentence written on page 5, line 130: "Because we are interested in principles that are conserved across subjects, we pooled subjects for all subsequent analyses." If this is not the case, please clarify that (and also add a section on "statistical analyses" in Methods.) But if this is the case, it is very problematic, because it means that statistical analyses are all done based on a fixed-effect approach. The fixed effect approach is infamous for inflated type I error.
  
  Your interpretation is correct and we acknowledge your concern about pooling participants. We had done this after observing that our results were consistent across participants but this was not demonstrated. We have now performed analyses sensitive to participant-specific effects and find that all major results hold for most participants, and we included additional main and supplementary bar plots (and tables in Appendix 2) showing per-participant data. The new plots/table show the effect of independent variables (mainly trial/arena difficulty) on dependent variables for each participant, as well as general effects conserved across participants. A new paragraph was added to the Methods section to describe the “Linear mixed effects models” which we used.
  
  Again, quite related to the last two points: please include degrees of freedom for every statistical test (i.e. every reported p-value).
  
  Degrees of freedom (df) are now included along with each p-value.
  
  scietyType:AuthorResponse
Visit annotations in context

Tags

scietyType:AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.04.26.441482v1
www.biorxiv.org www.biorxiv.org

Reassessing face topography in primary somatosensory cortex and remapping following hand loss

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Using fMRI-based univariate and multivariate analyses, Root, Muret, et al. investigated the topography of face representation in the somatosensory cortex of typically developed two-handed individuals and individuals with a congenital and acquired missing hand. They provide clear evidence for an upright face topography in the somatosensory cortex in all three groups. Moreover, they find that one-handers, but not amputees, show shorter distances from lip representations to the hand area, suggesting a remapping of the lips. They also find a shift away of the upper face from the deprived hand area in one-handers, and significantly greater dissimilarity between face part representations in amputees and one-handers. The authors argue that this pattern of remapping is different to that of cortical neighborhood theories and points toward a remapping of face parts which have the ability to compensate for hand function, e.g., using the lips/mouth to manipulate an object.
  
  These findings provide interesting insights into the topographic organization of face parts and the principles of cortical (re)organization. The authors use several analytical approaches, including distance measures between hand- and face-part-responsive regions and representational similarity analysis (RSA). Particularly commendable is the rigorous statistical analysis, such as the use of Bayesian comparisons, and careful interpretation of absent group differences.
  
  We thank the reviewer for their positive and constructive feedback.
  
  Reviewer #2 (Public Review):
  
  After amputation, the deafferented limb representation in the somatosensory cortex is activated by stimulation of other body parts. A common belief is that the lower face, including the lips, preferentially "invades" deafferented cortex due to its proximity to cortex. In the present study, this hypothesis is tested by mapping the somatosensory cortex using fMRI as amputees, congenital one-handers, and controls moved their forehead, nose, lips or tongue. First, they found that, unlike its counterpart in monkeys, the representation of the face in the somatosensory cortex is right-side up, with the forehead most medial (and abutting the hand) and the lips most lateral. Second, there was little evidence of "reorganization" of the deafferented cortex in amputees, even when tested with movements across the entire face rather than only the lips. Third, congenital one-handers showed significant reorganization of deafferented cortex, characterized principally by the invasion of the lower face, in contrast to predictions from the hypothesis that proximity was the driving factor. Fourth, there was no relationship between phantom limb pain reports and reorganization.
  
  As a non-expert in fMRI, I cannot evaluate the methodology. That being said, I am not convinced that the current consensus is that the representation of the face in humans is flipped compared to that of monkeys. Indeed, the overwhelming majority of somatosensory homunculi I have seen for humans has the face right side up. My sense is that the fMRI studies that found an inverted (monkey-like) face representation contradict the consensus.
  
  Thank you for point this out. As we tried to emphasise in the introduction, very few neuroimaging studies actually investigated face somatotopy in humans, with inconsistent results. We agree the default consensus tends to be dominated by the up-right depiction of Penfield’s homunculus (recently replicated by Roux et al, 2018). However, due to methodological and practical constraints, alignment across subjects in the case of intracortical recordings is usually difficult to achieve, and thus makes it difficult to assess the consistency in topographical organisation. Moreover, previous imaging studies did not manage to convincingly support Penfield’s homunculus. For these two key reasons, the spatial orientation of the human facial homunculus is still debated. A further limiting factor of previous studies in humans is that the vast majority of human studies investigating face (re)mapping in humans focused solely on the lip representation, using the cortical proximity hypothesis to interpret their results. Consequently, as we highlight above in our response to the Editor, there is a wide-spread and false representation in the human literature of the lips neighbouring the hand area.
  
  To account for the reviewer’s critic and convey some of this context, we changed our title from: Reassessing face topography in primary somatosensory cortex and remapping following hand loss; to: Complex pattern of facial remapping in somatosensory cortex following congenital but not acquired hand loss. This was done to de-emphasise the novelty of face topography relative to our other findings.
  
  We also rewrote our introduction (lines 79-94) as follows:
  
  “The research focus on lip cortical remapping in amputees is based on the assumption that the lips neighbour the hand representation. However, this assumption goes against the classical upright orientation of the face in S126–30, as first depicted in Penfield’s Homunculus and in later intracortical recordings and stimulation studies26–29, with the upper-face (i.e., forehead) bordering the hand area. In contrast, neuroimaging studies in humans studying face topography provided contradictory evidence for the past 30 years. While a few neuroimaging studies provided partial evidence in support of the traditional upright face organisation31, other studies supported the inverted (or ‘upside-down’) somatotopic organisation of the face, similar to that of non-human primates32,33. Other studies suggested a segmental organisation34, or even a lack of somatotopic organisation35–37, whereas some studies provided inconclusive or incomplete results38–41. Together, the available evidence does not successfully converge on face topography in humans. In line with the upright organisation originally suggested by Penfield, recent work reported that the shift in the lip representation towards the missing hand in amputees was minimal42,43, and likely to reside within the face area itself. Surprisingly, there is currently no research that considers the representation of other facial parts, in particular the upper-face (e.g., the forehead), in relation to plasticity or PLP.”
  
  We also updated the discussion accordingly (lines 457, 469-477, 490-492).
  
  Similarly, it is not clear to me how the observations (1) of limited reorganization in amputees, (2) of significant reorganization in congenital one-handers, and (3) of the lack of relationship between PLP and reorganization is novel given the previous work by this group. Perhaps the authors could more clearly articulate the novelty of these results compared to their previous findings.
  
  Thank you for giving us the opportunity to clarify on this important point. The novelty of these results can be summarised as follow:
  
  (1) Conceptually, it is crucial for us to understand if deprivation-triggered plasticity is constrained by the local neighbourhood, because this can give us clues regarding the mechanisms driving the remapping. We provide strong topographic evidence about the face orientation in controls, amputees and one-handers.
  
  (2) The vast majority of previous research on brain plasticity following hand loss (both congenital and acquired) in humans has exclusively focused on the lower face, and lips in particular. We provide systematic evidence for stable organisation and remapping of the neighbouring upper face, as well as the lower face. We also study topographic representation of the tongue (and nose) for the first time.
  
  (3) The vast majority of previous research on brain remapping following hand loss (both congenital and acquired, neuroimaging and electrophysiological) was focused on univariate activity measures, such as the spatial spread of units showing a similar feature preference, or the average activity level across individual units. We are going beyond remapping by using RSA, which allows us to ask not only if new information is available in the deprived cortex (as well as the native face area), but also whether this new information is structured consistently across individuals and groups. We show that representational content is enhanced in the deprived cortex one-handers whereas it is stable in amputees relative to controls (and to their intact hand region).
  
  (4) Based on previous studies, the assumption was that reorganisation in congenital one-handers was relatively unspecific, affecting all tested body parts. Here, we provide evidence for a more complex pattern of remapping, with the forehead representation seemingly moving out of the missing hand region (and the nose representation being tentatively similar to controls). That is, we show not just “invasion” but also a shift of the neighbour away from the hand area which has never been documented (or in fact suggested).
  
  (5) Using Bayesian analyses we provide definitive evidence against a relationship between PLP and forehead remapping, providing first and conclusive evidence against the remapping hypothesis, based on cortical neighbourhood.
  
  Our inclination is not to add a summary paragraph of these points in our discussion, as it feels too promotional. Instead, we have re-written large sections of the introduction and discussion to better emphasise each of these points separately throughout the text, where the context is most appropriate. Given the public review strategy taken by eLife, the novelty summary provided above will be available for any interested reader, as part of the public review process. However, should the reviewer feel that a novelty summary paragraph is required (or an emphasis on any of the points summarised above), we will be happy to revise the manuscript accordingly.
  
  Finally, Jon Kaas and colleagues (notably Niraj Jain) have provided evidence in experiments with monkeys that much of the observed reorganization in the somatosensory cortex is inherited from plasticity in the brain stem. Jain did not find an increased propensity for axons to cross the septum between face and hand representations after (simulated) amputation. From this perspective, the relevant proximity would be that of the cuneate and trigeminal nuclei and it would be critical to map out the somatotopic organization of the trigeminal and cuneate nuclei to test hypotheses about the role of proximity in this remapping.
  
  Thank you for highlighting this very relevant point, which we are well aware of. We fully agree with the reviewer that this is an important goal for future study, but functional imaging of the brainstem in humans is particularly challenging and would require ultra high field imaging (7T) and specialised equipment. We have encountered much local resistance due to hypothetical issues for MRI safety for scanning amputees in this higher field strength, meaning we are unable to carry out this research ourselves. Our former lab member Sanne Kikkert, who is now running her independent research programme in Zurich, has been working towards this goal for the past 4 years. So we can say with confidence that this aim is well beyond the scope of the current study. In response to your comment, we mentioned this potential mechanism in the introduction (lines 98-101), we ensured that we only referred to “cortical proximity” throughout our manuscript, and we circle back to this important point in the discussion.
  
  Lines 539-543: “Moreover, even if the remapping we observed here goes against the theory of cortical proximity, it can still arise from representational proximity at the subcortical level, in particular at the brainstem level44,45. While challenging in humans, mapping both the cuneate and trigeminal nuclei would be critical to provide a more complete picture regarding the role of proximity in remapping.”
  
  Reviewer #3 (Public Review):
  
  In their study, the authors set up to challenge the long-held claim that cortical remapping in the somatosensory cortex in hand deprived cortical territories follows somatotopic proximity (the hand region gets invaded by cortical neighbors) as classically assumed. In contrast to this claim, the authors suggest that remapping may not follow cortical proximity but instead functional rules as to how the effector is used. Their data indeed suggest that the deprived hand area is not invaded by the forefront which is the cortical neighbor but instead by the lips which may compensate for hand loss in manipulating objects. Interestingly the authors suggest this is mostly the case for one-handers but not in amputees for who the reorganization seems more limited in general (but see my comments below on this last point).
  
  This is a remarkably ambitious study that has been skilfully executed on a strong number of participants in each group. The complementarity of state-of-the-art uni- and multi-variate analyses are in the service of the research question, and the paper is clearly written. The main contribution of this paper, relative to previous studies including those of the same group, resides in the mapping of multiple face parts all at once in the three groups.
  
  We are grateful to the reviewer for appreciating the immense effort that this study involved.
  
  In the winner takes all approach, the authors only include 3 face parts but exclude from the analyses the nose and the thumb. I am not fully convinced by the rationale for not including nose in univariate analyses - because it does not trigger reliable activity - while keeping it for representational similarity analyses. I think it would be better to include the nose in all analyses or demonstrate this condition is indeed "noisy" and then remove it from all the analyses. Indeed, if the activity triggered by nose movement is unreliable, it should also affect multivariate.
  
  Following this comment, we re-ran all univariate analyses to include the nose, and updated throughout the main text and supplemental results and related figures. In short, adding the nose did not change the univariate results, apart from a now significant group x hemisphere interaction for the CoG of the tongue when comparing amputees and controls, matching better the trends for greater surface coverage in the deprived hand ROI of amputees. Full details are provided in our response to Reviewer 1 above.
  
  The rationale for not including the hand is maybe more convincing as it seems to induce activity in both controls and amputees but not in one-handers. First, it would be great to visualize this effect, at least as supplemental material to support the decision. Then, this brings the interesting possibility that enhanced invasion of hand territory by lips in one-handers might link to the possibility to observe hand-related activity in the presupposed hand region in this population. Maybe the authors may consider linking these.
  
  Thank you for this comment. As we explain in our response to Reviewer 1 above, we did not intent the thumb condition in one-handers for analysis, as the task given to one-handers (imagine moving a body part you never had before) is inherently different to that given to the other groups (move - or at least attempt to move - your (phantom) hand). As such, we could not pursuit the analysis suggested by the reviewer here. To reduce the discrepancy and following Reviewer 1’s advice, we decided to remove the hand-face dissimilarity analysis which we included in our original manuscript, and might have sparked some of this interest. Upon reflection we agreed that this specific analysis does not directly relate to the question of remapping (but rather of shared representation), in addition to making the paper unbalanced. We will now feature this analysis in another paper that appears more appropriate in the context of referred sensations in amputees (Amoruso et al, 2022 MedRxiv).
  
  The use of the geodesic distance between the center of gravity in the Winner Take All (WTA) maps between each movement and a predefined cortical anchor is clever. More details about how the Center Of Gravity (COG) was computed on spatially disparate regions might deserve more explanations, however.
  
  We are happy to provide more detail on this analysis, which weights the CoG based on the clusters size (using the workbench command -metric-weighted-stats). Let’s consider the example shown here (Figure 1) for a single control participant, where each CoG is measured either without weighting (yellow vertices) or with cluster weighting (forehead CoG=red, lip CoG=dark blue, tongue CoG=dark red). When the movement produces a single cluster of activity (the lips in the non-dominant hemisphere, shown in blue), the CoG’s location was identical for both weighted (red) and unweighted (yellow) calculations. But other movements, such as the tongue (green), produced one large cluster (at the lateral end), with a few more disparate smaller clusters more medially. In this case, the larger cluster of maximal activity is weighted to a greater extent than the smaller clusters in the CoG calculation, meaning the CoG is slightly skewed towards it (dark red), relative to the smaller clusters.
  
  Figure 1. Centre-of-gravity calculation, weighted and unweighted by cluster size, in an example control participant. Here the winner-takes-all output for each facial movement (forehead=red, lips=blue, tongue=green) was used to calculate the centre-of-gravity (CoG) at the individual-level in both the dominant (left-hand side) and non-dominant (right-hand side) hemisphere, weighted by cluster size (forehead CoG=red, lip CoG=dark blue, tongue CoG=dark red), compared to an unweighted calculation (denoted by yellow dots within each movements’ winner-takes-all output).
  
  This is now explained in the methods (lines 760-765) as follows:
  
  “To assess possible shifts in facial representations towards the hand area, the centre-of-gravity (CoG) of each face-winner map was calculated in each hemisphere. The CoG was weighted by cluster size meaning that in the event of multiple clusters contributing to the calculation of a single CoG for a face-winner map, the voxels in the larger cluster are overweighted relative to those in the smaller clusters. The geodesic cortical distance between each movement’s CoG and a predefined cortical anchor was computed.”
  
  Moreover, imagine that for some reason the forefront region extends both dorsally and ventrally in a specific population (eg amputees), the COG would stay unaffected but the overlap between hand and forefront would increase. The analyses on the surface area within hand ROI for lips and forehead nicely complement the WTA analyses and suggest higher overlap for lips and lower overlap for forehead but none of the maps or graphs presented clearly show those results - maybe the authors could consider adding a figure clearly highlighting that there is indeed more lip activity IN the hand region.
  
  We agree with you on this limitation of the CoG and this is why we interpret all cortical distances analyses in tandem with the laterality indices. The laterality indices correspond to the proportion of surface area in the hand region for a given face part in the winner-maps.
  
  Nevertheless, to further convince the Reviewer, we extracted activity levels (beta values) within the hand region of congenitals and controls, and we ran (as for CoGs) a mixed ANOVA with the factors Hemisphere (deprived x intact) and Group (controls x one-handers).
  
  As expected from the laterality indices obtained for the Lips, we found a significant group x hemisphere interaction (F(1,41)=4.52, p=0.040, n2p=0.099), arising from enhanced activity in the deprived hand region in one-handers compared to the non-dominant hand region in controls (t(41)=-2.674, p=0.011) and to the intact hand region in one-handers (t(41)=-3.028, p=0.004).
  
  Since this kind of analysis was the focus of previous studies (from which we are trying to get away) and since it is redundant with the proportion of face-winner surface coverage in the hand region, we decided not to include it in the paper. But we could add it as a Supplementary result if the Reviewer believes this strengthens our interpretation.
  
  In addition to overlap analyses between hand and other body parts, the authors may also want to consider doing some Jaccard similarity analyses between the maps of the 3 groups to support the idea that amputees are more alike controls than one-handers in their topographic activity, which again does not appear clear from the figures.
  
  We thank the reviewers for this clever suggestion. We now include the Jaccard similarity analysis, which quantified the degree of similarity (0=no overlap between maps; 1=fully overlapping) between winner-takes-all maps (which included the nose; akin to the revised univariate results) across groups. For each face part/amputee, the similarity with the 22 controls and 21 one-handers respectively was averaged. We utilised a linear mixed model which included fixed factors of Group (One-handers x Controls), Movement (Forehead x Nose x Lips x Tongue) and Hemisphere (Intact x Deprived) on Jaccard similarity values (similar to what we used for the RSA analysis). A random effect of participant, as well as covariates of ages, were also included in the model.
  
  Results showed a significant group x hemisphere interaction (F(240.0)=7.70, p=0.006; controlled for age; Fig. 5), indicating that amputees’ maps showed different similarity values to controls’ and one-handers’ depending on the hemisphere. Post-hoc comparisons (corrected alpha=0.025; uncorrected p-values reported) revealed significantly higher similarity to controls’ than to one-handers’ maps in the deprived hemisphere (t(240)=-3.892, p<.001). Amputees’ maps also showed higher similarity to controls’ maps in the deprived relative to the intact hemisphere (t(240)=2.991, p=0.003). Amputees, therefore, displayed greater similarity of facial somatotopy in the deprived hemisphere to controls, suggesting again fewer evidence for cortical remapping in amputees.
  
  We added these results at the end of the univariate analyses (lines 335-351) and in the discussion (lines 464-465 and 497-500).
  
  This brings to another concern I have related to the claim that the change in the cortical organization they observe is mostly observed in one-handers. It seems that most of this conclusion relies on the fact that some effects are observed in one-handers but not in amputees when compared to controls, however, no direct comparisons are done between amputees and one-handers so we may be in an erroneous inference about the interaction when this is actually not tested (Nieuwenhuis, 11). For instance, the shift away from the hand/face border of the forehead is also (mildly) significant in amputees (as observed more strongly in one-handers) so the conclusion (eg from the subtitle of the results section) that it is specific to one-hander might not fully be supported by the data. Similar to the invasion of the hand territory from the lips which is significant in amputees in terms of surface area. All together this calls for toning down the idea that plasticity is restricted to congenital deprivation (eg last sentence of the abstract). Even if numerically stronger, if I am not wrong, there are no stats showing remapping is indeed stronger in one-handers than in amputees and actually, amputees show significant effects when compared to controls along the lines as those shown (even if more strongly) in one-handers.
  
  Thank you for this very important comment. We fully agree – the RSA across-groups comparison is highly informative but insufficient to support our claims. We did not compare the groups directly to avoid multiple comparisons (both for statistical reasons and to manage the size of the results section). But the reviewer’s suggestion to perform a Jaccard similarity analysis complements very nicely the univariate and multivariate results and allows for a direct (and statistically lean) comparison between groups, to assess whether amputees are more similar to controls or to congenital one-handers, taking into account all aspects of their maps (both spatial location/CoG and surface coverage). We added the Jaccard analysis to the main text, at the end of the univariate results (lines 335-385). The Jaccard analysis suggests that amputees’ maps in the deprived hemisphere were more similar to the maps of controls than to the ones of congenital one-handers. This allowed us to obtain significant statistical results to support the claim that remapping is indeed stronger in one-handers than in amputees (lines 346-351). We also compared both amputees and one-handers to the control group. In line with our univariate results, this revealed that the only face part for which controls were more similar to one-handers than to amputees was the tongue (lines 379-381). And that the forehead remapping observed at the univariate level in amputees (surface area), is likely to arise from differences in the intact hemisphere (lines 381-383).
  
  Finally, we also added the post-hoc statistics comparing amputees to congenitals in the RSA analysis (lines 425-427): “While facial information in the deprived hand area was increased in one-handers compared with amputees, this effect did not survive our correction for multiple comparisons (t(70.7)=-2.117, p=0.038).”
  
  Regarding the univariate results mentioned by the reviewer, we would like to emphasise that we had no significant effect for the lips in amputees, though we agree the surface area appears in between controls and one-handers. But this laterality index was not different from zero. This test is now added lines 189-190. Regarding the forehead, we fully agree with the Reviewer, and we adjusted the subtitle accordingly (lines 241-242). For consistency, we also added the t-test vs zero for the forehead surface area (non-significant, lines 251-253).
  
  Also, maybe the authors could explore whether there is actually a link between the number of years without hand and the remapping effects.
  
  To address this question, we explored our data using a correlation analysis. The only body part who showed some suggestive remapping effects was the tongue, and so we explored whether we could find a relationship (Pearson’s correlation) between years since amputation and the laterality index of the Tongue in amputees (r = 0.007, p=0.980, 95% CI [-0.475, 0.475]). We also explored amputees’ global Jaccard similarity values to controls in the deprived hemisphere (r = -0.010, p=0.970, 95% CI [-0.488, 0.473]), and could not find any relationship. Considering there was no strong remapping effect to explain, we find this result too exploratory to include in our manuscript.
  
  One hypothesis generated by the data is that lips remap in the deprived hand area because lips serve compensatory functions. Actually, also in controls, lips and hands can be used to manipulate objects, in contrast to the forehead. One may thus wonder if the preferential presence of lips in the hand region is not latent even in controls as they both link in functions?
  
  We agree with the reviewer’s reasoning, and we think that the distributed representational content we recently found in two-handers (Muret et al, 2022) provides a first hint in this direction. It is worth noting that in that previous publication we did not find differences across face parts in the activity levels obtained in the hand region, except for slightly more negative values for the tongue. But we do think that such latent information is likely to provide a “scaffolding” for remapping. While the design of our face task does not allow to assess information content for each face part (as done for the lips in Muret et al, 2022), this should be further investigated in follow-up studies.
  
  We added a sentence in the discussion to highlight this interesting notion: Lines 556-559: “Together with the recent evidence that lip information content is already significant in the hand area of two-handed participants (Muret et al, 2022), compensatory behaviour since developmental stages might further uncover (and even potentiate) this underlying latent activity.”
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.07.05.451126v2
www.biorxiv.org www.biorxiv.org

New submission 07/12/2022, 11:59:39

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  The authors used data from extracellular recordings in mouse piriform cortex (PCx) by Bolding & Franks (2018), they examined the strength, timing, and coherence of gamma oscillations with respiration in awake mice. During "spontaneous" activity (i.e. without odor or light stimulation), they observed a large peak in gamma that was driven by respiration and aligned with the spiking of FBIs. TeLC, which blocks synaptic output from principal cells onto other principal cells and FBIs, abolishes gamma. Beta oscillations are evoked while gamma oscillations are induced. Odors strongly affect beta in PCx but have minimal (duration but not amplitude) effects on gamma. Unlike gamma, strong, odor-evoked beta oscillations are observed in TeLC. Using PCA, the authors found a small subset of neurons that conveyed most of the information about the odor (winner cells). Loser cells were more phase-locked to gamma, which matched the time course of inhibition. Odor decoding accuracy closely follows the time course of gamma power.
  
  We thank the reviewer for the accurate summary of our work.
  
  I think this is an interesting study that uses a publicly available dataset to good effect and advances the field elegantly, especially by selectively analyzing activity in identified principal neurons versus inhibitory interneurons, and by making use of defined circuit perturbations to causally test some of their hypotheses.
  
  We thank the reviewer for the positive appraisal.
  
  Major:
  
  The authors show odor-specificity at the time of the gamma peak and imply that the gamma coupling is important for odor coding. Is this because gamma oscillations are important or because gamma is strongest when activity in PCx is strongest (i.e. both excitatory and inhibitory activity, which would cancel each other in the population PSTH, which peaks earlier)? To make this claim, the authors could show that odor decoding accuracy - with a small (~10 ms sliding window) - oscillates at approx. gamma frequencies. As is, Fig. 5 just shows that cells respond at slightly different times in the sniff cycle. What time window was used for computing the Odor Specificity Index? Put another way, is it meaningful that decoding is most accurate when gamma oscillations are strongest, or is this just a reflection of total population activity, i.e., when activity is greatest there is more gamma power, and odor decoding accuracy is best?
  
  We thank the reviewer for the critical comment. Please note that the employed decoding strategy (supervised learning with cross-validation) prevents us from quantifying a time series of decoding accuracy. Nevertheless, to overcome this difficulty, we divided the spike data (0-500 ms following the inhalation start) according to the gamma cycle into four non-overlapping gamma phase bins. Then we tested whether odor decoding accuracy varied as a function of the gamma cycle phase. Using this approach, we found that decoding depended on the gamma phase, as shown below:
  
  (The bottom plot shows the modulation of decoding accuracy within the gamma cycle [Real MI] compared to a surrogate distribution [Surr MI, obtained by circularly shifting the gamma phases by a random amount]).
  
  We interpret this new result as indicative that gamma influences decoding accuracy directly and that our previous result was not only a reflection of total population activity. Moreover, please note that we only use the principal cell activity for computing the odor specificity index (Fig 5E) and decoding accuracy (Fig 7B). Both peak at ~150 ms following inhalation start, at a time window where the net principal cell activity is roughly similar to baseline levels (Fig 5A bottom panel).
  
  These new panels were added to revised Figure 7 and mentioned in the revised manuscript (page 8); we now also discuss the above considerations about maximal decoding not coinciding with the peak firing rate (page 10).
  
  Regarding the Odor Specificity Index computation, we apologize for not describing it appropriately in the corresponding Methods subsection. We employed the same sliding time window as in the population vector correlation and the decoding analyses (i.e., 100 ms window, 62.5 % overlap). This information has been added to the revised manuscript (page 15).
  
  The authors say, "assembly recruitment would depend on excitatory-excitatory interactions among winner cells occurring simultaneously during gamma activity." Can the authors test this prediction by examining the TeLC recordings, in which excitatory-excitatory connections are abolished?
  
  We thank the reviewer for the relevant comment. We followed the reviewer's suggestion and analyzed odor assemblies in TeLC recordings. Interestingly, we found a greater increase in the firing rate of winner cells in TeLC recordings (see figure below), which therefore does not support our previous interpretation that assembly recruitment would depend on excitatory-excitatory local interactions.
  
  Thus, this new result suggests a much more critical role than we previously considered for the OB projections in determining winner neurons.
  
  Moreover, we found significant differences in the properties of loser cells. In particular, the TeLC-infected piriform cortex showed a decreased number of losing cells, which were significantly less inhibited than their contralateral counterparts:
  
  Furthermore, the reduced inhibition of losing cells was associated with an increased correlation of assembly weights across odors for the affected hemisphere:
  
  Therefore, we believe these results highlight the role of gamma oscillations in segregating cell assemblies and generating a sparse orthogonal odor representation in the piriform cortex. These findings are now included as new panels of Figure 6 and discussed on page 8. Noteworthy, to conform with them, we modified our speculative sentence (page 9) "assembly recruitment would depend on excitatory-excitatory interactions among winner cells occurring simultaneously during gamma activity" to “(…) the assembly recruitment would depend on OB projections determining which winner cells “escape” gamma inhibition, highlighting the relevance of the OB-PCx interplay for olfaction (Chae et al., 2022; Otazu et al., 2015).”
  
  The authors show that gamma oscillations are abolished in the TeLC condition and use this to claim that gamma arises in the PCx. However, PCx neurons also project back to the OB, where they form excitatory connections onto granule cells. Fukunaga et al (2012) showed that granule cells are essential for generating gamma oscillations in the bulb. Can the authors be sure that gamma is generated in the PCx, per se, rather than generated in the bulb by centrifugal inputs from the PCx, and then inherited from the bulb by the PCx?
  
  We thank the reviewer for the pertinent comment regarding gamma generation in the PCx. To address this point, we have performed current source density (CSD) analysis, which showed sink and sources of low-gamma oscillations within the PCx and also a phase reversal:
  
  This result – shown as panel F in Figure 1 – suggests a local generation of gamma within the PCx. Along with the fact that PCx gamma tightly correlates with piriform FBI firing and that PCx gamma disappears in the TeLC ipsi hemisphere, which has intact OB projections, we deem it more parsimonious to assume that gamma does originate in the piriform circuit during feedback inhibition acting on principal cells and is not directly inherited from OB (though it depends on its drive). We have edited our text to incorporate the figure above panel (page 4). We now also relate our results with those of Fukunaga and colleagues for the OB gamma generation and discuss the alternative interpretation of inherited gamma (page 9).
  
  Reviewer #2 (Public Review):
  
  This is a very interesting paper, in which the authors describe how respiration-driven gamma oscillations in the piriform cortex are generated. Using a published data set, they find evidence for a feedback loop between local principal cells and feedback interneurons (FBIs) as the main driver of respiration-driven gamma. Interestingly, odour-evoked gamma bursts coincide with the emergence of neuronal assemblies that activate when a given odour is presented. The results argue in favour of a winner-take-all mechanism of assembly generation that has previously been suggested on theoretical grounds.
  
  We thank the reviewer for his/her work and accurate summary of our results.
  
  The article is well-written and the claims are justified by the data. Overall, the manuscript provides novel key insights into the generation of gamma oscillations and a potential link to the encoding of sensory input by cell assemblies. I have only minor suggestions for additional analyses that could further strengthen the manuscript:
  
  We thank the reviewer for the positive appraisal.
  
  1) The authors' analysis of firing rates of FFIs and FBIs combined with TeLC experiments make a compelling case for respiration-driven gamma being generated in a pyramidal cell-FBI feedback mechanism. This conclusion could be further strengthened by analyzing the gamma phase-coupling of the three neuronal populations investigated. One would expect strong coupling for FBIs but not FFIs (assuming that enough spikes of these populations could be sampled during the respiration-triggered gamma bursts). An additional analysis to strengthen this conclusion could be to extract FBI- and FFI spike-triggered gamma-filtered signals. One might expect an increase in gamma amplitude following FBI but not FFI spiking (see e.g., Pubmed ID 26890123).
  
  We thank the reviewer for the comment. To address this point, we first computed spike-coupling strength (by means of the Mean Vector Length – MVL) for each neuronal subtype. As shown below, we did not find major differences in MVL values across subtypes (if anything, the FBIs actually displayed the lowest MVL, though it should be cautioned that this metric is sensible to sample size, which differed among subtypes):
  
  Of note, this result also translated to spike-triggered gamma-filtered signals, with FBIs having the lowest average. We don’t however believe these findings speak against a major role of FBIs in giving rise to field gamma, since it is expected that inhibited neurons will highly phase-lock to gamma (while more active neurons during gamma would show lower phase-locking). Nevertheless, we also computed the spike-triggered gamma amplitude envelope for all three neuronal subtypes. This analysis showed that gamma envelopes closely followed FBI spikes (and not FFIs or EXC cells), and thus this new result reinforces the idea that FBIs trigger gamma oscillations. This plot is now part of an inset of Figure 1G (described on page 5).
  
  2) The authors utilize the neurons' weight in the first PC to assign them to odour-related assemblies. This method convincingly extracts an assembly for each odour (when odours are used individually), and these seem to be virtually non-overlapping. It would be informative to test whether a similar clear separation of the individual assemblies could be achieved by running the analysis on all odours simultaneously, perhaps by employing a procedure of assembly extraction that allows to deal with overlapping assembly membership better than a pure PCA approach (as used for instance in the work cited on page 11, including the authors' previous work)? I do not doubt the validity of the authors' approach here at all, but the suggested additional analysis might allow the authors to increase their confidence that individual neurons contribute mostly to an assembly related to a single odour.
  
  We thank the reviewer for the pertinent comment. In order to address it, we ran the ICA-based approach to detect cell assemblies (Lopes-dos-Santos et al., 2013) using the spike time series of all odors concatenated. The concatenation included time windows around the gamma peak (100-400 ms after inhalation start). We chose this window to prevent the ICA from picking temporal features of the response as different ICs instead of the spiking variations caused by the different odors. As a reference, we also calculated ICA for each odor independently during the gamma peak.
  
  We found that the results obtained from ICA computed using concatenated data from all odors show important resemblances to those from the single ICA per odor approach. For instance, we get similar sparsity and cell assembly membership (Figure 6-figure supplement 1A), orthogonality (Figure 6-figure supplement 1B), and odor specificity (Figure 6-figure supplement 1C) in the ICs loadings through both approaches. Noteworthy, the average absolute IC correlation between the six odors (computed separately) and the six first ICs (computed from the combined odor responses) were similar across animals and showed no significant differences (Figure 6-figure supplement 1C).
  
  We also directly tested odor selectivity and separation in the concatenated data approach by computing each odor’s mean assembly activity (i.e., “IC projection”). Regarding the former, we found that most assemblies coded for 1 or 2 odors (Figure 6-figure supplement 1D). Regarding the diversity of representations for the sampled neurons, we assessed odor separation by examining to which odor each IC is activated the most. Under this framework, we get that, on average, the first 6 ICs encode three to five different odors (Figure 6-figure supplement 1E).
  
  We have included this result as a new Figure 6-figure supplement 1 and mention it on page 8. Of note, we have also performed all of our previous assembly analyses (i.e., Figure 6) using ICA instead of PCA to be consistent throughout the manuscript and allow the reader to compare with the new supplementary figure. This led to a new and enhanced version of Figure 6.
  
  3) Do the authors observe a slow drift in assembly membership as predicted from previous work showing slowly changing odour responses of principal neurons (Schoonover et al., 2021)? This could perhaps be quantified by looking at the expression strengths of assemblies at individual odour presentations or by running the PCA separately on the first and last third of the odour presentations to test whether the same neurons are still 'winners'.
  
  We thank the reviewer for calling our attention to this point. We note, however, that the representation drift observed by Schoonover et al. occurred along several days of recordings, i.e., at a much slower time scale than the single-day recordings we analyzed here (of note, Schoonover et al. observed no drift within the same day [their Fig 2a]). But irrespective of this, we believe that the data at hand does not allow for a confident analysis of possible drifts. This is because each odor was only presented ~12 times; so, further subdividing the data into subsets of only 4 trials would not render a reliable analysis, unfortunately.
  
  4) Does the winner-take-all scenario involve the recruitment of specific sets of FBIs during the activation of the individual odour-selective assemblies? The authors could address this by testing whether the rate of FBIs changes differently with the activation of the extracted assemblies.
  
  Within each recording session, the number of recorded FBIs is very low, on average 3.6 FBIs per recording session. Thus, unfortunately such interesting analysis cannot be confidently performed.
  
  5) Given the dependence on local gamma oscillations, one might expect that odour-selective assemblies do not emerge in the TeLC-expressing hemisphere. This could be directly tested in the existing data set.
  
  We are thankful for the comment. We followed the reviewer's suggestion and analyzed odor assemblies in TeLC recordings, comparing the ipsilateral hemisphere (infected) with the contralateral one. Interestingly, we find an increased correlation of assembly weights across odors, suggesting that the formation/segregation of odor-selective assemblies is hindered when the principal cell synapses are abolished. This assembly selectivity reduction co-occurred as the number of losing neurons decreased, and the inhibition of the latter was also reduced. Consequently, decoding accuracy significantly decreased during the 150-250 ms window in the infected TeLC hemisphere compared to the contralateral cortex.
  
  Therefore, we believe these new results support the role of gamma oscillations in segregating cell assemblies and generating a sparse orthogonal odor representation. These findings are now included as new panels of Figure 6 and Figure 7 and discussed on page 8.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.04.24.489324v2
www.biorxiv.org www.biorxiv.org

New submission 28/11/2022, 13:30:51

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  By studying the effect of Treg depletion in a CD8+ T cell-dependent diabetes model the group around Ondrej Stepanek described that in the absence of Treg cells antigen-specific CD8+ OT-I T cells show an activated phenotype and accelerate the development of diabetes in mice. These cells - termed KILR cells - express CD8+ effector and NK cell gene signatures and are identified as CD49d- KLRK1+ CD127+ CD8+ T cells. The authors suggest that the generation of these cells is dependent on TCR stimulation and IL-2 signals, either provided due to the absence of Treg cells or by injection of IL-2 complexed to specific antiIL-2 mAbs. In vivo, these cells show improved target cell killing properties, while the authors report improved anti-tumor responses of combination treatments with doxorubicin combined with IL-2/JES6 complexes. Finally, the authors identified a similar human subset in publicly available scRNAseq datasets, supporting the translational potential of their findings.
  
  The conclusions are mostly well supported, except for the following two considerations:
  
  We are happy for the positive overall evaluation of our manuscript by both reviewers and we are thankful for their specific insightful comments, which helped us to improve the manuscript.
  
  1) From Fig. 4A and B it is not conclusively shown, that Tregs limit IL-2 necessary for the expansion of OT-I cells and subsequent induction of diabetes. An IL-2 depletion experiment (e.g. with combined injection of the S4B6 and JES6-1 antibodies) would further strengthen this claim. Along these lines, the authors claim "IL-2Rα expression on T cells can be induced by antigen stimulation or by IL-2 itself in a positive feedback loop [20]. Accordingly, downregulation of IL-2Rα in OT-I T cells in the presence of Tregs might be a consequence of the limited availability of IL-2.". The cited reference 20 did observe CD25 upregulation by IL-2 on T cells but the observed effect might only be caused by upregulation of CD25 on Treg cells, which increases the MFI for the whole T cell population. Did the authors observe significant upregulation of CD25 on effector CD4+ and CD8+ T cells in their experiments with IL-2/S4B6 or IL-2/JES6 treatment?
  
  We added another reference to support our claim (Sereti, I., et al., Clin Immunol, 2000. 97(3): p. 266-76.). Along this line, we also observed that addition of IL-2 in vitro leads to IL-2Rα upregulation on CD8+ T cells (shown in Fig. 4C), which was IL-2Rα level was lower if Tregs were present. We also observed upregulation of IL-2Rα in vivo upon the stimulation of OT-I T cells with OVA and IL-2ic, which is now shown in the Fig. S6C of the revised manuscript.
  
  To further explore if Tregs limit expansion of OT-I and diabetes progression via IL-2 limitations, we performed the proposed experiment using a combined injection of S4B6 and JES6-1 anti-IL-2 antibodies. At the beginning, we were skeptical that we could completely block the IL-2 using this approach for the following reasons. First, IL-2 is produced locally in the spleen and lymph nodes and might not be easily accessible for the antibodies for a complete block. Second, IL-2 has a relatively short turnover and is continuously produced, but the half-life of the injected antibodies is unknown, which questions the duration of such a block. Third, it is possible that some IL-2 molecules would bound only to one of the two antibodies, which will make it a hyper-stimulating immune-complex, instead of neutralizing it.
  
  Anyway, we were curious enough to perform this experiment. We used a condition that based on our experience leads to diabetes manifestation in Tregs depleted, but not in Treg replete mice (10 k OT-I T cells, OVA + LPS immunization). One additional group of Treg-depleted mice received a single dose of S4B6 and JES6-1 anti-IL-2 (200 µg of each antibody per mouse). We observed that this IL-2 blocking delayed, but not prevented the development of diabetes in most animals (Fig. 1 below).
  
  Overall, we believe that this experiment is rather supporting our conclusions concerning the importance of IL-2, although the effect is only partial. However, we decided not to include this experiment in the manuscript, because we do not have the evidence about how efficient the IL-2 blocking was (see above), which makes the interpretation difficult. Because the reviews and the point-by-point response is public in eLife, we believe that showing the data here is appropriate.
  
  Figure 1. Role of IL-2 blocking on the development of experimental diabetes. Two independent experiments were performed. Statistical significance was calculated using Log-rank (Mantel-Cox) test for survival, and Kruskal-Wallis test for blood glucose (p-value is shown in italics).
  
  2) The anti-tumor efficacy of KILR cells is intriguing but currently, it is unclear if it is indeed mediated by KILR cells. Have KILR cells been identified by flow cytometry in the BCL1 and B16F10 models treated with doxorubicin and IL-2/JES6? Were specific KILR cell depletion studies conducted, e.g. with an anti-KLRK1 depleting antibody? Additional experiments addressing these questions would be desirable to further support the authors' claims.
  
  We are thankful to both reviewers for their similar comments concerning the analysis of CD8+ T cells in the tumor model. Addressing these comments lead to very useful data and significantly improved our manuscript.
  
  We performed the analysis of splenic CD8+ T cells in the BCL1 leukemia model (spleen is the major site of the leukemic cells in this model). We observed that KLRK1+ T cells represented almost half of CD8+ T cells in mice treated with DOX+IL-2, which was much higher frequency than in the control and DOX-only treated mice. Although not all KLRK1+ cells were bona fide KILR cells, the frequencies of KLRK1+ IL-7R+ and KLRK1+ CD49d- cells were also strongly elevated in the Dox+IL-2ic treated mice. Overall, the survival of DOX+IL-2ic treated mice correlated with the frequencies of KILR T cells and KLRK1+ T cells. Moreover, GZMB was almost exclusively expressed by KLRK1+ T cells. We are showing these data in Fig. 7C and Fig. S7B in the revised manuscript.
  
  In the B16 melanoma model, we analyzed CD8+ T cells in the spleens and also in the tumors. We observed a huge population of KLRK1+ GZMB+ CD8+ T-cell population in the spleen of DOX+IL-2ic-treated mice, but not in the untreated or DOX-only treated mice (Fig. 7F). Both KLRK1+ CD49d+ and KLRK1+ CD49d- CD8+ T cells were substantially more frequent in the DOX+IL-2ic-treated, but not in the untreated or DOX-only treated mice (Fig. S7F). In the tumor, the KLRK1+ CD49d- CD8+ T cells were found at large numbers only in the DOX+IL-2ic-treated mice (Fig. 7G). Moreover, these KLRK1+ CD49d- CD8+ T cells expressed high levels of IL-7R and GZMB only in DOX+IL-2ic-treated, but not in untreated and DOX-only treated mice (Fig. 7H).
  
  We believe that these new data provide evidence that the combination of immunogenic chemotherapy with IL-2 treatment induced KILR cells in the spleens and in the tumors and that this correlates with the better survival.
  
  Because the majority of non-naïve CD8+ T cells (and vast majority of GZMB+ CD8+ T cells) in the spleens and tumors of the tumor-bearing mice treated with DOX+IL-2ic were KLRK1+ and because we have shown that the protective effect of the DOX+IL-2ic therapy is largely CD8+ T cell-dependent, we did not find it essential to perform the depletion of KLRK1+ T-cells. We believe that it is almost inevitable that the depletion of KLRK1+ T cells would lead to increased tumor growth as it would probably deplete the majority of antigenspecific CD8+ T cells, mimicking the overall CD8+ T cell depletion. Moreover, we do not have this protocol established.
  
  Reviewer #2 (Public Review):
  
  In this study, the authors determine the superior cell killing abilities of KLRK1+ IL7R+ (KILR) CD8+ effector T cells in experimental diabetes and tumor mouse model. They also provide evidence that Tregs suppress the formation of this previously uncharacterized subset of CD8+ effector T cells by limiting IL-2.
  
  Strength and Limitation
  
  This study focuses on the relationship between Tregs and CD8+ T cells. They used different experimental diabetes mouse models to reveal that Tregs suppress the CD8+ effector T cells by limiting IL-2. They also found a unique subset of KLRK1+ IL7R+ (KILR) CD8+ effector T cells with superior cell killing abilities through single-cell sequencing, but killing abilities could be inhibited by Tregs. They also tested their theory in in vivo tumor model. The data, in general, support the conclusions; however, some issues need to be fully addressed, as detailed below.
  
  We are happy for the positive overall evaluation of our manuscript by both reviewers and we are thankful for their specific insightful comments, which helped us to improve the manuscript.
  
  1) This study used the concentration of urine glucose as the standard for diabetes ({greater than or equal to} 1000 mg/dl for two consecutive days). However, multiple reasons may lead to a high level of urine glucose. As a type I diabetes mouse model, authors could use immunohistological analysis of islet to show the proportion of T cells and islet cells in islet, which can display the geographic distribution of immune cells, severity and histology structure of damaged pancreas islet directly. If possible, different subsets of immune cells, especially CD4 vs CD8+ cells should be stained for their location.
  
  We added the histological examination of the pancreas in control, DEREG-, and DEREG+ mice using contrast H&E staining and immuno-fluorescence (Fig. 1D-E in the revised manuscript). We observed that the high glucose and blood levels are preceded by the destruction of the pancreatic islets (morphology and decreased insulin production) as well as by the infiltration of the islets with immune cells including CD4+ and CD8+ T cells.
  
  2) This article shows that KILR effector CD8+ T cells have strong cytotoxic properties. However, they do not describe the potential proliferation ability vs apoptosis of this subset from islets.
  
  We analyzed the proliferation (KI67 expression) and apoptosis (Annexin V, cleaved Caspase 3) in T cells isolated from the pancreas of DEREG- and DEREG+ mice on day 4 after the induction of diabetes using flow cytometry (Figure 2 below). We did not observe any differences between DEREG- and DEREG+ mice or among different subsets of OT-I T cells in the DEREG+ mice. Essentially, all T cells were proliferative (KI67+) and there was a very low percentage of Annexin V or cleaved Caspase 3 positive cells.
  
  Figure 2. Lymphocytes were isolated from the pancreas of DEREG- RIP.OVA and DEREG+ RIP.OVA mice on day 4 after the induction of diabetes, and analyzed using flow cytometry. Two independent experiments were performed. Gated on OT-I T cells. Top: proliferation rate based on Ki-67 staining. Representative histogram and MFI (median is shown). Middle: Apoptosis rate based on Annexin V staining. Representative histogram shows Annexin V staining in three populations of OT-I T cells from DEREG+ mouse (“AE” - CD49d+ KLRK1-, “++” - CD49d+ KLRK1+, KILR - CD49d- KLRK1+), total OT-I T cells from DEREG-, and a positive control: WT CD8+ T cells treated with hydrogen peroxide. Middle right: Percentage of Annexin V+ cells and MFI (median is shown). Bottom: Apoptosis rate based on cleaved Caspase 3 staining. Representative dot plots show cleaved Caspase 3 staining of OT-I T cells from DEREG+, DEREG-, and a positive control: WT CD8+ T cells treated with hydrogen peroxide. Bottom right: percentage of cleaved Caspase 3+ cells (median is shown).
  
  However, we found question concerning proliferation and apoptosis of KILR cells interesting and worth further investigation. For this reason, we assessed the proliferation, survival, and phenotypic stability of naïve, KILR, and effector T cells by their competitive transfer into CD3ε-/- mice. The phenotype of all these three subsets remained stable for 4 days (Fig. 6F), documenting that KILR cells are not just a very transient stage. Moreover, the KILR cells were ~2 fold more abundant then effector cells 3 days after their 1:1 cotransfer into CD3ε-/- mice (Fig. 6G, Fig. 6SE). This was probably caused by their slight advantages in both proliferation and survival (Fig. 6SF-G).
  
  3) Figure 7 shows that the antitumor efficacy of IL-2 depends on CD8+ T cells. But in this part, there is no data to show the change of KLRK1+ IL7R+ CD8+ effector T cells in tumor tissue. Therefore, the article needs to add more data to verify that IL-2 enhances antitumor ability via KLRK1+ IL7R+ CD8+ effector T cells.
  
  We are thankful to both reviewers for their similar comments concerning the analysis of CD8+ T cells in the tumor model. Addressing these comments lead to very useful data and significantly improved our manuscript.
  
  We performed the analysis of splenic CD8+ T cells in the BCL1 leukemia model (spleen is the major site of the leukemic cells in this model). We observed that KLRK1+ T cells represented almost half of CD8+ T cells in mice treated with DOX+IL-2, which was much higher frequency than in the control and DOX-only treated mice. Although not all KLRK1+ cells were bona fide KILR cells, the frequencies of KLRK1+ IL-7R+ and KLRK1+ CD49d- cells were also strongly elevated in the Dox+IL-2ic treated mice. Overall, the survival of DOX+IL-2ic treated mice correlated with the frequencies of KILR T cells and KLRK1+ T cells. Moreover, GZMB was almost exclusively expressed by KLRK1+ T cells. We are showing these data in Fig. 7C and Fig. S7B in the revised manuscript.
  
  In the B16 melanoma model, we analyzed CD8+ T cells in the spleens and also in the tumors. We observed a huge population of KLRK1+ GZMB+ CD8+ T-cell population in the spleen of DOX+IL-2ic-treated mice, but not in the untreated or DOX-only treated mice (Fig. 7F). Both KLRK1+ CD49d+ and KLRK1+ CD49d- CD8+ T cells were substantially more frequent in the DOX+IL-2ic-treated, but not in the untreated or DOX-only treated mice (Fig. S7F). In the tumor, the KLRK1+ CD49d- CD8+ T cells were found at large numbers only in the DOX+IL-2ic-treated mice (Fig. 7G). Moreover, these KLRK1+ CD49d- CD8+ T cells expressed high levels of IL-7R and GZMB only in DOX+IL-2ic-treated, but not in untreated and DOX-only treated mice (Fig. 7H).
  
  We believe that these new data provide evidence that the combination of immunogenic chemotherapy with IL-2 treatment induced KILR cells in the spleens and in the tumors and that this correlates with the better survival.
  
  4) It is unclear why the authors chose Dox to combine with IL-2/JES6. The authors should provide a more rational introduction to bridge such a combination. Authors should also explain the reason why there is no antitumor effect of IL-2/JES6 treatment alone.
  
  The experiments with OT-I mice showed that the formation of KILR cells required both the antigenic stimulation and IL-2 signals. We believe that there is only very week antigenic stimulation by the tumor itself. For this reason, we combined the treatment with the chemotherapy Doxorubicin, which is known to induce immunogenic cell death of the tumor cells (e.g., Casares et al. 2005, PMID: 16365148). We believe that doxorubicin induces the death of (some) tumor cells and the release and presentation of their tumorspecific antigens. Without it, the tumor are simply too “cold” to induce sufficient T-cell response. We emphasized this in the revised version of the manuscript.
  
  Importantly, some of us observed a similar effect of IL-2ic in a combination with check-point blockade therapy (without chemotherapy) in a different tumor model, which documents that the chemotherapy is not essential for this effect (unpublished data).
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2021.11.10.467495v3
www.biorxiv.org www.biorxiv.org

Monkeys exhibit human-like gaze biases in economic decisions

1
1. Public_Reviews 17 Nov 2025
  
  in eLife
  
  Author Response
  
  Reviewer #1 (Public Review):
  
  Point 1: Many of the initial analyses of behavior metrics, for instance predicting reaction times, number of fixations, or fixation duration, use value difference as a regressor. However, given a limited set of values, value differences are highly correlated with the option values themselves, as well as the chosen value. For instance, in this task the only time when there will be a value difference of 4 drops is when the options are 1 and 5 drops, and given the high performance of these monkeys, this means the chosen value will overwhelmingly be 5 drops. Likewise, there are only two combinations that can yield a value difference of 3 (5 vs. 2 and 4 vs 1), and each will have relatively high chosen values. Given that value motivates behavior and attracts attention, it may be that some of the putative effects of choice difficulty are actually driven by value.
  
  To address this question, we have adapted the methods of Balewski and colleagues (Neuron, 2022) to isolate the unique contributions of chosen value and trial difficulty to reaction time and the number of fixations in a given trial (the two behaviors modulated by difficulty in the original paper). This new analysis reveals a double dissociation in which reaction time decreases as a function of chosen value but not difficulty, while the number of fixations in a trial shows the opposite pattern. Our interpretation is that reaction time largely reflects reward anticipation, whereas the number of fixations largely reflects the amount of information required to render a decision (i.e., choice difficulty). See lines 144-167 and Figure 2.
  
  Point 2: Related to point 1, the study found that duration of first fixations increased with fixated values, and second (middle) fixation durations decreased with fixated value but increased with relative value of the fixated versus other value. Can this effect be more concisely described as an effect of the value of the first fixated option carrying over into behavior during the second fixation?
  
  This is a valid interpretation of the results. To test this directly, we now include an analysis of middle fixation duration as a function of the not-currentlyviewed target. Note that the vast majority of middle fixations are the second fixation in the trial, and therefore the value of the unattended target is typically the one that was viewed first. The analysis showed a negative correlation between middle fixation duration and the value of the unattended target which is consistent with the first fixated value carrying over to the second fixation. See lines 243-246.
  
  Point 3: Given that chosen (and therefore anticipated) values can motivate responses, often measured as faster reaction times or more vigorous motor movements, it seems curious that terminal non-decision times were calculated as a single value for all trials. Shouldn't this vary depending at least on chosen values, and perhaps other variables in the trial?
  
  In all sequential sampling model formulations we are aware of, nondecision time is considered to be fixed across trial types. Examples can be found for perceptual decisions (e.g., Resulaj et al., 2009) and in the “bifurcation point” approach used in the recent value-based decision study by Westbrook et al. (2020).
  
  To further investigate this issue, we asked whether other post-decision processes were sensitive to chosen value in our paradigm. To do so, we measured the interval between the center lever lift and the left or right lever press, corresponding to the time taken to perform the reach movement in each trial (reach latency). We then fit a mixed effects model explaining reach latency as a function of chosen value. While the results showed significantly faster reach latencies with higher chosen values, the effect size was very small, showing on average a ~3ms decrease per drop of juice. In other words, between the highest and lowest levels of chosen value (5 vs. 1), there is only a difference of approximately 12ms. In contrast, the main RT measure used in the study (the interval between target onset and center lever lift) is an order of magnitude more sensitive to chosen value, decreasing ~40ms per drop of juice. These results are shown in Author response image 1.
  
  Author response image 1.
  
  This suggests that post-decision processes (NDT in standard models and the additive stage in the Westbrook paper) vary only minimally as a function of chosen value. We are happy to include this analysis as a supplemental figure upon request.
  
  Point 4: The paper aims to demonstrate similarities between monkey and human gaze behavior in value-based decisions, but focuses mainly on a series of results from one group of collaborators (Krajbich, Rangel and colleagues). Other labs have shown additional nuance that the present data could potentially speak to. First, Cavanaugh et al. (J Exp Psychol Gen, 2014) found that gaze allocation and value differences between options independently influence drift rates on different choices. Second, gaze can correlate with choice because attention to an option amplifies its value (or enhances the accumulation of value evidence) or because chosen options are attended more after the choice is implicitly determined but not yet registered. Westbrook et al. (Science, 2020) found that these effects can be dissociated, with attention influencing choice early in the trial and choice influencing attention later. The NDTs calculated in the present study allot a consistent time to translating a choice into a motor command, but as noted above don't account for potential influences of choice or value on gaze.
  
  The two-stage model of gaze effects put forth by Westbrook et al. (2020) is consistent with other observations of gaze behavior and choice (i.e., Thomas et al., 2019, Smith et al., 2018, Manohar & Husain, 2013). In this model, gaze effects early in the trial are best described by a multiplicative relationship between gaze and value, whereas gaze effects later in the trial are best described with an additive model term. To test the two-stage hypothesis, Westbrook and colleagues determined a ‘bifurcation point’ for each subject that represented the time at which gaze effects transitioned from multiplicative to additive. In our data, trial durations were typically very short (<1s), making it difficult to divide trials and fit separate models to them. We therefore took at different approach: We reasoned that if gaze effects transition from multiplicative to additive at the end of the trial, then the transition point could be estimated by removing data from the end of each trial and assessing the relative fit of a multiplicative vs. additive model. If the early gaze effects are predominantly multiplicative and late gaze effects are additive, the relative goodness of fit for an additive model should decrease as more data are removed from the end of the trial. To test this idea, we compared the relative model fit of an additive vs. multiplicative models in the raw data, and for data in which successively larger epochs were removed from the end of the trial (50, 100, 150, 200, 300, and 400ms). The relative fit was assessed by computing the relative probability that each model accurately reflects the data. In addition, to identify significant differences in goodness of fit, we compared the WAIC values and their standard errors for each model (Supplemental File 3). As shown in Figure 4, the relative fit probability for both models is nonzero in the raw data 0 truncation), indicating that a neither model provides a definitive best fit, potentially reflecting a mixture of the two processes. However, the relative fit of the additive model decreases sharply as data is removed, reaching zero at 100ms truncation. 100ms is also the point at which multiplicative models provide a significantly better fit, indicated by non-overlapping standard error intervals for the two models (Supplemental File 3). Together, this suggested that the transition between early- and late-stage gaze effects likely occurs approximately 100ms before the RT.
  
  To minimize the influence of post-decision gaze effects, the main results use data truncated by 100ms. However, because 100ms is only an estimate, we repeated the main analyses over truncation values between 0 and 400ms, reported in Figure 6 - figure supplement 1 & Figure 7 - figure supplement 1. These show significant gaze duration biases and final gaze biases in data truncated by up to 200ms.
  
  Reviewer #2 (Public Review):
  
  Recommendation 1: The only real issue that I see with the paper is fairly obvious: the authors find that the last fixations are longer than the rest, which is inconsistent with a lot of the human work. They argue that this is due to the reaching required in this task, and they take a somewhat ad-hoc approach to trying to correct for it. Specifically, they take the difference between final and non-final, second fixations, and then choose the 95th percentile of that distribution as the amount of time to subtract from the end of each trial. This amounts to about 200 ms being removed from the end of each trial. There are several issues with this approach. First, it assumes that final and non-final fixations should be the same length, when we know from other work that final fixations are generally shorter. Second, it seems to assume that this 200ms is "the latency between the time that the subject commits to the movement and the time that the movement is actually detected by the experimenter". However, there is a mismatch between that explanation and the details of the task. Those last 200ms are before the monkey releases the middle lever, not before the monkey makes a left/right choice. When the monkey releases the middle lever, the stimuli disappear and they then have 500ms to press the left or right lever. But, the reaction time and fixation data terminate when the monkey releases the middle lever. Consequently, I don't find it very likely that the monkeys are using those last 200ms to plan their hand movement after releasing the middle lever.
  
  Thanks for the opportunity to clarify these points. There are three related issues:
  
  First, with regards to fixation durations, in the updated Figure 3 we now show durations as a function of both the absolute order in the trial (first, second, third, fourth, etc.) and the relative order (final/nonfinal). We find that durations decrease as a function of absolute order in the trial, an effect also seen in humans (see Manohar & Husain, 2013). At the same time, while holding absolute order constant, final fixations are longer than non-final fixations. To explain the discrepancy with human final fixation durations, we note that monkeys make many fewer fixations per trial (~2.5) than humans do (~3.7, computed from publicly available data from Krajbich et al., 2010.) This means that compared to humans, monkeys’ final fixations occur earlier in the trial (e.g., second or third), and are therefore comparatively longer in duration. Note that studies with humans have not independently measured fixation durations by absolute and relative order, and therefore would not have detected the potential interaction between the two effects.
  
  Second, the comment suggests that the final 200ms before lever lift is not spent planning the left/right movement, given that the monkeys have time after the lever lift in which to execute the movement (400 or 500ms, depending on the monkey). The presumption appears to be that 400/500ms should be sufficient to plan a left/right reach. However, we think that these two suggestions are unlikely, and that our original interpretation is the most plausible. First, the 400/500ms deadline between lift and left/right press was set to encourage the monkeys to complete the reach as fast as possible, to minimize deliberations or changes of mind after lifting the lever. More specifically, these deadlines were designed so that on ~0.5% of trials, the monkeys actually fail to complete the reach within the deadline and fail to obtain a reward. This manipulation was effective at motivating fast reaches, as the average reach latency (time between lift and press) was 165 SEM 20ms for Monkey K, and 290 SEM 100ms for Monkey C.
  
  Therefore, given the time pressure imposed by the task, it is very unlikely that significant reach planning occurs after the lever lift. In addition to these empirical considerations, the idea that the final moments before the RT are used for motor planning is a standard assumption in many theoretical models of choice (including sequential sampling models, see Ratcliff & McKoon 2008, for review), and is also well-supported by studies of motor control and motor system neurophysiology. Based on these, we think the assumption of some form of terminal NDT is warranted.
  
  Third, we have changed our method for estimating the NDT interval. In brief we sweep through a range of NDT truncation values (0-400ms) and identify the smallest interval (100ms) that minimizes the contribution of “additive” gaze effects, which are thought to reflect late-stage, post-decision gaze processes. See the response to Point 4 for Reviewer 1 above, Figure 4 and lines 267-325 in the main text. In addition, we report all of the major study results over a range of truncation values between 0 and 400ms.
  
  AuthorResponse
Visit annotations in context

Tags

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.02.24.481847v1

Public_Reviews

Annotations: 10,000

Joined: March 17, 2021

Top tags 25