26,869 Matching Annotations
  1. Jun 2024
    1. eLife assessment

      This potentially useful study introduces an orthogonal approach for detecting RNA modification, without chemical modification of RNA, which often results in RNA degradation and therefore loss of RNA molecules. The approach might be of particular interest for sites where modifications are rare. However, the false positive and false negative rates are currently unclear, leaving the evidence for broad applicability of the method incomplete.

    2. Reviewer #1 (Public Review):

      The detection sensitivity and accuracy are unclear.

      In this manuscript, Zhou et al describe a deaminase and reader protein-assisted RNA m5C sequencing method. The general strategy is similar to DART-seq for m6A sequencing, but the difference is that in DART-seq, m6A sites are always followed by C which can be deaminated by fused APOBEC1 to provide a high resolution of m6A sites, while in the case of m5C, no such obvious conserved motifs for m5C sites exist, therefore, the detection resolution is much lower. In addition, the authors used two known m5C binding proteins ALYREF and YBX1 to guide the fused deaminases, but it is not clear whether these two binding proteins can bind most m5C sites and compete with other m5C binding proteins.

      It is well known that two highly modified m5C sites exist in 28S RNA and many m5C sites exist in tRNA, the authors should validate their methods first by detecting these known m5C sites and evaluate the possible false positives in rRNA and tRNA. In mRNA, it is not clear what is the overlap between the technical replicates. In Figures 4A and 4C, they detected more than 10K m5C sites, and most of them did not overlap with sites uncovered by other methods. These numbers are much larger than expected and possibly most of them are false positives. Besides, it is not clear what is the detection sensitivity and accuracy since the method is neither single base resolution nor quantitative. There are no experiments to show that the detected m5C sites are responsive to the writer proteins such as NSUN2 and NSUN6, and the determination of the motifs of these writer proteins.

    3. Reviewer #2 (Public Review):

      The fledgling field of epitranscriptomics has encountered various technical roadblocks with implications for the validity of early epitranscriptomics mapping data. As a prime example, the low specificity of (supposedly) modification-specific antibodies for the enrichment of modified RNAs, has been ignored for quite some time and is only now recognized for its dismal reproducibility (between different labs), which necessitates the development of alternative methods for modification detection. Furthermore, early attempts to map individual epitranscriptomes using sequencing-based techniques are largely characterized by the deliberate avoidance of orthogonal approaches aimed at confirming the existence of RNA modifications that have been originally identified.

      Improved methodology, the inclusion of various controls, and better mapping algorithms as well as the application of robust statistics for the identification of false-positive RNA modification calls have allowed revisiting original (seminal) publications whose early mapping data allowed making hyperbolic claims about the number, localization and importance of RNA modifications, especially in mRNA. Besides the existence of m6A in mRNA, the detectable incidence of RNA modifications in mRNAs has drastically dropped.

      As for m5C, the subject of the manuscript submitted by Zhou et al., its identification in mRNA goes back to Squires et al., 2012 reporting on >10.000 sites in mRNA of a human cancer cell line, followed by intermittent findings reporting on pretty much every number between 0 to > 100.000 m5C sites in different human cell-derived mRNA transcriptomes. The reason for such discrepancy is most likely of a technical nature. Importantly, all studies reporting on actual transcript numbers that were m5C-modified relied on RNA bisulfite sequencing, an NGS-based method, that can discriminate between methylated and non-methylated Cs after chemical deamination of C but not m5C. RNA bisulfite sequencing has a notoriously high background due to deamination artifacts, which occur largely due to incomplete denaturation of double-stranded regions (denaturing-resistant) of RNA molecules. Furthermore, m5C sites in mRNAs have now been mapped to regions that have not only sequence identity but also structural features of tRNAs. Various studies revealed that the highly conserved m5C RNA methyltransferases NSUN2 and NSUN6 do not only accept tRNAs but also other RNAs (including mRNAs) as methylation substrates, which in combination account for most of the RNA bisulfite-mapped m5C sites in human mRNA transcriptomes. Is m5C in mRNA only a result of the Star activity of tRNA or rRNA modification enzymes, or is their low stoichiometry biologically relevant?

      In light of the short-comings of existing tools to robustly determine m5C in transcriptomes, other methods - like DRAM-seq, that allow the mapping of m5C independently of ex-situ RNA treatment with chemicals - are needed to arrive at a more solid "ground state", from which it will be possible to state and test various hypotheses as to the biological function of m5C, especially in lowly abundant RNAs such as mRNA.

      Importantly, the identification of >10.000 sites containing m5C increases through DRAM-Seq, increases the number of potential m5C marks in human cancer cells from a couple of 100 (after rigorous post-hoc analysis of RNA bisulfite sequencing data) by orders of magnitude. This begs the question of whether or not the application of these editing tools results in editing artefacts overstating the number of actual m5C sites in the human cancer transcriptome.

      Comments:

      (1) The use of two m5C reader proteins is likely a reason for the high number of edits introduced by the DRAM-Seq method. Both ALYREF and YBX1 are ubiquitous proteins with multiple roles in RNA metabolism including splicing and mRNA export. It is reasonable to assume that both ALYREF and YBX1 bind to many mRNAs that do not contain m5C.

      To substantiate the author's claim that ALYREF or YBX1 binds m5C-modified RNAs to an extent that would allow distinguishing its binding to non-modified RNAs from binding to m5C-modified RNAs, it would be recommended to provide data on the affinity of these, supposedly proven, m5C readers to non-modified versus m5C-modified RNAs. To do so, this reviewer suggests performing experiments as described in Slama et al., 2020 (doi: 10.1016/j.ymeth.2018.10.020). However, using dot blots like in so many published studies to show modification of a specific antibody or protein binding, is insufficient as an argument because no antibody, nor protein, encounters nanograms to micrograms of a specific RNA identity in a cell. This issue remains a major caveat in all studies using so-called RNA modification reader proteins as bait for detecting RNA modifications in epitranscriptomics research. It becomes a pertinent problem if used as a platform for base editing similar to the work presented in this manuscript.

      (2) Since the authors use a system that results in transient overexpression of base editor fusion proteins, they might introduce advantageous binding of these proteins to RNAs. It is unclear, which promotor is driving construct expression but it stands to reason that part of the data is based on artifacts caused by overexpression. Could the authors attempt testing whether manipulating expression levels of these fusion proteins results in different editing levels at the same RNA substrate?

      (3) Using sodium arsenite treatment of cells as a means to change the m5C status of transcripts through the downregulation of the two major m5C writer proteins NSUN2 and NSUN6 is problematic and the conclusions from these experiments are not warranted. Sodium arsenite is a chemical that poisons every protein containing thiol groups. Not only do NSUN proteins contain cysteines but also the base editor fusion proteins. Arsenite will inactivate these proteins, hence the editing frequency will drop, as observed in the experiments shown in Figure 5, which the authors explain with fewer m5C sites to be detected by the fusion proteins.

      (4) The authors should move high-confidence editing site data contained in Supplementary Tables 2 and 3 into one of the main Figures to substantiate what is discussed in Figure 4A. However, the data needs to be visualized in another way than an Excel format. Furthermore, Supplementary Table 2 does not contain a description of the columns, while Supplementary Table 3 contains a single row with letters and numbers.

      (5) The authors state that "plotting the distribution of DRAM-seq editing sites in mRNA segments (5'UTR, CDS, and 3'UTR) highlighted a significant enrichment near the initiation codon (Figure 3F).", which is not true when this reviewer looks at Figure 3F.

      (6) The authors state that "In contrast, cells expressing the deaminase exhibited a distinct distribution pattern of editing sites, characterized by a prevalence throughout the 5'UTR.", which is not true when this reviewer looks at Figure 3F.

      (7) The authors claim in the final conclusion: "In summary, we developed a novel deaminase and reader protein assisted RNA m5C methylation approach...", which is not what the method entails. The authors deaminate As or Us close to 5mC sites based on the binding of a deaminase-containing protein.

      (8) The authors claim that "The data supporting the findings of this study are available within the article and its Supplementary Information." However, no single accession number for the deposited sequencing data can be found in the text or the supplementary data. Without the primary data, none of the claims can be verified.

    1. eLife assessment

      In this manuscript, the authors describe a new AlphaFold2 pipeline called PabFold that can represent a useful tool for identifying linear antibody epitopes (B-cell epitopes) for different antigens. This information can be used in the selection of different reagents in competitive ELISA assays which can save time and reduce costs. Several questions, however, remain and the study is currently incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, "PAbFold: Linear Antibody Epitope Prediction using AlphaFold2", the authors generate a python wrapper for the screening of antibody-peptide interactions using AlphaFold, and test the performance of AlphaFold on 3 antibody-peptide complexes. In line with previous observations regarding the ability of AlphaFold to predict antibody structures and antigen binding, the results are mixed. While the authors are able to use AlphaFold to identify and experimentally validate a previously characterized broad binding epitope with impressive precision, they are unable to consistently identify the proper binding registers for their control [Myc-tag, HA-tag] peptides. Further, it appears that the reproducibility and generality of these results are low, with new versions of AlphaFold negatively impacting the predictive power. However, if this reproducibility issue is solved, and the test set is greatly increased, this manuscript could contribute strongly towards our ability to predict antibody-antigen interactions.

      Strengths:

      Due to the high significance, but difficulty, of the prediction of antibody-antigen interactions, any attempts to break down these predictions into more tractable problems should be applauded. The authors' approach of focusing on linear epitopes (peptides) is clever, reducing some of the complexities inherent to antibody binding. Further, the ability of AlphaFold to narrow down a previously broadly identified experimental epitope is impressive. The subsequent experimental validation of this more precisely identified epitope makes for a nice data point in the assessment of AlphaFold's ability to predict antibody-antigen interactions.

      Weaknesses:

      Without a larger set of test antibody-peptide interactions, it is unclear whether or not AlphaFold can precisely identify the binding register of a given antibody to a given peptide antigen. Even within the small test set of 3 antibody-peptide complexes, performance is variable and depends upon the scFv scaffold used for unclear reasons. Lastly, the apparent poor reproducibility is concerning, and it is not clear why the results should rely so strongly on which multi-sequence alignment (MSA) version is used, when neither the antibody CDR loops nor the peptide are likely to strongly rely on these MSAs for contact prediction.

      Major Point-by-Point Comments:

      (1) The central concern for this manuscript is the apparent lack of reproducibility. The way the authors discuss the issue (lines 523-554) it sounds as though they are unable to reproduce their initial results (which are reported in the main text), even when previous versions of AlphaFold2 are used. If this is the case, it does not seem that AlphaFold can be a reliable tool for predicting antibody-peptide interactions.

      (2) Aside from the fundamental issue of reproducibility, the number of validating tests is insufficient to assess the ability of AlphaFold to predict antibody-peptide interactions. Given the authors' use of AlphaFold to identify antibody binding to a linear epitope within a whole protein (in the mBG17:SARS-Cov-2 nucleocapsid protein interaction), they should expand their test set well beyond Myc- and HA-tags using antibody-antigen interactions from existing large structural databases.

      (3) As discussed in lines 358-361, the authors are unsure if their primary control tests (antibody binding to Myc-tag and HA-tag) are included in the training data. Lines 324-330 suggest that even if the peptides are not included in the AlphaFold training data because they contain fewer than 10 amino acids, the antibody structures may very well be included, with an obvious "void" that would be best filled by a peptide. The authors must confirm that their tests are not included in the AlphaFold training data, or re-run the analysis with these templates removed.

      (4) The ability of AlphaFold to refine the linear epitope of antibody mBG17 is quite impressive and robust to the reproducibility issues the authors have run into. However, Figure 4 seems to suggest that the target epitope adopts an alpha-helical structure. This may be why the score is so high and the prediction is so robust. It would be very useful to see along with the pLDDT by residue plots a structure prediction by residue plot. This would help to see if the high confidence pLDDT is coming more from confidence in the docking of the peptide or confidence in the structure of the peptide.

      (5) Related to the above comment, pLDDT is insufficient as a metric for assessing antibody-antigen interactions. There is a chance (as is nicely shown in Figure S3C) that AlphaFold can be confident and wrong. Here we see two orange-yellow dots (fairly high confidence) that place the peptide COM far from the true binding region. While running the recommended larger validation above, the authors should also include a peptide RMSD or COM distance metric, to show that the peptide identity is confident, and the peptide placement is roughly correct. These predictions are not nearly as valuable if AlphaFold is getting the right answer for the wrong reasons (i.e. high pLDDT but peptide binding to a non-CDR loop region). Eventual users of the software will likely want to make point mutations or perturb the binding regions identified by the structural predictions (as the authors do in Figure 4).

    3. Reviewer #2 (Public Review):

      Summary:

      The authors showed the applicability and usefulness of a new AlphaFold2 pipeline called PabFold, which can predict linear antibody epitopes (B-cell epitopes) that can be helpful for the selection of reagents to be applied in competitive ELISA assay.

      Strengths:

      The authors showed the accuracy of the pipeline to identify correctly the binding epitope for three different antibody-antigen systems (Myc, HA, and Sars-Cov2 nucleocapsid protein). The design of scFvs from Fab of the three antibodies to speed up the analysis time is extremely interesting.

      Weaknesses:

      The article justifies correctly the findings and no great weaknesses are present. However, it could be useful for a broader audience to show in detail how pLDDT was calculated for both Simple-Max approach (per residue-pLDDT) and Consensus analysis ( average pLDDT for each peptide), with associated equations.

    1. Author response:

      Reviewer #1 (Public Review):

      How does the brain respond to the input of different complexity, and does this ability to respond change with age?

      The study by Lalwani et al. tried to address this question by pulling together a number of neuroscientific methodologies (fMRI, MRS, drug challenge, perceptual psychophysics). A major strength of the paper is that it is backed up by robust sample sizes and careful choices in data analysis, translating into a more rigorous understanding of the sensory input as well as the neural metric. The authors apply a novel analysis method developed in human resting-state MRI data on task-based data in the visual cortex, specifically investigating the variability of neural response to stimuli of different levels of visual complexity. A subset of participants took part in a placebo-controlled drug challenge and functional neuroimaging. This experiment showed that increases in GABA have differential effects on participants with different baseline levels of GABA in the visual cortex, possibly modulating the perceptual performance in those with lower baseline GABA. A caveat is that no single cohort has taken part in all study elements, ie visual discrimination with drug challenge and neuroimaging. Hence the causal relationship is limited to the neural variability measure and does not extend to visual performance. Nevertheless, the consistent use of visual stimuli across approaches permits an exceptionally high level of comparability across (computational, behavioural, and fMRI are drawing from the same set of images) modalities. The conclusions that can be made on such a coherent data set are strong.

      The community will benefit from the technical advances, esp. the calculation of BOLD variability, in the study when described appropriately, encouraging further linkage between complementary measures of brain activity, neurochemistry, and signal processing.

      Thank you for your review. We agree that a future study with a single cohort would be an excellent follow-up.

      Reviewer #2 (Public Review):

      Lalwani et al. measured BOLD variability during the viewing of houses and faces in groups of young and old healthy adults and measured ventrovisual cortex GABA+ at rest using MR spectroscopy. The influence of the GABA-A agonist lorazepam on BOLD variability during task performance was also assessed, and baseline GABA+ levels were considered as a mediating variable. The relationship of local GABA to changes in variability in BOLD signal, and how both properties change with age, are important and interesting questions. The authors feature the following results: 1) younger adults exhibit greater task-dependent changes in BOLD variability and higher resting visual cortical GABA+ content than older adults, 2) greater BOLD variability scales with GABA+ levels across the combined age groups, 3) administration of a GABA-A agonist increased condition differences in BOLD variability in individuals with lower baseline GABA+ levels but decreased condition differences in BOLD variability in individuals with higher baseline GABA+ levels, and 4) resting GABA+ levels correlated with a measure of visual sensory ability derived from a set of discrimination tasks that incorporated a variety of stimulus categories.

      Strengths of the study design include the pharmacological manipulation for gauging a possible causal relationship between GABA activity and task-related adjustments in BOLD variability. The consideration of baseline GABA+ levels for interpreting this relationship is particularly valuable. The assessment of feature-richness across multiple visual stimulus categories provided support for the use of a single visual sensory factor score to examine individual differences in behavioral performance relative to age, GABA, and BOLD measurements.

      Weaknesses of the study include the absence of an interpretation of the physiological mechanisms that contribute to variability in BOLD signal, particularly for the chosen contrast that compared viewing houses with viewing faces.

      Whether any of the observed effects can be explained by patterns in mean BOLD signal, independent of variability would be useful to know.

      One of the first pre-processing steps of computing SDBOLD involves subtracting the block-mean from the fMRI signal for each task-condition. Therefore, patterns observed in BOLD signal variability are not driven by the mean-BOLD differences. Moreover, as noted above, to further confirm this, we performed additional mean-BOLD based analysis (See Supplementary Materials Pg 3). Results suggest that ∆⃗ MEANBOLD is actually larger in older adults vs. younger adults (∆⃗ SDBOLD exhibited the opposite pattern), but more importantly ∆⃗ MEANBOLD is not correlated with GABA or with visual performance. This is also consistent with prior research (Garrett et.al. 2011, 2013, 2015, 2020) that found MEANBOLD to be relatively insensitive to behavioral performance.

      The positive correlation between resting GABA+ levels and the task-condition effect on BOLD variability reaches significance at the total group level, when the young and old groups are combined, but not separately within each group. This correlation may be explained by age-related differences since younger adults had higher values than older adults for both types of measurements. This is not to suggest that the relationship is not meaningful or interesting, but that it may be conceptualized differently than presented.

      Thank you for this important point. The relationship between GABA and ∆⃗ SDBOLD shown in Figure 3 is also significant within each age-group separately (Line 386-388). The model used both age-group and GABA as predictors of ∆⃗ SDBOLD and found that both had a significant effect, while the Age-group x GABA interaction was not significant. The effect of age on ∆⃗ SDBOLD therefore does not completely explain the observed relationship between GABA and ∆⃗ SDBOLD because this latter effect is significant in both age-groups individually and in the whole sample even when variance explained by age is accounted for. The revision clarifies this important point (Ln 488-492). Thanks for raising it.

      Two separate dosages of lorazepam were used across individuals, but the details of why and how this was done are not provided, and the possible effects of the dose are not considered.

      Good point. We utilized two dosages to maximize our chances of finding a dosage that had a robust effect. The specific dosage was randomly assigned across participants and the dosage did not differ across age-groups or baseline GABA levels. We also controlled for the drug-dosage when examining the role of drug-related shift in ∆⃗ SDBOLD. We have clarified these points in the revision and highlighted the analysis that found no effect of dosage on drug-related shift in ∆⃗ SDBOLD (Line 407-418).

      The observation of greater BOLD variability during the viewing of houses than faces may be specific to these two behavioral conditions, and lingering questions about whether these effects generalize to other types of visual stimuli, or other non-visual behaviors, in old and young adults, limit the generalizability of the immediate findings.

      We agree that examining the factors that influence BOLD variability is an important topic for future research. In particular, although it is increasingly well known that variability modulation itself can occur in a host of different tasks and research contexts across the lifespan (see Garrett et al., 2013 Waschke et al., 2021), to address the question of whether variability modulation occurs directly in response to stimulus complexity in general, it will be important for future work to examine a range of stimulus categories beyond faces and houses. Doing so is indeed an active area of research in Dr. Garrett’s group, where visual stimuli from many different categories are examined (e.g., for a recent approach, see Waschke et.al.,2023 (biorxiv)). Regardless, only face and house stimuli were available in the current dataset. We therefore exploited the finding that BOLD variability tends to be larger for house stimuli than for face stimuli (in line with the HMAX model output) to demonstrate that the degree to which a given individual modulates BOLD variability in response to stimulus category is related to their age, to GABA levels, and to behavioral performance.

      The observed age-related differences in patterns of BOLD activity and ventrovisual cortex GABA+ levels along with the investigation of GABA-agonist effects in the context of baseline GABA+ levels are particularly valuable to the field, and merit follow-up. Assessing background neurochemical levels is generally important for understanding individualized drug effects. Therefore, the data are particularly useful in the fields of aging, neuroimaging, and vision research.

      Thank you, we agree!

      Reviewer #3 (Public Review):

      The role of neural variability in various cognitive functions is one of the focal contentions in systems and computational neuroscience. In this study, the authors used a largescale cohort dataset to investigate the relationship between neural variability measured by fMRI and several factors, including stimulus complexity, GABA levels, aging, and visual performance. Such investigations are valuable because neural variability, as an important topic, is by far mostly studied within animal neurophysiology. There is little evidence in humans. Also, the conclusions are built on a large-scale cohort dataset that includes multi-model data. Such a dataset per se is a big advantage. Pharmacological manipulations and MRS acquisitions are rare in this line of research. Overall, I think this study is well-designed, and the manuscript reads well. I listed my comments below and hope my suggestions can further improve the paper.

      Strength:

      1). The study design is astonishingly rich. The authors used task-based fMRI, MRS technique, population contrast (aging vs. control), and psychophysical testing. I appreciate the motivation and efforts for collecting such a rich dataset.

      2) The MRS part is good. I am not an expert in MRS so cannot comment on MRS data acquisition and analyses. But I think linking neural variability to GABA in humans is in general a good idea. There has been a long interest in the cause of neural variability, and inhibition of local neural circuits has been hypothesized as one of the key factors. 3. The pharmacological manipulation is particularly interesting as it provides at least evidence for the causal effects of GABA and deltaSDBOLD. I think this is quite novel.

      Weakness:

      1) I am concerned about the definition of neural variability. In electrophysiological studies, neural variability can be defined as Poisson-like spike count variability. In the fMRI world, however, there is no consensus on what neural variability is. There are at least three definitions. One is the variability (e.g., std) of the voxel response time series as used here and in the resting fMRI world. The second is to regress out the stimulusevoked activation and only calculate the std of residuals (e.g., background variability). The third is to calculate variability of trial-by-trial variability of beta estimates of general linear modeling. It currently remains unclear the relations between these three types of variability with other factors. It also remains unclear the links between neuronal variability and voxel variability. I don't think the computational principles discovered in neuronal variability also apply to voxel responses. I hope the authors can acknowledge their differences and discuss their differences.

      These are very important points, thank you for raising them. Although we agree that the majority of the single cell electrophysiology world indeed seems to prefer Poisson-like spiking variability as an easy and tractable estimate, it is certainly not the only variability approach in that field (e.g., entropy; see our most recent work in humans where spiking entropy outperforms simple spike counts to predict memory performance; Waschke et al., 2023, bioRxiv). In LFP, EEG/MEG and fMRI, there is indeed no singular consensus on what variability “is”, and in our opinion, that is a good thing. We have reported at length in past work about entire families of measures of signal variability, from simple variance, to power, to entropy, and beyond (see Table 1 in Waschke et al, 2021, Neuron). In principle, these measures are quite complementary, obviating the need to establish any single-measure consensus per se. Rather than viewing the three measures of neural variability that the reviewer mentioned as competing definitions, we prefer to view them as different sources of variance. For example, from each of the three sources of variance the reviewer suggests, any number of variability measures could be computed.

      The current study focuses on using the standard deviation of concatenated blocked time series separately for face and house viewing conditions (this is the same estimation approach used in our very earliest studies on signal variability; Garrett et al., 2010, JNeurosci). In those early studies, and nearly every one thereafter (see Waschke et al., 2021, Neuron), there is no ostensible link between SDBOLD (as we normaly compute it) and average BOLD from either multivariate or GLM models; as such, we do not find any clear difference in SDBOLD results whether or not average “evoked” responses are removed or not in past work. This is perhaps also why removing ERPs from EEG time series rarely influences estimates of variability in our work (e.g., Kloosterman et al., 2020, eLife).

      The third definition the reviewer notes refers to variability of beta estimates over trials. Our most recent work has done exactly this (e.g., Skowron et al., 2023, bioRxiv), calculating the SD even over single time point-wise beta estimates so that we may better control the extraction of time points prior to variability estimation. Although direct comparisons have not yet been published by us, variability over single TR beta estimates and variability over the time series without beta estimation are very highly correlated in our work (in the .80 range; e.g., Kloosterman et al., in prep).

      Re: the reviewer’s point that “It also remains unclear the links between neuronal variability and voxel variability. I don’t think the computational principles discovered in neuronal variability also apply to voxel responses. I hope the authors can acknowledge their differences and discuss their differences.” If we understand correctly, the reviewer maybe asking about within-person links between single-cell neuronal variability (to allow Poisson-like spiking variability) and voxel variability in fMRI? No such study has been conducted to date to our knowledge (such data almost don’t exist). Or rather, perhaps the reviewer is noting a more general point regarding the “computational principles” of variability in these different domains? If that is true, then a few points are worth noting. First, there is absolutely no expectation of Poisson distributions in continuous brain imaging-based time series (LFP, E/MEG, fMRI). To our knowledge, such distributions (which have equivalent means and variances, allowing e.g., Fano factors to be estimated) are mathematically possible in spiking because of the binary nature of spikes; when mean rates rise, so too do variances given that activity pushes away from the floor (of no activity). In continuous time signals, there is no effective “zero”, so a mathematical floor does not exist outright. This is likely why means and variances are not well coupled in continuous time signals (see Garrett et al., 2013, NBR; Waschke et al., 2021, Neuron); anything can happen. Regardless, convergence is beginning to be revealed between the effects noted from spiking and continuous time estimates of variability. For example, we show that spiking variability can show a similar, behaviourally relevant coupling to the complexity of visual input (Waschke et al., 2023, bioRxiv) as seen in the current study and in past work (e.g., Garrett et al., 2020, NeuroImage). Whether such convergence reflects common computational principles of variability remains to be seen in future work, despite known associations between single cell recordings and BOLD overall (e.g., Logothetis and colleagues, 2001, 2002, 2004, 2008).

      Given the intricacies of these arguments, we don’t currently include this discussion in the revised text. However, we would be happy to include aspects of this content in the main paper if the reviewer sees fit.

      2) If I understand it correctly, the positive relationship between stimulus complexity and voxel variability has been found in the author's previous work. Thus, the claims in the abstract in lines 14-15, and section 1 in results are exaggerated. The results simply replicate the findings in the previous work. This should be clearly stated.

      Good point. Since this finding was a replication and an extension, we reported these results mostly in the supplementary materials. The stimulus set used for the current study is different than Garrett et.al. 2020 and therefore a replication is important. Moreover, we have extended these findings across young and older adults (previous work was based on older adults alone). We have modified the text to clarify what is a replication and what part are extension/novel about the current study now (Line 14, 345 and 467). Thanks for the suggestion.

      3) It is difficult for me to comprehend the U-shaped account of baseline GABA and shift in deltaSDBOLD. If deltaSDBOLD per se is good, as evidenced by the positive relationship between brainscore and visual sensitivity as shown in Fig. 5b and the discussion in lines 432-440, why the brain should decrease deltaSDBOLD ?? or did I miss something? I understand that "average is good, outliers are bad". But a more detailed theory is needed to account for such effects.

      When GABA levels are increased beyond optimal levels, neuronal firing rates are reduced, effectively dampening neural activity and limiting dynamic range; in the present study, this resulted in reduced ∆⃗ SDBOLD. Thus, the observed drug-related decrease in ∆⃗ SDBOLD was most present in participants with already high levels of GABA. We have now added an explanation for the expected inverted-U (Line 523-546). The following figure tries to explain this with a hypothetical curve diagram and how different parts of Fig 4 might be linked to different points in such a curve.

      Author response image 1.

      Line 523-546 – “We found in humans that the drug-related shift in ∆⃗ SDBOLD could be either positive or negative, while being negatively related to baseline GABA. Thus, boosting GABA activity with drug during visual processing in participants with lower baseline GABA levels and low levels of ∆⃗ SDBOLD resulted in an increase in ∆⃗ SDBOLD (i.e., a positive change in ∆⃗ SDBOLD on drug compared to off drug). However, in participants with higher baseline GABA levels and higher ∆⃗ SDBOLD, when GABA was increased presumably beyond optimal levels, participants experienced no-change or even a decrease in∆⃗ SDBOLD on drug. These findings thus provide the first evidence in humans for an inverted-U account of how GABA may link to variability modulation.

      Boosting low GABA levels in older adults helps increase ∆⃗ SDBOLD, but why does increasing GABA levels lead to reduced ∆⃗ SDBOLD in others? One explanation is that higher than optimal levels of inhibition in a neuronal system can lead to dampening of the entire network. The reduced neuronal firing decreases the number of states the network can visit and decreases the dynamic range of the network. Indeed, some anesthetics work by increasing GABA activity (for example propofol a general anesthetic modulates activity at GABAA receptors) and GABA is known for its sedative properties. Previous research showed that propofol leads to a steeper power spectral slope (a measure of the “construction” of signal variance) in monkey ECoG recordings (Gao et al., 2017). Networks function optimally only when dynamics are stabilized by sufficient inhibition. Thus, there is an inverted-U relationship between ∆⃗ SDBOLD and GABA that is similar to that observed with other neurotransmitters.”

      4) Related to the 3rd question, can you show the relationship between the shift of deltaSDBOLD (i.e., the delta of deltaSDBOLD) and visual performance?

      We did not have data on visual performance from the same participants that completed the drug-based part of the study (Subset1 vs 3; see Figure 1); therefore, we unfortunately cannot directly investigate the relationship between the drug-related shift of ∆⃗ SDBOLD and visual performance. We have now highlighted that this as a limitation of the current study (Line 589-592), where we state: One limitation of the current study is that participants who received the drug-manipulation did not complete the visual discrimination task, thus we could not directly assess how the drug-related change in ∆⃗ SDBOLD impacted visual performance.

      5) Are the dataset openly available?? I didn't find the data availability statement.

      An excel-sheet with all the processed data to reproduce figures and results has been included in source data submitted along with the manuscript along with a data dictionary key for various columns. The raw MRI, MRS and fMRI data used in the current manuscript was collected as a part of a larger (MIND) study and will eventually be made publicly available on completion of the study (around 2027). Before that time, the raw data can be obtained for research purposes upon reasonable request. Processing code will be made available on GitHub.

    2. eLife assessment

      This important study combines across multiple complementary neuroscientific methods to understand the neural response to visual stimulus complexity in the human brain across lifespan. Lalwani et al., provide solid evidence, drawing from appropriate and validated methodology. A weakness is that key information about methodological details and controls is still outstanding, as is a discussion on how generalizable the findings are. With these elements strengthened, the study would be of broad interest to neuroscientists and biologists interested in aging and sensory processing.

    3. Reviewer #1 (Public Review):

      How does the brain respond to the input of different complexity, and does this ability to respond change with age?

      The study by Lalwani et al. tried to address this question by pulling together a number of neuroscientific methodologies (fMRI, MRS, drug challenge, perceptual psychophysics). A major strength of the paper is that it is backed up by robust sample sizes and careful choices in data analysis, translating into a more rigorous understanding of the sensory input as well as the neural metric. The authors apply a novel analysis method developed in human resting-state MRI data on task-based data in the visual cortex, specifically investigating the variability of neural response to stimuli of different levels of visual complexity. A subset of participants took part in a placebo-controlled drug challenge and functional neuroimaging. This experiment showed that increases in GABA have differential effects on participants with different baseline levels of GABA in the visual cortex, possibly modulating the perceptual performance in those with lower baseline GABA. A caveat is that no single cohort has taken part in all study elements, ie visual discrimination with drug challenge and neuroimaging. Hence the causal relationship is limited to the neural variability measure and does not extend to visual performance. Nevertheless, the consistent use of visual stimuli across approaches permits an exceptionally high level of comparability across (computational, behavioural, and fMRI are drawing from the same set of images) modalities. The conclusions that can be made on such a coherent data set are strong.

      The community will benefit from the technical advances, esp. the calculation of BOLD variability, in the study when described appropriately, encouraging further linkage between complementary measures of brain activity, neurochemistry, and signal processing.

    4. Reviewer #2 (Public Review):

      Lalwani et al. measured BOLD variability during the viewing of houses and faces in groups of young and old healthy adults and measured ventrovisual cortex GABA+ at rest using MR spectroscopy. The influence of the GABA-A agonist lorazepam on BOLD variability during task performance was also assessed, and baseline GABA+ levels were considered as a mediating variable. The relationship of local GABA to changes in variability in BOLD signal, and how both properties change with age, are important and interesting questions. The authors feature the following results: 1) younger adults exhibit greater task-dependent changes in BOLD variability and higher resting visual cortical GABA+ content than older adults, 2) greater BOLD variability scales with GABA+ levels across the combined age groups, 3) administration of a GABA-A agonist increased condition differences in BOLD variability in individuals with lower baseline GABA+ levels but decreased condition differences in BOLD variability in individuals with higher baseline GABA+ levels, and 4) resting GABA+ levels correlated with a measure of visual sensory ability derived from a set of discrimination tasks that incorporated a variety of stimulus categories.

      Strengths of the study design include the pharmacological manipulation for gauging a possible causal relationship between GABA activity and task-related adjustments in BOLD variability. The consideration of baseline GABA+ levels for interpreting this relationship is particularly valuable. The assessment of feature-richness across multiple visual stimulus categories provided support for the use of a single visual sensory factor score to examine individual differences in behavioral performance relative to age, GABA, and BOLD measurements. Weaknesses of the study include the absence of an interpretation of the physiological mechanisms that contribute to variability in BOLD signal, particularly for the chosen contrast that compared viewing houses with viewing faces. Whether any of the observed effects can be explained by patterns in mean BOLD signal, independent of variability would be useful to know. The positive correlation between resting GABA+ levels and the task-condition effect on BOLD variability reaches significance at the total group level, when the young and old groups are combined, but not separately within each group. This correlation may be explained by age-related differences since younger adults had higher values than older adults for both types of measurements. This is not to suggest that the relationship is not meaningful or interesting, but that it may be conceptualized differently than presented. Two separate dosages of lorazepam were used across individuals, but the details of why and how this was done are not provided, and the possible effects of the dose are not considered. The observation of greater BOLD variability during the viewing of houses than faces may be specific to these two behavioral conditions, and lingering questions about whether these effects generalize to other types of visual stimuli, or other non-visual behaviors, in old and young adults, limit the generalizability of the immediate findings.

      The observed age-related differences in patterns of BOLD activity and ventrovisual cortex GABA+ levels along with the investigation of GABA-agonist effects in the context of baseline GABA+ levels are particularly valuable to the field, and merit follow-up. Assessing background neurochemical levels is generally important for understanding individualized drug effects. Therefore, the data are particularly useful in the fields of aging, neuroimaging, and vision research.

    5. Reviewer #3 (Public Review):

      The role of neural variability in various cognitive functions is one of the focal contentions in systems and computational neuroscience. In this study, the authors used a large-scale cohort dataset to investigate the relationship between neural variability measured by fMRI and several factors, including stimulus complexity, GABA levels, aging, and visual performance. Such investigations are valuable because neural variability, as an important topic, is by far mostly studied within animal neurophysiology. There is little evidence in humans. Also, the conclusions are built on a large-scale cohort dataset that includes multi-model data. Such a dataset per se is a big advantage. Pharmacological manipulations and MRS acquisitions are rare in this line of research. Overall, I think this study is well-designed, and the manuscript reads well. I listed my comments below and hope my suggestions can further improve the paper.

      Strength:<br /> (1) The study design is astonishingly rich. The authors used task-based fMRI, MRS technique, population contrast (aging vs. control), and psychophysical testing. I appreciate the motivation and efforts for collecting such a rich dataset.<br /> (2) The MRS part is good. I am not an expert in MRS so cannot comment on MRS data acquisition and analyses. But I think linking neural variability to GABA in humans is in general a good idea. There has been a long interest in the cause of neural variability, and inhibition of local neural circuits has been hypothesized as one of the key factors.<br /> (3) The pharmacological manipulation is particularly interesting as it provides at least evidence for the causal effects of GABA and deltaSDBOLD. I think this is quite novel.

      Weakness:<br /> (1) I am concerned about the definition of neural variability. In electrophysiological studies, neural variability can be defined as Poisson-like spike count variability. In the fMRI world, however, there is no consensus on what neural variability is. There are at least three definitions. One is the variability (e.g., std) of the voxel response time series as used here and in the resting fMRI world. The second is to regress out the stimulus-evoked activation and only calculate the std of residuals (e.g., background variability). The third is to calculate variability of trial-by-trial variability of beta estimates of general linear modeling. It currently remains unclear the relations between these three types of variability with other factors. It also remains unclear the links between neuronal variability and voxel variability. I don't think the computational principles discovered in neuronal variability also apply to voxel responses. I hope the authors can acknowledge their differences and discuss their differences.<br /> (2) If I understand it correctly, the positive relationship between stimulus complexity and voxel variability has been found in the author's previous work. Thus, the claims in the abstract in lines 14-15, and section 1 in results are exaggerated. The results simply replicate the findings in the previous work. This should be clearly stated.<br /> (3) It is difficult for me to comprehend the U-shaped account of baseline GABA and shift in deltaSDBOLD. If deltaSDBOLD per se is good, as evidenced by the positive relationship between brainscore and visual sensitivity as shown in Fig. 5b and the discussion in lines 432-440, why the brain should decrease deltaSDBOLD ?? or did I miss something? I understand that "average is good, outliers are bad". But a more detailed theory is needed to account for such effects.<br /> (4) Related to the 3rd question, can you should the relationship between the shift of deltaSDBOLD (i.e., the delta of deltaSDBOLD) and visual performance?<br /> (5) Are the dataset openly available ?? I didn't find the data availability statement.

    1. Author response:

      Reviewer #1 (Public Review):

      Reviewer #1, comment #1: The study is thorough and systematic, and in comparing three well-separated hypotheses about the mechanism leading from grid cells to hexasymmetry it takes a neutral stand above the fray which is to be particularly appreciated. Further, alternative models are considered for the most important additional factor, the type of trajectory taken by the agent whose neural activity is being recorded. Different sets of values, including both "ideal" and "realistic" ones, are considered for the parameters most relevant to each hypothesis. Each of the three hypotheses is found to be viable under some conditions, and less so in others. Having thus given a fair chance to each hypothesis, nevertheless, the study reaches the clear conclusion that the first one, based on conjunctive grid-by-head-direction cells, is much more plausible overall; the hypothesis based on firing rate adaptation has intermediate but rather weak plausibility; and the one based on clustering of cells with similar spatial phases in practice would not really work. I find this conclusion convincing, and the procedure to reach it, a fair comparison, to be the major strength of the study.

      Response: Thanks for your positive assessment of our manuscript.

      Reviewer #1, comment #2: What I find less convincing is the implicit a priori discarding of a fourth hypothesis, that is, that the hexasymmetry is unrelated to the presence of grid cells. Full disclosure: we have tried unsuccessfully to detect hexasymmetry in the EEG signal from vowel space and did not find any (Kaya, Soltanipour and Treves, 2020), so I may be ranting off my disappointment, here. I feel, however, that this fourth hypothesis should be at least aired, for a number of reasons. One is that a hexasymmetry signal has been reported also from several other cortical areas, beyond entorhinal cortex (Constantinescu et al, 2016); true, also grid cells in rodents have been reported in other cortical areas as well (Long and Zhang, 2021; Long et al, bioRxiv, 2021), but the exact phenomenology remains to be confirmed.

      Response: Thank you for the suggestion to add the hypothesis that the neural hexasymmetry observed in previous fMRI and intracranial EEG studies may be unrelated to grid cells. Following your suggestion, we have now mentioned at the end of the fourth paragraph of the Introduction that “the conjunctive grid by head-direction cell hypothesis does not necessarily depend on an alignment between the preferred head directions with the grid axes”. Furthermore, at the end of section “Potential mechanisms underlying hexadirectional population signals in the entorhinal cortex” (in the Discussion) we write: “However, none of the three hypotheses described here may be true and another mechanism may explain macroscopic grid-like representations. This includes the possibility that neural hexasymmetry is completely unrelated to grid-cell activity, previously summarized as the ‘independence hypothesis' (Kunz et al., 2019). For example, a population of head-direction cells whose preferred head directions occur at offsets of 60 degrees from each other could result in neural hexasymmetry in the absence of grid cells. The conjunctive grid by head-direction cell hypothesis thus also works without grid cells, which may explain why grid-like representations have been observed (using fMRI) in regions outside the entorhinal cortex, where rodent studies have not yet identified grid cells (Doeller et al., 2010; Constantinescu et al., 2016). In that case, however, another mechanism would be needed that could explain why the preferred head directions of different head-direction cells occur at multiples of 60 degrees. Attractor-network structures may be involved in such a mechanism, but this remains speculative at the current stage.” We now also mention the results from Long and Zhang (second paragraph of the Introduction): “Surprisingly, grid cells have also been observed in the primary somatosensory cortex in foraging rats (Long and Zhang, 2021).”

      Regarding your EEG study, we have added a reference to it in the manuscript and state that it is an example for a study that did not find evidence for neural hexasymmetry (end of first paragraph of the Discussion): “We note though that some studies did not find evidence for neural hexasymmetry. For example, a surface EEG study with participants “navigating” through an abstract vowel space did not observe hexasymmetry in the EEG signal as a function of the participants’ movement direction through vowel space (Kaya et al., 2020). Another fMRI study did not find evidence for grid-like representations in the ventromedial prefrontal cortex while participants performed value-based decision making (Lee et al., 2021). This raises the question whether the detection of macroscopic grid-like representations is limited to some recording techniques (e.g., fMRI and iEEG but not surface EEG) and to what extent they are present in different tasks.”

      Reviewer #1, comment #3: Second, as the authors note, the conjunctive mechanism is based on the tight coupling of a narrow head direction selectivity to one of the grid axes. They compare "ideal" with "Doeller" parameters, but to me the "Doeller" ones appear rather narrower than commonly observed and, crucially, they are applied to all cells in the simulations, whereas in reality only a proportion of cells in mEC are reported to be grid cells, only a proportion of them to be conjunctive, and only some of these to be narrowly conjunctive. Further, Gerlei et al (2020) find that conjunctive grid cells may have each of their fields modulated by different head directions, a truly surprising phenomenon that, if extensive, seems to me to cast doubts on the relation between mass activity hexasymmetry and single grid cells.

      Response: We have revised the manuscript in several ways to address the different aspects of this comment.

      Firstly, we agree with the reviewer that our “Doeller” parameter for the tuning width is narrower than commonly observed. We have therefore reevaluated the concentration parameter κ_c in the ‘realistic’ case from 10 rad-2 (corresponding to a tuning width of 18o) to 4 rad-2 (corresponding to a tuning width of 29o). We chose this value by referring to Supplementary Figure 3 of Doeller et al. (2010). In their figure, the tuning curves usually cover between one sixth and one third of a circle. Since stronger head-direction tuning contributes the most to the resulting hexasymmetry, we chose a value of κ_c=4 for the tuning parameter, which corresponds to a tuning width (= half width) of 29o (full width of roughly one sixth of a circle). Regarding the coupling of the preferred head directions to the grid axes, the specific value of the jitter σc = 3 degrees that quantifies the coupling of the head-direction preference to the grid axes was extracted from the 95% confidence interval given in the third row of the Table in Supplementary Figure 5b of Doeller et al. 2010. We now better explain the origin of these values in our new Methods section “Parameter estimation” and provide an overview of all parameter values in Table 1.

      Furthermore, in response to your comment, we have revised Figure 2E to show neural hexasymmetries for a larger range of values of the jitter (σc from 0 to 30 degrees), going way beyond the values that Doeller et al. suggested. We have also added a new supplementary figure (Figure 2 – figure supplement 1) where we further extend the range of tuning widths (parameter κ_c) to 60 degrees. This provides the reader with a comprehensive understanding of what parameter values are needed to reach a particular hexasymmetry.

      Regarding your comments on the prevalence of conjunctive grid by head-direction cells, we have revised the manuscript to make it explicit that the actual percentage of conjunctive cells with the necessary properties may be low in the entorhinal cortex (first paragraph of section “A note on our choice of the values of model parameters” of the Discussion): “Empirical studies in rodents found a wide range of tuning widths among grid cells ranging from broad to narrow (Doeller et al., 2010; Sargolini et al., 2006). The percentage of conjunctive cells in the entorhinal cortex with a sufficiently narrow tuning may thus be low. Such distributions (with a proportionally small amount of narrowly tuned conjunctive cells) lead to low values in the absolute hexasymmetry. The neural hexasymmetry in this case would be driven by the subset of cells with sufficiently narrow tuning widths. If this causes the neural hexasymmetry to drop below noise levels, the statistical evaluation of this hypothesis would change.” In addition, in Figure 5, we have applied the coupling between preferred head directions and grid axes to only one third of all grid cells (parameter pc= ⅓ in Table 1), following the values reported by Boccara et al. 2010 and Sargolini et al. 2006. To strengthen the link between Figure 5 and Figure 2, we now state the hexasymmetry when using pc= ⅓ along with a ‘realistic’ tuning width and jitter for head-direction modulated grid cells in Figure 2H. Additionally, we performed new simulations where we observed a linear relationship (above the noise floor) between the proportion of conjunctive cells and the hexasymmetry. This shall help the reader understand the effect of a reduced percentage of conjunctive cells on the absolute hexasymmetry values. We have added these results as a new supplementary figure (Figure 2 – figure supplement 2).

      Finally, regarding your comment on the findings by Gerlei et al. 2020, we now reference this study in our manuscript and discuss the possible implications (second paragraph of section “A note on our choice of the values of model parameters” of the Discussion): “Additionally, while we assumed that all conjunctive grid cells maintain the same preferred head direction between different firing fields, conjunctive grid cells have also been shown to exhibit different preferred head directions in different firing fields (Gerlei et al., 2020). This could lead to hexadirectional modulation if the different preferred head directions are offset by 60o from each other, but will not give rise to hexadirectional modulation if the preferred head directions are randomly distributed. To the best of our knowledge, the distribution of preferred head directions was not quantified by Gerlei et al. (2020), thus this remains an open question.”

      Reviewer #1, comment #4: Finally, a variant of the fourth hypothesis is that the hexasymmetry might be produced by a clustering of head direction preferences across head direction cells similar to that hypothesized in the first hypothesis, but without such cells having to fire in grid patterns. If head direction selectivity is so clustered, who needs the grids? This would explain why hexasymmetry is ubiquitous, and could easily be explored computationally by, in fact, a simplification of the models considered in this study.

      Response: We fully agree with you. We now explain this possibility in the Introduction where we introduce the conjunctive grid by head-direction cell hypothesis (fourth paragraph of the Introduction) and return to it in the Discussion (section “Potential mechanisms underlying hexadirectional population signals in the entorhinal cortex”). There, we now also explain that in such a case another mechanism would be needed to ensure that the preferred head directions of head-direction cells exhibit six-fold rotational symmetry.

      Reviewer #2 (Public Review):

      Reviewer #2, comment #1: Grid cells - originally discovered in single-cell recordings from the rodent entorhinal cortex, and subsequently identified in single-cell recordings from the human brain - are believed to contribute to a range of cognitive functions including spatial navigation, long-term memory function, and inferential reasoning. Following a landmark study by Doeller et al. (Nature, 2010), a plethora of human neuroimaging studies have hypothesised that grid cell population activity might also be reflected in the six-fold (or 'hexadirectional') modulation of the BOLD signal (following the six-fold rotational symmetry exhibited by individual grid cell firing patterns), or in the amplitude of oscillatory activity recorded using MEG or intracranial EEG. The mechanism by which these network-level dynamics might arise from the firing patterns of individual grid cells remains unclear, however.

      In this study, Khalid and colleagues use a combination of computational modelling and mathematical analysis to evaluate three competing hypotheses that describe how the hexadirectional modulation of population firing rates (taken as a simple proxy for the BOLD, MEG, or iEEG signal) might arise from the firing patterns of individual grid cells. They demonstrate that all three mechanisms could account for these network-level dynamics if a specific set of conditions relating to the agent's movement trajectory and the underlying properties of grid cell firing patterns are satisfied.

      The computational modelling and mathematic analyses presented here are rigorous, clearly motivated, and intuitively described. In addition, these results are important both for the interpretation of hexadirectional modulation in existing data sets and for the design of future experiments and analyses that aim to probe grid cell population activity. As such, this study is likely to have a significant impact on the field by providing a firmer theoretical basis for the interpretation of neuroimaging data. To my mind, the only weakness is the relatively limited focus on the known properties of grid cells in rodent entorhinal cortex, and the network level activity that these firing patterns might be expected to produce under each hypothesis. Strengthening the link with existing neurobiology would further enhance the importance of these results for those hoping to assay grid cell firing patterns in recordings of ensemble-level neural activity.

      Response: Thank you very much for reviewing our manuscript and your positive assessment. Following your comments, we have revised the manuscript to more closely link our simulations to known properties of grid cells in the rodent entorhinal cortex.

      Reviewer #3 (Public Review):

      Reviewer #3, comment #1: This is an interesting and carefully carried out theoretical analysis of potential explanations for hexadirectional modulation of neural population activity that has been reported in the human entorhinal cortex and some other cortical regions. The previously reported hexadirectional modulation is of considerable interest as it has been proposed to be a proxy for the activation of grid cell networks. However, the extent to which this proposal is consistent with the known firing properties of grids hasn't received the attention it perhaps deserves. By comparing the predictions of three different models this study imposes constraints on possible mechanisms and generates predictions that can be tested through future experimentation.

      Overall, while the conclusions of the study are convincing, I think the usefulness to the field would be increased if null hypotheses were more carefully considered and if the authors' new metric for hexadirectional modulation (H) could be directly contrasted with previously used metrics. For example, if the effect sizes for hexadirectional modulation in the previous fMRI and EEG data could be more directly compared with those of the models here, then this could help in establishing the extent to which the experimental hexadirectional modulation stands out from path hexasymmetry and how close it comes to the striking modulation observed with the conjunctive models. It could also be helpful to consider scenarios in which hexadirectional modulation is independent of grid firing, for example perhaps with appropriate coordination of head direction cell firing.

      Response: Thanks for reviewing our manuscript and for the overall positive assessment. The new Methods section “Implementation of previously used metrics” starts with the following sentences: “We applied three previously used metrics to our framework: the Generalized Linear Model (GLM) method by Doeller et al. 2010; the GLM method with binning by Kunz et al. 2015; and the circular-linear correlation method by Maidenbaum et al. 2018.” We have created a new supplementary figure (Figure 5 – figure supplement 4) in which we compare the results from these other methods to the results of our new method. Overall, the results are highly similar, indicating that all these methods are equally suited to test for a hexadirectional modulation of neural activity.

      In section “Implementation of previously used metrics” we then explain: “In brief, in the GLM method (e.g. used in Doeller et al., 2010), the hexasymmetry is found in two steps: the orientation of the hexadirectional modulation is first estimated on the first half of the data by using the regressors and on the time-discrete fMRI activity (Equation 9), with θt being the movement direction of the subject in time step t. The amplitude of the signal is then estimated on the second half of the data using the single regressor , where . The hexasymmetry is then evaluated as .

      The GLM method with binning (e.g. used in Kunz et al., 2015) uses the same procedure as the GLM method for estimating the grid orientation in the first half of the data, but the amplitude is estimated differently on the second half by a regressor that has a value 1 if θt is aligned with a peak of the hexadirectional modulation (aligned if , modulo operator) and a value of -1 if θt is misaligned. The hexasymmetry is then calculated from the amplitude in the same way as in the GLM method.

      The circular-linear correlation method (e.g. used in Maidenbaum et al., 2018) is similar to the GLM method in that it uses the regressors β1 cos(6θ_t) and β2 on the time-discrete mean activity, but instead of using β1 and β2 to estimate the orientation of the hexadirectional modulation, the beta values are directly used to estimate the hexasymmetry using the relation .”

      For each of the three previously used metrics and our new method, we estimated the resulting hexasymmetry (new Figure 5 – figure supplement 4 in the manuscript). In the Methods section “Implementation of previously used metrics” we then continue with our explanations: “Regarding the statistical evaluation, each method evaluates the size of the neural hexasymmetry differently. Specifically, the new method developed in our manuscript compares the neural hexasymmetry to path hexasymmetry to test whether neural hexasymmetry is significantly above path hexasymmetry. For the two generalized linear model (GLM) methods, we compare the hexasymmetry to zero (using the Mann-Whitney U test) to establish significance. Hexasymmetry values can be negative in these approaches, allowing the statistical comparison against 0. Negative values occur when the estimated grid orientation from the first data half does not match the grid orientation from the second data half. Regarding the statistical evaluation of the circular-linear correlation method, we calculated a z-score by comparing each empirical observation of the hexasymmetry to hexasymmetries from a set of surrogate distributions (as in Maidenbaum et al., 2018). We then calculate a p-value by comparing the distribution of z-scores versus zero using a Mann-Whitney U test. We use the z-scores instead of the hexasymmetry for the circular-linear correlation method to match the procedure used in Maidenbaum et al. (2018). We obtained the surrogate distributions by circularly shifting the vector of movement directions relative to the time dependent vector of firing rates. For random walks, the vector is shifted by a random number drawn from a uniform distribution defined with the same length as the number of time points in the vector of movement directions. For the star-like walks and piecewise linear walks, the shift is a random integer multiplied by the number of time points in a linear segment. Circularly shifting the vector of movement directions scrambles the correlations between movement direction and neural activity while preserving their temporal structure.”

      The results of these simulations, i.e. the comparison of our new method to previously used metrics, are summarized in Figure 5 – figure supplement 4 and show qualitatively identical findings when using the different methods. We have added this information also to the manuscript in the third paragraph of section “Quantification of hexasymmetry of neural activity and trajectories” of the Methods: “Empirical (fMRI/iEEG) studies (e.g. Doeller et al., 2010; Kunz et al., 2015; Maidenbaum et al., 2018) addressed this problem of trajectories spuriously contributing to hexasymmetry by fitting a Generalized Linear Model (GLM) to the time discrete fMRI/iEEG activity. In contrast, our new approach to hexasymmetry in Equation (12) quantifies the contribution of the path to the neural hexasymmetry explicitly, and has the advantage that it allows an analytical treatment (see next section). Comparing our new method with previous methods for evaluating hexasymmetry led to qualitatively identical statistical effects (Figure 5 – figure supplement 4).” We have also added a pointer to this new supplementary figure in the caption of Figure 5 in the manuscript: “For a comparison between our method and previously used methods for evaluating hexasymmetry, see Figure 5 – figure supplement 4.”

    2. eLife assessment

      This computational work represents a valuable and long overdue assessment of the potential mechanisms associating patterns of activity of entorhinal grid cells, recorded mostly in rodents, with the population property of hexasymmetry detected in non-invasive human studies. The methodic comparison of alternative hypotheses is compelling, and the conclusions are important for the future design of experiments assessing the neural correlates of human navigation across physical, virtual, or conceptual spaces.

    3. Reviewer #1 (Public Review):

      The study is thorough and systematic, and in comparing three well-separated hypotheses about the mechanism leading from grid cells to hexasymmetry it takes a neutral stand above the fray which is to be particularly appreciated. Further, alternative models are considered for the most important additional factor, the type of trajectory taken by the agent whose neural activity is being recorded. Different sets of values, including both "ideal" and "realistic" ones, are considered for the parameters most relevant to each hypothesis. Each of the three hypotheses is found to be viable under some conditions, and less so in others. Having thus given a fair chance to each hypothesis, nevertheless, the study reaches the clear conclusion that the first one, based on conjunctive grid-by-head-direction cells, is much more plausible overall; the hypothesis based on firing rate adaptation has intermediate but rather weak plausibility; and the one based on clustering of cells with similar spatial phases in practice would not really work. I find this conclusion convincing, and the procedure to reach it, a fair comparison, to be the major strength of the study.

      What I find less convincing is the implicit a priori discarding of a fourth hypothesis, that is, that the hexasymmetry is unrelated to the presence of grid cells. Full disclosure: we have tried unsuccessfully to detect hexasymmetry in the EEG signal from vowel space and did not find any (Kaya, Soltanipour and Treves, 2020), so I may be ranting off my disappointment, here. I feel, however, that this fourth hypothesis should be at least aired, for a number of reasons. One is that a hexasymmetry signal has been reported also from several other cortical areas, beyond entorhinal cortex (Constantinescu et al, 2016); true, also grid cells in rodents have been reported in other cortical areas as well (Long and Zhang, 2021; Long et al, bioRxiv, 2021), but the exact phenomenology remains to be confirmed. Second, as the authors note, the conjunctive mechanism is based on the tight coupling of a narrow head direction selectivity to one of the grid axes. They compare "ideal" with "Doeller" parameters, but to me the "Doeller" ones appear rather narrower than commonly observed and, crucially, they are applied to all cells in the simulations, whereas in reality only a proportion of cells in mEC are reported to be grid cells, only a proportion of them to be conjunctive, and only some of these to be narrowly conjunctive. Further, Gerlei et al (2020) find that conjunctive grid cells may have each of their fields modulated by different head directions, a truly surprising phenomenon that, if extensive, seems to me to cast doubts on the relation between mass activity hexasymmetry and single grid cells.

      Finally, a variant of the fourth hypothesis is that the hexasymmetry might be produced by a clustering of head direction preferences across head direction cells similar to that hypothesized in the first hypothesis, but without such cells having to fire in grid patterns. If head direction selectivity is so clustered, who needs the grids? This would explain why hexasymmetry is ubiquitous, and could easily be explored computationally by, in fact, a simplification of the models considered in this study.

    4. Reviewer #2 (Public Review):

      Grid cells - originally discovered in single-cell recordings from the rodent entorhinal cortex, and subsequently identified in single-cell recordings from the human brain - are believed to contribute to a range of cognitive functions including spatial navigation, long-term memory function, and inferential reasoning. Following a landmark study by Doeller et al. (Nature, 2010), a plethora of human neuroimaging studies have hypothesised that grid cell population activity might also be reflected in the six-fold (or 'hexadirectional') modulation of the BOLD signal (following the six-fold rotational symmetry exhibited by individual grid cell firing patterns), or in the amplitude of oscillatory activity recorded using MEG or intracranial EEG. The mechanism by which these network-level dynamics might arise from the firing patterns of individual grid cells remains unclear, however.

      In this study, Khalid and colleagues use a combination of computational modelling and mathematical analysis to evaluate three competing hypotheses that describe how the hexadirectional modulation of population firing rates (taken as a simple proxy for the BOLD, MEG, or iEEG signal) might arise from the firing patterns of individual grid cells. They demonstrate that all three mechanisms could account for these network-level dynamics if a specific set of conditions relating to the agent's movement trajectory and the underlying properties of grid cell firing patterns are satisfied.

      The computational modelling and mathematic analyses presented here are rigorous, clearly motivated, and intuitively described. In addition, these results are important both for the interpretation of hexadirectional modulation in existing data sets and for the design of future experiments and analyses that aim to probe grid cell population activity. As such, this study is likely to have a significant impact on the field by providing a firmer theoretical basis for the interpretation of neuroimaging data. To my mind, the only weakness is the relatively limited focus on the known properties of grid cells in rodent entorhinal cortex, and the network level activity that these firing patterns might be expected to produce under each hypothesis. Strengthening the link with existing neurobiology would further enhance the importance of these results for those hoping to assay grid cell firing patterns in recordings of ensemble-level neural activity.

    5. Reviewer #3 (Public Review):

      This is an interesting and carefully carried out theoretical analysis of potential explanations for hexadirectional modulation of neural population activity that has been reported in the human entorhinal cortex and some other cortical regions. The previously reported hexadirectional modulation is of considerable interest as it has been proposed to be a proxy for the activation of grid cell networks. However, the extent to which this proposal is consistent with the known firing properties of grids hasn't received the attention it perhaps deserves. By comparing the predictions of three different models this study imposes constraints on possible mechanisms and generates predictions that can be tested through future experimentation.

      Overall, while the conclusions of the study are convincing, I think the usefulness to the field would be increased if null hypotheses were more carefully considered and if the authors' new metric for hexadirectional modulation (H) could be directly contrasted with previously used metrics. For example, if the effect sizes for hexadirectional modulation in the previous fMRI and EEG data could be more directly compared with those of the models here, then this could help in establishing the extent to which the experimental hexadirectional modulation stands out from path hexasymmetry and how close it comes to the striking modulation observed with the conjunctive models. It could also be helpful to consider scenarios in which hexadirectional modulation is independent of grid firing, for example perhaps with appropriate coordination of head direction cell firing.

    1. Author response:

      Reviewer #1 (Public Review):

      Metabotropic glutamate receptors (mGLuRs) play a key role in regulating neuronal activity and related behaviors. In different brain regions these receptors can be expressed presynaptically and postsynaptically in different classes of neurons. Therefore, it is difficult to predict the effects of systemically applied drugs that act on these receptors. Here, the authors harness the power of photopharmacology, applying modulators that can be activated or inactivated by light with spatial precision, to address this problem. Their stated goal is to determine the role of mGluRs in regulating pain behaviors, and the circuit mechanisms driving this regulation. Their findings suggest that mGluRs acting in medial prefrontal cortex and thalamus drive antinociception in animals with neuropathic pain, whereas these receptors drive pronociception when acting in the amygdala. Their circuit analysis suggests that, in the amygdala, mGluRs act by decreasing feedforward inhibition of the output neurons. These findings have the potential to affect the development of targeted treatment for pain and related disorders. The elegant photopharmacological approaches will likely inform future studies attempting to distinguish the action of neuroactive drugs in different brain regions.

      We thank the reviewer for the insightful evaluation of our study.

      Reducing the impact of these studies are several methodological, analytical, and interpretation issues.

      The authors report that "the effect of optical manipulations of photosensitive mGlu5 NAMs in individual brain regions in pain models has been studied before". It is, therefore, not immediately clear what is novel in the present study.

      We have clarified this in the following statement (page 3, lines 15‐17): “It remains to be determined if region‐specific actions play a role in the overall analgesic activity of mGlu5 receptor NAMs, considering that opposite actions have been reported”. The subsequent paragraph nicely explains the novelty of our approach, which is based on the combined use of a drug activated by light (JF‐NP‐26) and another drug inactivated by light (alloswitch‐1) to determine which region is sufficient and/or necessary for the analgesic effect of systemic mGlu5 receptor NAMs. In the Discussion (page 7) we state that “To the best of our knowledge, this is the first study to employ photopharmacological tools to compare and contrast distinct roles of mGlu5 receptors in different regions of the pain matrix”.

      The reliance only on reflexive measures of pain, especially in a study that examines the role of "affective and cognitive aspects of pain and pain modulation".

      The main endpoint of the study was not to examine the cognitive and affective aspects of pain, although some of the regions examined are involved in these aspects of pain besides the regulation of sensory aspects (pain thresholds). However, we followed the kind suggestion and measured depression‐like and risk‐taking (anxiety‐like) behaviors in mice. To optimize the number of mice and be still consistent with the number of mice approved by the regulatory agency we used the following groups of mice for the evaluation of risk‐taking behavior with the light‐dark box: (i) sham‐operated mice treated with vehicle; (ii) CCI mice treated with vehicle; (iii) CCI mice treated with JF‐NP‐26 without light activation; and (iv) CCI mice treated with JF‐NP‐26 and irradiated with activating light (the test cannot be performed in the same mice before and after light activation to avoid habituation); depression‐like behavior with the tail suspension test was performed in two separate groups of mice: (i) CCI mice treated with JF‐NP‐26 with no light; and (ii) CCI mice treated with JF‐NP‐26 and light activation. All mice had been implanted with optic fibers in the basolateral amygdala.

      Data are shown in the new Supplementary Fig. S4 and reported in the Results section (page 5) as follows: “Knowing that mGlu5 receptors in the BLA shape susceptibility to stress and fear in rodents (35, 36), we also measured depression‐like and risk‐taking behavior after light‐induced activation of JF‐NP26 in the BLA of neuropathic mice. Light‐induced activation of JF‐NP‐26 decreased risk‐taking hence increased anxiety‐like behavior in CCI mice as shown by the decreased number of entries into, and reduced time spent in, the light compartment of the light‐dark box (Fig. S4a‐c). Depression‐like behavior assessed with the tail‐suspension test was unchanged in CCI mice after light‐induced irradiation of JF‐NP‐26 in the BLA (Fig. S4d).”

      The inclusion of only males is unfortunate because of known, significant sex differences in neuronal circuits driving pain conditions, in both preclinical models (including form work by the authors) and in clinical populations.

      We are aware that there are important sex differences in the pain neuraxis, but this study was not about sex differences. The goal was to evaluate any region‐specific actions of systemically administered compounds (mGlu5 NAMs) and the contribution and requirement of specific brain regions to the observed drug effects, using photopharmacology and drugs activated or inactivated/reactivated by light. This analysis would have been less straightforward in female mice given for example that it is known that mGlu5 receptors interact with estrogen receptors. This aspect could be addressed in a future project. The present study provides the basis for comparative studies in females.

      The elegant slice experiments (especially Fig. 3) were designed to probe circuit mechanisms through which mGluRs act in different brain regions. These experiments also provide a control to assess whether the photopharmacological compounds act as advertised. Surprisingly, the effect size produced by these compounds on neuronal activity are rather small (and, at times, seems driven by outliers). How this small effect affects the interpretation of the behavioral findings is not clear.

      These small effect sizes should also be considered when interpreting the circuit actions studied here.

      We greatly appreciate your insightful comments and constructive feedback on our findings. The mean effect sizes observed in certain experiments are quite small, but effects or changes were very consistent. And we illustrate this now by including lines to connect individual data points for the same neuron in the modified Figure 3 (f, g, n, o) to show consistent changes observed in the EPSC and IPSC graphs. We would like to add that is not quite clear how neuronal effects translate into behavioral consequence, how much of a change in individual neurons or in a population of neurons or change of a certain magnitude is sufficient and required. These are all interesting questions, but the results of our behavioral and electrophysiological data match quite nicely, including differential or opposing drug effects.

      Some of the sample sizes are as small as n=3. Without an a priori power analysis, it is difficult to assess the validity of the analyses.

      The authors present intriguing data on changes in InsP levels in some (but not all) animals after injury, but not in sham animals. They also report an increase in the expression of mGLuRs expression in some, but not all brain regions. These findings are not discussed. It is not clear how these selective changes in mGluR expression and activity might affect the interpretation of the photopharmacological results.

      We performed new experiments to increase sample size in PI experiments in the infralimbic and prelimbic cortices where the n was low. Now the data are more solid. New statistical values are reported in the legend of Fig. 1. We also added a discussion of the signaling data (page 9) as follows:

      “We found that mGlu5 receptor‐mediated PI hydrolysis was significantly amplified in all subregions of the contralateral mPFC and in the contralateral amygdala after induction of neuropathic pain whereas mGlu5 receptor protein levels were significantly increased only in the contralateral infralimbic cortex of neuropathic mice. This suggests that, at least in the anterior cingulate cortex, prelimbic cortex, and basolateral amygdala, mGlu5 receptors become hyperactive after induction of pain. It remains to be determined if this is mediated by an enhanced coupling of mGlu5 receptors to Gq/11 proteins, increased expression of phospholipase‐C or other mechanisms. Interestingly, mGlu5 receptor signaling was down‐regulated in the thalamus of neuropathic mice, but mGlu5 blockade in the thalamus still had antinociceptive effects (see below). Downregulation of mGlu5 receptor signaling in the thalamus might represent a compensatory mechanism aimed at mitigating pain in neuropathic mice.”

      The behavioral data seem to represent discrete, and not continuous variables. The statistical tests applied are likely inappropriate for these analyses.

      The behavioral values reported here represent measurements of force (g) required to elicit a reflex (i.e., reflex thresholds) and can be considered continuous variables. The statistical tests used for the behavioral experiments included either t‐test to determine if the difference between two groups was statistically significant or One‐Way ANOVA (repeated measures when appropriate) to determine if there were any statistically significant differences between the means of three or more groups. This form of analysis for the outcome measures in this study is well‐established in the literature.

      The authors assume (and state in the abstract) that they can selectively stimulate BLA afferents to the neocortex. This is technically highly unlikely.

      We appreciate the reviewer's insightful comment regarding the technical challenges associated with the selective stimulation of BLA afferents to the neocortex. We are aware that the electrical stimulation does not allow the exclusive stimulation of a specific pathway, though BLA afferents form the major component of afferent fibers running in the layer IV of the infralimbic cortex on their way to targets in layer II/III and layer V or infra‐ and pre‐limbic cortices.

      Our previous work (Kiritoshi et al., 2016) compared directly electrical and optogenetic stimulation in the mPFC, and found that they match, suggesting that electrical stimulation provides a reliable means to activate BLA input in the mPFC. We acknowledge the technical limitations of selective BLA activation with electrical stimulation, though we are confident that our approach allowed the investigation of mGlu5 manipulations in the BLA‐mPFC circuitry. We have modified the abstract to read as follows: “Electrophysiological analysis showed that alloswitch‐1 increased excitatory synaptic responses in prelimbic pyramidal neurons evoked by stimulation of presumed BLA input, and decreased BLA‐driven feedforward inhibition of amygdala output neurons”.

      The results from the experiment on rostroventral medulla (RVM) neurons are less than convincing because only a "trend" towards decreased excitation is reported. As above, without consideration of effect size, it is hard to appreciate the significance of these findings. The absence of a demonstration of a classical ON Cell firing pattern is also unfortunate.

      We appreciate this observation. Based on the Reviewer’s suggestion, we report below the effect size of optical modulation in the prelimbic cortex on RVM activity, according to Cohen’s d calculation from ttests (now shown in the Table 1). This information is also included in Results (page 6).

      Moreover, in this study we classified ON‐ or OFF‐cells based on their firing patterns relative to nocifensive withdrawal responses (H.L. Fields and M.M. Heinricher 1985). As ON‐cells with high basal firing can be easily misclassified as NEUTRAL‐cells (N.M. Barbaro, M.M. Heinricher, H.L. Fields, 1986), potential NEUTRAL‐cells with continuous spontaneous activity were verified by giving a brief bolus of anesthetic to the point that the withdrawal reflex was abolished. Indeed, firing of spontaneously active ON‐cells slows or stops with this manipulation, which unmasks reflex‐related responses. This is now reported and explained in Methods (page 14).

    2. Reviewer #1 (Public Review):

      Metabotropic glutamate receptors (mGLuRs) play a key role in regulating neuronal activity and related behaviors. In different brain regions these receptors can be expressed presynaptically and postsynaptically in different classes of neurons. Therefore, it is difficult to predict the effects of systemically applied drugs that act on these receptors. Here, the authors harness the power of photopharmacology, applying modulators that can be activated or inactivated by light with spatial precision, to address this problem. Their stated goal is to determine the role of mGluRs in regulating pain behaviors, and the circuit mechanisms driving this regulation. Their findings suggest that mGluRs acting in medial prefrontal cortex and thalamus drive antinociception in animals with neuropathic pain, whereas these receptors drive pronociception when acting in the amygdala. Their circuit analysis suggests that, in the amygdala, mGluRs act by decreasing feedforward inhibition of the output neurons. These findings have the potential to affect the development of targeted treatment for pain and related disorders. The elegant photopharmacological approaches will likely inform future studies attempting to distinguish the action of neuroactive drugs in different brain regions.

      Reducing the impact of these studies are several methodological, analytical, and interpretation issues.

      - The authors report that "the effect of optical manipulations of photosensitive mGlu5 NAMs in individual brain regions in pain models has been studied before". It is, therefore, not immediately clear what is novel in the present study.<br /> - The reliance only on reflexive measures of pain, especially in a study that examines the role of "affective and cognitive aspects of pain and pain modulation".<br /> - The inclusion of only males is unfortunate because of known, significant sex differences in neuronal circuits driving pain conditions, in both preclinical models (including form work by the authors) and in clinical populations.<br /> - The elegant slice experiments (especially Fig. 3) were designed to probe circuit mechanisms through which mGluRs act in different brain regions. These experiments also provide a control to assess whether the photopharmacological compounds act as advertised. Surprisingly, the effect size produced by these compounds on neuronal activity are rather small (and, at times, seems driven by outliers). How this small effect affects the interpretation of the behavioral findings is not clear.<br /> - These small effect sizes should also be considered when interpreting the circuit actions studied here.<br /> - Some of the sample sizes are as small as n=3. Without an a priori power analysis, it is difficult to assess the validity of the analyses.<br /> - The authors present intriguing data on changes in InsP levels in some (but not all) animals after injury, but not in sham animals. They also report an increase in the expression of mGLuRs expression in some, but not all brain regions. These findings are not discussed. It is not clear how these selective changes in mGluR expression and activity might affect the interpretation of the photopharmacological results.<br /> - The behavioral data seem to represent discrete, and not continuous variables. The statistical tests applied are likely inappropriate for these analyses.<br /> - The authors assume (and state in the abstract) that they can selectively stimulate BLA afferents to the neocortex. This is technically highly unlikely.<br /> - The results from the experiment on rostroventral medulla (RVM) neurons are less than convincing because only a "trend" towards decreased excitation is reported. As above, without consideration of effect size, it is hard to appreciate the significance of these findings. The absence of a demonstration of a classical ON Cell firing pattern is also unfortunate.

    3. eLife assessment

      In this interesting study, the authors have used light-sensitive mGlu5 negative allosteric modulators to determine the role of these receptors in a chronic pain model. These findings could be useful to the pain field, but the evidence supporting these claims is incomplete.

    4. Reviewer #2 (Public Review):

      In this study, Notartomaso et al. used optical activation of systemic JF-NP-26, a caged, baseline inactive, negative allosteric modulator (NAM) of mGlu5 receptors, in cingulate, prelimbic and infralimbic cortices, thalamus, and BLA to investigate the roles of these receptors in various brain regions in pain processing. They found that alloswitch-1, an intrinsically active mGlu5 receptor NAM, caused analgesia, but this analgesic effect was reversed by light-induced drug inactivation in the prelimbic and infralimbic cortices, and thalamus. In contrast, these pharmacological effects were reversed in the BLA. They further found that alloswitch-1 increased excitatory synaptic responses in prelimbic pyramidal neurons evoked by stimulation of BLA input, and decreased feedforward inhibition of amygdala output neurons by BLA. They thus concluded that mGlu5 receptors had differential effects in distinct brain regions. mGlu5 receptors are important receptors in pain processing, and their regional specificity has not been studied in detail. Further, this is an interesting study regarding the use of optical activation of pro-drugs, and the findings are timely. The combination of in vivo pharmacology, biochemistry, and slice EP provides complementary results.

    5. Reviewer #3 (Public Review):

      In this manuscript, Notartomaso, Antenucci et al. use two different light-sensitive metabotropic glutamate receptor negative allosteric modulators (NAMs) to determine how mGlu5 receptor signaling in distinct brain regions of mice influences mechanical sensitivity in chronic constriction injury (CCI) model of neuropathic pain. This is an extension of their previous work using photocaged mGlu5 negative allosteric modulators and incorporates a systemically active NAM that can be locally photoswitched off and on with violet and green light, respectively. The authors found that NAM signaling in the thalamus and prefrontal cortical regions consistently reduced mechanical hypersensitivity. However, they observed divergent effects on these measures in the basolateral amygdala. The authors attempted to solve the discrepancy in behavioral measurements between mGlu5 signaling in the basolateral amygdala by determining how NAMs modulate synaptic transmission or in vivo firing and found that these effects were projection-dependent.

      Strengths:

      This study demonstrates the importance of local signaling by mGlu5 receptors across multiple pain-processing circuits in the brain, and the use of optical activation and inactivation strategies is very intriguing.

      Weaknesses:

      One major limitation is the lack of sham surgery groups and vehicle/light-only controls in behavior and physiology experiments, though the authors did test mechanical sensitivity in the contralateral paw. The reliance on a single behavioral measure in these groups is also problematic. Many of these brain regions are known to influence distinct aspects of somatosensory processing or other behaviors entirely, which may be interpreted as hypersensitivity (e.g. fear or anxiety-like behaviors in the basolateral amygdala). Details on the light intensities used is also absent, and it is important to test whether violet light had any unintended effects on these cells/mice.

      While the effort to provide some mechanistic understanding using slice physiology and in vivo recordings is appreciated, they ignore any effects that these NAMs have directly on the excitability of the recorded output neuron. In the models, mGlu5 is proposed to exist on some upstream inhibitory (mPFC) or excitatory (BLA) projection, but no evidence of a direct effect on these synaptic inputs was observed. Given the widespread distribution of mGlu5 in these brain regions, the proposed model seems unlikely. Perhaps CCI induces changes in functional coupling of mGlu5 in different cell types, and this could be revealed with appropriate control experiments.

      Overall the broad profiling taken here across multiple brain regions lacks controls and some cohesion, making it challenging to conclude how mGlu5 signaling is acutely impacting these circuits and/or specific cell types.

    1. Author response:

      Reviewer #2 (Public Review):

      (1) The groups of patients with endometrial cancer in the manuscript are classified according to age greater than/less than 60. Please explain why 60 years old is chosen as the boundary value of age.

      Thanks for your Recommendation. We have modified the discussion section of the manuscript in accordance with your suggestion.

      (2) Among the patients with endometrial cancer selected in the manuscript, AFP outliers accounted for a relatively small proportion. The authors chose the clinical detection outliers of CA-125, CA19-9, AFP and CEA as the dividing line, instead of re-selecting the optimal cut-off value in thispopulation, which should be classified and the prognostic value explored.

      Thanks for your Recommendation. We have modified the discussion section of the manuscript in accordance with your suggestion.

      (3) In cancer research, stage is an important prognostic factor to guide the treatment of patients in clinical work. Patients with different stages of endometrial cancer have obvious prognostic differences. The authors constructed a new prognostic risk score based on serum level of AFP, CEA andCA125, the prognostic value of the risk score should be validated in patients with endometrial cancer at different stages。

      Thanks for your Recommendation. We have modified the discussion section of the manuscript in accordance with your suggestion.

    2. eLife assessment

      This study presents a valuable finding on prognostic values of serum CA125, CEA, and AFP for predicting patient outcomes of endometrial cancer. The evidence supporting the claims of the authors is solid, although inclusion of detailed discussion of present results with prior documented findings would have strengthened the study. The work will be of interest to scientists working on endometrial cancer.

    3. Reviewer #1 (Public Review):

      Article strengths:

      (1) Detailed data: The authors provided a large amount of clinical data as support, making the analysis results more persuasive and credible.<br /> (2) Scientific method: Appropriate statistical methods were used to analyze the data, which can accurately reflect the internal laws and trends of the data.<br /> (3) Clear conclusions: The conclusions drawn in the article are clear and explicit, easy for readers to understand and accept.<br /> (4) High practicality: The research results have important guiding significance for obstetrics and gynecology clinical practice, helping to improve patient treatment outcomes and quality of life.

      Article weaknesses:

      Limitations of research methods: Although the authors used statistical methods to analyze the data, they may be limited by factors such as data sources and sample size, leading to some limitations in the research results. It is recommended that the authors further expand the data sources and increase the sample size in subsequent studies to improve the accuracy and reliability of the research.

    4. Reviewer #2 (Public Review):

      This prospective study advances our understanding of the predictive value of preoperative serum CA125, CA19-9, CA72-4, CEA, and AFP in endometrial cancer. The evidence supporting the conclusions is convincing with rigorous analysis of the association between prognostic values of several serum markers with the clinical data of endometrial cancer patients. However, there are a few areas in which the article may be improved through further validation of the prognostic value of the risk score in patients with endometrial cancer at different stages. Moreover, the authors should provide a more detailed explanation of the choice of statistical methods in the manuscript. The work will be of broad interest to clinicians, medical researchers and scientists working in endometrial cancer.

      (1) The groups of patients with endometrial cancer in the manuscript are classified according to age greater than/less than 60. Please explain why 60 years old is chosen as the boundary value of age.<br /> (2) Among the patients with endometrial cancer selected in the manuscript, AFP outliers accounted for a relatively small proportion. The authors chose the clinical detection outliers of CA-125, CA19-9, AFP and CEA as the dividing line, instead of re-selecting the optimal cut-off value in this population, which should be classified and the prognostic value explored.<br /> (3) In cancer research, stage is an important prognostic factor to guide the treatment of patients in clinical work. Patients with different stages of endometrial cancer have obvious prognostic differences. The authors constructed a new prognostic risk score based on serum level of AFP, CEA and CA125, the prognostic value of the risk score should be validated in patients with endometrial cancer at different stages。

    5. Reviewer #3 (Public Review):

      The authors of this study aimed to enhance the prognostic assessment of endometrial cancer (EC) by identifying and validating a set of serum tumor markers (CA125, CEA, and AFP) that could reliably predict progression-free survival (PFS) and overall survival (OS) in patients. By employing a robust methodology that included the use of LASSO Cox regression analysis to construct a predictive model, the study sought to provide a more nuanced tool for clinical decision-making in the management of EC.

      Major Strengths:

      Methodological Rigor: The study's use of advanced statistical methods to analyze a large dataset of EC patients stands out. The inclusion of a validation cohort enhances the credibility of the prognostic model developed.<br /> Clinical Relevance: The identification of CA125, CEA, and AFP as independent prognostic factors and the creation of a risk score based on these markers offer valuable tools for clinicians. The predictive accuracy of this model could significantly impact patient management and treatment planning.<br /> Weaknesses:

      Generalizability: The study is based on a cohort from a single institution, which may limit the applicability of the findings across different populations and healthcare settings.<br /> Loss to Follow-Up: As acknowledged by the authors, the loss to follow-up of some patients introduces a potential source of bias, possibly affecting the study's conclusions.<br /> Achievement of Aims and Support for Conclusions:

      The study successfully achieves its aim of developing a prognostic model for EC that integrates serum levels of CA125, CEA, and AFP. The evidence presented supports the authors' conclusions that this model is a robust tool for predicting patient outcomes, evidenced by its performance in both the training and validation cohorts.

      Impact and Utility:

      This work is poised to make a significant contribution to the field of gynecological oncology, particularly in the management of endometrial cancer. The study's findings provide a practical approach to stratifying patients based on their risk, which could be instrumental in tailoring individualized treatment plans. Moreover, the model's ability to predict PFS and OS with considerable accuracy offers a promising avenue for further research and application in clinical settings.

      Additional Context:

      Understanding the role of tumor markers in cancer prognosis is a rapidly evolving area of oncology research. This study's focus on combining multiple serum markers into a comprehensive risk score model represents a significant step forward in the quest for more personalized cancer care. Future studies could expand on this work by exploring the integration of such markers with other clinical and molecular data to further refine prognostic models.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors tested the hypothesis that protein consumption decreases with decreasing mass-specific growth during development. This hypothesis is firmly grounded in the logical premise that as animals progress from periods of reduced activity and rapid growth to phases of increased activity and reduced mass-specific growth during their development, they are likely to adjust their nutrient intake, reducing protein and increasing carbohydrate consumption accordingly. The authors tested their hypothesis using the South American locust Schistocerca cancellata, combining field observations with laboratory experiments. This approach allowed them to discern how variations in activity history and metabolism between field- and laboratory-raised locusts influenced their nutrient requirements.

      Their findings, indeed reveal the predicted shift from high protein: carbohydrate consumption to lower protein: carbohydrate intake from the first instar to adult locust - a decline that strongly correlated with a decrease in mass-specific growth rate. Their comparison between field- and laboratory-raised locusts, showed that protein demand was not different, however, carbohydrate consumption rate was >50% higher in the field locusts. These results add depth and significance to the study, shedding light on how environmental factors influence nutrient requirements. What truly amplifies the strength and novelty of the authors' hypothesis is their anticipation that this observed trend in Schistocerca cancellata could extend to all animals. This anticipation is rooted in the expectation that growth rates scale hypometrically across various body sizes and developmental stages, introducing a universal dimension to their findings that holds great promise for broader ecological and evolutionary understanding.

      However, while the study is commendable in its methodology and core findings, there is room for improvement in clarifying the implications of the results. The current lack of clarity is evident in the somewhat shallow questions outlined in lines 358 to 363. For instance, the practice of administering age-specific diets has been commonplace in human and livestock management for ages. Thus, its continued utility may not be the most stimulating question. Instead, a more thought-provoking inquiry might delve into whether variations in global protein availability play a pivotal role in driving niche specialization and the biogeography of animal body sizes and ontogeny, especially considering the potential impacts of climate change. Such inquiries would further elevate the significance of the author's work and its broader implications in the field.

      Thanks for the suggestions. We have added additional sentences to the discussion regarding how size affects protein:carbohydrate consumption may affect physiology and ecology of animals.

      Reviewer #2 (Public Review):

      How and why nutritional requirements and intake targets change over development and differ between species are significant questions with wide-ranging implications spanning ecology to health. In this manuscript, Talal et al. set out to address these questions in laboratory and field experiments with grasshoppers and in a comparative analysis of different species.

      The authors conclude that the target intake of protein to non-protein energy (in this case carbohydrate) (P:C) falls over developmental stages and that this occurs because of a decline in mass-specific intake of protein whereas mass-specific carbohydrate intake remains more constant. The decrease in mass-specific protein consumption rate is tightly correlated with a decline in specific growth rate. Hence, protein consumption directly reflects requirements for growth, with hypometric scaling of protein intake serving as a useful relationship in nutritional ecology.

      The laboratory experiments on the locust, Schistocerca cancellata, provide an elegant dataset in which different instars have been provided with one of two nutritionally complementary food pairings differing in protein to carbohydrate (P: C) content, and their self-selected protein to carbohydrate "intake target" measured.

      These lab locust results were then compared with independently collected field data for late instar nymphs of the same locust species, and the conclusion is drawn that field insects ingested similar protein but 50-90% more carbohydrate (with only 23% increased mass-specific resting oxygen consumption rates). Numerous uncontrolled variables between the lab and field studies make meaningful conclusions difficult to draw from this observation.

      Thank you for this comment. We have revised the text to better explain that very few studies have directly compared lab and field intake target data, and that our goal was to test whether lab intake targets predicted those for field-collected animals. We have also revised the discussion to describe the many possible reasons that intake targets for field-collected animals may diverge from those of lab-reared locust.

      A graph is then provided showing comparative data across a selection of species, making the case that protein consumption scales similarly both developmentally and across taxa. Questions need to be addressed for this to be convincing, including which criteria were used to select the examples in the graph and how comprehensively do these represent the available literature.

      We now provide further data in the methods on our literature search methods.

      Reviewer #3 (Public Review):

      The main goal of this study was to test how and why the intake of two important macronutrients ‒protein and carbon‒ often changes with ontogeny and body size. To do this, authors examined protein and carbon intake in a locusts lab population, across each instar and adult stages. Then, authors examined how the optimal balance of carbon and protein intake in a wild locusts population corresponded to that observed in the laboratory population. Results of these experiments showed that with ontogenic growth, locust decreased protein while increasing carbohydrate intake. Authors concluded that such decrease in the protein: carbohydrate intake may result from reductions in specific growth rates (growth within each instar). The protein: carbohydrate intake in the lab population appeared to be consistent with that observed in a wild locust population. Finally, authors combined their data with that from the literature to examine how protein intake scales with body mass throughout development, within and across different species.

      Strengths:

      To determine how locusts balance protein: carbohydrate intake, authors applied the Geometric Framework (GF) of nutrition, which is a powerful approach for studying effects of nutrition and understanding the rules of compromise associated with balancing dietary unbalances.

      Captivity can change behavior and physiology of most organisms, making it difficult to establish the relevance of laboratory experiments to what happens in the real world. A strength of this paper is that it compares behavior/physiology of lab vs. wild locusts. Finally, this study takes a step further by proposing a new scaling rule based on this study's results and data from the literature on various species.

      Weaknesses:

      Although the paper has strengths, there seems to be several methodological issues that obscure the interpretation/conclusions presented in the manuscript.

      It appears that authors are not actually estimating "Intake Targets", as stated throughout the manuscript. According to the geometric framework, the intake target (IT) is estimated as the point in the nutritional landscape under which performance/fitness is optimized. The geometric framework also predicts that animals can reach their intake targets by feeding selectivity when given a choice of diets that differ in nutrient amounts, which is what authors did here. However, because the relationship between fitness/performance with diet was not established, in the choice experiments authors seem to be assuming (but not testing) that locusts are reaching their intake target.

      The reviewer is correct that we have not tested whether the intake target selected by each instar maximizes growth or some other measure of fitness. This is a nontrivial task, as there are many possible indices of fitness for juvenile instars, including growth rate, developmental time, resistance to disease/stress, as well as effects on adult reproduction. We use intake target as defined by Raubenheimer and Simpson (2018), “the intake target (IT) is a geometric representation of the nutrient mixture that the regulatory systems target through foraging and feeding.” As we explain above, we followed the protocols used by most investigators to measure intake targets, including for many papers locusts.

      You estimated a mass-specific protein intake for each instar. It is not clear why mass-specific intake and not just intake of protein was used for analysis. While mass (or size) of an individual may influence food consumption, it seems like authors calculated mass-specific consumption using each instar's final mass, which would make mass a result of protein consumption (and not the opposite). Importantly, the comparison between mass-specific protein consumption and specific growth rate may be problematic, as both variables seem to be estimated using final mass.

      Thank you for this important comment. We agree and therefore, we changed figure 2 and the related analyses, using protein consumption rate corrected for initial rather than final mass.

    2. eLife assessment

      How and why nutritional requirements change over development and differ between species are significant questions with wide-ranging implications spanning ecology to health. In this manuscript, Talal et al. set out to address these questions in laboratory and field experiments with grasshoppers and in a comparative analysis of different species. The laboratory experiments are convincing but the field and comparative aspects are not sufficiently well developed. In general, the study offers some evidence of a universal shift from high protein to high carbohydrate intake during ontogeny in animals, but the methods are not clear and/or appropriate to support the goals and conclusions of the manuscript as it is.

    3. Reviewer #1 (Public Review):

      The authors tested the hypothesis that protein consumption decreases with decreasing mass-specific growth during development. This hypothesis is firmly grounded in the logical premise that as animals progress from periods of reduced activity and rapid growth to phases of increased activity and reduced mass-specific growth during their development, they are likely to adjust their nutrient intake, reducing protein and increasing carbohydrate consumption accordingly. The authors tested their hypothesis using the South American locust Schistocerca cancellata, combining field observations with laboratory experiments. This approach allowed them to discern how variations in activity history and metabolism between field- and laboratory-raised locusts influenced their nutrient requirements.<br /> Their findings, indeed reveal the predicted shift from high protein: carbohydrate consumption to lower protein: carbohydrate intake from the first instar to adult locust - a decline that strongly correlated with a decrease in mass-specific growth rate. Their comparison between field- and laboratory-raised locusts, showed that protein demand was not different, however, carbohydrate consumption rate was >50% higher in the field locusts. These results add depth and significance to the study, shedding light on how environmental factors influence nutrient requirements.<br /> What truly amplifies the strength and novelty of the authors' hypothesis is their anticipation that this observed trend in Schistocerca cancellata could extend to all animals. This anticipation is rooted in the expectation that growth rates scale hypometrically across various body sizes and developmental stages, introducing a universal dimension to their findings that holds great promise for broader ecological and evolutionary understanding.<br /> However, while the study is commendable in its methodology and core findings, there is room for improvement in clarifying the implications of the results. The current lack of clarity is evident in the somewhat shallow questions outlined in lines 358 to 363. For instance, the practice of administering age-specific diets has been commonplace in human and livestock management for ages. Thus, its continued utility may not be the most stimulating question. Instead, a more thought-provoking inquiry might delve into whether variations in global protein availability play a pivotal role in driving niche specialization and the biogeography of animal body sizes and ontogeny, especially considering the potential impacts of climate change. Such inquiries would further elevate the significance of the author's work and its broader implications in the field.

    4. Reviewer #2 (Public Review):

      How and why nutritional requirements and intake targets change over development and differ between species are significant questions with wide-ranging implications spanning ecology to health. In this manuscript, Talal et al. set out to address these questions in laboratory and field experiments with grasshoppers and in a comparative analysis of different species.

      The authors conclude that the target intake of protein to non-protein energy (in this case carbohydrate) (P:C) falls over developmental stages and that this occurs because of a decline in mass-specific intake of protein whereas mass-specific carbohydrate intake remains more constant. The decrease in mass-specific protein consumption rate is tightly correlated with a decline in specific growth rate. Hence, protein consumption directly reflects requirements for growth, with hypometric scaling of protein intake serving as a useful relationship in nutritional ecology.

      The laboratory experiments on the locust, Schistocerca cancellata, provide an elegant dataset in which different instars have been provided with one of two nutritionally complementary food pairings differing in protein to carbohydrate (P: C) content, and their self-selected protein to carbohydrate "intake target" measured.

      These lab locust results were then compared with independently collected field data for late instar nymphs of the same locust species, and the conclusion is drawn that field insects ingested similar protein but 50-90% more carbohydrate (with only 23% increased mass-specific resting oxygen consumption rates). Numerous uncontrolled variables between the lab and field studies make meaningful conclusions difficult to draw from this observation.

      A graph is then provided showing comparative data across a selection of species, making the case that protein consumption scales similarly both developmentally and across taxa. Questions need to be addressed for this to be convincing, including which criteria were used to select the examples in the graph and how comprehensively do these represent the available literature.

    5. Reviewer #3 (Public Review):

      The main goal of this study was to test how and why the intake of two important macronutrients ‒protein and carbon‒ often changes with ontogeny and body size. To do this, authors examined protein and carbon intake in a locusts lab population, across each instar and adult stages. Then, authors examined how the optimal balance of carbon and protein intake in a wild locusts population corresponded to that observed in the laboratory population. Results of these experiments showed that with ontogenic growth, locust decreased protein while increasing carbohydrate intake. Authors concluded that such decrease in the protein: carbohydrate intake may result from reductions in specific growth rates (growth within each instar). The protein: carbohydrate intake in the lab population appeared to be consistent with that observed in a wild locust population. Finally, authors combined their data with that from the literature to examine how protein intake scales with body mass throughout development, within and across different species.

      Strengths:<br /> To determine how locusts balance protein: carbohydrate intake, authors applied the Geometric Framework (GF) of nutrition, which is a powerful approach for studying effects of nutrition and understanding the rules of compromise associated with balancing dietary unbalances.

      Captivity can change behavior and physiology of most organisms, making it difficult to establish the relevance of laboratory experiments to what happens in the real world. A strength of this paper is that it compares behavior/physiology of lab vs. wild locusts. Finally, this study takes a step further by proposing a new scaling rule based on this study's results and data from the literature on various species.

      Weaknesses:<br /> Although the paper has strengths, there seems to be several methodological issues that obscure the interpretation/conclusions presented in the manuscript.<br /> It appears that authors are not actually estimating "Intake Targets", as stated throughout the manuscript. According to the geometric framework, the intake target (IT) is estimated as the point in the nutritional landscape under which performance/fitness is optimized. The geometric framework also predicts that animals can reach their intake targets by feeding selectivity when given a choice of diets that differ in nutrient amounts, which is what authors did here. However, because the relationship between fitness/performance with diet was not established, in the choice experiments authors seem to be assuming (but not testing) that locusts are reaching their intake target.

      You estimated a mass-specific protein intake for each instar. It is not clear why mass-specific intake and not just intake of protein was used for analysis. While mass (or size) of an individual may influence food consumption, it seems like authors calculated mass-specific consumption using each instar's final mass, which would make mass a result of protein consumption (and not the opposite). Importantly, the comparison between mass-specific protein consumption and specific growth rate may be problematic, as both variables seem to be estimated using final mass.

    1. eLife assessment

      This study presents a useful examination of dense neuroanatomy in human postmortem medial entorhinal cortex, using a large number of small electron microscopy image volumes sampled from multiple cortical layers and individuals. The authors use solid experimental and annotation techniques, demonstrating the suitability of postmortem tissue reconstructions for analysis and presenting careful, detailed measurements of synapse properties and overall tissue composition. However, there is inadequate support connecting these findings to claims about general connectivity in medial entorhinal cortex, since factors affecting interpretability like noise, the spatial scales examined, and relationships between structural properties and connectivity are not characterized. With a more thorough contextualization, this work would be of interest for studies of cellular neuroanatomy or brain network organization.

    2. Reviewer #1 (Public Review):

      In this work, Plaza-Alonso et al. present a collection of volume electron microscopy (EM) reconstructions of human postmortem medial entorhinal cortex (MEC), and they measure properties of MEC cytoarchitecture and synapses as a function of neuroanatomical subdivision. The authors generate a sampling of 9 smaller (≲10 µm/side) EM reconstructions per subdivision to avoid prohibitively large (petabyte) EM volumes, using 3 reconstructions for each of 3 brain donors to control for inter-individual variability. Conducting in-depth analyses for 7 subdivisions (63 reconstructions total), the authors find little significant inter-subdivision variability in structural composition (volume fractions of cell bodies vs. neuropil vs. blood vessels) and multiple synapse properties (spatial distribution, density, area, shape, excitatory/inhibitory type, and postsynaptic cell compartment). They conclude that human MEC connectivity is largely homogeneous, with synapses arranged in a generally random spatial distribution and a large fraction of synapses being asymmetric (putatively excitatory). Their other findings include that asymmetric synapses are larger than symmetric/putatively inhibitory synapses; that asymmetric synapses prefer dendritic spines whereas symmetric synapses prefer dendritic shafts; and that a small fraction of synapses have larger, complex shapes that may suggest increased synaptic efficacy. They note that inhomogeneities may include inter-subdivision variation in asymmetric synapse area and complex-shaped synapse prevalence, and for some reconstructions (12/63), possible substructure in synapse distributions.

      Strengths:<br /> The authors have carefully conducted this work, using reasonable methods and comparing their findings with previous volume EM reconstructions where possible. It represents a substantial effort, given the challenges of producing and annotating volume EM data and of collecting human postmortem tissue. They have thus contributed a brain-region-specific characterization of human postmortem tissue with value as both a data resource and an examination of postmortem EM reconstruction quality, given that postmortem tissue is less-studied with volume EM but could be an important source of human brain samples (for example in regions that are surgically inaccessible). Further, some of the authors' measurements may be of added value, as they suggest functional correlates for less-studied synapse structures (such as the differing sizes of complex and simple "macular" synapses formed onto dendritic spines vs. shafts).

      Weaknesses:<br /> Despite these strengths, the analysis in this work may be impacted by multiple sources of experimental variability that may have contributed to the observed lack of structural variability, and the potential contributions of these should be addressed in making their claims.

      (1) The authors' approach to tissue sampling may have resulted in under-sampling, which may have reduced the detection power of their tests. More specifically, each reconstructed EM volume measured ~10 µm x 7 µm x 6 µm (360 - 502 µm^3) and contained ~300-400 synapses (Lines 211-212, 772-773). Per donor, this amounts to a sampling volume of ~1500 µm^3 for each MEC subdivision or ~1x104 µm^3 total. By contrast, the volume of the adult human MEC is ~1x10^12 µm^3, roughly 1x10^8 times larger [1]. Thus, while these EM reconstructions reflect a substantial effort, it is likely that they represent an under-sampling of MEC structure, especially since multiple excitatory and inhibitory neuron types are likely interspersed throughout (the authors also note this possibility in Lines 640-659).

      (2) The authors' measurements are combined across three donors who are biologically diverse (Table S11), including in terms of characteristics that themselves may impact neuronal connectivity. Without controlling for these variables, the possible reduction in stochastic, biological inter-individual variability that could be achieved by combining data across donors may be offset by increases in phenotype-related variability, which could reduce the detectability of true, conserved connectivity variations across MEC subdivisions. Specifically, these donors represent a mix of males and females; a mix of ages (40, 53, and 66 years) that suggest differing degrees of aging-related changes in neuronal connectivity (according to previous work, a majority of people >55 years of age are estimated to have Alzheimer's-associated neurofibrillary tangles, regardless of whether they have dementia symptomatology; see for instance [1]); and one death from metastatic cancer, indicating that for one donor cellular/neuronal abnormalities associated either with cancer itself or related therapies could be present.

      These two factors could substantially increase the dispersion of the authors' measurements in each MEC subdivision and lead to a situation with no detectable differences between subdivisions. It would be important to address these impacts when determining whether to interpret a lack of significant differences as true biological homogeneity for human MEC.

      One helpful approach would be to explicitly show the variance of each measurement obtained for each EM reconstruction. For example, error bars showing the interquartile range could be added to each data point in Fig. 3C, to show how much synapse areas vary per reconstruction and to allow some comparison across donors and MEC subdivisions.

      (3) A third potential source of variability relates to the authors' approach for synapse annotation. They appear to annotate active zones and postsynaptic densities by thresholding synapse images at some user-defined pixel intensity value, taking only pixels darker than that threshold as their annotations (Lines 806 - 812). This technique seems like it could be prone to producing noisy annotations, particularly since in the EM images provided (Figs. S11-16) the pixel intensities of active zones/postsynaptic densities and surrounding neuropil do not appear to be highly distinct.

      It would be important for the authors to support their findings by quantifying the variability that may be associated with this technique.

      [1] Price, C.C. et al., J. Int. Neuropsychol. Soc., (2010), doi: 10.1017/S135561771000072X.

    3. Reviewer #2 (Public Review):

      Plaza-Alanso et al. characterize synaptic properties across human medial entorhinal cortex across layers and, importantly, across multiple individuals. Using an impressive collection of post-mortem autopsy samples, they generate high resolution 3d FIB-SEM volumes across layers and MEC subregions and measure features such as synapse density, spatial distribution, size, shape and target location. The use of volumes permits a richer local context to synaptic reconstructions, and the methods used to count and quantify synapses appear thorough and convincing, although with limited descriptions at times. The core findings suggest few differences in most properties across either layers or individuals, with some modest exceptions in layers 1 and 6. A particular strength of the dataset is the large number of high quality synaptic contact reconstructions.<br /> However, because the volumes have no specific labels and are too small to associate axons or dendrites with individual cells or cell types, it is not clear how to extrapolate these findings to new insights toward the stated goal of a better understanding of the networks and connectivity characteristics of the MEC. Broadly speaking, the paper would benefit from a better explanation of why these specific parameters were chosen and what the authors hoped to gain from them. It might be useful to think of what would need to be the case to see something substantially different. Many of the measures here reflect the properties of dendrites passing through a small volume, which depends on the number of cells of different cell types, the length and thickness of their dendritic arbors, synapse density distributions, local and long range afferents, and more. One interpretation of these results is that these neuropil volumes across layers and individuals are effectively fully packed with dendrites, with a similar ratio of excitatory and inhibitory neurons, dendrites with roughly similar thickness and synaptic input density and local E/I balance. Can the authors disentangle these cellular-scale contributions or constrain their inter-individual variability across individuals? The lack of variability is perhaps the main observation here, and understanding this more clearly could be useful for thinking about larger volumes where fewer replicates are currently possible.

    1. eLife assessment

      This important manuscript used state-of-the-art techniques and employed relevant animal models to provide both convincing and solid evidence supporting the regulatory role of microRNA cluster 221/222 in rheumatoid arthritis synovial fibroblast. The findings of this work offer significant advances to current knowledge which will be interesting to a wide range audience in the rheumatology and bone research fields. However, whereas models, techniques, and analyses are solid, certain concepts related to the role of immune and bone cells are limited.

    2. Reviewer #1 (Public Review):

      The current manuscript investigates the role of microRNA cluster 221/222 (miR221/222) in rheumatoid arthritis synovial fibroblasts (RA SFs) prompted by previous evidence that this cluster is upregulated in these cells. The authors employed multiple genetic mouse models and genomic approaches demonstrating that global overexpression of miR221/222 in huTNFtg polyarthritic mice further expanded SF proliferation and exacerbated RA, whereas global deletion reduced SF proliferation and dampened RA. Mechanistically, the authors provide sufficient evidence that these effects are mediated through the regulation of cell cycle inhibitors (p27 and p57) and the epigenetic regulator Smarca1. In general, these studies offer strong evidence that miR221/222 contributes to the pathogenic mechanisms underlying SF function in RA and provide new critical information to advance the understanding of RA pathology. However, certain important aspects are not addressed. Specifically, limited information related to the immune and inflammatory nature of this mechanism is offered, which is further complicated by limitations of using global overexpression and knockout. For example, it remains unknown to what is the extent of contribution by immune and inflammatory cells as well as what are the SF-derived effectors that propagate tissue damage and erosion

    3. Reviewer #2 (Public Review):

      This study focuses on the role of miR221/222 in the pathogenesis of rheumatoid arthritis. Through the use of different murine models and genome-wide techniques, the authors individuate a miR221/222 elicited mechanism leading to synovial fibroblast hyperproliferation. These discoveries may provide a rationale for future targeted therapies for RA treatment.

      miR-221 and miR-222 have been linked with arthritis in previous studies from this and other laboratories: miR-221 and miR-222 have been found upregulated in SFs derived from the huTNFtg mouse model and RA patients, where their expression correlates with disease activity. The novelty of the present study resides in the analysis of the role of miR-221/miR-222 in an in vivo system and provides insight into cellular and molecular mechanisms linking miR-221/222 to RA progression.

    4. Reviewer #3 (Public Review):

      In this study, Roumelioti et al demonstrate the role of miR-221/222 in synovial fibroblasts (SFs) in inflammatory arthritis, applying a plethora of methods in three transgenic mouse models (huTNFtg, TgColVI-miR-221/222, huTNFtg;TgColVI-miR-221/222). miR-221/222 is upregulated in SFs, upon stimulation with TNF, both in early and established disease, while its gene is activated, as shown by scATAC-seq data. Using RNA sequencing and KEGG pathway analysis, authors showed that overexpression of miR-221 and miR-222 exacerbates arthritis, mainly due to SFs proliferation, driven by cell cycling inhibition and extracellular matrix remodeling. Although the authors suggest the potential utility of miR-221/222 targeting in inflammatory arthritis treatment, this was only examined through miR-221/222 -/- mice generation and not by direct silencing of miR-221/222 by administering a miR-221/222 antagonist.

    1. eLife assessment

      This is a valuable investigation of how type 5 metabotropic receptor signaling contributes to regulation of striatal circuit dynamics, that focuses on its role in direct pathway striatal projection neurons. The range of methods deployed and levels of analysis undertaken are key strengths but concerns remain that make the conclusions incomplete at present. This study will be of great interest for its unique demonstration of metabotropic receptor regulation of striatal circuit dynamics, physiology and behavior.

    2. Reviewer #1 (Public Review):

      Summary:

      Marshall and coworkers describe the effects of altering metabotropic glutamate receptor 5 activity on locomotion and related activity of D1 receptor expressing spiny projection neurons in dorsolateral striatum. The authors also examine effects of dSPN-specific constitutive mGlu5 deletion in several motor tests. Effects of inhibiting the degradation of the endocannabinoid 2-arachidonoyl glycerol are also examined. Overall, this study provides intriguing new information with relevance to movement disorders and possibly psychosis. However, there are questions about the interpretation of dSPN activity in relation to movement, as well as the analysis approach. Some aspects of the study are also incomplete.

      Strengths:

      A nice combination of in vivo cellular calcium imaging, pharmacology, receptor knockout and sophisticated movement analysis are used. The authors conclude that mGlu5 expressed in dSPNs contributes to movement through effects on clustered spatial coactivity of dSPNs. Some data suggesting the story may be different in the other major SPN subpopulation (iSPNs) are also presented. The authors also suggest that mGlu5 stimulation of endocannabinoid signaling may play a role in the receptor effects. Overall, this study provides intriguing new information with relevance to movement disorders and possibly psychosis

      Weaknesses:

      Major Comments:

      (1) The relationship between coactivity and movement in this and the previous study from this group is intriguing. Can the authors offer a hypothesis as to how decreased coactivity promotes increased movement velocity (e.g. as indicated by Figures 2l and 3m, and in the previous study)? Is coactivity during rest part of a "movement preparation" SPN program, or is it simply the case that the actual activity of individual dSPNs starts to contribute to different aspects of movement as velocity increases (given that the majority of neurons appear to show increased event rate during movement).

      (2) The authors focus on dSPNs until very late in the study and then provide a little intriguing data suggesting that iSPNs show no difference in coactivity in the mGlu5 cKO mice. However, the basic characterization of the relationship between iSPN coactivity and movement is missing, although Figure 5g does seem to suggest a relationship between coactivity and proximity similar to dSPNs. It would be helpful to include the type of analysis shown in Figure 1 for iMSNs.

      (3) The use of the Jaccard similarity index in this study is not intuitive and not fully explained by the methods or the diagram in Figure 1. The more detailed explanations in the previous papers from this group seem to indicate cells are listed as "coactive" if they both show an above-threshold fluorescence increase during a one second time frame after converting signals to a binary "on" or "off" status. However, it seems unlikely that the activity of the neurons would be perfectly or even strongly correlated, as there is bound to be variability in the exact traces from cell to cell. Furthermore, it doesn't seem clear how many frames need to show suprathreshold signals for two neurons to be considered coactive (or does this determine the magnitude of the normalized coactivity y-axis, e.g. in Figure 1i). Thus, while the technique appears to capture some index of coactivity, it does not appear to reveal the true temporal correlations in activity that could be obtained with techniques that use all data points to assess correlations. While this technique may be well suited to determining coactivity based on action potentials, or another all-or-none type biological event, it may not be as optimal for relating calcium transients that have more nuanced features.<br /> Another question is how the one second time frame was chosen. Did the authors run a sensitivity analysis to determine the effect of changing the frame duration on coactivity estimates. This might help determine if the analysis was too conservative in identifying coactive neurons.<br /> These comments may reflect a lack of understanding of the approach on the part of this reviewer. Perhaps a more detailed explanation of the method, maybe including examples of the types of calcium transients that are listed as reflecting coactivity or lack thereof, would clarify the suitability of this technique.

      (4) The analysis of a possible 2-AG role in the mGlu5 mediated processes is incomplete and does not add much to the story. As the authors admit, inhibiting MGL globally will have widespread effects on many striatal synapses. Perhaps a dSPN-targeted approach, such as knocking out DAG lipase in dSPNs, would be more informative. For example, one might expect that this knockout would prevent the effects of the JNJ mGlu5 PAM on both movement and dSPN activity. The authors also do not provide any evidence of 2-AG involvement in the synaptic changes they report, although admittedly the role of endocannabinoids in DHPG-induced synaptic depression has been reported in several previous studies.

      (5) It would seem to be a simple experiment to examine effects of the mGlu5 NAM in the dSPN mGlu5 cKO mice. If effects of the two manipulations occluded one another this would certainly support the hypothesis that the drug effects are mediated by receptors expressed in dSPNs. A similar argument can be made for examining effects of the JNJ PAM in the cKO mice.

      Minor Comments:

      (i) The use of CsF-based whole-cell internal solutions has caused concern in some past studies due to possible interference with G-protein, phosphatase and channel function (https://www.sciencedirect.com/science/article/abs/pii/S1044743104000296, https://www.jneurosci.org/content/jneuro/6/10/2915.full.pdf). It is reassuring the DHPG-induced LTD was still observable with this solution. However, it might be worth examining this plasticity with a different internal to ensure that the magnitude of the agonist effect is not altered by this manipulation.

      (ii) The Kreitzer and Malenka 2007 paper may not be the best to cite in the context of dSPN-related synaptic plasticity, as these authors claimed that DHPG-induced LTD was restricted to iSPNs (an observation that has not generally been supported by subsequent work in several laboratories).

    3. Reviewer #2 (Public Review):

      Strengths are that the topic is of significant interest and understudied and the combination of both genetic and pharmacological approaches. However, while there is great enthusiasm for the need to better understand mGluR5 roles in striatal circuitry, in its present form, there are three overarching concerns that significantly limit the impact of this study. First, while a Jaccard method is used to measure the spatiotemporal dynamics of dSPN activity, collectively the data herein do not support the authors' interpretation of the data that mGluR5 is a modulator of spatiotemporal dSPN dynamics. Specifically, pharmacological and genetic manipulations of mGluR5 do not differentially/preferentially modulate the activity of proximal vs distal dSPNs, therefore, it could also be interpreted that mGluR5 is blanketly boosting/suppressing all dSPN activity as opposed to differential proximal/distal spatial relationships. While this is acknowledged in the manuscript (Figure 2i), it leaves open for question the extent to which mGluR5 is modulating other aspects of dSPN activity independent of the spatiotemporal relationship across dSPNs (i.e. amplitude, firing probability, etc.). Second, while it is a strength that mGluR5 NAM, PAM, and D1 Cre mGluR5-cKO were used to bidirectionally manipulate mGluR5 signaling, the manuscript lacks a clear model of where mGluR5 is acting to affect dSPN activity. This concern can be readily addressed by treating D1 Cre mGluR5-cKO mice with the mGluR5 NAM (as described in Ln. 413-416) to determine the extent to which other sources of mGluR5 are contributing to dSPN activity. The authors' working model predicts that the NAM would have no significant effects on the D1 Grm5 cKO model. Third, there are some concerns about the statistical basis for conclusions that are drawn detailed below that when addressed will strengthen the rigor of the conclusions. Addressing these suggestions should strengthen the mechanistic understanding and further allow the authors to present a more clear working model for their findings.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Marshall et al. investigates the role mGluR5 in modulating the coactivity of d1 spiny projection neurons (dSPN) in the dorsolateral striatum through calcium imaging and pharmacological i.p. injections or targeted deletion of mGluR5 in dSPNs. They show a bidirectional modulation by negative and positive allosteric modulators respectively (mainly at rest) on dSPN coactivity, the increase in coactivity by the negative modulator showed qualitative similar effects on coactivity as the deletion of mGluR5 in dSPNs.

      Strengths:

      Overall the study is well written and easy to read, with the data supporting (most of the time) the conclusion. It brings a new perspective on the role of mGluR5 in the modulation of dSPNs coactivity and its correlation with movement.

      Weaknesses:

      Some of the experiments would strengthen the solidness of the study providing further information and verifying the claims of the main text with the statistics on the figure legends.

    1. eLife assessment

      This study provides valuable insights into the mechanistic basis of neurological manifestations of RNA polymerase III-related disease by creating a mutant mouse to dissect transcriptional changes. The data are solid and provide compelling evidence for disease progression initiated by a global reduction in tRNA levels leading to integrated stress and innate immune responses and neuronal loss. These observations notwithstanding, additional studies will be necessary to separate the direct and indirect effects of diminished RNA polymerase III transcription on cellular function and neurodegeneration in this valuable mouse model. The work will be of interest to those engaged in the study of chromosome biology, developmental biology and neurodegeneration.

    1. Reviewer #1 (Public Review):

      The authors design an automated 24-well Barnes maze with 2 orienting cues inside the maze, then model what strategies the mice use to reach the goal location across multiple days of learning. They consider a set of models and conclude that the animals begin with a large proportion of random choices (choices irrespective of the goal location), which over days of experience becomes a combination of spatial choices (choices targeted around the goal location) and serial choices (successive stepwise choices in a given direction). Moreover, the authors show that after the animal has many days of experience in the maze, they still often began each trial with a random choice, followed by spatial or serial choices.

      This study is written concisely and the results are presented concisely. The best fit model provides valuable insight into how the animals solve this task, and therefore offers a quantitative foundation upon which tests of neural mechanisms of the components of the behavioral strategy can be performed. These tests will also benefit from the automated nature of the task.

    2. eLife assessment

      This study presents a valuable new behavioral apparatus aimed at differentiating the strategies animals use to orient themselves in an environment. The evidence supporting the claims is solid, with statistical modeling of animal behavior. Overall, this study will attract the interest of researchers exploring spatial learning and memory.

    3. Reviewer #2 (Public Review):

      This paper uses a novel maze design to explore mouse navigation behaviour in an automated analogue of the Barnes maze. A major strength is the novel and clever experimental design which rotates the floor and intramaze cues before the start of each new trial, allowing the previous goal location to become the next starting position. The modelling sampling a Markov chain of navigation strategies is elegant, appropriate and solid, appearing to capture the behavioural data well. This work provides a valuable contribution and I'm excited to see further developments, such as neural correlates of the different strategies and switches between them.

    4. Reviewer #3 (Public Review):

      The development of an automated Barnes maze allows for more naturalistic and uninterrupted behavior, facilitating the study of spatial learning and memory, as well as the analysis of the brain's neural networks during behavior when combined with neurophysiological techniques. The system's design has been thoughtfully considered, encompassing numerous intricate details. These details include the incorporation of flexible options for selecting start, goal, and proximal landmark positions, the inclusion of a rotating platform to prevent the accumulation of olfactory cues, and careful attention given to atomization, taking into account specific considerations such as the rotation of the maze without causing wire shortage or breakage. When combined with neurophysiological manipulations or recordings, the system provides a powerful tool for studying spatial navigation system.

      The behavioral experiment protocols, along with the analysis of animal behavior, are conducted with care, and the development of behavioral modeling to capture the animal's search strategy is thoughtfully executed. It is intriguing to observe how the integration of these innovative stochastic models can elucidate the evolution of mice's search strategy within a variant of the Barnes maze.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors design an automated 24-well Barnes maze with 2 orienting cues inside the maze, then model what strategies the mice use to reach the goal location across multiple days of learning. They consider a set of models and conclude that the animals begin with a large proportion of random choices (choices irrespective of the goal location), which over days of experience becomes a combination of spatial choices (choices targeted around the goal location) and serial choices (successive stepwise choices in a given direction). Moreover, the authors show that after the animal has many days of experience in the maze, they still often began each trial with a random choice, followed by spatial or serial choices.

      This study is written concisely and the results are presented concisely. The best fit model provides valuable insight into how the animals solve this task, and therefore offers a quantitative foundation upon which tests of neural mechanisms of the components of the behavioral strategy can be performed. These tests will also benefit from the automated nature of the task.

      Reviewer #2 (Public Review):

      This paper uses a novel maze design to explore mouse navigation behaviour in an automated analogue of the Barnes maze. A major strength is the novel and clever experimental design which rotates the floor and intramaze cues before the start of each new trial, allowing the previous goal location to become the next starting position. The modelling sampling a Markov chain of navigation strategies is elegant, appropriate and solid, appearing to capture the behavioural data well. This work provides a valuable contribution and I'm excited to see further developments, such as neural correlates of the different strategies and switches between them.

      Reviewer #3 (Public Review):

      Strength:

      The development of an automated Barnes maze allows for more naturalistic and uninterrupted behavior, facilitating the study of spatial learning and memory, as well as the analysis of the brain's neural networks during behavior when combined with neurophysiological techniques. The system's design has been thoughtfully considered, encompassing numerous intricate details. These details include the incorporation of flexible options for selecting start, goal, and proximal landmark positions, the inclusion of a rotating platform to prevent the accumulation of olfactory cues, and careful attention given to atomization, taking into account specific considerations such as the rotation of the maze without causing wire shortage or breakage. When combined with neurophysiological manipulations or recordings, the system provides a powerful tool for studying spatial navigation system.

      The behavioral experiment protocols, along with the analysis of animal behavior, are conducted with care, and the development of behavioral modeling to capture the animal's search strategy is thoughtfully executed. It is intriguing to observe how the integration of these innovative stochastic models can elucidate the evolution of mice's search strategy within a variant of the Barnes maze.

      Comments on revised version:

      The authors have addressed all the points I outlined in the previous round of review, resulting in significant improvements to the manuscript. However, I have one remaining comment. Given the updated inter-animal analysis (Supplementary Figure 8), it appears that male and female mice develop strategies differently across days. Male mice seem to progressively increase their employment of spatial strategy across days, at the expense of the random strategy. Conversely, female mice exhibit both spatial and serial strategies at their highest levels on day 2, with minimal changes observed on the subsequent days.

      These findings could alter the interpretation of Figure 5 and the corresponding text in the section "Evolution of search strategy across days".

      For instance, this statement on page 6 doesn't hold for female mice: "The spatial strategy was increased across days, ... largely at the expense of the random strategy."

      We agree with the reviewer. While the text on page 6 is still valid for the male-female pooled data, we have clarified in the next section describing male-female differences that this trend is not observed in female. Furthermore, we adjusted the relevant part of the discussion the following manner:

      “A shift in the proportion of random, spatial and serial strategies was observed across days. Several factors might contribute to this shift, including learning of the environment and goal location, changes in motivation for exploration versus goal-directed navigation, and the evaluation of each strategy’s benefit via reinforcement learning. The spatial strategy progressively increased, mostly at the expense of the random strategy. This trend suggests a diminishing interest in exploration and an increasing benefit from employing the spatial strategy as the mice became more familiar with the environment and goal location. Consistent with this hypothesis, the development of the spatial strategy approximately matched the development of spatial maps in the hippocampus37 and the growth pattern of hippocampal feedforward inhibitory connectivity62, both showing progressive increases that reached plateaus after a week. In contrast, the serial strategy showed a sudden increase from day 1 to day 2, indicating that this goal-directed strategy is associated with rapid learning and could already be reinforced on day 2. However, the strategy shift was not uniform across the mouse population, as male and female mice showed distinct trends. Female mice showed no progressive increase in spatial strategy and initially relied more on the spatial strategy while using the random strategy less compared to male mice. This difference might be explained by faster learning of goal location and/or a stronger inclination towards goal-directed navigation over exploration in female mice.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor points:

      (1) The following sentence in the abstract is not grammatical: "The processes randomly selected vestibules based on either uniform (random) or biased (serial and spatial) probability distributions; closely matched experimental data across a range of statistical distributions characterizing the length, distribution, step size, direction, and stereotypy of vestibule sequences; and revealed a shift from random to spatial and serial strategies over time, with a strategy switch occurring approximately every 6 vestibule visits."

      One possible revision is: "The processes randomly selected vestibules based on either uniform (random) or biased (serial and spatial) probability distributions; [they] closely matched experimental data across a range of statistical distributions characterizing the length, distribution, step size, direction, and stereotypy of vestibule sequences, [revealing] a shift from random to spatial and serial strategies over time, with a strategy switch occurring approximately every 6 vestibule visits."

      We followed the reviewer’s suggestion.

      (2) There is a missing word in the following sentence in the last paragraph of the discussion: "Our tools might be combined in the future with optogenetic and/or pharmacogenetic [missing word here] to investigate the neural mechanisms underlying strategy selection"

      We added the word ‘manipulations’: ‘… optogenetic, pharmacogenetic manipulations …’

      Reviewer #2 (Recommendations For The Authors):

      I have two minor suggestions:

      (1) Results - Automated Maze section: It would be beneficial to clarify here that the floor and cues rotate allowing automation by chining start/end positions together. This information is key to the reader understanding the task and currently they would only know this by studying fig1 or delving into the methods

      As suggested by the reviewer, we have added the following text in the Results - Automated Maze section:

      “The maze consist of an enclosed arena with an array of 24 doors evenly spaced along the periphery, and two home boxes moving around the arena perimeter. Start positions are changed by rotating the arena and the home boxes (Fig. 1b). Furthermore, the arena has a tinted cover that prevents mice from seeing room cues while still allowing for infrared tracking of mouse trajectories.”

      (2) I still find the author's decision to exclude days from some of the line plots, e.g. days 3,4,5 from Fig2 etc, a little odd as this makes the reader wary. I appreciate their argument about clarity, but this can still be achieved while partitioning all of the data rather than excluding certain days. NB I do not find the heat map distributions in the far panel a particularly good substitute for this as pixel intensities are far less interpretable

      We appreciate the reviewer’s comment. We want to point out that line plots for all individual days are actually displayed in Supplementary Figure 7a.

      Reviewer #3 (Recommendations For The Authors):

      Although the difference between females and males is clear in Figure S8b, please note that the statistics in panels C and D might not be appropriate, as many of them may become insignificant if adjusted for multiple comparisons.

      If we understand correctly, a Bonferroni correction would need to consider the 3 day intervals in Figure S8c and the 2 day groups in Figure S8d. This would mean a significance threshold of 0.05/3 = 0.016667 for Figure S8c and 0.05/2 = 0.025 for Figure S8d, after Bonferroni correction. As it stands, all comparisons that are not labelled ’ns’ in Figure S8c-d remain significant even after applying the Bonferroni correction.

    1. eLife assessment

      This important manuscript describes experimental evolution experiments using a novel genetic system in yeast, showing that solute carrier transporters can incorporate additional functionality through the introduction of point mutations to either the ligand binding site or gating helices. These findings provide convincing evidence to establish that for Amino Acid transporters of the APC-type family, evolution to recognize new substrates passes through generalist intermediates that can transport most amino acids.

    2. Reviewer #1 (Public Review):

      Summary:

      The evolution of transporter specificity is currently unclear. Did solute carrier systems evolve independently in response to a cellular need to transport a specific metabolite in combination with a specific ion or counter metabolite, or did they evolve specificity from an ancestral protein that could transport and counter transport most metabolites. The present study addresses this question by applying selective pressure to Saccharomyces cerevisiae and studying the mutational landscape of two well characterised amino acid transporters. The data suggest that AA transporters likely evolved from an ancestral transporter and then specific sub families evolved specificity depending on specific evolutionary pressure.

      Strengths:

      The work is based on sound logic and the experimental methodology is well thought through. The data appear accurate, and where ambiguity is observed (as in the case of citruline uptake by AGP1), in vitro transport assays are carried out. to verify transport function.

      Weaknesses:

      The revisions have substantially strengthened the conclusions based on the results of this study. Follow up studies will no doubt try to rationalise/identify if specific mutational hot-spots exist within the APC fold that explain the specialisation observed in mammals (neurotransmitter vs. metabolic) for example.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper describes evolution experiments performed on yeast amino acid transporters aiming at the enlargement of the substrate range of these proteins. Yeast cells lacking 10 endogenous amino acid transporters and thus being strongly impaired to feed on amino acids were again complemented with amino acid transporters from yeast and grown on media with amino acids as the sole nitrogen source.

      In the first set of experiments, complementation was done with seven different yeast amino acid transporters, followed by measuring growth rates. Despite most of them having been described before in other experimental contexts, the authors show that many of them have a broader substrate range than initially thought.

      Moving to the evolution experiments, the authors used the OrthoRep system to perform random mutagenesis of the transporter gene while it is actively expressed in yeast. The evolution experiments were conducted such that the medium would allow for poor/slow growth of cells expressing the wt transporters, but much better/faster growth if the amino acid transporter would mutate to efficiently take up a poorly transported (as in case of citrulline and AGP1) or non-transported (as in case of Asp/Glu and PUT4) amino acid.

      This way and using Sanger sequencing of plasmids isolated from faster-growing clones, the authors identified a number of mutations that were repeatedly present in biological replicates. When these mutations were re-introduced into the transporter using site-directed mutagenesis, faster growth on the said amino acids was confirmed. Growth phenotype were confirmed by uptake experiments using radioactive amino acids; corresponding correlation plots show that the assays based on growth rates versus radioactive uptake assays indeed can explain the effect of the mutations to a large extent.

      When mapped to Alphafold prediction models on the transporters, the mutations mapped to the substrate permeation site, which suggests that the changes allow for more favorable molecular interactions with the newly transported amino acids.<br /> Finally, the authors compared growth rates of the evolved transporter variants with those of the wt transporter and found that some variants exhibit a somewhat diminished capacity to transport its original range of amino acids, while other variants were as fit as the wt transporter in terms of uptake of its original range of amino acids.<br /> Based on these findings, the author conclude that transporters can evolve novel substrates through generalist intermediates, either by increasing a weak activity or by establishing a new one.

      Strengths:

      The study provides evidence in favour of an evolutionary model, wherein a transporter can "learn" to translocate novel substrates without "forgetting" what it used to transport before. This evolutionary concept has been proposed for enzymes before, and this study shows that it also can apply to transporters. The concept behind the study is easy to understand, i.e. improving growth by uptake of more amino acids as nitrogen source. In addition, the study contains a large and extensive characterization of the transporter variants, including growth assays and radioactive uptake measurements. The authors performed experiments as part of the revision to show that the studied mutations do not greatly change surface expression of the transporters. Further they showed that in the absence of the evolutionary pressure, overexpression of the mutants versus the wildtype transporters does not affect growth rates, which is important to assess. Finally, the authors make careful conclusions saying that in real life, the evolutionary landscape is way more complex than under these "reductive" laboratory conditions with a strain lacking ten natively expressed amino acid transporters and being selected on a single amino acid in a defined medium.

      Weaknesses:

      The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. While this experimental approach is suited to find some gain-of-function variants for some of the amino acids, it has also its inherent limitations, the most important being that loss-of-function mutants are not sampled (though they might be interesting) and that mutagenesis is entirely random, thus not targeted. These weaknesses cannot be easily overcome other than by restarting the entire study and conducting for example deep mutational scanning experiments. The authors have done what they could do within the scope of this study to make this manuscript as complete and rigorous as possible.

    4. Reviewer #3 (Public Review):

      The goal of the current manuscript is to investigate how changes in transporter substrate specificity emerge in response to a novel selective pressure. The authors investigate the APC family of amino acid transporters, a large family with many related transporters that together cover the spectrum of amino acid uptake in yeast.

      The authors use a clever approach for their experimental evolutions. By deleting 10 amino acid uptake transporters in yeast, they develop a strain that relies on amino acid import by introduced APC transporters under nitrogen limiting conditions. They can thus evolve transporters towards transport of new substrates if no other nitrogen source is available. The main takeaway from the paper is that it is relatively easy for the spectrum of substrates in a particular transporter of this family to shift, as a number of single mutants are identified that modulate substrate specificity. In general, transporters evolved towards gain-of-function mutations (better or new activities) also confer transport promiscuity, expanding the range of amino acids transported.

      The data in the paper support the conclusions, and the outcomes (evolution towards promiscuity) agree with the literature available for soluble enzymes. The authors do a good job in the discussion of relating the lessons of the current study to natural evolution.

    5. Author response:

      The following is the authors’ response to the original reviews.

      (1) The authors should show i) whether the variants exhibit the same surface expression as wildtype and ii) whether changes of surface expression (e.g. wt transporter expressed low and high) alters growth rates under conditions where growth depends on amino acid uptake. The authors say that the uptake of radioactive substrate and the overall fitness coincide (Figures 5 and 6), but it would be good to quantify the correlation, perhaps by using a scatterplot and linear regression.

      We thank the reviewer for the questions and proposals. The comparison of the surface expression between the transporter-expressing variants was added to the manuscript (Figure 3- Figure supplement 1 and 2). In the case of the AGP1 variants it was calculated that surface expression between the evolved mutants and the wild-type is similar, indicating that the transporter overexpression has no impact on the growth rate per se. The same analysis for the PUT4 variants showed significant difference, with the PUT4-S variant seemingly expressed more than the wild-type. However, that does not seem to affect the uptake effect of the mutation in the cases of the original substrates of Ala, Gly and GABA, since in those cases the transporter activity for the evolved variant is substantially decreased (Figure 5). Thus, the variation on the surface expression between the mutant and the wild-type, which could be attributed to the small sample size and the inherent limitations of the analysis (imaging of a culture with cells in different planes), is not expected to interfere with the reported results.

      Additionally, a scatterplot accompanied with a linear regression curve describing the connection between the overall fitness and uptake of 2 mM radioactive substrates was added to the manuscript, as advised (Figure 5- Figure supplement 2). In both cases of 2 mM Phe or Glu, the regression model explains 60-70% of the variation observed in the uptake rate of the amino acids by the different variants if changes in the uptake rate are dependent on changes in the fitness.

      (2) The authors should further investigate to what extent the (over)expression of wildtype versus variant transporters impacts growth rates. I would recommend such experiments being done under conditions where nitrogen uptake does not depend on amino acid uptake. I could imagine that some of the fitness data are confounded by the general effects of mutations on growth rates. More concretely, I could imagine that overexpression of e.g. the AGP1-G variant is less of a burden for the yeast cells and would allow to grow them better in general. This could explain why its overall fitness is close to wt, whereas other variants exhibit diminished fitness (Fig. 4A).

      The growth curves of all transporter variant cultures in the absence of selection for amino acid uptake have been presented in Figure 4 - Supplement figure 1. As proposed, the growth rates of the variants in medium with ammonium as nitrogen source were calculated and presented in Figure 3- Supplement figure 1 and 2. For both cases of AGP1 and PUT4 expressing variants, statistical analysis showed no significant difference between the mutants and the wild-type.

      (3) It is quite remarkable that the PUT4-S variant has such a dramatically enlarged substrate spectrum. In addition, the fitness losses for Alanine and GABA are rather small. This striking finding asks the question of why yeast has not evolved this much better/more efficient variant in the first place?

      We thank the reviewer for this very good question. We now included an explanation in the Discussion, but to give a short answer here: One should keep in mind that we used a 10-gene deletion strain to select for given mutants. Wild-type cells have a wide spectrum of substrates through the use of many amino acid transporters, and their regulation is intricately tuned to achieve optimum transport under any environmental circumstance. Broadening the spectrum of a single transporter thus would not lead to increased fitness. On the contrary, it would probably throw off this fine balance.

      (4) It would be generally interesting which types of selections (transporter/amino acid combinations) were tried (maybe as part of the methods section). I could imagine that the examples that are shown in the paper are the "tip of the iceberg", and that many other trials may have failed either because the cultures died, or the identified clones would grow faster due to mutations outside of the plasmid. It would be helpful for researchers planning such experiments in the future to be made aware of potential stepping stones.

      The issues raised here are spot-on, as we actually did test the evolution of PUT4 towards transport of other amino acids than the two mentioned in the report. Aside from the successful Asp and Glu, we ran parallel cultures selecting for transport of Gln, Thr, Trp, Tyr, and Cit. Neither of these evolution regimes led to increased growth phenotypes that were linked to the evolved gene, and we did not investigate these cultures further. At this point, we cannot fully explain this result, which is why we decided to omit it from the report. The L207S variant of PUT4 was later shown to indeed support growth on Gln, Thr, and Cit. Therefore, we speculate that the reason for not evolving this mutant in the respective evolution cultures was that the fitness gain in these amino acids was not large enough to be sufficiently enriched in the course of the evolution trial. Given that the Δ10AA strain still harbors nine amino acid transporter genes in its genome, it is conceivable that upregulation of some of these genes causes growth in some amino acids, prohibiting the selection of mutations in PUT4 (e.g., by mutations outside the plasmid, as the reviewer aptly suggested). We deemed these (negative) results not appropriate for the manuscript, as our main focus was characterizing the fitness effects of single mutations, not the laboratory evolution process of obtaining the mutants.

      (5) The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. In such approaches, it is difficult to know which mutation space is finally covered/tested, and information that can be gained from loss-of-function analyses is missed. Accordingly, the outcome is somewhat anecdotal. To provide an idea of the mutational landscape accessible, the authors could perform NGS of cultures without any selective pressure, and report the distribution of missense variants in the population.

      We very much appreciate the interest in the details of the mutagenesis. Based on the information given in the original OrthoRep publications (e.g., Ravikumar et al., DOI: 10.1016/j.cell.2018.10.021; mutation rate approx. 10-5 per generation and nucleotide), we calculated the expected number of mutations per passage in our experiments. For AGP1, it is about 5000 mutational events per passage (10 mL culture volume and 1:200 dilution), and for PUT4, it is about 1000 mutational events per passage (2 mL culture volume and 1:100 dilution). At a gene length of about 2000 bp, we expect to cover most single mutations already in the first or second passage (in the absence of selection). This is reflected in the result that the strongly beneficial mutation L207S in PUT4 was recovered in every selection on Asp or Glu we tested. We included this information in the Methods section.

      That said, the present study was consciously designed to research gain-of-function mutations, as we wanted to know if and how membrane transporters can evolve new substrate specificities without losing the original functions. Our approach was chosen to reflect as close as possible a natural scenario where a microorganism encounters a new ecological niche (a new nutrient to be transported). At the same time, we included selective pressure to keep the capacity to thrive in the original niche (to assimilate an ancestral nutrient). This approach is designed to specifically select against any loss-of-function mutations, which is in line with most modern theories about evolution of protein function (excellently reviewed in Soskine and Tawfik, DOI: 10.1038/nrg2808). We find that this approach gives a good idea how transporters could evolve new functions in a natural setting. By engineering single mutations in the wild-type background of the transporters, we show the fitness effects of different single mutations - this finding thus does not depend on the mutational landscape that is covered in the experiment.

      (6) The authors do not discuss the impact of these mutations on transport rates/kinetics, which are known to play a role in substrate selection in solute carriers (https://www.nature.com/articles/s41467-023-39711-y). Do the authors think ligand binding/recognition is more important than kinetic selection in the evolution of function?

      Indeed, the observed phenotypes can stem from both changes in transport rate and changes in substrate binding. In our opinion, both are perfectly possible explanations for the behavior of evolved transporter variants. We are not discussing this in the manuscript as the weak transport of the novel substrates in the wild-type transporters did not allow us to unambiguously assign one or the other. Yet, we can lend minor circumstantial evidence pointing towards substrate affinity being the more important factor in evolving a new activity in transporters: Overall transport rate (for original substrates) declined in most evolved transporters. Therefore, it is a bit less likely that improved transport rate allowed novel substrates to be used as a nutrient. However, this is not to say that both processes can occur (even side by side).

      (7) Ultimately, what are the selective pressures that drive transporter function? The authors pose this question but don't fully develop the idea. Would promiscuous variants still be selected for if the limiting nitrogen source was taken up by the cell via a different pathway (i.e. ammonium or perhaps arginine)?

      Evolution and regulation of transporters is a very complex system, and we simplify this system in our single-transporter/single-amino acid approach. In nature, the selective forces are assumed to be much smaller than in our system, and multiple selective pressures might occur at the same time (maybe even in opposite directions). Therefore, such predictions are beyond the scope of the present study. To put it shortly, yeasts (and other organisms) have evolved the capacity to transport all natural amino acids. Yet, to actually allow fine-tuned regulation of transport of each individual amino acid, narrow- and broad-range transporters have evolved, including a lot of redundancy. This means that the question posed cannot be answered by yes or no, but by “it depends”.

      (8) Amino acids are a special class of metabolites, in that they all have the same basic structure. Thus, transport systems really only need to recognize the amino and carboxyl groups with high fidelity, and can modulate the side chain binding site to increase specificity. This was demonstrated in a bacterial APC transporter (https://www.nature.com/articles/s41467-018-03066-6#Sec2). Is this why the APC fold is largely responsible for AA uptake in biology?

      Indeed, typically, APC-type amino acid transporters bind the amino and carboxyl groups in the same position by backbone interactions. Therefore, this might be an ancestral feature of the APC superfamily and explain why this group represents the main group of amino acid transporters.

      (9) There isn't much discussion on the location of the mutations with respect to binding site vs. gating helices. Are there hotspots of mutations within the APC, and areas where variation is poorly tolerated? It would be helpful to briefly review what is known about mutations that change amino acid specificity in the APC family. My impression is that other studies applying rational mutagenesis have also shown that single-site mutations in the binding pocket alter substrate specificity - are these analogous to the L207 in PUT4? PUT4: I64T comes up in 3 of 5 selections. Did the authors consider a closer analysis of this mutation, and if not, why?

      We agree that it would be helpful to determine hotspots of mutations in APC transporters that lead to changes in selectivity. However, we feel that the current literature does not lend enough data to support an extended analysis of such hotspots. Conversely, the natural sequences of APC transporters are not similar enough to determine which residues are responsible for a certain selectivity profile. There are however some studies on site-directed mutagenesis, as mentioned by the reviewer. A short summary of those is discussed in the revised paper. Interpretation of the previous studies under the light of our results suggests that the evolutionary evolved sites derived in our work play a significant role in substrate selectivity and transporter function within the superfamily of the APC transporters.

      As to the question why we did not include the I64T mutation in our experiments: this mutation lies within the poorly defined N-terminus of the protein, which is not part of the transmembrane core. We therefore deemed this residue as probably not connected to the specificity of the protein; it might be related to the protein’s stability in the cell, as the termini of transporters are known to be important for post-translational regulation, especially vacuolar degradation.

      (10) What do we learn about the APC fold that informs our understanding of where substrate specificity arises in this fold? Do the authors think all SLC folds are equally capable of adaption, or are some more evolutionary-ready than others? An evolutionary analysis of these transporters to gain insights into whether the identified substitutions also occurred during natural evolution under real-life conditions would further strengthen the manuscript. Could the authors provide a sense of how similar the 18 yeast amino acid transporters are, such as sequence alignments or a matrix of pairwise sequence identity/similarity? Are they very diverged, or is the complement of amino acid substrates covered by a rather conserved suite of transporters?

      We do not want to make bold statements about adaptive evolution in other SLC folds, but we consider it not unlikely that a similar approach will lead to similar conclusions in other transporters.<br /> As advised, a pairwise identity matrix was added to the manuscript (Figure 1–figure supplement 2).

      As to the proposed analysis focusing on natural occurrence of the mutations we found: we have indeed looked into this, but have not found evidence of such mutations. This is actually expected, as our selection regime puts “unnatural” selective pressures on a single transporter in isolation, which in reality co-evolved with a whole suite of other transporters that already have the capacity to transport all amino acids. Therefore, it is unlikely that the same mutations would happen in a natural setting. Our study is designed to capture evolution where a completely novel substrate is encountered, for which no transport mechanism has evolved yet.

      (11) Throughout: some of the bar graphs show individual data points, but others do not (Figure 3, Figure 5). These should be shown for all experiments.

      We thank the reviewer for the comment. In the revised version of the manuscript, we included individual data points in all bar graphs.

      (12) For bar graphs in which no indication of significance is shown, does this mean that p>0.05? Comparisons that are not significant (p>0.05) should be indicated as such.

      We thank the reviewer for the comment. In the revised version of the manuscript, we indicated in the legends that in cases of no significant difference (p > 0.05) between the wild-type and the evolved variants, no asterisks are shown.

      (13) Figure 5, Figure 6: Are the three confocal images just three different fields of view? It might be useful to include a zoom-in on a single representative cell, as it is hard for the reader to see to evaluate the membrane localization.

      In the revised version of the manuscript, we clarified that the three confocal images represent three different cultures, as each variant was tested in triplicates. We also included a zoom-in of a representative cell, as suggested.

      (14) In the main text, page 9, the conditions used for each experimental evolution are not clear ("nitrogen limiting mixture of amino acids (1 mM final concentration)". I think this is an important detail, since the mixtures are quite different for the more promiscuous vs. the more selective transporter, and it would be helpful if this was described more clearly in the main text.

      We thank the reviewer for the comment. We have included further clarification in the revised manuscript.

      (15) Figure 1-Supplement 1 and Figure 4 Supplement 4 - can't read the figure labels. Try labeling columns and rows rather than individual plots.

      We have taken the proposal into account and revised the proposed Figures accordingly.

      (16) Page 9: "The transporter gene was sequenced and re-introduced into Delta-10AA cells." Was the plasmid isolated, sequenced, and re-introduced, or was the gene cut-and-pasted into a new vector backbone?

      In the revised manuscript we have clarified that the gene was sequenced and then cloned into the expression vector and re-introduced into naïve Δ10AA cells.

    1. eLife assessment

      The manuscript examined the potential modulatory effects of nitric oxide (NO) on the response properties of mouse retinal ganglion cells (RGCs) using two-photon calcium imaging and multi-electrode arrays (MEA). The data identifying a group of RGCs affected by NO are solid but fall short on the precise nature of the effects and their physiological implications. The findings that there can be cell-specific adaptation effects provide useful new information for the field, and more experiments and MEA analysis are encouraged.

    2. Reviewer #1 (Public Review):

      Summary:

      Nitric oxide (NO) has been implicated as a neuromodulator in the retina. Specific types of amacrine cells (ACs) produce and release NO in a light-dependent manner. NO diffuses freely through the retina and can modulate intracellular levels of cGMP, or directly modify and modulate proteins via S-nitrosylation, leading to changes in gap-junction coupling, synaptic gain, and adaptation. Although these system-wide effects have been documented, it is not well understood how the physiological function of specific neuronal types is affected by NO. This study aims to address this gap in our knowledge.

      Strengths:

      NO was expected to produce small effects, and considerable effort was expended in validating the system to ensure that any effects of NO would not be confounded by changes in the state of the preparation. The authors used a paired stimulus protocol to control for changes in the sensitivity of the retina during the extended recording periods. The approach potentially increases the sensitivity of the measurements and allows more subtle effects to be observed.

      Neural activity was initially measured by Ca-imaging. Responsive ganglion cells were grouped into 32 types using a clustering analysis. Initial control experiments demonstrated that the cell-types revealed here largely recapitulate those from their earlier landmark study using the same approach (Fig. 2).

      Application of NO to the retina strongly modulated responses of a single cluster of cells, labeled G32, while having little effect on the remaining 31 clusters. This result is evident in Fig. 3e.

      Separate experiments measured ganglion cell spiking activity on a multi-electrode array (MEA). Clustering analysis of the peri-stimulus spike-time histograms (PSTHs) obtained from the MEA data also revealed 32 clusters. The PSTHs for each cluster were aligned to the Ca-imaging data using a convolution approach. The higher temporal resolution of the MEA recordings indicated that NO increased the speed of sub-cluster 2 responses but had no effect on receptive field size. The physiological significance of the small change in kinetics remains unclear.

      Weaknesses:

      The G32 cluster was further divided into three sub-types using Bayesian Information Criterion (BIC) based on the temporal properties of the Ca-responses. This sub-clustering result seems questionable due to the small difference in the BIC parameter between 2 and 3 clusters. Three sub-clusters of the G32 cluster were also revealed for the PSTH data, however, the BIC analysis was not applied to further validate this result.

      The alignment of sub-clusters 1, 2, and 3 identified in the Ca-imaging and the MEA recordings seemed questionable, because the temporal properties of clusters did not align well, nor did the effects of NO.

      The title of the paper indicates that nitric oxide modulates contrast suppression in a subset of mouse retinal ganglion cells, however, this result appears to be inferred from previous results showing that G32 is identified as a "suppressed-by-contrast" cell. The present study does not explicitly evaluate the amount of contrast-suppression in G32 cells.

      In its current form, the work is likely to have limited impact, since the morphological and functional properties of the affected sub-cluster remain unknown. The finding that there can be cell-specific adaptation effects during experiments on in vitro retina is important new information for the field.

    3. Reviewer #2 (Public Review):

      Neuromodulators are important for circuit function, but their roles in the retinal circuitry are poorly understood. This study by Gonschorek and colleagues aims to determine the modulatory effect of nitric oxide on the response properties of retinal ganglion cells. The authors used two photon calcium imaging and multi-electrode arrays to classify and compare cell responses before and after applying a NO donor DETA-NO. The authors found that DETA-NO selectively increases activity in a subset of contrast-suppressed RGC types. In addition, the authors found cell-type specific changes in light response in the absence of pharmacological manipulation in their calcium imaging paradigm. While this study focuses on an important question and the results are interesting, the following issues need further clarification for better interpretation of the data.

      (1) Design of the calcium imaging experiments: the control-control pair has a different time course from the control-drug pair (Fig 1e). First, the control-control pair has a 10 minute interval while the control-drug pair has a 25 minute interval. Second, Control 1 Field 2 was imaged 10 min later than Control 1 Field 1 since the start of the calcium imaging paradigm.

      Given that the control dataset is used to control for time-dependent adaptational changes throughout the experiment, I wonder why the authors did not use the same absolute starting time of imaging and the same interval between the first and second round of imaging for both the control-control and the control-drug pairs. This can be readily done in one of the two ways: 1. In a set of experiment, add DETA/NO between "Control 1 Field 1 and "Control 2 Field 1" in Fig. 1e as the drug group; or 2. Omit DETA/NO in the Fig. 1e protocol as the control group to monitor the time course of adaptational changes.

      Related to the concern above, to determine NO-specific effect, the authors used the criterion that "the response changes observed for control (ΔR(Ctrl2−Ctrl1)) and NO (ΔR(NO−Ctrl1)) were significantly different". This criterion assumes that without DETA-NO, imaging data obtained at the time points of "Control 1 Field 2" and "DETA/NO Field 2" would give the same value of ΔR as ΔR(Ctrl2−Ctrl1) for all RGC types. It is not obvious to me why this should be the case, because of the unknown time-dependent trajectory of the adaptational change for each RGC type. For example, a RGC type could show stable response in the first 30 min and then change significantly in the following 30 min. DETA/NO may counteract this adaptational change, leading to the same ΔR as the control condition (false negative). Alternatively, DETA/NO may have no effect, but the nonlinear time-dependent response drift can give false positive results.

      I also wonder why washing-out, a standard protocol for pharmacological experiments, was not done for the calcium protocol since it was done in the MEA experiments. A reversible effect by washing in and out DETA/NO in the calcium protocol would provide a much stronger support that the observed NO modulation is due to NO and not to other adaptive changes.

      (2) Effects of Strychnine: In lines 215-219, " In the light-adapted retina, On-cone BCs boost light-Off responses in Off-cone BCs through cross-over inhibition (83, 84) and hence, strychnine affects Off-response components in RGCs - in line with our observations (Fig. S2)" However, Fig. S2 doesn't seem to show a difference in the Off-response components. Rather, the On response is enhanced with strychnine. In addition, suppressed-by-contrast cells are known to receive glycinergic inhibition from VGluT3 amacrine cells (Tien et al., 2016). However, the G32 cluster in Fig. S2 doesn't seem to show a change with strychnine. More explanation on these discrepancies will be helpful.

      (3) This study uses DETA-NO as an NO donor for enhancing NO release. However, a previous study by Thompson et al., Br J Pharmacol. 2009 reported that DETA-NO can rapidly and reversible induce a cation current independent of NO release at the 100 uM used in the current study, which could potentially cause the observed effect in G32 cluster such as reduced contrast suppression and increased activity. This potential caveat should at least be discussed, and ideally excluded by showing the absence of DETA-NO effects in nNOS knockout mice, and/or by using another pharmacological reagent such as the NO donor SNAP or the nNOS inhibitor l-NAME.

      (4) Clarification of methods: In the Methods, lines 1119-1127, the authors describe the detrending, baseline subtraction, and averaging. Then, line 1129, " the mean activity r(t) was computed and then traces were normalized such that: max t(|r(t)|) = 1. How is the normalization done? Is it over the entire recording (control and wash in) for each ROI? Or is it normalized based on the mean trace under each imaging session (i.e. twice for each imaging field)?

      As for the clustering of RGC types, I assume that each ROI's cluster identity remains unchanged through the comparison. If so, it may be helpful to emphasize this in the text.

    4. Author response:

      We thank the reviewers for appreciating our study and for providing valuable comments and recommendations.

      We are convinced that by carefully addressing the reviewers' comments and questions, we will be able to improve the manuscript’s quality.  

      Specifically, we aim to provide further analysis to validate the subdivision of G32 RGCs into sub-clusters.

      In that context, we will improve the alignment of the RGC sub-types between the calcium imaging and MEA datasets.  

      To give the reader all information about our analysis, we will improve the methods section and explain the normalization of the calcium traces and the clustering in more detail.

      Furthermore, we will also address the concerns regarding the design of the calcium imaging experiments, potential false-negative effects, and why we did not include a wash-out condition in our experimental protocol.  

      Finally, we will revise the discussion about potential NO mechanisms and expand it on how the effects we observed may relate to known or potentially novel mechanisms.

      In particular, we will also deepen our discussion and interpretation of the strychnine dataset.  

      Again, we would like to thank the reviewers for their valuable comments.

    1. eLife assessment

      This important study describes a new mathematical method to analyze clonal composition of tissues using fluorescent reporters and to estimate the number of precursor cells contributing to tissue homeostasis and regeneration based on statistical variance. The evidence provided is convincing, with rigorous measurement of hematopoietic cell labeling during steady state and regenerative hematopoiesis following insult. It could be further strengthened by exploring the limitations of the binomial assumption, using tools to measure clonality and considering the possible effects of the inducing agent (tamoxifen) on precursor cells. The manuscript not only presents a compelling approach to better understand tissue dynamics, it also challenges some ideas in pathological hematopoiesis, opens new research directions and is thus of broad interest to stem cell and developmental biologists.

    2. Reviewer #1 (Public Review):

      Previous studies have used a randomly induced label to estimate the number of hematopoietic precursors that contribute to hematopoiesis. In particular, the McKinney-Freeman lab established a measurable range of precursors of 50-2500 cells using random induction of one of the 4 fluorescent proteins (FPs) of a Confetti reporter in the fetal liver to show that hundreds of precursors establish lifelong hematopoiesis. In the presented work, Liu and colleagues aim to extend the measurable range of precursor numbers previously established and enable measurement in a variety of contexts beyond embryonic development. To this end, the authors investigated whether the random induction of a given Confetti FP follows the principles of binomial distribution such that the variance inversely correlates with the precursor number. They tested their hypothesis using a simplified 2-color in vitro system, paying particular attention to minimizing sources of experimental error (elimination of outliers, sample size, events recorded, etc.) that may obscure the measurement of variance. As a result, the data generated are robust and show that the measurable range of precursors can be extended up to 105 cells. They use tamoxifen-inducible Scl-CreER, which is active in hematopoietic stem and progenitor cells (HSPCs) to induce Confetti labeling, and investigated whether they could extend their model to cell numbers below 50 with in vivo transplantation of high versus low numbers of Confetti total bone marrow (BM) cells. The premise of binomial distribution requires that the number of precursors remains constant within a group of mice. The rare frequency of HSPCs in the BM means that the experimentally generated "low" number recipient animals showed some small variability of seeding number, which does not follow the requirement for binomial distribution. While variance due to differences in precursor numbers still dominates, it is unclear how accurate estimated numbers are when precursor numbers are low (<10).

      The authors then apply their model to estimate the number of hematopoietic precursors that contribute to hematopoiesis in a variety of contexts including adult steady state, fetal liver, following myeloablation, and a genetic model of Fanconi anemia. Their modeling shows:

      -thousands of precursors (~2400-2600) contribute to adult myelopoiesis, which is in line with results from a previous study (Sun et al, 2014).<br /> -myeloablation (single dose 5-FU), while reducing precursor numbers of myeloid progenitors and HSPCs, was not associated with a reduction in precursor numbers of LT-HSCs.<br /> -no major expansion of precursor number in the fetal liver derived from labeling at E11.5 versus E14.5, consistent with recent findings from Ganuza et al, 2022.<br /> -normal precursor numbers in Fancc-/- mice at steady state and from competitive transplantation of young Fancc-/- BM cells, suggesting that reduced Fancc-/- cell proliferation may underlie the reduced chimerism upon transplantation.<br /> -reduced number of lymphoid precursors following transplantation of BM cells from 9-month-old Fancc-/- animals (beyond this age animals have decreased survival).

      Although this system does not permit the tracing of individual clones, the modeling presented allows measurements of clonal activity covering nearly the entire HSPC population (as recently estimated by Cosgrove et al, 2021) and can be applied to a wide range of in vivo contexts with relative ease. The conclusions are generally sound and based on high-quality data. Nevertheless, some results could benefit from further explanation or discussion:

      -The estimated number of LT-HSCs that contribute to myelopoiesis is not specifically provided, but from the text, it would be calculated to be 1958/5 = ~391. Data from Busch et al, 2015 suggest that the number of differentiation-active HSCs is 5.2x103, which is considered the maximum limit. There is nevertheless a more than 10-fold difference between these two estimates, and it is unclear how this discrepancy arises.<br /> -Similarly, in Figure 3E, the estimated number of precursors is highest in MPP4, a population typically associated with lymphoid potential and transient myeloid potential, whereas the numbers of MPP3, traditionally associated with myeloid potential, tend to be higher but are not significantly different than those found in HSCs.<br /> -The requirement for estimating precursor numbers at stable levels of Confetti labeling is not well explained. As a result, it is unclear how accurate the estimates of B cell precursors upon transplantation of Fancc-/- cells are. In previous experiments on normal Confetti mice (Figure 3B), the authors do not estimate precursors of lymphopoiesis because Confetti labeling of B cells is not saturated, and this appears to be the case in Fanc-/- animals as well (Fig. 5B).<br /> -Do 9-month-old Fanc-/- animals have reduced lymphoid precursors as well?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript by Liu et al. uses Confetti labeling of hematopoietic stem and progenitor cells in situ to infer the clonal dynamics of adult hematopoiesis. The authors apply a new mathematical framework to analyze the data, allowing them to increase the range of applicability of this tool up to tens of thousands of precursors. With this tool, they (1) provide evidence for the large polyclonality of adult hematopoiesis, (2) offer insights on the expansion dynamics in the fetal liver stage, (3) assess the clonal dynamics in a Fanconi anemia model (Fancc), which has engraftment defects during transplantation.

      Strengths:

      The manuscript is well written, with beautiful and clear figures, and both methods and mathematical models are clear and easy to understand.

      Since 2017, Mikel Ganuza and Shannon McKinney-Freeman have been using these Confetti approaches that rely on calculating the variance across independent biological replicates as a way to infer clonal dynamics. This is a powerful tool and it is a pleasure to see it being implemented in more labs around the world. One of the cool novelties of the current manuscript is using a mathematical model (based on a binomial distribution) to avoid directly regressing the Confetti labeling variance with the number of clones (which only has linearity for a small range of clone numbers). As a result, this current manuscript of Liu et al. methodologically extends the usability of the Confetti approach, allowing them more precise and robust quantification.

      They then use this model to revisit some questions from various Ganuza et al. papers, validating most of their conclusions. The application to the clonal dynamics of hematopoiesis in a model of Fanconi anemia (Fancc mice) is very much another novel aspect, and shows the surprising result that clonal dynamics are remarkably similar to the wild-type (in spite of the defect that these Fancc HSCs have during engraftment).<br /> Overall, the manuscript succeeds at what it proposes to do, stretching out the possibilities of this Confetti model, which I believe will be useful for the entire community of stem cell biologists, and possibly make these assays available to other stem cell regenerating systems.

      Weaknesses:

      My main concern with this work is the choice of CreER driver line, which then relates to some of the conclusions made. Scl-CreER succeeds at being as homogenous as possible in labeling HSC/MPPs... however it is clear that it also labels a subcompartment of HSC clones that become dominant with time... This is seen as the percentage of Confetti-recombined cells never ceases to increase during the 9-month chase of labeled cells, suggesting that non-labeled cells are being replaced by labeled cells. The reason why this is important is that then one cannot really make conclusions about the clonal dynamics of the unlabeled cells (e.g. for estimating the total number of clones, etc.).

      I am not sure about the claims that the data shows little precursor expansion from E11 to E14. First, these experiments are done with fewer than 5 replicates, and thus they have much higher error, which is particularly concerning for distinguishing differences of such a small number of clones. Second, the authors do see a ~0.5-1 log difference between E11 and E14 (when looking at months 2-3). When looking at months 5+, there is already a clear decline in the total number of clones in both adult-labeled and embryonic-labeled, so these time points are not as good for estimating the embryonic expansion. In any case, the number of precursors at E11 (which in the end defines the degree of expansion) is always overestimated (and thus, the expansion underestimated) due to the effects of lingering tamoxifen after injection (which continues to cause Confetti allele recombination as stem cell divide). Thus, I think these results are still compatible with expansion in the fetal liver (the degree of which still remains uncertain to me).

    4. Reviewer #3 (Public Review):

      Summary:

      Liu et al. focus on a mathematical method to quantify active hematopoietic precursors in mice using Confetti reporter mice combined with Cre-lox technology. The paper explores the hematopoietic dynamics in various scenarios, including homeostasis, myeloablation with 5-fluorouracil, Fanconi anemia (FA), and post-transplant environments. The key findings and strengths of the paper include (1) precursor quantification: The study develops a method based on the binomial distribution of fluorescent protein expression to estimate precursor numbers. This method is validated across a wide dynamic range, proving more reliable than previous approaches that suffered from limited range and high variance outside this range; (2) dynamic response analysis: The paper examines how hematopoietic precursors respond to myeloablation and transplantation; (3) application in disease models: The method is applied to the FA mouse model, revealing that these mice maintain normal precursor numbers under steady-state conditions and post-transplantation, which challenges some assumptions about FA pathology. Despite the normal precursor count, a diminished repopulation capability suggests other factors at play, possibly related to cell proliferation or other cellular dysfunctions. In addition, the FA mouse model showed a reduction in active lymphoid precursors post-transplantation, contributing to decreased repopulation capacity as the mice aged. The authors are aware of the limitation of the assumption of uniform expansion. The paper assumes a uniform expansion from active precursor to progenies for quantifying precursor numbers. This assumption may not hold in all biological scenarios, especially in disease states where hematopoietic dynamics can be significantly altered. If non-uniformity is high, this could affect the accuracy of the quantification. Overall, the study underscores the importance of precise quantification of hematopoietic precursors in understanding both normal and pathological states in hematopoiesis, presenting a robust tool that could significantly enhance research in hematopoietic disorders and therapy development. The following concerns should be addressed.

      Major Points:

      • The authors have shown a wide range of seeded cells (1 to 1e5) (Figure 1D) that follow the linear binomial rule. As the standard deviation converges eventually with more seeded cells, the authors need to address this limitation by seeding the number of cells at which the assumption fails.<br /> • Line 276: This suggests myelopoiesis is preferred when very few precursors are available after irradiation-mediated injury. Did the authors see more myeloid progenitors at 1 month post-transplantation with low precursor number? The authors need to show this data in a supplement.

      Minor Points:

      • Please cite a reference for line 40: a rare case where a single HSPC clone supports hematopoiesis.<br /> • Line 262-263: "This discrepancy may reflect uneven seeding of precursors to the BM throughout the body after transplantation and the fact that we only sampled a part of the BM (femur, tibia, and pelvis)." Consider citing this paper (https://doi.org/10.1016/j.cell.2023.09.019) that explores the HSPCs migration across different bones.<br /> • Lines 299 and 304. Misspellings of RFP.<br /> • The title is misleading as the paper's main focus is the precursor number estimator using the binomial nature of fluorescent tagging. Using a single-copy cassette of Confetti mice cannot be used to measure clonality.

    1. eLife assessment

      This important work substantially advances our understanding of nocturnal animal navigation and the ways that animals use polarized light. The evidence supporting the conclusions is convincing, with elegant behavioural experiments in actively navigating ants. The work will be of interest to biologists working on animal navigation or sensory ecology.

    2. Reviewer #1 (Public Review):

      Freas et al. investigated if the exceedingly dim polarization pattern produced by the moon can be used by animals to guide a genuine navigational task. The sun and moon have long been celestial beacons for directional information, but they can be obscured by clouds, canopy, or the horizon. However, even when hidden from view, these celestial bodies provide directional information through the polarized light patterns in the sky. While the sun's polarization pattern is famously used by many animals for compass orientation, until now it has never been shown that the extremely dim polarization pattern of the moon can be used for navigation. To test this, Freas et al. studied nocturnal bull ants, by placing a linear polarizer in the homing path on freely navigating ants 45 degrees shifted to the moon's natural polarization pattern. They recorded the homing direction of an ant before entering the polarizer, under the polarizer, and again after leaving the area covered by the polarizer. The results very clearly show, that ants walking under the linear polarizer change their homing direction by about 45 degrees in comparison to the homing direction under the natural polarization pattern and change it back after leaving the area covered by the polarizer again. These results can be repeated throughout the lunar month, showing that bull ants can use the moon's polarization pattern even under crescent moon conditions. Finally, the authors show, that the degree in which the ants change their homing direction is dependent on the length of their home vector, just as it is for the solar polarization pattern.

      The behavioral experiments are very well designed, and the statistical analyses are appropriate for the data presented. The authors' conclusions are nicely supported by the data and clearly show that nocturnal bull ants use the dim polarization pattern of the moon for homing, in the same way many animals use the sun's polarization pattern during the day. This is the first proof of the use of the lunar polarization pattern in any animal.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to understand whether polarised moonlight could be used as a directional cue for nocturnal animals homing at night, particularly at times of night when polarised light is not available from the sun. To do this, the authors used nocturnal ants, and previously established methods, to show that the walking paths of ants can be altered predictably when the angle of polarised moonlight illuminating them from above is turned by a known angle (here +/- 45 degrees).

      Strengths:

      The behavioural data are very clear and unambiguous. The results clearly show that when the angle of downwelling polarised moonlight is turned, ants turn in the same direction. The data also clearly show that this result is maintained even for different phases (and intensities) of the moon, although during the waning cycle of the moon the ants' turn is considerably less than may be expected.

      Weaknesses:

      The final section of the results - concerning the weighting of polarised light cues into the path integrator - lacks clarity and should be reworked and expanded in both the Methods and the Results (also possibly with an extra methods figure). I was really unsure of what these experiments were trying to show or what the meaning of the results actually are.

      Impact:

      The authors have discovered that nocturnal bull ants while homing back to their nest holes at night, are able to use the dim polarised light pattern formed around the moon for path integration. Even though similar methods have previously shown the ability of dung beetles to orient along straight trajectories for short distances using polarised moonlight, this is the first evidence of an animal that uses polarised moonlight in homing. This is quite significant, and their findings are well supported by their data.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript presents a series of experiments aimed at investigating orientation to polarized lunar skylight in a nocturnal ant, the first report of its kind that I am aware of.

      Strengths:

      The study was conducted carefully and is clearly explained here.

      Weaknesses:

      I have only a few comments and suggestions, that I hope will make the manuscript clearer and easier to understand.

      Time compensation or periodic snapshots

      In the introduction, the authors compare their discovery with that in dung beetles, which have only been observed to use lunar skylight to hold their course, not to travel to a specific location as the ants must. It is not entirely clear from the discussion whether the authors are suggesting that the ants navigate home by using a time-compensated lunar compass, or that they update their polarization compass with reference to other cues as the pattern of lunar skylight gradually shifts over the course of the night - though in the discussion they appear to lean towards the latter without addressing the former. Any clues in this direction might help us understand how ants adapted to navigate using solar skylight polarization might adapt use to lunar skylight polarization and account for its different schedule. I would guess that the waxing and waning moon data can be interpreted to this effect.

      Effects of moon fullness and phase on precision

      As well as the noted effect on shift magnitudes, the distributions of exit headings and reorientations also appear to differ in their precision (i.e., mean vector length) across moon phases, with somewhat shorter vectors for smaller fractions of the moon illuminated. Although these distributions are a composite of the two distributions of angles subtracted from one another to obtain these turn angles, the precision of the resulting distribution should be proportional to the original distributions. It would be interesting to know whether these differences result from poorer overall orientation precision, or more variability in reorientation, on quarter moon and crescent moon nights, and to what extent this might be attributed to sky brightness or degree of polarization.

      N.B. The Watson-Williams tests for difference in mean angle are also sensitive to differences in sample variance. This can be ruled out with another variety of the test, also proposed by Watson and Williams, to check for unequal variances, for which the F statistic is = (n2-1)*(n1-R1) / (n1-1)*(n2-R2) or its inverse, whichever is >1.

    1. eLife assessment

      The microRNA lin-4, originally discovered in C. elegans, has a key role in controlling developmental timing across species, but how its expression is developmentally regulated is poorly understood. Here, the authors provide convincing evidence that two MYRF transcription factors are essential positive regulators of lin-4 during early C. elegans larval development. These results provide important insight into the molecular control of developmental timing that could have significant implications for understanding these processes in more complex systems.

    1. Author response:

      Reviewer #1 (Public Review):

      This is an important and very well conducted study providing novel evidence on the role of zinc homeostasis for the control of infection with the intracellular bacterium S. typhimurium also disentangling the underlying mechanisms and providing clear evidence on the importance of spatio-temporal distribution of (free) zinc within the cell.

      We thank the reviewer for the positive comments.

      1) It would be important to provide more information on the genotype of mice.

      As suggested by the reviewer, we have added the detailed genotype of Slc30a1flagEGFP/+ and Slc30a1fl/flLysMCre mice to the revised supplementary Figure supplement 10.

      2) It is rather unlikely that C57Bl6 mice survive up to two weeks after i.p. injection of 1x10E5 bacteria.

      According to the reviewer comment, we have tested survival rate using a group of our experimental animals and C57BL/6 wild type.

      The Salmonella stain is a gift from our friend, Professor Ge Bao-xue. We have sent this stain for genetic characterisation which we found 100% identity to Salmonella enterica Typhimurium with many strains originated from poultry. One of them is Salmonella enterica subsp. enterica serovar Typhimurium strain MeganVac1 (Accession: CP112994.1), a live attenuated stain. We hope that this would support the relationship between the high infectious dose and mice survive.

      Author response image 1.

      (A) Survival rate of Slc30a1fl/fl and Slc30a1fl/flLysMCre (n = 14-15/group) and (B) Survival rate of C57BL/6 wild type (n = 8) after Salmonella infection for two weeks. (C) A fulllength sequence (1,478 bases) of 16S rDNA genes sequences of Salmonella stain and (D) the sequencing electropherogram.

      3) To be sure that macrophages Slc30A1 fl/fl LysMcre mice really have an impaired clearance of bacteria it would be important to rule out an effect of Slc30A1 deletion of bacterial phagocytosis and containment (f.e. evaluation of bacterial numbers after 30 min of infection).

      As the reviewer advised, we have repeated the experiment and measured the bacterial numbers after 30 min of infection (dashed line in A). The results show that there is no statistical difference in the bacterial numbers after 30 min between Slc30a1fl/flLysMCre and Slc30a1fl/fl BMDMs. Therefore, the reduction of bacterial numbers after 24 hours occurs due to the impairment of intracellular pathogen-killing capacity as the reviewer pointed out.

      Author respnse image 2.

      (A) Time course of the intracellular pathogen-killing capacity of Salmonellainfected Slc30a1fl/flLysMCre and Slc30a1fl/fl BMDMs measured in colony-forming units per ml (n = 5). (B) Fold change in Salmonella survival (CFU/mL) at different time points from A. (C) Representative images of Salmonella colonies on solid agar medium at 24 hours. Data are represented as mean ± SEM. P values were determined using 2-tailed unpaired Student’s t-test. P<0.05, *P<0.01, and ns, not significant.

      4) Does the addition of zinc to macrophages negatively affect iNOS transcription as previously observed for the divalent metal iron and is a similar mechanism also employed (CEBPß/NF-IL6 modulation) (Dlaska M et al. J Immunol 1999)?

      The reviewer has raised an important point here since free zinc also play a role in multiple levels of cellular signaling components (Kembe et al., 2015). Dlaska and colleague reported that NF-IL6, a protein responsible for iNOS transcription is negatively regulated by iron perturbation under IFNg/LPS stimulation in macrophages (Dlaska and Weiss, 1999). As the reviewer suggested, our results showed that zinc supplementation decreases the iNOS expression in macrophages after Salmonella infection, suggesting that free zinc might play a role in iNOS regulation.

      However, in Slc30a1fl/flLysMCre macrophages, despite increase intracellular free zinc, lacking Slc30a1 also induces Mt1, a zinc reservoir which might negatively affect NO production (Schwarz et al., 1995) or alternatively inhibits iNOS through NF-kB pathway (Cong et al., 2016) as reported by previous studies. Therefore, we couldn’t rule out the possibility that defects in Salmonella clearance due to iNOS/NO inhibition may be caused by a complex combination of excess free zinc and overexpression of the zinc reservoir. To prove this hypothesis, further studies using the specific target, for example Mtfl/fliNOSfl/flLysMCre model might be needed to investigate the precision mechanism.

      Author response image 3.

      RT-qPCR analysis of mRNA encoding Nos2 in BMDMs after infected with Salmonella and Salmonella plus ZnSO4 (20 μM) for 4 h.

      Reference:

      Dlaska M, Weiss G. 1999. Central role of transcription factor NF-IL6 for cytokine and ironmediated regulation of murine inducible nitric oxide synthase expression. The Journal of Immunology. 162:6171-6177, PMID: 10229861

      Kambe T, Tsuji T, Hashimoto A, Itsumura N. 2015. The physiological, biochemical, and molecular roles of zinc transporters in zinc homeostasis and metabolism. Physiological Reviews. 95:749-784. https://doi: 10.1152/physrev.00035.2014, PMID: 26084690

      Schwarz MA, Lazo JS, Yalowich JC, Allen WP, Whitmore M, Bergonia HA, Tzeng E, Billiar TR, Robbins PD, Lancaster JR Jr, et al. 1995. Metallothionein protects against the cytotoxic and DNA-damaging effects of nitric oxide. Proceedings of the National Academy of Sciences of the United States of America. 92: 4452-4456. https://doi: 10.1073/pnas.92.10.4452, PMID: 7538671

      Cong W, Niu C, Lv L, Ni M, Ruan D, Chi L, Wang Y, Yu Q, Zhan K, Xuan Y, Wang Y, Tan Y, Wei T, Cai L, Jin L. 2016. Metallothionein prevents age-associated cardiomyopathy via inhibiting NF-κB pathway activation and associated nitrative damage to 2-OGD. Antioxidants & Redox Signaling. 25: 936-952. https://doi: 10.1089/ars.2016.6648, PMID: 27477335

      5) How does Zinc or TPEN supplementation to bacteria in LB medium affect the log growth of Salmonella?

      We found that zinc supplementation at both low (20 µM) and high (640 µM) concentrations negatively effects Salmonella growth, especially during log phase and stationary phase in the broth culture medium, but not TPEN (20 µM) supplementation. These indicates that high zinc conditions occur at cellular levels such as within phagosomes (Botella et al., 2011) can limit bacterial growth.

      Author response image 4.

      Growth curve (optical density, OD 600 nm) of Salmonella in LB medium at different concentrations of ZnSO4 and/or TPEN. Bar graph indicating Salmonella growth at specific time points. Each value was expressed as mean of triplicates for each testing and data were determined using 2-tailed unpaired Student’s t-test. P<0.05, P<0.01, **P<0.001 and ns, not significant.

      Reference:

      Botella H, Peyron P, Levillain F, Poincloux R, Poquet Y, Brandli I, Wang C, Tailleux L, Tilleul S, Charrière GM, Waddell SJ, Foti M, Lugo-Villarino G, Gao Q, Maridonneau-Parini I, Butcher PD, Castagnoli PR, Gicquel B, de Chastellier C, Neyrolles O. 2011. Mycobacterial p(1)-type ATPases mediate resistance to zinc poisoning in human macrophages. Cell Host Microbe. 10:248-59. https://doi: 10.1016/j.chom.2011.08.006, PMID: 21925112

      Reviewer #2 (Public Review):

      This paper explores the importance of zinc metabolism in host defense against the intracellular pathogen Salmonella Typhimurium. Using conditional mice with a deletion of the Slc30a1 zinc exporter, the authors show a critical role for zinc homeostasis in the pathogenesis of Salmonella. Specifically, mice deficient in Slc30a1 gene in LysM+ myeloid cells are hypersusceptible to Salmonella infection, and their macrophages show alter phenotypes in response to Salmonella. The study adds important new information on the role metal homeostasis plays in microbe host interactions. Despite the strengths, the manuscript has some weaknesses. The authors conclude that lack of slc30a1 in macrophages impairs nos2-dependent anti-Salmonella activity. However, this idea is not tested experimentally. In addition, the research presented on Mt1 is preliminary. The text related to Figure 7 could be deleted without affecting the overall impact of the findings.

      We thank the reviewer for his/her positive comments and constructive suggestions.

      Reviewer #3 (Public Review):

      Na-Phatthalung et al observed that transcripts of the zinc transporter Slc30a1 was upregulated in Salmonella-infected murine macrophages and in human primary macrophages therefore they sought to determine if, and how, Slc30a1 could contribute to the control of bacterial pathogens. Using a reporter mouse the authors show that Slc30a1 expression increases in a subset of peritoneal and splenic macrophages of Salmonella-infected animals. Specific deletion of Slc30a1 in LysM+ cells resulted in a significantly higher susceptibility of mice to Salmonella infection which, counter to the authors conclusions, is not explained by the small differences in the bacterial burden observed in vivo and in vitro. Although loss of Slc30a1 resulted in reduced iNOS levels in activated macrophages, the study lacks experiments that mechanistically link loss of NO-mediated bactericidal activity to Salmonella survival in Slc30a1 deficient cells. The additional deletion of Mt1, another zinc binding protein, resulted in even lower nitrite levels of activated macrophages but only modest effects on Salmonella survival. By combining genetic approaches with molecular techniques that measure variables in macrophage activation and the labile zinc pool, Na-Phattalung et al successfully demonstrate that Slc30a1 and metallothionein 1 regulate zinc homeostasis in order to modulate effective immune responses to Salmonella infection. The authors have done a lot of work and the information that Slc30a1 expression in macrophages contributes to control of Salmonella infection in mice is a new finding that will be of interest to the field. Whether the mechanism by which SLC30A1 controls bacterial replication and/or lethality of infection involves nitric oxide production by macrophages remains to be shown.

      We very much appreciate the reviewer’s detailed evaluation and suggestions. The manuscript has been revised thoroughly according to the reviewer’s advice.

    1. Author response:

      Reviewer #2 (Public Review):

      The manuscript by Chan et al reports results of a systematic mutagenesis approach to study the surface expression and APP+ transport mechanism of serotonin transporter. They complement this experimental evidence with large-scale molecular simulations of the transporter in the presence of APP+. The use of deep mutagenesis and large-scale adaptive sampling simulations is impressive and could be very exciting contributions to the field.

      On the whole, the results appear to provide a fascinating insight into the effects of mutations on transport mechanisms, and how those interrelate with the structural fold and biophysical properties of a dynamic protein and its substrate pathways. A weakness of the conclusions based on the molecular simulation is that it relies on comparison with previously-published work involving non-identical simulation systems (i.e. different protonation states).

      As we explain further below, this is because a preprint of previous MD simulations used a different protonation state for Glu508. However, the final published article (Chan, et al., Biophysical Journal. 121, 715–730, 2022) and new simulations we present here are consistent in having Glu508 protonated.

      Conclusions in this work about the origins of the sodium:serotonin 1:1 stoichiometry should also be considered in the context of the fact that there are two sodium ions bound in the structures of SERT, and more work is needed to explain why this ion is not also released/co-transported.

      We do not have any direct evidence as to why Na+ in the Na1 site is not also symported, except to say that in our simulations it remains bound while 5-HT/APP+ is imported. Only Na+ in the Na2 site is displaced into the cytosol, consistent with the known stoichiometry for transport and consistent with works by others. For example, the Na2 site is conserved as a functionally relevant site in distantly related secondary transporters (Cheng & Bahar, Structure. 2015; 23: 2171-2181; Stolzenberg et al., J. Biol. Chem. 2017; 292: 7372-7384; Koldsø et al., PLoS Comput. Biol. 2011; 7: e1002246; Khafizov et al., Proc. Natl. Acad. Sci. U S A. 2012; 109: E3035-E3044); please see further elaboration in the manuscript on lines 450-462. Nonetheless, it could be inferred from our data that Na+ in the Na2 site is the symported ion because it, rather than Na+ in the Na1 site, shares the exit pathway with substrate (interactions with the displaced Na+ ion are replaced by the amine of the substrate as it moves into the exit pathway).

    1. Author response:

      Reviewer #1 (Public Review):

      The authors report a high-quality genome assembly for a member of Xenacoelomorpha, a taxon that is at the center of the last remaining great controversies in animal evolution. The taxon and the species in question have "jumped around" the animal tree of life over the past 25 years, and seemed to have found their place as a sister-group to all remaining bilaterians. This hypothesis posits that the earliest split within Bilateria includes Xenacoelomorpha on the one hand and a clade known as Nephrozoa (Protostomia + Deuterostomia) on the other, and is thus referred to as the Nephrozoa hypothesis. Nephrozoa is supported by phylogenomic evidence, by a number of synapomorphic morphological characters in the Nephrozoa (namely, the presence of nephridia) and lack of some key bilaterian characters in Xenacoelomorpha, and by the presence of unique miRNAs in Nephrozoa.

      The Nephrozoa hypothesis has been challenged several times by the authors' groups who alternatively suggest placing Xenacoelomorpha within Deuterostomia as a sister group to a clade known as Ambulacraria. This hypothesis (the Xenambulacraria hypothesis) is supported by alternative phylogenomic datasets and by the shared presence of a number of unique molecular signatures. In this contribution, the authors aim to strengthen their case by providing full genome data for Xenoturbella bocki.

      The actual sequencing and analysis are technically and methodologically excellent. Some of the analyses were done several years ago using approaches that may now seem obsolete, but there is no reason not to include them. As a detailed report of a newly sequenced genome, the manuscript meets the highest standards.

      The authors emphasize a number of key findings. One is the fact that the genome is not as simple as one might expect from a "basal" taxon, and is on par with other bilaterian genomes and even more complex than the genome of secondarily simplified bilaterians. There is an implicit expectation here that the sister group to all Bilateria would represent the primitive state. This is of course not true, and the authors are aware of this, but it sometimes feels as though they are using this implicit assumption as a straw dog argument to say that since the genome is not as simple as expected, X. bocki must be nested within Bilateria. The authors get around this by acknowledging that their finding is consistent with a "weak version of the Nephrozoa hypothesis", which is essentially the Nephrozoa phylogenetic hypothesis without implicit assumptions of simplicity.

      We were NOT suggesting that Xenacoels are ‘basal’ though others have certainly done so. We were testing, instead, whether their supposed simplicity is reflected in the compostion of the genome.

      Another finding is a refutation of the miRNA data supporting Nephrozoa. This is an important finding although it is somewhat flogging a dead horse, since there is already a fair amount of skepticism about the validity of the miRNA data (now over 20 years old) for higher-level phylogenetics.

      The missing bilaterian microRNAs was one of the early pieces of evidence excluding the Xenacoelomorpha from Nephrozoa. Our new data are an important refutation of this source of evidence and add to the picture that this phylum is not lacking characters of Bilateria as had been suggested (missing micro RNAs Hox genes explicitly interpreted in this way).

      The finding that the authors feel is most important is gene presence-absence data that recovers a topology in which X. bocki is sister to Abulacraria. The problem is that the same tree does not support the monophyly of Xenacoelomorpha. This may be an artifact of fast evolving acoel genomes, as the authors suggest, but it still raises questions about the robustness of the data.

      In sum, the authors' results and analyses leave an open window for the Xenambulacraria hypothesis, but do not refute the Nephrozoa hypothesis. The manuscript is a valuable contribution to the debate but does not go a significant way towards its resolution.

      The manuscript has gone through several rounds of review and revision on a preprint server and is thus fairly clear of typos, inconsistencies and lack of clarity. The authors are honest and open in their interpretation of the results and their strengths.

      We thank the reviewer for their assessment of our manuscript. We have responded to some of the points they make above. As there were no specific points to edit or change raised by reviewer 1, we are replying in detail only to reviewer 2. We like to note that we have modified the text and thus focus of our manuscript in accordance to with what we think reviewer 1 is suggesting in the last two paragraphs of their review.

      Reviewer #2 (Public Review):

      The manuscript describes the genome assembly and analysis of Xenoturbella bocki, a worm that bears many morphological features ascribed to basal bilateria. The authors aim to analyse this genome in an attempt to determine the phylogenetic position of X. bocki as a representative of Xenacoelomorpha and its associated acoelomorphs. In doing so, they want to inform the debate as to whether xenacoelomorph belong among, or is in fact paraphyletic to all bilaterians.

      This paper presents a high-quality assembly of the X. bocki genome. By virtue of the phylogenetic position of this species, this genome has considerable scientific interest. This assembly appears to be highly complete and is a strength of the paper. The further characterisation of the genome is well executed and presented. Solid results from this paper include a comprehensive description of the Hox genes, miRNA and neruopeptide repertoire, as well as a description of the linkage group and how they relate to the ancestral linkage groups.

      Where this paper is weaker is that for the central claims and questions of this paper, i.e,. the question of the phylogenetic position of xenacoelomorph and whether X. bocki is a slowly evolving, but otherwise representative member of this clade, remains insufficiently resolved.

      The authors have achieved the goal of describing the X. bocki genome very well. By contrast, it is unclear, based on the presented evidence, whether xenacoelomorph is truly a monophyletic group. The balance of the evidence seems to suggest that the X. bocki genome belongs within the bilateria group. However, it is unclear as to what is driving the position of the other acoels. Assuming that X. bocki and the other two species in that group are monophyletic, then the evidence will favour the authors' conclusion (but without clearly rejecting the alternatives).

      This paper will likely further animate the debate regarding this basal species, and also questions related to the ancestral characters of bilateria as a whole. In particular the results from the HOX and paraHOX clusters, may provide an interesting counterpoint to the previous results based on the acoels.

      We thank the Reviewer for their extended comments on our manuscript. We would firstly like to point out that our work was not aiming to resolve the phylogenetic position of X. bocki. We discussed this question at length, as it was and is a major and important question in evolutionary biology, however we think that we had phrased any conclusions in this regard very cautiously as we are well aware of limitations in our data to resolve the conundrum.

      In this revision we have further modified our text, specifically in the Introduction and Abstract, to make it clear that we are contributing to the understanding of the evolution and biology of a fascinating organism that cannot easily be cultured in the laboratory.

      In addition, we have supplied more explanation on why Xenacoelomorpha are generally seen as a monophyletic group and which lines of evidence point to this. Again, it should be noted here that colleagues who regard the Nephrozoa hypothesis as true, do not doubt the monophyly of Xenacoelomorpha.

    1. Author response:

      Reviewer #1 (Public Review):

      This manuscript presents an exciting new method for separating insulin secretory granules using insulator-based dielectrophoresis (iDEP) of immunolabeled vesicles. The method has the advantage of being able to separate vesicles by subtle biophysical differences that do not need to be known by the experimenter, and hence could in principle be used to separate any type of organelle in an unbiased way. Any individual organelle ("particle") will have a characteristic ratio of electrokinetic to dielectrophoretic mobilities (EKMr) that will determine where it migrates in the presence of an electric field. Particles with different EKMr will migrate differently and thus can be separated. The present manuscript is primarily a methods paper to show the feasibility of the iDEP technique applied to insulin vesicles. Experiments are performed on cultured cells in low or high glucose, with the conclusion that there are several distinct subpopulations of insulin vesicles in both conditions, but that the distributions in the two conditions are different. As it is already known that glucose induces release of mature insulin vesicles and stimulates new vesicle biosynthesis and maturation, this finding is not necessarily new, but is intended as a proof of principle experiment to show that the technique works. This is a promising new technology based on solid theory that has the possibility to transform the study of insulin vesicle subpopulations, itself an emerging field. The technique development is a major strength of the paper. Also, cellular fractionation and iDEP experiments are performed well, and it is clear that the distribution of vesicle populations is different in the low and high glucose conditions. However, more work is needed to characterize the vesicle populations being separated, leaving open the possibility that the separated populations are not only insulin vesicles, but might consist of other compartments as well. It is also unclear whether the populations might represent immature and mature vesicles, distinct pools of mature vesicles such as the readily releasable pool and the reserve pool, or vesicles of different age. Without a better characterization of these populations, it is not possible to assess how well the iDEP technique is doing what is claimed.

      Major comments:

      1) There is no attempt to relate the separated populations of vesicles to known subpopulations of insulin vesicles such as immature and mature vesicles, or the more recently characterized Syt9 and Syt7 vesicle subpopulations that differ in protein and lipid composition (Kreutzberger et al. 2020). Given that it is unclear exactly what populations of vesicles will be immunolabeled (see point #2 below), it is also possible that some of the "subpopulations" are other compartments being separated in addition to insulin vesicles. It will be important to examine other markers on these separated populations or to perform EM to show that they look like insulin vesicles.

      We thank the reviewer for this comment and have added the following to the discussion:

      “The intensity peaks we observed at specific EKMr values likely correspond to some of the previously described insulin vesicle subpopulations34,54-57. Larger particles are expected to have a smaller EKMr value compared to smaller particles50. Subpopulations containing larger insulin vesicles, such as a mature pool34,54, synaptotagmin IX-positive vesicles57, or docked vesicles near the plasma membrane34 may have lower EKMr values than smaller immature vesicles. Additionally, phosphatidylcholine lipids increase the zeta potential of tristearoylglycerol crystals58. This effect may extend to insulin vesicle subpopulations containing more phosphatidylcholine, such as young insulin vesicles55 which could lead to higher EKMr values. Taken together, these two properties may be used to predict the EKMr values of known insulin vesicle subpopulations. For example, insulin vesicles with EKMr values of 1-2×109 V/m2 (Fig. 4C) may represent a synaptotagmin IX-positive subpopulation due to their larger radii and depletion under glucose stimulation. Additionally, young insulin vesicles may have EKMr values between 5 and 7.5×109 V/m2 (Fig. 4C) due to higher amounts of phosphatidylcholine present in this subpopulation55. In this EKMr range, we observed a higher intensity for glucose-treated cells which may suggest biosynthesis of new vesicles. Immature insulin vesicles are likely to have higher EKMr values due to their smaller size34, such as an EKMr value between 1.5-1.6×1010 V/m2 (Fig. 4C). Here we demonstrated the capabilities of DC-iDEP to separate insulin vesicle subpopulations in an unbiased manner. Future experiments using chemical probes to label subpopulations will be useful to accurately define the EKMr values associated with specific subpopulations.” pages 7-8, lines 176-191

      Furthermore, we have conducted additional experiments using a modified INS-1 cell line with a GFP-tagged C-peptide (hPro-CpepSfGFP, GRINCH cells RRID:CVCL_WH61) in order to visualize a more complete population of insulin vesicles. By using this cell line, we have performed confocal microscopy, transmission electron microscopy, and cryo-electron microscopy experiments, demonstrating that the isolated vesicles resemble insulin vesicles and contain GFP-tagged C-peptide (Fig. 1-S3). While we acknowledge that further investigation using a more detailed labeling strategy of known insulin vesicle populations with DC-iDEP would be informative, we believe it is beyond the scope of our initial proof-of-concept experiments.

      The following text was added to the results section to describe our additional microscopy analysis:

      “To verify that the insulin vesicles were intact prior to DC-iDEP, we imaged a modified INS-1E cell line that contains a human insulin and green fluorescent protein-tagged C peptide (hPro-CpepSfGFP).49 This GFP tag allowed for quick visual verification of intact vesicles using fluorescence confocal microscopy. We observed distinct puncta rather than a diffuse GFP signal which indicated that the vesicles were intact and not ruptured. Further analysis of isolated vesicles was done using EM. We observed intact vesicles with the expected size and shape using both transmission electron microscopy (TEM) and cryo-electron microscopy (cryo-EM) (Fig. 1—figure supplement 3).” Page 5, lines 104 – 109.

      2) An antibody to synaptotagmin V is used to immunolabel vesicles, but there has been confusion between synaptotagmins V and IX in the literature and it isn't clear what exactly is being recognized by this antibody (this reviewer actually thinks it is Syt 9). If it is indeed recognizing Syt 9, it might already be labeling a restricted population of insulin vesicles (Kreutzberger et al. 2020). The specificity of this antibody should be clarified. Furthermore, Figure 2 is not convincing at showing that this synaptotagmin antibody specifically labels insulin vesicles nor is there convincing colocalization of this synaptotagmin antibody with insulin vesicles. In the image shown, several cells show very weak or no staining of both insulin and the synaptotagmin. The highlighted cell appears to show insulin mainly in a perinuclear structure (probably the Golgi) rather than in mature vesicles (which should be punctate), and insulin is not particularly well-colocalized with the synaptotagmin. Other cells in the image appear to have even less colocalization of insulin and synaptotagmin, and there is no quantification of colocalization. It seems possible that this antibody is recognizing other compartments in the cell, which would change the interpretation of the populations measured in the iDEP experiments. It would also be good to perform synaptotagmin staining under glucose-stimulating conditions, in case this alters the localization.

      We thank the reviewer for bringing this issue to our attention. The antibody originally used in Figure 2 recognizes the 386 aa isoform of synaptotagmin, which is called Syt 9 in the paper mentioned above (Kreutzberger et al. 2020). We have edited our manuscript to label this antibody as “Synaptotagmin IX” to match the existing literature. This antibody, therefore, likely labels only a subset of insulin vesicles. We believe that populations measured in the iDEP experiments consist solely of insulin vesicles, as supported by Western blot and dynamic light scattering results (Fig. 1—figure supplement 2B-C), as well as EM images (Fig. 1—figure supplement 3). Even with a subset of insulin vesicles, these results show the potential of this method, as iDEP analysis reveals heterogeneity within the population of Syt 9-positive insulin vesicles. We have replaced the original immunofluorescence images in Figure 2 with images that are more representative of INS-1E cells. We recognize that immuno-labeling did not yield perfect co-localization, which was expected. However, these experiments do provide valuable insights into the promise of using DC-iDEP for more in-depth separation analysis. Future work will use a modified INS-1 cell line or mouse model with a GFP-tagged C-peptide (hPro-CpepSfGFP, GRINCH cells RRID:CVCL_WH61) in order to visualize a less restricted set of insulin vesicles, avoiding the limitations associated with antibodies confined to a specific insulin vesicle subpopulation.

      3) The EKMr values of the vesicle populations between the low and high glucose conditions don't seem to precisely match. It is unclear if this just a technical limitation in comparing between experiments or instead suggests that glucose stimulation does not just change the proportion of vesicles in the subpopulations (i.e. the relative fluorescent intensities measured), but rather the nature of the subpopulations (i.e. they have distinct biophysical characteristics). This again gets to the issue of what these vesicle subpopulations represent. If glucose stimulation is simply converting immature to mature vesicles, one might expect it to change the proportion of vesicles, but not the biophysical properties of each subpopulation.

      We thank the reviewer for this question. We agree that glucose likely shifts the proportion of vesicles within a specific EKMr value rather than impacting the overall biophysical characteristics of all vesicles. We have performed new statistical analysis as suggested and rewritten this section to better explain the differences between conditions.

      “Visual inspection of the collected data revealed generally similar patterns of vesicles collected at specific EKMr values (Fig. 4). However, at 1200 V we achieved adequate separation of vesicle populations to discern unique populations of vesicles from cells treated with glucose compared to no treatment. Using a two-way ANOVA, we found a statistically significant interaction between the effect of treatment on vesicles collected at each EKMr value for data collected only at 1200 V [F (8, 45) = 3.61, p= 0.003]. A Bonferroni post hoc test revealed a significant difference in the intensity or quantity of vesicles collected between treated and untreated samples at 1.10x109 V/m2 (p=0.0249), 5.35x109 V/m2 (p=0.0469), 7.45x109 V/m2 (p=0.0369). These differences reflect a shift in the populations of insulin vesicles upon glucose stimulation.” Page 7, lines 158-165

      We have also now directly addressed the potential identities of the different populations in the discussion section. This was addressed in major comment #1 and on page 7 lines, 176-191 of the manuscript.

      4) The title of the paper promises "isolation" of insulin vesicles, but the manuscript only presents separation and no isolation of the separated populations. Isolation of the separated populations is important to be able to better define what these populations are (see point #1 above). Isolation is also critical if this is to be a valuable technique in the future. Yet the paper is unclear on whether it is actually technically feasible to isolate the populations separated by iDEP. In line 367, it states "this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.)." However, in line 361 it says "developing the capability to port the collected individual boluses will enable downstream analyses such as mass spectrometry or electron microscopy," suggesting that true isolation of these populations is not yet feasible. This should be clarified.

      We thank the reviewer for pointing this out. We have modified the text and title to put more focus on our ability to separate vesicles rather than isolate. We agree that the isolation and further biophysical characterization of these subpopulations will be critical to understanding them. However, this capability is still in development. We have made the following change to clarify that a way to isolate these subpopulations once iDEP-assisted separation has occurred is currently being developed.

      Title: “Insulator-based dielectrophoresis-assisted separation of insulin secretory vesicles”

      “this method serves as a stepping stone towards isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis…” page 9, line 230-232.

      Reviewer #2 (Public Review):

      This manuscript used DC-iDEP, a technology previously used on other organelle preparations to isolate insulin secretory granules from INS1 cells based on differences in dielectrophoretic and electrokinetic properties of synaptotagmin V positive insulin granules.

      The major motivation presented for this work is to provide a methodology to allow for more sensitive isolation of subpopulations of granules allowing better understanding of the biochemical composition of these populations. This manuscript clearly demonstrates the ability of this technology to separate these subpopulations which will allow for future biochemical characterizations of insulin granules in future studies.

      After proving these subpopulations can be observed, this method was then utilized to show there are shifts in these subpopulations when granules are isolated from glucose stimulated cells. Overall the method of isolation is novel and could provide a tool for further characterization of purified secretory granules.

      The observation of glucose stimulation causing shifts in subpopulations is unsurprising. Glucose stimulation could cause a depletion of insulin and other secretory content from a subset of granules. It would be expected that this loss of content would cause a shift in electrochemical properties of the granules, but this is a nice confirmation that the isolation method has the sensitivity to delineate these changes.

      Major comments:

      1) It is unclear what Synaptotagmin isoform is being looked at. Synaptotagmin V and IX have been repetitively interchanged in the literature. See note in syt IX section of "Moghadam and Jackson 2013 Front. Endocrinology" or read "Fukuda and Sagi- Eisenberg Calcium Bind Proteins 2008".

      The 386 aa. isoform that is abundant in PC12 cells has been robustly observed in INS1 cells in multiple studies and has been frequently referred to as syt IX. The sequence the antibody was raised against should be determined from the company where this was purchased and then this should be mapped to to which isoform of Synaptotagmin by sequence and clarified in the text.

      We thank the reviewer for this comment. The supplier (Thermo Fisher Scientific) calls this antibody “Synaptotagmin V.” As it recognizes the 386 aa synaptotagmin isoform, we have changed references to this antibody to call it “Synaptotagmin IX” to match the existing literature.

      2) Immunofluorescence of insulin and syt V is confusing. The example images do not appear to show robust punctate structures that are characteristic of secretory granules (in both the insulin and syt V stain).

      We appreciate the reviewer bringing this point to our attention. We agree that the immunofluorescence images in Figure 2 are not representative of typical INS-1E cells and have replaced the original image for Figure 2 with new images that show punctate structures that are more characteristic of secretory granules. These images also have better colocalization of insulin and synaptotagmin V (now labeled synaptotagmin IX) than the original image, with Pearson’s R values of 0.66 and 0.64.

      3) In the discussion it says, "Finally, this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.) that otherwise would not be possible given the low-abundance components of these subpopulations."

      It would help to elaborate more on the yield and concentrations of isolated granules. This would give a better sense of what level of biochemical characterization could be performed on sub- populations of granules.

      We thank the reviewer for this comment. This line has been changed to clarify the current capabilities of iDEP, as subpopulations cannot presently be removed from the channel.

      “this method serves as a stepping stone towards isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis…” page 9 line 230-232.

      Once it is possible to isolate subpopulations from the channel, we expect to obtain sufficient sample for further characterization. We anticipate that biophysical characterization such as imaging will be highly feasible, and small-scale proteomics could also be possible. However, currently we have not measured the concentration of isolated vesicles due to complications in the isolation steps. If the quantity of isolated subpopulations proves inadequate for proteomic analysis, we plan to scale up our cell culture to generate enough insulin vesicles for further biochemical characterization. However, these experiments are out of scope for our current work, so we removed details on this idea in the Introduction and Discussion.

      Reviewer #3 (Public Review):

      The manuscript from Barekatain et al. is investigating heterogeneity within the population of insulin vesicles from an insulinoma cell line (INS-1E) in response to glucose stimulation. Prevailing dogma in the beta-cell field suggests that there are distinct pools of mature insulin granules, such as ready-releasable and a reserve pool, which contribute to distinct phases of insulin release in response to glucose stimulation. Whether these pools (and others) are distinct in protein/lipid composition or other aspects is not known, but has been suggested. In this manuscript, the authors use density gradient sedimentation to enrich for insulin vesicles, noting the existence of a number of co-purifying contaminants (ER and mitochondrial markers). Following immunolabeling with synaptotagmin V and fluorescent-conjugated secondary antibodies, insulin vesicles were applied to a microfluidic device and separated by dielectrophoretic and electrokinetic forces following an applied voltage. The equilibrium between these opposing forces was used to physically separate insulin granules. Here some differences were observed in the insulin (Syt V positive) granule populations, when isolated from cells that were either non-stimulated or stimulated with glucose, which has been suggested previously by other studies as noted by the authors; however in the current manuscript, the inclusion of a number of control experiments may provide a better context for what the data reveal about these changes.

      The major strength of the paper is in the use of the novel, highly sophisticated methodology to examine physical attributes of insulin granules and thus begin to provide some insight into the existence of distinct insulin granule populations within a beta-cell -these include insulin granules that are maturing, membrane- docked (i.e. readily releasable), in reserve, newly-synthesized, aged, etc. Whether physical differences exist between these various granule pools is not known. In this capacity, the technical abilities of the current manuscript may begin to offer some insight into whether these perceived distinctions are physical.

      The major weakness of the manuscript is that the study falls short in terms of linking the biology to the sophisticated changes observed and primarily focuses on differences in response to glucose. Without knowing what the various populations of granules are, it is challenging to understand what the changes in response to glucose mean.

      Specific concerns are as follows:

      1) There is confusion on what the DC-iDEP separation between stimulated and stimulated cells reveals. Do these changes reflect maturation state of granules, nascent vs. old granules? Ready- releasable vs. reserve pool? The comments in the text seem to offer all possibilities.

      We thank the reviewer for this comment. Additional experiments will be useful to concretely define the physical nature of these subpopulations. Our primary goal in this study is to assess the utility of DC-iDEP in reproducibly separating these subpopulations. Our current results reflect variations in the amounts of subpopulations described in the literature and/or in currently uncharacterized subpopulations. As addressed in Reviewer #1 question #1, we have added to the discussion to review these possibilities (Page 7-8, lines 176-191).

      2) It is unclear what we can infer regarding the physical changes of granules between the stimulated states of the cells. Without an understanding of the magnitude of the effect, it is unclear how biologically significant these changes are. For example, what degree of lipid or protein remodeling would be necessary to give a similar change?

      We thank the reviewer for this question. Separation by iDEP is sufficiently sensitive to distinguish particles with minimal differences between them. For example, we could successfully separate wild type GFP from a point mutation variant of GFP. We anticipate that this method is capable of distinguishing vesicles with greater physical differences between them resulting in more distinct EKMr values. However, significant future experiments are likely necessary to determine the extent of lipid and protein remodeling between each subpopulation to define the biological significance of each subpopulation.

      3) The reliance on a single vesicle marker, Syt V, is concerning given that granule remodeling is the focus.

      We appreciate the reviewer’s concern. The current manuscript focuses on synaptotagmin V (IX)-positive insulin vesicles. The results of these experiments demonstrate the capabilities of iDEP to reveal heterogeneity in a seemingly similar set of particles. In future experiments we plan to use the modified INS-1 cell line with a GFP-tagged C-peptide (hPro-CpepSfGFP, GRINCH cells RRID:CVCL_WH61). All insulin vesicles from this cell line contain GFP-tagged C-peptide, and therefore would allow for the detection of a more complete set of insulin vesicles. The results from the current manuscript provide the proof-of-concept validation that this method is promising for understanding vesicle remodeling in more detail in the future.

      4) Additional confirmation that the isolated vesicles are in fact insulin granules would be helpful. As noted, granules were gradient enriched, but did carry contaminants. Note that the microscopy image provided does not provide any real validation for this marker.

      Further confirmation that the immune-isolated vesicles are in fact insulin granules should be included. EM with immunogold labeling post-SytV enrichment would be a potential methodology to confirm.

      We thank the reviewer for this comment. We have performed new immunofluorescence imaging to demonstrate the overlap of insulin and synaptotagmin (Fig 2). Additionally, we have performed microscopy experiments with a modified INS-1 cell line with a GFP-tagged C-peptide (hPro-CpepSfGFP, GRINCH cells RRID:CVCL_WH61) in order to provide evidence of these granules’ identity. Fluorescence microscopy revealed that the isolated granules contain GFP-tagged C-peptide (Fig. 1—figure supplement 3A), while transmission electron microscopy and cryo-electron microscopy confirmed that these vesicles have radii within the correct range to be considered insulin vesicles (Fig 1—figure supplement 3B-C). We added the following text in the results section to describe the new results included:

      “To verify that the insulin vesicles were intact prior to DC-iDEP, we imaged a modified INS-1E cell line that contains a human insulin and green fluorescent protein-tagged C peptide (hPro-CpepSfGFP).49 This GFP tag allowed for quick visual verification of intact vesicles using fluorescence confocal microscopy. We observed distinct puncta rather than a diffuse GFP signal which indicated that the vesicles were intact and not ruptured. Further analysis of isolated vesicles was done using EM. We observed intact vesicles with the expected size and shape using both transmission electron microscopy (TEM) and cryo-electron microscopy (cryo-EM) (Fig. 1—figure supplement 3). Page 5, lines 104 – 109.

      5) It would be useful to understand if the observed effects are specific to the INS-1E cell line or are a more universal effect of glucose on beta-cells.

      We agree with the reviewer that it would be interesting to study these effects in primary beta cells. While we expect to see similar results in these cells, there may be differences in the population variations or EKMr values. However, working with beta cells is currently beyond the scope of this study, as our primary focus is on validating this approach.

    1. Author response:

      Reviewer #1 (Public Review):

      Authors propose a mechanism where actin polymerization in the dendritic shaft plays a key role in trapping AMPAR vesicles around the stimulated site, promoting the preferential insertion of AMPAR into the potentiated synapse. This dendritic mechanism is novel and may be important for phenomena. Authors also developed a sophisticated method to observe the endogenous behavior of AMPAR using the HITI system.

      However, there are some major issues that need to be addressed to support the authors' claims. Also, overall, it is hard to follow. It could be better written.

      We thank the reviewer for carefully reading our text and for the helpful recommendations. We have performed additional experiments and analysis to address the raised issues (detailed below). In addition, we have streamlined and shortened the text to improve its clarity and focus on the biological story.

      Reviewer #2 (Public Review):

      In this study, Wong and colleagues investigate mechanisms leading to input-specificity of LTP. They focus on the trafficking of AMPA receptors as the surface accumulation of AMPARs is one of the key features of potentiated synapses. They employ an elegant strategy to label endogenous GluA1 with a HaloTag using CRISPR-based technology and succeed to find targeting site which does not interfere with receptor's trafficking or function. This allowed them to visualize and track single receptors in endosomes as well as at the plasma membrane of primary rat hippocampal neurons. They develop and extend particle tracking and molecule counting algorithms to analyze active transport and diffusion of AMPARs and, as expected find that neuronal activation leads to increased surface expression of labelled AMPARs. Interestingly, they also observe a strong decrease in long-range motion of AMPAR-containing vesicles upon induction of chemical LTP. From this point, the manuscript focuses on explaining this observation. The authors switch from a global activation protocol to glutamate uncaging to induce LTP at individual synapses. Also, in these settings, they measure the reduction in mobile vesicle fraction within about 30 µm long dendritic segment containing the activated spine. In search of an explanation, they investigate activity-dependent actin polymerization as a possible confinement factor that could change the motility of organelles in dendrites. Their hypotheses is based on pre-existing literature demonstrating the role of F-actin in trapping and stalling dendritic endolysosomes as well similar role of F-actin in non-neuronal cells. Indeed, the authors convincingly show that pharmacological depolymerization or stabilization of F-actin bidirectionally impacts the trafficking behavior of AMPAR-containing vesicles in the dendritic shaft. To directly visualize effects of structural LTP at individual synapses on dendritic actin cytoskeleton, they employ a F-actin-binding probe Tractin. Here they find that cLTP results in the formation of dendritic F-actin fibers and bundles arranged in a network. The spatial extent of such a network correlates with an area where AMPAR vesicles exhibit decreased motility. Although this makes sense, I have some concerns about these experiments.

      Tractin has been previously published as F-actin marker but like several other binding probes (i.e. lifeact), it affects F-actin structure and dynamics. The large number of F-actin bundles is not very typical for dendrites of hippocampal neurons and might be an artifact of Tractin overexpression. It is difficult to judge whether this is a case because there is no comparison with the endogenous situation where F-actin is labelled directly. The final series of experiments focus on the role of processive myosins in stalling and exocytosis of AMPAR vesicles. To address this point, the authors employ a mixture of three different myosin inhibitors and show that although myosins are not responsible for increased vesicle confinement they facilitate exocytosis of AMPARs. What I find somewhat missing are data and examples of AMPAR trafficking into dendritic spines. Also here, stronger experimental support could benefit the conclusions.

      Overall, the authors achieved the aims of their study. They demonstrated that synapse-specific potentiation results in signaling which triggers actin polymerization in dendritic shaft beneath the activated input. This leads to trapping and accumulation of AMPAR-containing endosomes which then have higher probability to be delivered and secreted at activated dendritic spines. In addition to conceptual advance of this work, several state-of-the-art labeling and analysis techniques where developed in this project and they will likely be used by other groups.

      We thank the reviewer for raising these important issues with regards to the use of tractin as a marker for actin polymerization. We have performed additional experiments (detailed below) using phalloidin and also dominant negative inhibitors of myosin Va, Vb, and VI in order to strengthen our conclusions. We find that inducing synaptic activity with cLTP increases phalloidin labeling and the appearance of F-actin fibers. Moreover, inhibition of myosin Va and Vb (but not VI) using their dominant negative c-terminal domains recapitulates the effects of pharmacological inhibition on both the motion states and directional bias of GluA1-HT vesicles in response to cLTP.

      With regards to AMPAR trafficking into spines, we and others have found that GluA1-containing vesicles rarely enter dendritic spines (see response to Reviewer #2, comment 3). Furthermore, exocytic events occur largely at extrasynaptic sites, such as on the dendritic shaft (Figure 5-video 1-3; Lin et al., 2007; Makino et al., 2009; Patterson et al., 2010). Consequently, we believe vesicles are concentrated proximal to synaptic activity in the dendritic shaft rather than in the dendritic spine itself, creating a larger reservoir of intracellular AMPARs that can exocytose during synaptic activity. Others have demonstrated that surface bound AMPARs diffuse across the cell membrane into stimulated synapses where they are captured (Choquet and Opazo, 2022).

      We also thank the reviewers for acknowledging the conceptual and technical advances in this work.

      Reviewer #3 (Public Review):

      Wong et al. developed a new versatile approach with a robust signal to track protein dynamics by inserting a tag into the endogenous loci and different properties of fluorescent dyes for conjugation. Using this approach, the authors monitor the trafficking of Fluorescent dye and Halo-tagged GluA1 with time-lapse imaging and found that neuronal stimulation induces GluA1 accumulation surrounding stimulated synapses on dendritic shafts and actin polymerization at synapses and dendrites. Furthermore, combining with pharmacological manipulations of actin polymerization or myosin activity, the authors found that actin polymerization facilitates exocytosis of GluA1 near activated synapses. The new approach may provide broad impacts upon appropriate control experiments, and the practical application of this approach to GluA1 trafficking upon neuronal activation is significant. However, there are several weaknesses, including confirmation of activity of the tagged receptors and receptor specificity mimicking endogenous LTP machinery. If the receptor tagged by the new robust approach reflects endogenous machinery, this approach will provide a big opportunity to the community as a versatile method to visualize a protein not visualized previously.

      Although we use methods previously demonstrated to stimulate LTP, we do not ourselves demonstrate LTP using electrophysiological methods, and consequently we have changed the text to focus on synaptic plasticity (specifically structural plasticity). Furthermore, we confirm the activity of HaloTag knock-in receptors by expressing GluA1-HT and GluA1-HT-SEP in HEK293T cells and performing whole-cell patch clamp experiments. We find that GluA1-HT and GluA1-HT-SEP responds to glutamate in a similar manner to untagged GluA1.

      We also thank the reviewer for acknowledging the novelty of our strategy.

    1. eLife assessment

      This important manuscript uses a machine-learning approach to predict and annotate cis-regulatory elements across insect genomes, helping to address a much-needed gap in comparative genomics. This method does not rely on sequence alignments, thereby allowing functional genomics studies of more distant species, including emerging model organisms. There are nuanced views on the strength of the evidence from the predictions: the pipeline appears to be based on solid evidence, but the methods could be better described. We suggest the manuscript would be much more robust if the code used was accessible for review and validated further.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors provide a genome annotation resource of 33 insects using a motif-blind prediction method for tissue-specific cis-regulatory modules. This is a welcome addition that may facilitate further research in new laboratory systems, and the approach seems to be relatively accurate, although it should be combined with other sources of evidence to be practical.

      Strengths:

      The paper clearly presents the resource, including the testing of candidate enhancers identified from various insects in Drosophila. This cross-species analysis, and the inherent suggestion that training datasets generated in flies can predict a cis-regulatory activity in distant insects, is interesting. While I can not be sure this approach will prevail in the future, for example with approaches that leverage the prediction of TF binding motifs, the SCRMShaw tool is certainly useful and worth consideration for the large community of genome scientists working on insects.

      Weaknesses:

      While the authors made the effort to provide access to the SCRMShaw annotations via the RedFly database, the usefulness of this resource is somewhat limited at the moment. First, it is possible to generate tables of annotated elements with coordinates, but it would be more useful to allow downloads of the 33 genome annotations in GFF (or equivalent) format, with SCRMshaw predictions appearing as a new feature. Also, I should note that unlike most species some annotations seem to have issues in the current RedFly implementation. For example, Vcar and Jcoen turn empty.

    3. Reviewer #2 (Public Review):

      Summary:

      The ability of researchers to identify and compare enhancers across different species is an important facet of understanding gene regulation across development and evolution. Many traditional methods of enhancer identification involve sequence alignments and manual annotations, limiting the ability to expand the scope of regulatory investigations into many species. In order to overcome this obstacle, the authors apply a previously published machine learning method called SCRMshaw to predict enhancers across 33 insect species, using D. melanogaster as a reference. SCRMshaw operates through the selection of a few dozen training loci in a reference genome, marking genomic loci in other species that are significantly enriched with similar k-mer distributions relative to randomly selected genomic backgrounds. Upon identification of predicted enhancer regions, the authors perform post-processing step filtering and identify the most likely predicted enhancer candidates based on the proximity of an orthologous target gene. They then perform reporter gene analysis to validate selected predicted enhancers from other species in D. melanogaster. The analysis of the expression patterns returned variable results across the selected predicted regions.

      Strengths:

      The authors provide annotations of predicted regions across dozens of insect species, with the intention of expanding and refining the annotations for use by the scientific field. This is useful, as researchers will be able to use the identified annotations for their own work or as a benchmark for future methods. This work also showcases the flexible and versatile nature of SCRMshaw, which can readily obtain predictions using training sets of genomic loci requiring only a few dozen annotations as input. SCRMshaw does not require sequence alignments of the enhancers and can operate without prior knowledge of the cis-regulatory sequence rules such as transcription factor binding motifs, making it a useful tool to explore the evolution of enhancers in further distant and less well-studied species.

      Weaknesses:

      This work provides predicted enhancer annotations across many insect species, with reporter gene analysis being conducted on selected regions to test the predictions. However, the code for the SCRMshaw analysis pipeline used in this work is not made available, making reproducibility of this work difficult. Additionally, while the authors claim the predicted enhancers are available within the REDfly database, the predicted enhancer coordinates are currently not downloadable as Supplementary Material or from a linked resource.

      The authors do not validate or benchmark the application of SCRMshaw against other published methods, nor do they seek to apply SCRMshaw under a variety of conditions to confirm the robustness of the returned predicted enhancers across species. Since SCRMshaw relies on an established k-mer enrichment of the training loci, its performance is presumably highly sensitive to the selection of training regions as well as the statistical power of the given k-mer counts. The authors do not justify their selection of training regions by which they perform predictions.

      While there is an attempt made to report and validate the annotated predicted enhancers using previously published data and tools, the validation lacks the depth to conclude with confidence that the predicted set of regions across each species is of high quality. In vivo, reporter assays were conducted to anecdotally confirm the validity of a few selected regions experimentally, but even these results are difficult to interpret. There is no large-scale attempt to assess the conservation of enhancer function across all annotated species.

      Lastly, it is suggested that predicted regions are derived from the shared presence of sequence features such as transcription factor binding motifs, detected through k-mer enrichment via SCRMshaw. This assumption has not been examined, although there are public motif discovery tools that would be appropriate to discover whether SCRMshaw is assigning predicted regions based on previously understood motif grammar, or due to other sequence patterns captured by k-mer count distributions. Understanding the sequence-derived nature of what drives predictions is within the scope of this work and would boost confidence in the predicted enhancers, even if it is limited to a few training examples for the sake of clarity of interpretation.

    4. Reviewer #3 (Public Review):

      Summary:

      In this ambitious paper, the authors develop an unparalleled community resource of insect genome regulatory annotations spanning five insect orders. They employ their previously-developed SCRMshaw method for computational cross-species enhancer prediction, drawing on available training datasets of validated enhancer sequence and expression from Drosophila melanogaster, which had been previously shown to perform well across select holometabolous insects (representing 160-345MY divergence). In this work, they expand regulatory sequence annotation to 33 insect genomes spanning Holometabola and Hemiptera, which is even more distantly related to the fly model. They perform multiple downstream analyses of sets of predicted enhancers to assess the true-positive rate of predictions; the independent comparisons of real predictions with simulated predictions and with chromatin accessibility data, as well as the functional validation through reporter gene analysis, strengthen their conclusions that their annotation pipeline achieves a high true-positive rate and can be used across long divergence times to computationally annotate regulatory genome regions, an ability that has been previously inaccessible for non-model insects and now is possible across the many newly-sequenced insect scaffold-level genomes.

      Strengths:

      This work fills a large gap in current methods and resources for predicting regulatory regions of the genome, a task that has long lagged behind that of coding region prediction and analysis.

      Despite technical constraints in working outside of well-developed model insect systems, the authors creatively draw on existing resources to scaffold a pipeline and independently assess the likelihood of prediction validity.

      The established database will be a welcome community resource in its current state, and even more so as the authors continue to expand their annotations to more insect genomes as they indicate. Their available analysis pipeline itself will be useful to the community as well for research groups that may want to undertake their own regulatory genome annotation.

      Weaknesses:

      The rates of predicted true positive enhancer identification vary widely across the genomes included here based on the simulations and comparison to datasets of accessible chromatin in a manner that doesn't map neatly onto phylogenetic distance. At this point, it is unclear why these patterns may arise, although this may become more clear as regulatory annotation is undertaken for more genomes.

      Functional assessment of predicted enhancers was performed through reporter gene assays primarily in Drosophila melanogaster imaginal discs, a system amenable to transgenics. Unfortunately, this mode of canonical imaginal disc development is only representative of a subset of all holometabolous insects; therefore, it is difficult to interpret reporter gene expression in a fly imaginal disc as evidence of a true positive enhancer that would be active in its native species whose adult appendages develop differently through the larval stage (for example, Coleopteran and Lepidopteran legs). However, the reporter gene assays from other tissues do offer strong evidence of true positive enhancer detection, and constraints on transgenic experiments in other systems mean that this approach is the best available.

    5. Author response:

      We thank the reviewers for their thoughtful and insightful comments. We were pleased to see that the reviewers and editors consider our work a “welcome addition” that “fills a large gap” in comparative genomics methods and provides “an unparalleled community resource of insect genome regulatory annotations.”

      Many of the reviewers’ comments reflect weaknesses in our description of the methodology. As the basic SCRMshaw methodology has been published previously, we had opted for brevity over detail in the current manuscript. We recognize now that we went too far in that direction, and we will include more methodological detail in our revised submission, along with easier access to the code we used. The reviewers also offered some helpful suggestions regarding data availability which we intend to address, including direct download of the results in GFF format and adding to the results database several species that were inadvertently omitted.

      Reviewer 2 expressed concerns about benchmarking SCRMshaw against other methods. We respectfully feel this lies outside the scope of the current study, which focuses on application of SCRMshaw to generate a multi-species annotation resource rather than on an attempt to show that SCRMshaw is superior to other approaches. We provide evidence in this manuscript, as well as in previous publications, that supports the effectiveness of SCRMshaw as an approach for regulatory element discovery that is suitable for the task at hand. Benchmarking for regulatory element discovery brings many challenges, as there are no comprehensive “truth” sets to serve as a comparison baseline. We therefore do not attempt strong claims here about the relative merits of SCRMshaw vs. other methods (although we have explored this in previous publications). Note that we also previously demonstrated commonality of transcription factor binding sites in cross-species SCRMshaw predictions, in particular in Kazemian et al. 2014 (Genome Biol. Evol. 6:2301).

      Finally, because it has important implications for understanding our results, we would like to point out a small misconception in Reviewer 2’s Summary of our study. The reviewer states that we “identify the most likely predicted enhancer candidates based on the proximity of an orthologous target gene.” We stress, however, that putative target gene assignments and identities have no impact at all on our prediction of regulatory sequences. Predictions are solely based on sequence-dependent SCRMshaw scores, with no regard to the nature or identities of nearby annotated features. Putative target genes are mapped to Drosophila orthologs purely as a convenience to aid in interpreting and prioritizing the predicted regulatory elements. We will take care to clarify this important point in our revised submission.

    1. eLife assessment

      Hartman et al.'s important research examines six commonly utilized imaging-based multiplexed transcriptomic techniques and introduces a novel specificity metric, "MECR," to streamline platform evaluations. The authors highlight the crucial influence of cell segmentation methodologies on outcomes, offering insight into the field. Nonetheless, the substantiation for the principal assertions remains incomplete, as the comparisons across platforms seem uneven due to variations in gene panels.

    2. Reviewer #1 (Public Review):

      Summary:

      Hartman and Satija's manuscript constitutes a significant contribution to the field of imaging-based spatial transcriptomics (ST) through their comprehensive comparative analysis of six multiplexed in situ gene expression profiling technologies. Their findings provide invaluable insights into the practical considerations and performance of these methods, offering robust evidence for researchers seeking optimal ST technologies. However, given the simultaneous availability of similar preprints, readers should exercise caution when comparing findings to ensure reliable information. Therefore, the authors should revise their manuscript to ensure consistency among all ST technologies compared, considering findings from other preprints as well if possible.

      Strengths:

      (1) The manuscript offers a comprehensive and systematic comparison of six in situ gene expression profiling technologies, including both commercially available and academically developed methods, which is the most extensive study in this field.

      (2) Novel metrics have been proposed by the authors to mitigate molecular artifacts and off-target signals, enhancing the accuracy of sensitivity and specificity comparisons across datasets. By emphasizing the significance of evaluating both sensitivity and specificity, the study addresses the challenge of comparing standard metrics like the number of unique molecules detected per cell, given variations in panel composition and off-target molecular artifacts. This feature is directly connected to their development of novel cell segmentation methods to improve the specificity.

      (3) As a result of the analysis performed earlier, the authors illustrate how molecular false positives can distort spatially-aware differential expression analysis, underscoring the necessity for caution in interpreting downstream results.

      (4) Offering guidance for the selection, processing, and interpretation of in situ spatial technologies, the study equips researchers in the field with valuable insights.

      Weaknesses:

      (1) Although focusing on mouse brain datasets broadens the comparison of technologies, it confines the study to a single biological context. Discussing the potential limitations of this approach and advocating for future studies in diverse tissue types would enrich the manuscript, especially for clinical FFPE applications.

      (2) Providing more explicit details on the criteria used to select datasets for each technology would ensure a fair and unbiased comparison. Otherwise, it may look like the Hall of Fame for champion data sets to advertise a certain commercial product.

      (3) Improving the discussion part by discussing the origins of non-specific signals and molecular artifacts, alongside the challenges related to cell segmentation across different tissue types and cell morphologies, would enrich its content. Note that all of these experimental sets have been obtained from thin mouse brain slices, which are actually 3D although they are thin like 10-20 um. As a result, there might be a chance to have partial cell overlap in the z-axis, potentially leading to transcript mixing. Additionally, many cells are probably cut so their actual transcriptomes are inherently partial information, which makes direct comparison to scRNA-seq unfair. These aspects should be included for fair comparison issues.

      (4) Expanding on the potential implications of the findings for developing new computational methods to address non-specific biases in downstream analyses would augment the manuscript's impact and relevance.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript, Hartman et al. present a detailed comparison of 6 distinct multiplexed in situ gene expression profiling technologies, including both academic and commercial systems.

      The main concept of the study is to evaluate publicly accessible mouse brain datasets provided by the platforms' developers, where optimal performance in showcasing their technologies is expected. The authors stress the difficulty of making a comparison with standard metrics, e.g., the count of total molecules per cell, considering the differences in gene panel sizes across platforms. To make a fair comparison, the authors conceived a metric of specificity performance, which is called "MECR", an average of mutually exclusive gene co-expression rates in the sample. The authors found that the rate mainly depends on the choice of cell segmentation method, thus reanalyzed 5 of these datasets (excluding STARmap PLUS, due to the lack of molecule location information) with an independent cell segmentation algorithm (i.e., Baysor). Based on the reanalysis, the authors clearly suggest the best-performing platform at the end of the manuscript.

      Strengths:

      I consider that the paper is a valuable contribution to the community, for the following two reasons:

      (1) As the authors mentioned, I fully agree that the spatial transcriptomics community indeed needs better metrics in terms of comparison across technologies, rather than traditional metrics, e.g., molecule counts per cell. In that regard, I believe introducing a new metric, MECR, is quite valuable.

      (2) This work highlights the differences in results based on the choice of cell segmentation used for each platform, which suggests a need for trying out different segmentation algorithms to derive the right results. I believe this is an urgent warning that should be widespread in the community as soon as possible.

      Weaknesses:

      I disagree with the conclusion of the manuscript where the authors compare the technologies and suggest the best-performing ones, because of the following major points:

      (1) As the authors mentioned, MECR is a measure of "specificity" not "sensitivity". Still, the comparison of sensitivity was done with the mean counts per cell (Figure 3e). However, I strongly disagree with using the mean counts per cell as a measure of sensitivity because the comparison was done with different gene panels. The counts per cell can be highly dependent on the choice of genes, especially due to optical crowding.

      (2) The authors compared sensitivity based on the Baysor cell segmentation, but in fact, Baysor uses spatial gene expression for cell segmentation, which depends on the sensitivity of the platform. Thus, a comparison of sensitivity based on an algorithm that is based on sensitivity seems to be nonsensical.

    4. Author response:

      We thank both reviewers for their constructive feedback. We were grateful to see that both reviewers found our work to be valuable to the field, and agreed that new metrics (including our introduced MECR) were important for dataset evaluation. We briefly respond to two main points from the reviewers.

      (1) Key findings from our manuscript. While we do evaluate publicly available datasets in our manuscript, the focus/conclusion of our work is not to return a definitive ranking of in-situ technologies. As reviewers point out, our comparative evaluation is only in a single biological context, and we further note that many of these in situ platforms are rapidly evolving with new chemistries and gene panels. 

      Instead, the conclusion and purpose of our manuscript was to emphasize the importance and need for new metrics when evaluating spatial datasets. We propose an option, and demonstrate how cell segmentation can affect technical metrics, but also downstream biological analysis of in-situ datasets.

      (2) Comparing technologies with different gene panels. The reviewers correctly point out that comparing technologies that use different gene panels is not a perfect benchmark. We agree that differences in molecular counts could arise due to biological differences in the abundance of targeted genes.

      We did address this in Supplementary Figure 4, where we perform pairwise comparisons of each technology - and compute these only using overlapping genes that were measured by both technology. Our results are consistent with the analysis of full gene sets. 

      While we believe that regenerating in-situ datasets with identical gene panels is beyond the scope of this work (and is likely technically infeasible), we hope that our findings are still valuable and informative to the growing spatial community.

    1. eLife assessment

      This is an important study that brings insight into mechanisms that underlie regulation of GABAergic transmission in response to changes in activity. The authors present solid data supporting the premise that action potential firing rather than excitatory synaptic strength is a key determinant of GABAergic synaptic inputs.

    2. Reviewer #3 (Public Review):

      This paper concerns whether synaptic scaling (or homeostatic synaptic plasticity; HSP) occurs similarly at GABA and Glu synapses and comes to the surprising conclusion that these can be regulated independently. In fact, under the conditions used in this study, only the GABAergic synapses show HSP and the glutamatergic synapses don't change. This is surprising because these were thought to be co-regulated during HSP and in fact, the major mechanisms thought to underlie downscaling (TTX or CNQX driven), retinoic acid and TNF, have been shown to regulate both GABARs and AMPARs directly. Thus, the main result, that GABA HSP is dissociable from Glu HSP, is novel and exciting. This suggests either different mechanisms underlie the two processes, or that under certain conditions, another mechanism is engaged that scales one type of synapse and not the other. Given that glutamatergic synapses are unchanged in their conditions, that later seems more likely - a novel form of HSP exists that only scale GABAergic synapses. Whether glutamatergic and GABAergic synapses scale independently during HSP affecting both types of synapses remains to be addressed. It would be necessary to demonstrate the dissociation in the same system, under conditions where both types of synapses are changing. But because the form of HSP studied here appears different than that studied in Fong et al., the authors should be careful when comparing the two results. There seems to be an implicit underlying assumption that there is a simple form of HSP, when the overall literature (and the two studies from this lab) supports the idea of many forms of HSP.

      The homeostatic changes at GABAergic synapses do seem to be more consistent in amplitude across the bulk of the synapses, which does suggest that true scaling (a proportional change to all synapses on a cell) is occurring. This may represent a major difference in how homeostatic changes occur at the two types of synapses.

      The second finding is that this form of HSP seems more regulated by action potential firing than conventional HSP - previous work from this lab had shown that restoring AP firing during AMPA receptor blockade did not prevent scaling of glutamatergic synapses (it should be noted these experiments were done in rat cultures, not mouse, used a higher concentration of CNQX, and used a different optogenetic stimulation paradigm). Restoring AP firing rates under the conditions used here (and thus the form of HSP only affecting GABA synapses), on the other hand, did prevent the homeostatic response. This suggests that this GABA-only form of HSP is more attuned to spiking rates than other forms.

      However, details in the data may suggest that spiking is not the (or the only) homeostat, as TTX and CNQX causes identical changes in mIPSC amplitude but have different effects on spiking (although TTX may be driving a different form of HSP). Further, in Fig 5, CTZ had a minimal effect on spiking but a large effect on mIPSCs. Similar issues appear in Fig 6, where the induction of increased spiking is highly variable, with many cells showing control levels or lower spiking rates. Yet the synaptic changes are robust, across all cells. Overall, more will need to be done to conclude that spiking is the homeostat for GABA synapses.

      The paper also suggests that the GABA changes are leading to the recovery of the spiking rates, but while they have the time course of the spiking changes and recovery, they only have the 24h time point for synaptic changes. It is not yet possible to conclude how the time courses align without more data, nor can we assume that cells that did not recover to control firing rates would do so eventually.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study assesses homeostatic plasticity mechanisms driven by inhibitory GABAergic synapses in cultured cortical neurons. The authors report that up- or down-regulation of GABAergic synaptic strength, rather than excitatory glutamatergic synaptic strength, is critical for homeostatic regulation of neuronal firing rates. The reviewers noted that the findings are potentially important, but they also raised questions. In particular, the evidence supporting the findings is currently incomplete and demonstration of independent regulation of mEPSCs and mIPSCs is a necessary experiment to support the major claims of the study. 

      We appreciate the detailed, thoughtful assessment of our paper by the reviewers and editors and now submit a revised version that addresses the reviewers’ comments as detailed below in response to each concern. We include a more open discussion of alternative possibilities and have added experiments demonstrating that AMPAergic scaling in our mouse cortical cultures is triggered differently than GABAergic scaling. We treated the cultured neurons exactly as described for triggering GABAergic scaling (20µM CNQX for 24 hours), however this did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking/bursting activity. Below we explain the result further, but ultimately this does demonstrate independent regulation of mEPSCs and mIPSCs as requested by the editor/reviewer (spike reductions induced by CNQX reduced mIPSC amplitude, but had no effect on mEPSC amplitude).

      Reviewer #1 (Public Review):

      While the paper is ambitious in its rhetorical scope and certainly presents intriguing findings, there are several serious concerns that need to be addressed to substantiate the interpretations of the data. For example, the CTZ data do not support the interpretations and conclusions drawn by the authors. Summarily, the authors argue that GABAergic scaling is measuring spiking (at the time scale of the homeostatic response, which they suggest is a key feature of a homeostat) yet their data in figure 5B show more convincingly that CTZ does not influence spiking levels - only one out of four time points is marginally significant (also, I suspect that the bootstrapping method mentioned in line 454-459 was conducted as a pairwise comparison of distributions. There is no mention of multiple comparisons corrections, and I have to assume that the significance at 3h would disappear with correction).

      We certainly understand the criticism here (similar to reviewer 2’s third point). We now discuss these complications in a more detailed description in the manuscript (CTZ section of results and at end of the discussion). First, we are presenting our entire dataset to be as transparent as possible. Unlike most synaptic scaling studies (including our own) that apply drugs to alter activity and assess mPSC amplitude at the final time point, here we are actually showing CTZ’s effect on spiking activity within the culture over time. This is critical because it has informed us of the drug’s true effect on spiking, the variability that is associated with these perturbations, and the ability and timing of the cultured network to homeostatically recover initial levels. This was important because it revealed that the drugs do not always influence activity in the way we assume, and this provides greater context to our results. Second, we are showing all of our data, and presenting it using estimation statistics which go beyond the dichotomy of a simple p value yes or no (Ho J, Tumkaya T, Aryal S, Choi H, Claridge-Chang A. 2019. Moving beyond P values: data analysis with estimation graphics. Nat Methods 16: 565-66). Estimation statistics have become a more standard statistical approach in the last 15 years and is the preferred method for the Society for Neuroscience’s eNeuro Journal. This method shows the effect size and the confidence interval of the distribution. For the 3 hr time point in Fig. 5B the CTZ/ethanol vs. ethanol data points exhibit very little overlap and the effect size demonstrates a near doubling of spike frequency, and the confidence interval shows a clear separation from 0. This was a pairwise comparison as we compared values at each time point after the addition of ethanol or ethanol/CTZ. Third, the plots illustrate an upward trend in spike frequency at 1 and 6 hrs, but that there is also clear variability. It is important to note that these are multiunit recordings and not purely excitatory principal neurons that we target for mPSC recordings. This complication along with the variability inherent in these cultures could make simple comparisons difficult to interpret and we now discuss this (end of discussion). Regardless, we do see some increase in spiking with CTZ and we clearly see increases in mIPSC amplitude, thus providing some support for the idea that spiking could be a critical player in terms of GABAergic scaling, particularly when put in the context of all of our findings. Future work will be necessary to determine how alterations in spiking lead to changes in mIPSC amplitude and we now discuss this (2nd to last paragraph in discussion).

      Then, the fact that TTX applied on top of CTZ drives an increase in mIPSC amplitude is interpreted as a conclusive demonstration that GABAergic scaling is sensing spiking. It is inevitable, however, that TTX will also severely reduce AMAP-R activation - a very plausible alternative explanation is that the augmentation of AMPAR activation caused by CTZ is not sufficient to overcome the dramatic impact of TTX. All together, these data do not provide substantial evidence for the conclusion drawn by the authors. 

      We believe that the most parsimonious explanation for our results is that spiking activity, not AMPAR activation, triggers GABAergic downscaling. GABAergic scaling is no different when comparing 24hr TTX treatment vs TTX+CTZ, and optogenetic restoration of spiking activity while continuing to block AMPAR activation was able to restore GABAergic mPSC amplitudes to control levels. It is important to emphasize that our results with TTX vs. TTX+CTZ are different for GABAergic scaling (no difference in this study) and AMPAergic scaling (CTZ diminished upward scaling in previous study – Fong et al., 2015 - PMID: 25751516) suggesting different triggers for the two forms of scaling. While we strongly believe we have demonstrated that GABAergic downscaling is dependent on spiking (not AMPAergic transmission), we now acknowledge that we cannot rule out the possibility that upward GABAergic scaling may be influenced by AMPAR activation (2nd paragraph discussion), although we have no evidence in support of this.

      Specific points:

      - The logic of the basis for the argument is somewhat flawed: A homeostat does not require a multiplicative mechanism, nor does it even need to be synaptic. Membrane excitability is a locus of homeostatic regulation of firing, for example. In addition, synapse-specific modulation can also be homeostatic. The only requirement of the homeostat is that its deployment subserves the stabilization of a biological parameter (e.g., firing rate). 

      We largely agree with the reviewer and should not have implied that this was a necessary requirement for a spike rate homeostat. What we should have said was that historically this definition has been applied to AMPAergic scaling, which is thought to be a spike rate homeostat. We have now corrected this (introduction and discussion).

      - Line 63 parenthetically references an important, but contradictory study as a brief "however". Given the tone of the writing, it would be more balanced to give this study at least a full sentence of exposition. 

      Agreed, and we have now done this.

      - The authors state (line 11) that expression of a hyperpolarizing conductance did not trigger scaling. More recent work ('Homeostatic synaptic scaling establishes the specificity of an associative memory') does this via expression of DREADDs and finds robust scaling.

      The purpose of citing this study was to argue that the spike rate homeostat hypothesis doesn’t make sense for AMPAergic scaling based on a study that hyperpolarized an individual cell while leaving the rest of the network unaltered and therefore leaving network activity and neurotransmission largely normal. In this previous study scaling was not triggered, suggesting reduced spike rate within an individual cell was insufficient to trigger scaling in that cell. The more recent study mentioned by the reviewer achieved scaling by hyperpolarizing a majority of cells in the network. Importantly, this approach alters neurotransmission throughout the network, making it challenging to isolate the specific contributions of spiking vs. receptor activation. Unlike the previous study, which focused on the impact within individual cells, this newer study involves global alterations in network activity, complicating the interpretation of the role of spiking versus receptor activation in triggering scaling.

      - Supplemental figure 1 looks largely linear to me? Out of curiosity, wouldn't you expect the left end to be aberrant because scaling up should theoretically increase the strength of some synapses that would have been previously below threshold for detection?

      We agree that the scaling ratio plot is largely linear. To be clear, the linearity of the ratio plot was not our point here, rather that there was a positive slope meaning ratios (CNQX mEPSC amplitudes/control mEPSC amplitudes) got bigger for the larger CNQX-treated mEPSCs. Alternatively, a multiplicative relationship where mEPSCs are all increased by a single factor (e.g. 2X) would be a flat line with 0 slope at the multiplicative value (e.g. 2). In terms of the left side of the plot, we do see values that rise abruptly from 1 - this was partially obstructed by the Y axis in this figure and we have adjusted this. This left part of the plot is likely due the CNQX-induced increases in mEPSC amplitudes of mini’s that where below our detection threshold of 5pA, as suggested by the reviewer. Therefore, mini’s that were 4pAs could now be 5pAs after CNQX treatment and these are then divided by the smallest control mEPSCs which are 5 pAs (ratio of 1). We tried to do a better job describing this in the resubmission (1st paragraph of results).

      - Given that figure 2B also shows warping at the tail ends of similar distributions, how is this to be interpreted? 

      The left side of the ratio plot shows evidence consistent with the idea that mIPSCs are dropping into the noise after CNQX treatment (smallest GABA mIPSCs that don’t fall into noise are 5pA and this is divided by the smallest control GABA mPSCs of 5pPA and therefore the ratio is 1). The rest of the distribution will then approach the scaling factor (50% in this case). On the right side of the ratio plot the values appear to slightly increase. We are not sure why this is happening, but it maybe that a small percentage of mIPSCs are not purely multiplicative at 0.5, however the biggest mPSCs can vary to a great degree from one cell to the next and in other cases we do not see this (Figure 4B, Figure 5E). We tried to do a better job describing this in the resubmission (results describing Figure 2).

      - The readability of the figures is poor. Some of them have inconsistent boundary boxes, bizarre axes, text that appears skewed as if the figures were quickly thrown together and stretched to fit. 

      We have adjusted the figures to be more consistent throughout the manuscript.

      - I'm concerned about the optogenetic restoration of activity experiment. Cortical pyramidal neuron mean firing rates are log normally distributed and span multiple orders of magnitude. The stimulation experiments can only address the total firing at a network-level - given than a network level "mean" is meaningless in a lognormal distribution, how are we to think about the effect of this manipulation when it comes to individual neurons homeostatically stabilizing their own activities? In essence, the argument is made at the single-neuron level, but the experiment is conducted with a network-level resolution. 

      As described above, we do not have the capacity to know what the actual firing rate of a particular neuron was before and after perturbing the system, and certainly not for the specific cells we recorded from to obtain mPSC amplitudes, and so we cannot say that we have perfectly restored the original firing rates of neurons. However, there is reason to believe that this is achieved to some extent. Our optogenetic stimulation is only 50-100 ms long activating a subset of neurons. This is sufficient to provide a synaptic barrage that then triggers a full blown network burst where the majority of spikes occur, but this is after the light is off. In other words, the optogenetic light pulse only initiates what becomes a relatively normal network burst that fortunately allows the individual cells to express their relatively normal (pre-drug) activity pattern. In our previous study using optogenetic activity restoration (Fong et al., 2015) we were able to show that this was the case for individual units - the spiking of an individual unit during a burst is similar before and after CNQX/optogenetic stimulation (see Figure 4b and Suppl. Fig 4 in Fong et al. 2015). We are not claiming that we have restored spiking to exactly the pre-drug state, but bring it back toward those levels and we see this is associated with a return of the mIPSC amplitude to near control levels. We now include a brief description of this in the manuscript (results describing Figure 3).

      - Line 198-99: multiplicativity is not a requirement of a homeostatic mechanism.

      - Line 264-265 - again, neither multiplicativity and synaptic mechanisms are fundamentally any more necessary for a homeostatic locus than anything else that can modulate firing rate in via negative feedback. 

      As mentioned above, the multiplicative nature of scaling has been a historical proposal for AMPAergic scaling and we have now found such a relationship for GABAergic scaling. This is important for understanding how this plasticity works, but we agree that it is not necessary for a homeostat and we have adjusted the manuscript accordingly.

      - 277: do you mean AMPAR? 

      We were not clear enough here. We actually do mean GABAR. The idea was that CTZ increases network activity and thus increases both AMPAergic and GABAergic transmission. We have rewritten this part of the discussion to avoid any confusion (2nd paragraph discussion).

      - Example: Figure 1A is frustratingly unreadable. The axes on the raster insets are microscopic, the arrows are strangely large, and it seems unnecessary to fill so much realestate with 4 rasters. Only one is necessary to show the concept of a network burst. The effect of time+CNQX on the frequency of burst is shown in B and C.

      - Example: Figure 2 appears warped and hastily assembled. Statistical indications are shown within and outside of bounding boxes. Axes are not aligned. Labels are not aligned. Font sizes are not equal on equivalent axes. 

      These figures were generated by the estimation statistics website and text may have been resized inappropriately. We have tried to adjust this and now have attempted to standardize the axes text to the best of our ability.

      - The discussion should include mention of the limitations and/or constraints of drawing general conclusions from cell culture. 

      We have added this consideration at the end of the discussion. Further, this is why we cited studies that argue GABAergic neurons have a particularly important role in homeostatic regulation of firing following sensory deprivations in vivo.

      - The discussion should include mention of the role of developmental age in the expression of specific mechanisms. It is highly likely that what is studied at ~P14 is specific to early postnatal development. 

      We now discuss caveats of cortical cultures at the end of the discussion.

      It is essential to ensure that the data presented in the paper adequately supports the conclusions drawn. A more cautious approach in interpreting the results may lead to a stronger argument and a more robust understanding of the underlying mechanisms at play. 

      We have broadened our discussion of alternative interpretations throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      While I am hesitant to judge a paper based on its tone, I would personally recommend revision of some of the subjective words and statements, as the manuscript undermines its own effectiveness by making unnecessarily strong statements. The text repeatedly paints an "either A or B" picture, and if there's any general lesson in biology, it's that it's always A and B. Global, multiplicative glutamatergic scaling could quite conceivably occur alongside GABAergic scaling, as well as synapse-specific homeostatic modifications. It seems that it would be wise to acknowledge that, while the data presented here point in one direction, in vivo results in an adult brain (for example) might present an entirely different set of patterns. This will not only enhance the readability of the paper but also ensure that the scientific community can engage with the work in a constructive and collaborative manner. Again, I present this as only a constructive and supportive suggestion. I am a big fan of work from this laboratory, and I would love to see this paper in an improved form - it's an important set of ideas and I do believe that these data are rigorously collected. 

      We have attempted to provide a more comprehensive interpretation of our results. We agree that a homeostat can come in many flavors, but do believe that GABAergic scaling is strong candidate, whereas AMPAergic scaling does not currently fit such a role. We do now discuss caveats with our work and are open to other interpretations that need to be flushed out in future work.

      Reviewer #2 (Public Review):

      Major points:

      (1) The reason why CNQX does not completely eliminate spiking is unclear (Fig. 1). What is the circuit mechanism by which spiking continues, although at lower frequency, in the absence of AMPA-mediated transmission and what the mechanism by which spiking frequency grows back after 24h (still in the absence of AMPA transmission)?

      Is it possible that NMDA-mediated transmission takes over and triggers a different type of network plasticity?

      The bursting in AMPAR blockade is due to the remaining NMDA receptor-mediated transmission. We showed this in our previous study in Suppl. Figure 2 and 6 of Fong et al., 2015 (PMID: 25751516). Our ability to optically induce normal looking bursts of spikes was also dependent NMDAR activation (Fong et al 2015 and Figure 6 Newman et al., 2015 - PMID: 26140329). Further, in Dr Fong’s PhD dissertation it was shown that the bursting activity was abolished when AMPA and NMDA receptors were both blocked. There are likely many factors that contribute to the recovery of activity, and certainly one of them is likely to be the weakening of inhibitory GABAergic currents as we had mentioned. We have now added the point about NMDARs mediating the remaining bursts in the manuscript (results associated with Figure 1). We are not clear on what the reviewer has in mind in terms of “NMDA-mediated transmission takes over and triggers a different kind of network plasticity”, but we do discuss the possibility that spiking triggers GABAergic scaling through its effect on NMDAergic transmission, which we cannot rule out, but also have no evidence in support of this idea (3rd and 5th paragraph of discussion). We do plan on addressing this in a future work.

      (2) A possible activation of NMDARs should be considered. One would think that experiments involving chronic glutamatergic blockade could have been conducted in the presence of NMDAR blockers. Why this was not the case?

      Unfortunately, it was not possible to optogenetically restore normal bursting in the presence of NMDAR blockade (even when AMPAergic transmission was intact), as NMDARs appeared to be critical for the optical restoration of the normal duration and form of the burst in rat cortical cultures (see Suppl. Figure 6 Fong et al., 2015 Nat Comm and Figure 6 Newman et al., 2015). Even high concentrations of CNQX (40µM) prevented us from restoring spiking in mouse cultures in the current study, which is why we moved to 20µM CNQX for this study. The reviewer raises an excellent point about a possible NMDAR contribution to altered synaptic strength, however. It is likely that NMDAR signaling is reduced in the presence of CNQX since burst frequency was dramatically reduced along with AMPAR-mediated depolarizations. We cannot rule out the possibility that NMDAR signaling could contribute to the alterations in GABAergic mIPSCs and discuss this in the resubmission (3rd and 5th paragraph of the discussion). We had not considered this previously because prior work suggested that 24/48 hour block NMDARs (APV) did not trigger AMPAergic scaling in cortical or hippocampal cultures (see Figure 1 Turrigiano et al., 1998 Nature and Suppl. Figure 4 Sutton et al., 2006 Cell), moreover, our previous study showed that restoring NMDAergic transmission ontogenetically, at least to some extent, had no influence on AMPAergic scaling (Fong et al., 2015).

      Also, experiments with global ChR2 stimulation with coincident pre and postsynaptic firing might also activate NMDARs and result in additional effects that should be taken into consideration for the global scaling mechanism.

      To be clear, our optical stimulation was of short duration (duration 50-100 ms) and was turned off before the vast majority of spiking that occurred in the bursts. So the light flash was a trigger that allowed a relatively normal looking burst to occur after the light was off (see lower panel of Figure 3B optogenetic stimulation – short duration only at onset of burst – we now make this clearer in resubmission). Therefore, we were unlikely to trigger significant synchronous activation that does not normally occur in network bursts.

      (3) Cultures exposed to CTZ to enhance AMPA receptors generated variable results (Fig. 5), somewhat increasing spiking activity in a non-significant manner but, at the same time, strengthening mIPSC amplitude. This result seems to suggest that spiking might be involved in GABAergic scaling, but it does not seem to prove it. Then, addition of TTX that blocked spiking reduced mIPSC amplitude. It was concluded here that the ability of CTZ to enhance GABAergic currents was primarily due to spiking, rather than the increase in AMPA-mediated currents. However, in addition to blocking action potentials, TTX would also prevent activation of AMPARs in the presence of CTZ due to the lack of glutamatergic release. Therefore, under these conditions, an effect of glutamatergic activation on GABAergic scaling cannot be ruled out.

      These concerns were very similar to reviewer 1’s first comments (see above). To be clear we are going a step beyond most scaling studies by assessing MEA-wide firing rate, but this still provides an incomplete picture of the particular cells that we target for patch recordings in terms of their firing before and after a drug. Further, we see considerable variability in effect on firing rate from culture to culture, which we now discuss in the resubmission (final paragraph discussion). The fact that mIPSCs are no different after TTX treatment vs CTZ+TTX treatment suggests that AMPAergic transmission is not so influential on GABAergic downscaling. While the CTZ results are not conclusive by themselves, taken together with the optogenetic results, where restoration of spiking in AMPAR blockade reverses scaling, is most consistent with idea that GABAergic scaling is triggered by spiking rather than AMPAR activation and places GABAergic scaling as a strong candidate as spike rate homeostat. Although we do feel that we have demonstrated that downward GABAergic scaling is dependent on spiking, we cannot rule out the possibility that upward GABAergic scaling could be influenced by AMPAR activation to some extent. We now acknowledge this possibility (2nd paragraph discussion).

      (4) The sample size is not mentioned in any figure. How many cells/culture dishes were used in each condition?

      The individual dots represent either individual cells for mIPSC amplitude or individual cultures in MEA experiments. Number of cultures and cells are now stated in the figure legends.

      (5) Cortical cultures may typically contain about 5-10% GABAergic interneurons and 90-95 % pyramidal cells. One would think that scaling mechanisms occurring in pyramidal cells and interneurons could be distinct, with different impact on the network. Although for whole-cell recordings the authors selected pyramidal looking cells, which might bias recordings towards excitatory neurons, naked eye selection of recording cells is quite difficult in primary cultures. Some of the variability in mIPSC amplitude values (Fig. 2A for example) might be attributed to the cell type? One could use cultures where interneurons are fluorescently labeled to obtain an accurate representation. The issue of the possible differential effects of scaling in pyramidal cells vs. interneurons and the consequences in the network should be discussed.

      We now include this discussion in the resubmission (final paragraph discussion). Briefly, we chose large cells, which will be predominantly glutamatergic neurons as suggested by the reviewer. Ultimately, even among glutamatergic principal cells there may be variability in the response to drug application. All of these issues could contribute to variability and we have expanded our description of the variability in our results, including that based on cellular heterogeneity. 

      Reviewer #2 (Recommendations For The Authors):

      Minor comments –

      Fig S3: Please quantify changes in frequency

      We have done this (Supplemental Figure 5).

      Fig 2: please choose colors with higher contrast for CNQX/TTX

      We have done this.

      Fig. 3C: Why doesn't CNQX+PhotoStim reach control levels of bursting at 2h?

      The program was designed to follow and maintain total spike frequency and so it does a better job at this than maintaining burst frequency.

      Fig. 5A: please include a comparison between control and Ethanol

      We now do this in Figure 5C. Both around 26pAs.

      Fig. 5C: where is the Etoh condition?

      We have made this figure more clear in terms of controls (Figure 5C & D).

      Reviewer #3 (Public Review):

      This paper concerns whether scaling (or homeostatic synaptic plasticity; HSP) occurs similarly at GABA and Glu synapses and comes to the surprising conclusion that these are regulated separately. This is surprising because these were thought to be co-regulated during HSP and in fact, the major mechanisms thought to underlie downscaling (TTX or CNQX driven), retinoic acid and TNF, have been shown to regulate both GABARs and AMPARs directly. (As a side note, it is unclear that the manipulations used in Josesph and Turrigiano represent HSP, and so might not be relevant). Thus the main result, that GABA HSP is dissociable from Glu HSP, is novel and exciting. This suggests either different mechanisms underlie the two processes, or that under certain conditions, another mechanism is engaged that scales one type of synapse and not the other.

      However, strong claims require strong evidence, and the results presented here only address GABA HSP, relying on previous work from this lab on Glu HSP (Fong, et al., 2015). But the previous experiments were done in rat cultures, while these experiments are done in mice and at somewhat different ages (DIV). Even identical culture systems can drift over time (possibly due to changes in the components of B27 or other media and supplements). Therefore it is necessary to demonstrate in the same system the dissociation. To be convincing, they need to show the mEPSCs for Fig 4, clearly showing the dissociation. Doing the same for Fig 5 would be great, but I think Fig 4 is the key.

      We understand the concern of the reviewer as we do see significant variability within our cultures and they were plated in different places, by different people, in different species (rat vs mouse). Therefore, we have attempted to redo the study on AMPAergic scaling on these mouse cortical neurons. Surprisingly, we found that 20µM CNQX did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking activity and was able to produce GABAergic downscaling. We did not carry out the optogenetic restoration of activity, because we did not trigger upscaling. The result does however, show that the reductions in spiking/bursting that trigger GABAergic downscaling, did not trigger AMPAergic upscaling and therefore dissociate the 2 forms of scaling in these mouse cultures. We do not know why 20 µM CNQX did not trigger scaling in these cultures since it does reduce spiking and AMPAR activation. In the Fong study we used 40µM CNQX because intracellular recordings from rat cortical neurons suggested this was required to completely block AMPAergic currents. Our initial studies in the current manuscript examining GABAergic scaling in mouse cortical cultures used 40µM CNQX, however, this concentration of CNQX prevented us from restoring spiking through optogenetic activation, so we reduced our concentration to 20µM CNQX, which did trigger GABAergic downscaling and allowed the restoration of spiking. We now show and discuss this result (Figure 7 and 3rd paragraph discussion).

      The paper also suggests that only receptor function or spiking could control HSP, and therefore if it is not receptor function then it must be spiking. This seems like a false dichotomy; there are of course other options. Details in the data may suggest that spiking is not the (or the only) homeostat, as TTX and CNQX causes identical changes in mIPSC amplitude but have different effects on spiking. Further, in Fig 5, CTZ had a minimal effect on spiking but a large effect on mIPSCs. Similar issues appear in Fig 6, where the induction of increased spiking is highly variable, with many cells showing control levels or lower spiking rates. Yet the synaptic changes are robust, across all cells. Overall, this is not persuasive that spiking is necessarily the homeostat for GABA synapses.

      Together our results argue against AMPAR or GABAR activation as a trigger for GABAergic scaling and that this is different than our results for AMPAergic scaling. These points alone are important to recognize. While changes in spiking do not perfectly follow the changes in GABAergic scaling they do always trend in the right direction. As mentioned above, total spiking activity is only one measure of spiking. It is possible that these drugs alter the pattern of spiking that translates into an altered calcium transients which may be important for triggering the plasticity. Further, we acknowledge that we cannot rule out a role for NMDARs contributing to GABAergic scaling (3rd and 5th paragraph of discussion). Based on the variability that we observe and the nature of our MEA recordings we cannot precisely determine how the total activity or pattern of activity changes with drug application in the specific cells that we target for whole cell recordings, and this is now discussed (final paragraph of discussion). Again, it is important to note that we are going a step beyond most homeostatic plasticity studies that add a drug and simply assume it is having an effect on spiking (e.g. CNQX was initially thought to completely abolish spiking, but clearly does not). However, we believe that the most parsimonious explanation of our results supports our proposal that GABAergic scaling is a strong candidate as a spike rate homeostat. Regardless, in the resubmission we have included a broader discussion about these possibilities, and recognize that we cannot rule out the possibility that AMPAergic transmission could contribute to upward GABAergic scaling (2nd paragraph discussion).

      The paper also suggests that the timing of the GABA changes coincides with the spiking changes, but while they have the time course of the spiking changes and recovery, they only have the 24h time point for synaptic changes. It is impossible to conclude how the time courses align without more data.

      We can only say that by the 24 hour CNQX time point, when overall spiking is recovered in some but not all cultures and bursts have not recovered, that GABAergic scaling has already occurred. We now state this more clearly in the resubmission (near the end of the 2nd paragraph of the discussion).

      Reviewer #3 (Recommendations For The Authors):

      The statistics are inadequately described. The full information including actual p values should be given, particularly for the non-significant trends reported.

      We have done this in Figure legends.

      The abstract and introduction give the impression that GABA and Glu HSP are independent, though most work links them as occurring simultaneously and in a coordinated fashion to achieve homeostasis.

      While it is true that many studies have triggered both forms of scaling with activity or transmission blockade, these studies have not addressed whether these forms of scaling are actually triggered in the same way mechanistically, except potentially for the one study that we mentioned (Joseph et al.,). Our results suggest they are independent. We now do mention the idea that these two forms of scaling have been assumed to be commonly triggered (3rd paragraph introduction).

      The data in Fig 6 is presented as if BIC treatment is a novel result, although BIC/Gabazine/PTX have been used to induce down-scaling in many previous papers. While it's good to have the results, they should be put in proper context. As suggested in the paper, testing if decreased GABAR function would lead to upscaling does not make sense given all the previous data. 

      Figure 6 shows GABAergic upscaling in response to GABAR block (bicuculline), but we are aware of only two other studies that looked at GABAergic scaling after treating with a GABAR blocker and they found upscaling but this was in hippocampal cultures, not cortical cultures (Peng et al., 2010 - PMID: 21123568, Pribiag et al., 2014 - PMID: 24753587). We now mention this in the results section describing Figure 6. While many studies have blocked GABARs and find AMPAergic downscaling, we are addressing the triggers for GABAergic scaling in Figure 6.

      Is Fig S4B mislabeled? The title says spike rate but the graph axis says burst frequency.

      The reviewer is correct and we have now adjusted this.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Public Review):

      Weaknesses:

      There are however, substantial concerns about the interpretation of the findings and limitations to the current analysis. In particular, Analysis of single unit activity is absent, making interpretation of population clusters and decoding less interpretable. These concerns should be addressed to make sure that the results can be interpreted clearly in an active field that already contains a number of confusing and possibly contradictory findings.

      We addressed this important point (which was also made by reviewer #1) in our previous revision. Specifically, we included additional analyses that operate at the level of single units rather than the population level, as requested by the reviewer. For example, we assessed, separately for each recorded neuron, whether there was a statistically significant difference in the magnitude of neural activity between hit and miss trials. This approach allowed us to fully balance the numbers of hit and miss trials at each sound level that were entered into the analysis. The results revealed that a large proportion (close to 50%) of units were task modulated, i.e. had significantly different response magnitudes between hit and miss trials, and that this proportion was not significantly different between lesioned and non-lesioned mice. It is therefore no longer correct to say that “analysis of single unit activity is absent”, and we would be grateful if this statement could be changed.  

      Reviewer #2 (Recommendations For The Authors): 

      The authors have done a good job addressing the main concerns from the previous review. There are a few additional points that hopefully do not require substantial additional edits. 

      Figure 5/supplements. While the authors provide compelling evidence that clusters and overall activity patterns are similar for lesioned and control animals, there do appear to be some differences. For instance, the hit/miss difference for cluster 3 (the "auditory" cluster) appears to be absent for lesioned mice (Fig 5S3 D). Can the hit-miss difference be quantified? 

      We agree that there are some differences between the activity profiles of lesioned and non-lesioned mice: Inspection of panels A and C of Figure 5 – figure supplement 3, for instance, indicates that there is a relatively high proportion of neurons in cluster 3 of the non-lesioned mice that exhibit prolonged elevated activity in hit trials and a relatively lower proportion of those neurons in cluster 3 of lesioned mice. This likely explains the difference in the average response profiles of cluster 3 between the two groups pointed out by the reviewer. Furthermore, there is a slightly larger pre-stimulus dip in hit trial activity for lesioned than non-lesioned mice in cluster 1, a more pronounced short latency peak in hit trial activity for lesioned mice in cluster 2 as well as differences in other clusters. However, these differences are not inconsistent with our interpretation of these data in that we describe the activity profiles as being “similar” and exhibiting a “close correspondence” (rather than as being identical). Having considered this carefully, we do not believe that attempting to quantify these small differences would add much value here or help the reader with the interpretation of these data, especially given that the activity profiles of all neurons that make up each cluster are plotted in panels A and C.  

      Could the mice have been using somatosensory information to perform the task? A wideband click presented from a free-field speaker could have energy in a low frequency range that triggers a whisker response. Given the moderate but not insignificant somatosensory input into the IC shell, this doesn't seem like a trivial concern, and it could substantially impact interpretation of the results. Without wanting to complicate things too much, the authors might consider one or more of these questions: What's the frequency content of the click? Can a deaf mouse perform the task? Can an AC-lesioned mouse learn/perform the task with close-field acoustic stimulation? Or for a highfrequency tone target rather than a click?

      This is an interesting suggestion. We have, in the context of another study, trained mice in our lab to detect somatosensory stimulation (a brush stroke to their whiskers) and consistently found that it takes them much longer (often two weeks or more) to learn to respond to a stimulation of their whiskers than to the presentation of a sound. The brush strokes applied to the whiskers in those experiments were 50-150 ms in duration and were thus orders of magnitude greater in both their duration and amplitude and considerably more salient than any somatosensory stimulus that could potentially arise from the clicks presented here. Therefore, we consider it highly unlikely that mice learned to use somatosensory information potentially picked up by their whiskers to perform the click detection task.  

      L. 63. The authors might want to cite some recent work from the Apostilides lab on the properties of AC-IC projections as well as non-auditory signals in the IC. 

      There are two recent papers from the Apostolides lab that are relevant to our study. We already cite Quass et al., 2023. We have now added Ford et al., 2024 as well.

      Changes to manuscript:

      Line 81: “This raises the possibility that these context-dependent effects may be inherited from the auditory cortex (Ford et al., 2024)”.

      L. 220. "sound-responsive neurons" It is possible to report the representation of sound-responsive neurons in the different clusters? This might help tease apart what processes contribute to their respective activity. Not a big problem if the samples can't be registered easily.

      Sound-driven neurons were identified on the basis of a subset (those trials in which sounds were presented at levels from 53 dB SPL to 65 dB SPL) of the trials used for the clustering analysis so the analyses are not directly comparable.

      p. 603. "quieter stimuli" What sound level was actually used in the 2p experiments? Was it fixed at a single level per animal?

      Sound level was not fixed at a single level. A total of nine different sound levels were used per mouse. We apologize that this was not made clear previously.  

      Changes to manuscript:

      Line 603: “Once the mice had achieved a stable level of performance (typically two days with d’ > 1.5), quieter stimuli (41-71 dB SPL) were introduced. For each mouse a total of 9 different sound levels were used and the range of sound levels was adjusted to each animal’s behavioral performance to avoid floor and ceiling effects and could, therefore, differ from mouse to mouse.”

      L. 747. Something is not right with this formula. It appears that it will always reduce to a value of 1/2.

      Thanks for spotting this. There are two typos in this formula. This has been fixed and now reads (line 749):  

    2. eLife assessment

      This study demonstrates that neurons receiving inputs from auditory cortex in the inferior colliculus widely encode the outcome of a sound detection task independant of the presence of auditory cortex. This valuable study based on imaging of transynaptically labelled neurons provides convincing evidence that auditory cortex is necessary neither for sound detection, nor to channel information related to behavioral outcome to the subcortical auditory system. This study will be of wide interest for sensory neuroscientists.

    3. Reviewer #1 (Public Review):

      The inferior colliculus (IC) is the central auditory system's major hub. It integrates ascending brainstem signals to provide acoustic information to the auditory thalamus. The superficial layers of the IC ("shell" IC regions as defined in the current manuscript) also receive a massive descending projection from the auditory cortex. This auditory cortico-collicular pathway has long fascinated the hearing field, as it may provide a route to funnel "high-level" cortical signals and impart behavioral salience upon an otherwise behaviorally agnostic midbrain circuit.

      Accordingly, IC neurons can respond differently to the same sound depending on whether animals engage in a behavioral task (Ryan and Miller 1977; Ryan et al., 1984; Slee & David, 2015; Saderi et al., 2021; De Franceschi & Barkat, 2021). Many studies also report a rich variety of non-auditory responses in the IC, far beyond the simple acoustic responses one expects to find in a "low-level" region (Sakurai, 1990; Metzger et al., 2006; Porter et al., 2007). A tacit assumption is that the behaviorally relevant activity of IC neurons is inherited from the auditory cortico-collicular pathway. However, this assumption has never been tested, owing to two main limitations of past studies:

      (1) Prior studies could not confirm if data were obtained from IC neurons that receive monosynaptic input from the auditory cortex.

      (2) Many studies have tested how auditory cortical inactivation impacts IC neuron activity; the consequence of cortical silencing is sometimes quite modest. However, all prior inactivation studies were conducted in anesthetized or passively listening animals. These conditions may not fully engage the auditory cortico-collicular pathway. Moreover, the extent of cortical inactivation in prior studies was sometimes ambiguous, which complicates interpreting modest or negative results.

      Here, the authors' goal is to directly test if the auditory cortex is necessary for behaviorally relevant activity in IC neurons. They conclude that surprisingly, task relevant activity in cortico-recipient IC neuron persists in absence of auditory cortico-collicular transmission. To this end, a major strength of the paper is that the authors combine a sound-detection behavior with clever approaches that unambiguously overcome the limitations of past studies.

      First the authors inject a transsynaptic virus into the auditory cortex, thereby expressing a genetically encoded calcium indicator in the auditory cortex's postsynaptic targets in the IC. This powerful approach enables 2-photon Ca2+ imaging from IC neurons that unambiguously receive monosynaptic input from auditory cortex. Thus, any effect of cortical silencing should be maximally observable in this neuronal population. Second, they abrogate auditory cortico-collicular transmission using lesions of auditory cortex. This "sledgehammer" approach is arguably the most direct test of whether cortico-recipient IC neurons will continue to encode task-relevant information in absence of descending feedback. Indeed, their method circumvents the known limitations of more modern optogenetic or chemogenetic silencing, e.g. variable efficacy.

      The authors have revised their manuscript and adequately addressed the major concerns. Although more in depth analyses of these rich datasets are definitely possible, the current results nevertheless stand on their own. Indeed, the work serves as a beacon to move away from the idea that cortico-collicular projections function primarily to impart behavioral relevance upon auditory midbrain neurons. This knowledge inspires a search for alternative explanations as to the role of auditory cortico-collicular synapses in behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      This study takes a new approach to studying the role of corticofugal projections from auditory cortex to inferior colliculus. The authors performed two-photon imaging of cortico-recipient IC neurons during a click detection task in mice with and without lesions of auditory cortex. In both groups of animals, they observed similar task performance and relatively small differences in the encoding of task-response variables in the IC population. They conclude that non-cortical inputs to the IC can provide substantial task-related modulation, at least when AC is absent.

      Strengths:

      This study provides valuable new insight into big and challenging questions around top-down modulation of activity in the IC. The approach here is novel and appears to have been executed thoughtfully. Thus, it should be of interest to the community.

      Weaknesses:

      Analysis of single unit activity is limited in its scope.

    5. Reviewer #3 (Public Review):

      Summary:

      This study aims to demonstrate that cortical feedback is not necessary to signal behavioral outcome to shell neurons of the inferior colliculus during a sound detection task. The demonstration is achieved in a very clear manner by the observation of the activity of cortico-recepient neurons in animals which have received lesions of the auditory cortex. The experiment shows that neither behavior performance nor neuronal responses are significantly impacted by cortical lesions except for the case of partial lesions which seem to have a disruptive effect on behavioral outcome signaling.

      Strengths:

      The demonstration of the main conclusions is based on state-of-the-art, carefully controlled methods and is highly convincing. There is an in depth discussion of the different effects of auditory cortical lesions on sound detection behavior.

      Weaknesses:

      The description of feedback signals could be more detailed although it is difficult to achieve good temporal resolution with the calcium imaging technique necessary for targeting cortico-recipient neurons.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Through an unbiased genomewide KO screen, the authors identified loss of DBT to suppress MG132-mediated death of cultured RPE cells. Further analyses suggested that DBT reduces ubiquitinated proteins by promoting autophagy. Mechanistic studies indicated that DBT loss promotes autophagy via AMPK and its downstream ULK and mTOR signaling. Furthermore, loss of DBT suppresses polyglutamine- or TDP-43-mediated cytotoxicity and/or neurodegeneration in fly models. Finally, the authors showed that DBT proteins are increased in ALS patient tissues, compared to non-neurological controls. 

      Strengths: 

      The idea is novel, the evidence is convincing, and the data are clean. The findings have implications for human diseases. 

      Weaknesses: 

      None. 

      Reply: We thank the reviewer for the supportive comments.

      Reviewer #2 (Public Review): 

      Summary: 

      Hwang, Ran-Der et al utilized a CRISPR-Cas9 knockout in human retinal pigment epithelium (RPE1) cells to evaluate for suppressors of toxicity by the proteasome inhibitor MG132 and identified that knockout of dihydrolipoamide branched chain transacylase E2 (DBT) suppressed cell death. They show that DBT knockout in RPE1 cells does not alter proteasome or autophagy function at baseline. However, with MG132 treatment, they show a reduction in ubiquitinated proteins but with no change in proteasome function. Instead, they show that DBT knockout cells treated with MG132 have improved autophagy flux compared to wildtype cells treated with MG132. They show that MG132 treatment decreases ATP/ADP ratios to a greater extent in DBT knockout cells, and in accordance causes activation of AMPK. They then show downstream altered autophagy signaling in DBT knockout cells treated with MG132 compared to wild-type cells treated with MG132. Then they express the ALS mutant TDP43 M337 or expanded polyglutamine repeats to model Huntington's disease and show that knockdown of DBT improves cell survival in RPE1 cells with improved autophagic flux. They also utilize a Drosophila models and show that utilizing either a RNAi or CRISPR-Cas9 knockout of DBT improves eye pigment in TDP43M337V and polyglutamine repeat-expressing transgenic flies. Finally, they show evidence for increased DBT in postmortem spinal cord tissue from patients with ALS via both immunoblotting and immunofluorescence. 

      Strengths: 

      This is a mechanistic and well-designed paper that identifies DBT as a novel regulator of proteotoxicity via activating autophagy in the setting of proteasome inhibition. Major strengths include careful delineation of a mechanistic pathway to define how DBT is protective. These conclusions are well-justified. 

      Weaknesses: 

      None 

      Reply: We thank the reviewer for the supportive comments.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors): 

      The authors have addressed my concerns. I have two more suggestions: 

      (1) Since the authors found that MG132 inhibits autophagy, which is inconsistent with previous findings that it promotes autophagy (e.g., PMID: 26648402, 30647455, 28674081), they should discuss this discrepancy in the Discussion.

      Reply: We thank the reviewer for raising this point. We agree with the reviewer that it has been well known in the literature that MG132 can lead to activation of autophagy. Indeed, we have observed in this study that MG132 itself can lead to time-dependent increases in LC3II levels in the first 8 hours of the MG132 treatment (Fig. S5B). These observations reflect the adaptive response of the cell to activate autophagy following proteasomal inhibition. However, as the MG132-mediated proteasomal inhibition persists, it is expected that the accumulation of misfolded protein substrates may overwhelm protein degradation systems, including the autophagylysosome pathway. Indeed, we have observed a reduction of the autophagic flux after 48 hours of the MG132 treatment (Fig. 3). Importantly, the DBT KO cells were able to maintain significantly higher levels of autophagic activities than the WT cells at this time point, consistent with their resistance to MG132-induced cell death. As suggested, we have added more discussion on the dynamic changes in the autophagic activities following proteasomal inhibition.

      (2) A grammar issue: consider removing some of the article "the," e.g.: 

      page 6: "the increase in cleaved PARP1 "-->"an increase in cleaved PARP1";  "the loss of DBT "-->"loss of DBT" 

      page 7: "the loss of DBT "-->"loss of DBT"; "The ubiquitin modification"-->"Ubiquitin modification" 

      Reply:  We thank the reviewer for the supportive comments. And we have removed some of the grammar issues in the article.

      Reviewer #2 (Recommendations For The Authors): 

      The authors have addressed my concerns. 

      Reply: We thank the reviewer for the supportive comments.

    2. eLife assessment

      This important study discovered DBT as a novel gene implicated in the resistance to MG132-mediated cytotoxicity and potentially also in the pathogenesis of ALS and FTD, two fatal neurodegenerative diseases. The authors provided convincing evidence to support a mechanism by which loss of DBT suppresses MG132-mediated toxicity via promoting autophagy. This work will be of interest to cell biologists and biochemists, especially in the FTD/ALS field.

    3. Reviewer #1 (Public Review):

      Summary:

      Through an unbiased genomewide KO screen, the authors identified loss of DBT to suppress MG132-mediated death of cultured RPE cells. Further analyses suggested that DBT reduces ubiquitinated proteins by promoting autophagy. Mechanistic studies indicated that DBT loss promotes autophagy via AMPK and its downstream ULK and mTOR signaling. Furthermore, loss of DBT suppresses polyglutamine- or TDP-43-mediated cytotoxicity and/or neurodegeneration in fly models. Finally, the authors showed that DBT proteins are increased in ALS patient tissues, compared to non-neurological controls.

      Strengths:

      The idea is novel, the evidence is convincing, and the data are clean. The findings have implications for human diseases.

      Weaknesses:

      None.

    4. Reviewer #2 (Public Review):

      Summary:

      Hwang, Ran-Der et al utilized a CRISPR-Cas9 knockout in human retinal pigment epithelium (RPE1) cells to evaluate for suppressors of toxicity by the proteasome inhibitor MG132 and identified that knockout of dihydrolipoamide branched chain transacylase E2 (DBT) suppressed cell death. They show that DBT knockout in RPE1 cells does not alter proteasome or autophagy function at baseline. However, with MG132 treatment, they show a reduction in ubiquitinated proteins but with no change in proteasome function. Instead, they show that DBT knockout cells treated with MG132 have improved autophagy flux compared to wildtype cells treated with MG132. They show that MG132 treatment decreases ATP/ADP ratios to a greater extent in DBT knockout cells, and in accordance causes activation of AMPK. They then show downstream altered autophagy signaling in DBT knockout cells treated with MG132 compared to wild-type cells treated with MG132. Then they express the ALS mutant TDP43 M337 or expanded polyglutamine repeats to model Huntington's disease and show that knockdown of DBT improves cell survival in RPE1 cells with improved autophagic flux. They also utilize a Drosophila models and show that utilizing either a RNAi or CRISPR-Cas9 knockout of DBT improves eye pigment in TDP43M337V and polyglutamine repeat-expressing transgenic flies. Finally, they show evidence for increased DBT in postmortem spinal cord tissue from patients with ALS via both immunoblotting and immunofluorescence.

      Strengths:

      This is a mechanistic and well-designed paper that identifies DBT as a novel regulator of proteotoxicity via activating autophagy in the setting of proteasome inhibition. Major strengths include careful delineation of a mechanistic pathway to define how DBT is protective. These conclusions are well-justified.

      Weaknesses:

      None

    1. eLife assessment

      Building on previous toolboxes to distinguish 1/f noise from oscillatory activity, this study introduces an important advancement in neural signal analysis to identify oscillatory activity in electrophysiological data that refines the accuracy of identifying non-sinusoidal neural oscillations. Extensive validation, using synthetic and various empirical data, provides convincing evidence for the accuracy of the method and outlines practical implications for relevant scientific problems in the field.

    2. Reviewer #1 (Public Review):

      Summary:

      The study introduces and validates the Cyclic Homogeneous Oscillation (CHO) detection method to precisely determine the duration, location, and fundamental frequency of non-sinusoidal neural oscillations. Traditional spectral analysis methods face challenges in distinguishing the fundamental frequency of non-sinusoidal oscillations from their harmonics, leading to potential inaccuracies. The authors implement an underexplored approach, using the auto-correlation structure to identify the characteristic frequency of an oscillation. By combining this strategy with existing time-frequency tools to identify when oscillations occur, the authors strive to solve outstanding challenges involving spurious harmonic peaks detected in time-frequency representations. Empirical tests using electrocorticographic (ECoG) and electroencephalographic (EEG) signals further support the efficacy of CHO in detecting neural oscillations.

      Strengths:

      The paper puts important emphasis on the 'identity' question of oscillatory identification. The field primarily identifies oscillations through frequency, space (brain region), and time (length, and relative to task or rest). However, more tools that claim to further characterize oscillations by their defining/identifying traits are needed, in addition to data-driven studies about what the identifiable traits of neural oscillations are beyond frequency, location, and time. Such tools are useful for potentially distinguishing between circuit mechanistic generators underlying signals that may not otherwise be distinguished. This paper states this problem well and puts forth a new type of objective for neural signal processing methods.

      The paper uses synthetic data and multimodal recordings at multiple scales to validate the tool, suggesting CHO's robustness and applicability in various real-data scenarios. The figures illustratively demonstrate how CHO works on such synthetic and real examples, depicting in both time and frequency domains. The synthetic data are well-designed, and capable of producing transient oscillatory bursts with non-sinusoidal characteristics within 1/f noise. Using both non-invasive and invasive signals exposes CHO to conditions which may differ in the extent and quality of harmonic signal structure. An interesting follow-up question is whether the utility demonstrated here holds for MEG signals, as well as source-reconstructed signals from non-invasive recordings.

      This study is accompanied by open-source code and data for use by the community.

      Weaknesses:

      The criteria that the authors use for neural oscillations embody some operating assumptions underlying their characteristics, perhaps informed by immediate use cases intended by the authors (e.g., hippocampal bursts). The extent to which these assumptions hold in all circumstances should be investigated. For instance, the notion of consistent auto-correlation breaks down in scenarios where instantaneous frequency fluctuates significantly at the scale of a few cycles. Imagine an alpha-beta complex without harmonics (Jones 2009). If oscillations change phase position within a timeframe of a few cycles, it would be difficult for a single peak in the auto-correlation structure to elucidate the complex time-varying peak frequency in a dynamic fashion. Likewise, it is unclear whether bounding boxes with a pre-specified overlap can capture complexes that manoeuvre across peak frequencies.

      This method appears to lack the implementation of statistical inferential techniques for estimating and interpreting auto-correlation and spectral structure. In standard practice, auto-correlation functions and spectral measures can be subjected to statistical inference to establish confidence intervals, often helping to determine the significance of the estimates. Doing so would be useful for expressing the likelihood that an oscillation and its harmonic has the same auto-correlation structure and fundamental frequency, or more robustly identifying harmonic peaks in the presence of spectral noise. Here, the authors appear to use auto-correlation and time-frequency decomposition more as a deterministic tool rather than an inferential one. Overall, an inferential approach would help differentiate between true effects and those that might spuriously occur due to the nature of the data. Ultimately, a more statistically principled approach might estimate harmonic structure in the presence of noise in a unified manner transmitted throughout the methodological steps.

    3. Reviewer #2 (Public Review):

      Summary:

      A new toolbox is presented that builds on previous toolboxes to distinguish between real and spurious oscillatory activity, which can be induced by non-sinusoidal waveshapes. Whilst there are many toolboxes that help to distinguish between 1/f noise and oscillations, not many tools are available that help to distinguish true oscillatory activity from spurious oscillatory activity induced in harmonics of the fundamental frequency by non-sinusoidal waveshapes. The authors present a new algorithm which is based on autocorrelation to separate real from spurious oscillatory activity. The algorithm is extensively validated using synthetic (simulated) data, and various empirical datasets from EEG, and intracranial EEG in various locations and domains (i.e. auditory cortex, hippocampus, etc.).

      Strengths:

      Distinguishing real from spurious oscillatory activity due to non-sinusoidal waveshapes is an issue that has plagued the field for quite a long time. The presented toolbox addresses this fundamental problem which will be of great use for the community. The paper is written in a very accessible and clear way so that readers less familiar with the intricacies of Fourier transform and signal processing will also be able to follow it. A particular strength is the broad validation of the toolbox, using synthetic, scalp EEG, EcoG, and stereotactic EEG in various locations and paradigms.

      Weaknesses:

      A weakness is that the algorithm seems to be quite conservative in identifying oscillatory activity which may render it only useful for analyzing very strong oscillatory signals (i.e. alpha), but less suitable for weaker oscillatory signals (i.e. gamma).

    1. Reviewer #2 (Public Review):

      Summary:

      The authors set out to non-invasively track neuronal development in rat neonates, which they achieved with notable success. However, the direct relationship between the results and broader conclusions regarding developmental biology and potential human implications is somewhat overstretched without further validation.

      Strengths:

      If adequately revised and validated, this work could have a significant impact on the field, providing a non-invasive tool for longitudinal studies of brain development and neurodevelopmental disorders in preclinical settings.

      Weaknesses:

      (1) Consistency and Logical Flow:

      - The manuscript suffers from a lack of strategic flow in some sections. Specifically, transitions between major findings and methodological discussions need refinement to ensure a logical progression of ideas. For example, the jump from the introduction of developmental trajectories and the technicalities of MRS (Magnetic Resonance Spectroscopy) processing on page 3 could benefit from a bridging paragraph that explicitly states the study's hypotheses based on existing literature gaps.

      (2) Scientific Rigour:

      - While the novel application of diffusion-weighted MRS is commendable, there's a notable gap in the rigorous validation of this approach against gold-standard histological or molecular techniques. Particularly, the assertions regarding the sphere fraction and morphological changes inferred from biophysical modelling mandates direct validation to solidify the claims made. A study comparing these in vivo findings with ex vivo confirmation in at least a subset of samples would significantly enhance the reliability of these conclusions.

      (3) Clarity and Novelty:

      - The manuscript often delves deeply into technical specifics at the expense of accessibility to readers not deeply familiar with MRS technology. The introduction and discussions would benefit from a clearer elucidation of why these specific metabolite markers were chosen and their known relevance to neuronal and glial cells, placing this in the context of what is novel compared to existing literature.<br /> - The novelty aspect could be reinforced by a more structured discussion on how this method could change the current understanding or practices within neurodevelopmental research, compared to the current state of the art.

      (4) Completeness:

      - The Discussion section requires expansion to offer a more comprehensive interpretation of how these findings impact the broader field of neurodevelopment and psychiatric disorders. Specifically, the implications for human studies or clinical translation are touched upon but not fully explored.<br /> - Further, while supplementary material provides necessary detail on methodology, key findings from these analyses should be summarized and discussed in the main text to ensure the manuscript stands complete on its own.

      (5) Grammar, Style, Orthography:

      - There are sporadic grammatical and typographical errors throughout the text which, while minor, detract from the overall readability. For example, inconsistencies in metabolite abbreviations (e.g., tCr vs Cr+PCr) should be standardized.

      (6) References and Additional Context:

      - The current reference list is extensive but lacks integration into the narrative. Direct comparisons with existing studies, especially those with conflicting or supportive findings, are scant. More dedicated effort to contextualize this work within the existing body of knowledge would be beneficial.

    2. eLife assessment

      This study presents valuable findings regarding the microstructural basis of brain development in the cerebellum and thalamus of rat neonates using diffusion-weighted MRS. The authors present solid evidence of differential development trajectories in the thalamus and the cerebellum through analytical and morphometric biophysical modelling of the diffusion-weighted MRS data, though some aspects such as the validation of the findings against gold-standard techniques and a detailed discussion of methodological choices require further elaboration. The work will be of interest to developmental biologists and neuroscientists seeking noninvasive approaches to probe in vivo neuronal and glial development in the brain.

    3. Reviewer #1 (Public Review):

      In this work, Ligneul and coauthors implemented diffusion-weighted MRS in young rats to follow longitudinally and in vivo the microstructural changes occurring during brain development. Diffusion-weighted MRS is here instrumental in assessing microstructure in a cell-specific manner, as opposed to the claimed gold-standard (manganese-enhanced MRI) that can only probe changes in brain volume. Differential microstructure and complexification of the cerebellum and the thalamus during rat brain development were observed non-invasively. In particular, lower metabolite ADC with increasing age were measured in both brain regions, reflecting increasing cellular restriction with brain maturation. Higher sphere (representing cell bodies) fraction for neuronal metabolites (total NAA, glutamate) and total creatine and taurine in the cerebellum compared to the thalamus were estimated, reflecting the unique structure of the cerebellar granular layer with a high density of cell bodies. Decreasing sphere fraction with age was observed in the cerebellum, reflecting the development of the dendritic tree of Purkinje cells and Bergmann glia. From morphometric analyses, the authors could probe non-monotonic branching evolution in the cerebellum, matching 3D representations of Purkinje cells expansion and complexification with age. Finally, the authors highlighted taurine as a potential new marker of cerebellar development.

      From a technical standpoint, this work clearly demonstrates the potential of diffusion-weighted MRS at probing microstructure changes of the developing brain non-invasively, paving the way for its application in pathological cases. Ligneul and coauthors also show that diffusion-weighted MRS acquisitions in neonates are feasible, despite the known technical challenges of such measurements, even in adult rats. They also provide all necessary resources to reproduce and build upon their work, which is highly valuable for the community.

      From a biological standpoint, claims are well supported by the microstructure parameters derived from advanced biophysical modelling of the diffusion MRS data. The assumption of metabolite compartmentation, forming the basis of cell-specific microstructure interpretation of dMRS data, remains debated and should be considered with care (Rae, Neurochem Res, 2014, https://doi.org/10.1007/s11064-013-1199-5). External cross-validation of some of the authors' claims, in particular taurine in the thalamus switching from neurons to astrocytes during brain development, would be a highly valuable addition to this study.

      Specific strengths:

      (1) The interpretation of dMRS data in terms of cell-specific microstructure through advanced biophysical modelling (e.g. the sphere fraction, modelling the fraction of cell bodies versus neuronal or astrocytic processes) is a strong asset of the study, going beyond the more commonly used signal representation metrics such as the apparent diffusion coefficient, which lacks specificity to biological phenomena.<br /> (2) The fairly good data quality despite the complexity of the experimental framework should be praised: diffusion-weighted MRS was acquired in two brain regions (although not in the same animals) and longitudinally, in neonates, including data at high b-values and multiple diffusion times, which altogether constitutes a large-scale dataset of high value for the diffusion-weighted MRS community.<br /> (3) The authors have shared publicly data and codes used for processing and fitting, which will allow one to reproduce or extend the scope of this work to disease populations, and which goes in line with the current effort of the MR(S) community for data sharing.

      Specific weaknesses:

      (1) This work lacks an introduction and a discussion about diffusion MRI, which is already a validated technique to assess brain development non-invasively. Although water lacks cell-specificity compared to metabolites, several studies have reported a decrease in water ADC and increased fractional anisotropy with brain maturation, associated with the myelination process and decreased water content (overview in Hüppi, Chapt. 30 of "Diffusion MRI: Theory, Methods, and Applications", Oxford University Press, 2010). Interestingly, the same observations are found in this work (decreased ADC with age for most metabolites in both brain regions), which should have been commented on. Moreover, the authors could have reported water diffusion properties in addition to metabolites', as I believe the water signal, used for coil combination and/or Eddy currents corrections, is usually naturally acquired during diffusion-weighted MRS scans.<br /> (2) It is unclear why the authors have normalized metabolite concentrations (measured from low b-values diffusion-weighted MRS spectra) to the macromolecule concentrations. First, it is not specified whether in vivo macromolecules were acquired at each age or just at one time point. Second, such ratios are not standard practice in the MRS community so this choice should have been explained. Third, the macromolecule content was reported to change with age (Tkac et al., Magn Reson Med, 2003), therefore a change in metabolite to macromolecule ratio with age cannot be interpreted unequivocally.<br /> (3) Some discussion is missing about the choice of the analytical biophysical model (although a few are compared in Supplementary Materials), in particular: is a model of macroscopic anisotropy relevant in cerebellum, made of a large fraction of oriented white matter tracks, and does the model remain valid at different ages given white matter maturation and the ongoing myelination process?

    1. eLife assessment

      This valuable study marks a significant advancement in brain aging research by centering on Asian populations (Chinese, Malay, and Indian Singaporeans), a group frequently underrepresented in such studies. It unveils solid evidence for anatomical differences in brain aging predictors between the young and old age groups. Overall, this study broadens our understanding of brain aging across diverse ethnicities.

    2. Joint Public Review:

      Summary:

      The authors of the study investigated the generalization capabilities of a deep learning brain age model across different age groups within the Singaporean population, encompassing both elderly individuals aged 55 to 88 years and children aged 4 to 11 years. The model, originally trained on a dataset primarily consisting of Caucasian adults, demonstrated a varying degree of adaptability across these age groups. For the elderly, the authors observed that the model could be applied with minimal modifications, whereas for children, significant fine-tuning was necessary to achieve accurate predictions. Through their analysis, the authors established a correlation between changes in the brain age gap and future executive function performance across both demographics. Additionally, they identified distinct neuroanatomical predictors for brain age in each group: lateral ventricles and frontal areas were key in elderly participants, while white matter and posterior brain regions played a crucial role in children. These findings underscore the authors' conclusion that brain age models hold the potential for generalization across diverse populations, further emphasizing the significance of brain age progression as an indicator of cognitive development and aging processes.

      Strengths:

      (1) The study tackles a crucial research gap by exploring the adaptability of a brain age model across Asian demographics (Chinese, Malay, and Indian Singaporeans), enriching our knowledge of brain aging beyond Western populations.<br /> (2) It uncovers distinct anatomical predictors of brain aging between elderly and younger individuals, highlighting a significant finding in the understanding of age-related changes and ethnic differences.

      Weaknesses:

      (1) Clarity in describing the fine-tuning process is essential for improved comprehension.<br /> (2) The analysis often limits its findings to p-values, omitting the effect sizes crucial for understanding the relationship with cognition.<br /> (3) Employing a predictive framework for cognition using brain age could offer more insight than mere statistical correlations.<br /> (4) Expanding the study's scope to evaluate the model's generalisability to unseen Caucasian samples is vital for establishing a comparative baseline.

      In summary, this paper underscores the critical need to include diverse ethnicities in model testing and estimation.

    1. eLife assessment

      The authors use a synthetic approach to introduce synaptic ribbon proteins into HEK cells and analyze the ability of the resulting assemblies to cluster calcium channels at the active zone. The use of this ground-up approach is valuable as it establishes a system to study molecular interactions at the active zone. The work relies on a solid combination of super-resolution microscopy and electrophysiology, but would benefit from: (i) additional ultrastructural analysis to establish ribbon formation (in the absence of which the claim of these being synthetic ribbons might not be supported; (ii) data quantification (to confirm colocalization of different proteins); (iii) stronger validation of impact on Ca2+ function; (iv) in depth discussion of problems derived from the use of an over-expression approach.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors attempt to reconstitute some active zone properties by introducing synaptic ribbon proteins into HEK cells. This "ground-up" approach can be valuable for assessing the necessity of specific proteins in synaptic function. Here, the authors co-transfect a membrane-targeted bassoon, RBP2, calcium channel subunits and Ribeye to generate what they call "synthetic ribbons". The resultant structures show an ability to cluster calcium channels (Figure 4B) and a modest ability to concentrate calcium entry locations (figure 7J). At the light level, the ribeye aggregates look spherical and localize to the membrane through its interaction with the membrane-targeted bassoon. It is a nice proof-of-principle in establishing a useful experimental system for studying calcium channel localization. However, the impact of the study is modest. No new biology is discovered and to call these structures "synthetic ribbons" is an overstatement in the absence of an ultrastructural analysis.

      Strengths:

      (1) The authors establish a new experimental system for the study of calcium channel localization to active zones.<br /> (2) The clustering of calcium channels to bassoon via RBP2 is a nice confirmation of a previously described interaction between bassoon and calcium channels in a cell-based system<br /> (3) The "ground-up" approach is an attractive one and theoretically allows one to learn a lot about the essential interactions for building a ribbon structure.

      Weaknesses:

      (1) Are these truly "synthetic ribbons". The ribbon synapse is traditionally defined by its morphology at the EM level. To what extent these structures recapitulate ribbons is not shown. It has been previously shown that Ribeye forms aggregates on its own. Do these structures look any more ribbon-like than ribeye aggregates in the absence of its binding partners?<br /> (2) No new biology is discovered here. The clustering of channels is accomplished by taking advantage of previously described interactions between RBP2, Ca channels and bassoon. The localization of Ribeye to bassoon takes advantage of a previously described interaction between the two. Even the membrane localization of the complexes required the introduction of a membrane-anchoring motif.<br /> (3) The only thing ribbon-specific about these "syn-ribbons" is the expression of ribeye and ribeye does not seem to participate in the localization of other proteins in these complexes. Bsn, Cav1.3 and RBP2 can be found in other neurons.<br /> (4) As the authors point out, RBP2 is not necessary for some Ca channel clustering in hair cells, yet seems to be essential for clustering to bassoon here.<br /> (5) The difference in Ca imaging between SyRibbons and other locations is extremely subtle.<br /> (6) The effect of the expression of palm-Bsn, RBP2 and the combination of the two on Ca-current is ambiguous. It appears that while the combination is larger than the control, it probably isn't significantly different from either of the other two alone (Fig 5). Moreover, expression of Ribeye + the other two showed no effect on Ca current (Figure 7). Also, why is the IV curve right shifted in Figure 7 vs Figure 5?<br /> (7) While some of the IHC is quantified, some of it is simply shown as single images. EV2, EV3 and Figure 4a in particular (4b looks convincing enough on its own, but could also benefit from a larger sample size and quantification)

    3. Reviewer #2 (Public Review):

      Summary:

      The authors show that co-expression of bassoon, RIBEYE, Cav1.3-alpha1, Cav-beta3, Cav-alpha2delta1, and RBP2 in a heterologus system (HEK293 cells) is sufficient to generate a protein complex resembling a presyanptic ribbon-type active zone both in morphology and in function (in clustering voltage-gated Ca channels and creating sites for localized Ca2+ entry). If the 3 separate Cav gene products are taken as a single protein (i.e. a Ca channel), the conclusion is that the core of a ribbon synapse comprises 4 proteins: bassoon holds the RIBEYE-containing ribbon to the plasma membrane, and RPB2 binds to bassoon and Ca channels, tethering the Ca channels to the presynaptic active zone.

      Strengths:

      Good use of a heterologous system with generally appropriate controls provides convincing evidence that a presynaptic ribbon-type active zone (without the ability to support exocytosis), with the ability to support localized Ca2+ entry (a key feature of ribbon-type pre-synapses) can be assembled from a few proteins.

      Weaknesses:

      (1) Relies on over-expression, which almost certainly diminishes the experimentally-measured parameters (e.g. pre-synapse clustering, localization of Ca2+ entry).<br /> (2) Are HEK cells the best model? HEK cells secrete substances and have a studied-endocytitic pathway, but they do not create neurosecretory vesicles. Why didn't the authors try to reconstitute a ribbon synapse in a cell that makes neurosecretory vesicles like a PC12 cell?<br /> (3) Related to 1 and 2: the Ca channel localization observed is significant but not so striking given the presence of Cav protein and measurements of Ca2+ influx distributed across the membrane. Presumably, this is the result of overexpression and an absence of pathways for pre-synaptic targeting of Ca channels. But, still, it was surprising that Ca channel localization was so diffuse. I suppose that the authors tried to reduce the effect of over-expression by using an inducible Cav1.3? Even so, the accessory subunits were constitutively over-expressed.

    4. Reviewer #3 (Public Review):

      Summary:

      Ribbon synapses are complex molecular assemblies responsible for synaptic vesicle trafficking in sensory cells of the eye and the inner ear. The Ca2+-dependent exocytosis occurs at the active zone (AZ), however, the molecular mechanisms orchestrating the structure and function of the AZs of ribbon synapses are not well understood. To advance in the understanding of those mechanisms, the authors present a novel and interesting experimental strategy pursuing the reconstitution of a minimal active zone of a ribbon synapse within a synapse-naïve cell line: HEK293 cells. The authors have used stably transfected HEK293 cells that express voltage-gated Ca2+ channels subunits (constitutive -CaV beta3 and CaV alpha2 beta1- and inducible CaV1.3 alpha1). They have expressed in those cells several proteins of the ribbon synapse active zone: (1) RIBEYE, (2) a modified version of Bassoon that binds to the plasma membrane through artificial palmitoylation (Palm-Bassoon) and (3) RIM-binding protein 2 (RBP2) to induce the formation of a minimal active zone that they called SyRibbons. The formation of such structures is convincing, however, the evidence of such structures having an impact enhancing Ca2+-currents, as the authors claim, is rather weak in the present version of the study.

      Strengths of the study:

      (1) The study is carefully carried out using a remarkable combination of (1) superresolution microscopy, to analyze the formation and subcellular distribution of molecular assemblies and (2) functional assessment of voltage-gated Ca2+ channels using patch-clamp recording of Ca2+-currents and fluorometry to correlate Ca2+ influx with the molecular assemblies formed by AZ proteins. The results are of high quality and are in general accompanied of required control experiments.<br /> (2) The method opens new opportunities to further investigate the minimal and basic properties of AZ proteins that are difficult to study using in vivo systems. The cells that operate through ribbon synapses (e.g. photoreceptors and hair cells) are particularly difficult to manipulate, so setting up and validating the use of a heterologous system more suitable for molecular manipulations is highly valuable.<br /> (3) The structures formed by RIBEYE and Palm-Bassoon in HEK293 cells identified by STED nanoscopy are strikingly similar to the AZs of ribbon synapses found in rat inner hair cells (Figure 2).

      Weaknesses of the study:

      (1) The results obtained in a heterologous system (HEK293 cells) need to be interpreted with caution. They will importantly speed the generation of models and hypothesis that will, however, require in vivo validation.<br /> (2) The authors analyzed the distribution of RIBEYE clusters in different membrane compartments and correctly conclude that RIBEYE clusters are not trapped in any of those compartments, but it is soluble instead. The authors, however, did not carry out a similar analysis for Palm-Bassoon. It is therefore unknown if Palm-Bassoon binds to other membrane compartments besides the plasma membrane. That could occur because in non-neuronal cells GAP43 has been described to be in internal membrane compartments. This should be investigated to document the existence of ectopic internal Synribbons beyond the plasma membrane because it might have implications for interpreting functional data in case Ca2+-channels become part of those internal Synribbons.<br /> (3) The co-expression of RBP2 and Palm-Bassoon induces a rather minor but significant increase in Ca2+-currents (Figure 5). Such an increase does not occur upon expression of (1) Palm-Bassoon alone, (2) RBP2 alone or (3) RIBEYE alone (Figure 5). Intriguingly, the concomitant expression of Palm-Bassoon, RBP2 and RIBEYE does not translate into an increase of Ca2+-currents either (Figure 7).<br /> (4) The authors claim that Ca2+-imaging reveals increased CA2+-signal intensity at synthetic ribbon-type AZs. That claim is a subject of concern because the increase is rather small and it does not correlate with an increase in Ca2+-currents.

    1. eLife assessment

      The authors provide an important step forward in understanding how brain-derived hormones modulate behavior, using medaka fish as a model system. Knockout lines present convincing evidence from multiple mutant lines, showing that estrogens play a significant role in male social behavior, and that lacking aromatase changes brain gene expression. The conclusions for females are less substantiated, and the conclusions regarding sexual differentiation should be considered carefully.

    1. eLife assessment

      This important study reports that glutamate signaling in LepRb PMv neurons is necessary for leptin-dependent fertility. The data supporting the conclusion is solid. This work will be of interest to researchers in the fields of both reproductive and metabolic biology.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as the careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.

      I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      Indeed, it is difficult (embarrassingly) to benchmark against our past work due to the dependencies on different python packages and the lack of software engineering. With the infrastructure we’ve laid down with this paper, made possible by an EOSS grant from CZI, that will not be a problem going forward. Not only is the code more reliable and standardized, but we have developed several scientific test sets that can be used as a basis for broad comparisons to judge whether improvements are substantial. We’ve also changed with “substantial improvement” to “several modifications”  to indicate the lack of comparison to past versions.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      qFit can indeed handle both. We left out multiple chains for simplicity in constructing a dataset enriched for small proteins while still covering diversity to speed the ability to rapidly iterate and test our approaches. Improvements to qFit ligand handling will be discussed in a forthcoming work as we face similar technical debt to what we saw in proteins and are undergoing a process of introducing “several modifications” that we hope will lead to “substantial improvement” - but at the very least will accelerate further development.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      We added to the abstract:

      “Importantly, unlike ensemble models, the multiconformer models produced by qFit can be manually modified in most major model building software (e.g. Coot)  and fit can be further improved by refinement using standard pipelines (e.g. Phenix, Refmac, Buster).”

      and introduction:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      and results:

      “This model can then be examined and edited in Coot12 or other visualization software, and further refined using software such as phenix.refine, refmac, or buster as the modeler sees fit.”

      and discussion

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore generally also be deposited in the PDB using the standard deposition and validation process.”

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

      Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Different from structure prediction methods, this approach is strongly based on high-resolution experimental information and requires validated single-copy high-quality models as input. Overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - An automation of construction of multiple conformations when they can be identified visually;

      - A determination of multiple conformations when their visual identification is difficult or impossible.

      We often think about this problem similarly to the reviewer. However, in building qFit, we do not want to separate these problems - but rather use the first category (obvious visual identification) to build an approach that can accomplish part of the second category (difficult to visualize) without building “impossible”/nonexistent conformations - with a consistent approach/bias.

      The first one is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show.

      We agree with the reviewers' assessment here. The “floor” in terms of impact is automating a tedious part of high resolution model building and improving model quality.

      The second problem is important from the physical point of view and has been addressed first by Burling & Brunger (1994; https://doi.org/10.1002/ijch.199400022). The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model is not overfitted data but hardly as a strong argument in favor of the model.

      We agree with the overall sentiment of this comment. What is a significant variation in R-free is an important question that we have looked at previously (http://dx.doi.org/10.1101/448795) and others have suggested an R-sleep for further cross validation (https://pubmed.ncbi.nlm.nih.gov/17704561/). For these reasons it is important to get at the significance of the changes to model types from large and diverse test sets, as we have here and in other works, and from careful examination of the biological significance of alternative conformations with experiments designed to test their importance in mechanism.

      In general, overall targets are less appropriate for this kind of problem and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956; https://doi.org/10.1107/S0365110X56002059) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have more freedom to place the atoms where the experiments "see" them.

      Again, we agree—geometric outliers should not be completely absent, but it is comforting when they and model/experiment agreement both improve.

      The key local characteristic for multi conformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, there is no information about how sharply it identifies the best model, how much it changes between the initial and final models; in overall there is not any feeling about its values. The Q-score (page 17) can be a tool for the first problem where the multiple conformations are clearly separated and not for the second problem where the contributions from neighboring conformations are merged. In addition to BIC or to even more conventional target functions such as LS or local map correlation, the extreme and mean values of the local difference maps may help to validate the models.

      We agree with the reviewer that the problem of “best” model determination is poorly posed here. We have been thinking a lot about htis in the context of Bayesian methods (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278553/); however, a major stumbling block is in how variable representations of alternative conformations (and compositions) are handled. The answers are more (but by no means simply) straightforward for ensemble representations where the entire system is constantly represented but with multiple copies.

      This method with its results is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. At the same time, absence of strong density-based proofs may limit its impact.

      We agree - indeed we think it will be difficult to further improve structure prediction methods without much more interaction with the experimental data.

      Strengths:

      Addressing an important problem and automatization of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:

      An insufficient validation of the models when no discrete alternative conformations are visible and essentially missing local real-space validation indicators.

      While not perfect real space indicators, local real-space validation is implicit in the MIQP selection step and explicit when we do employ Q-score metrics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A point of clarification: I don't understand why waters seem to be handled differently in for cryo-EM and crystallography datasets. I am interested about the statement on page 19 that the Molprobity Clashscore gets worse for cryo-EM datasets, primarily due to clashes with waters. But the qFit algorithm includes a round of refinement to optimize placement of ordered waters, and the clashscore improves for the qFit refinement in crystallography test cases. Why/how is this different for cryo-EM?

      We agree that this was not an appropriate point. We believe that the high clash score is coming from side chains being incorrectly modeled. We have updated this in the manuscript and it will be a focus of future improvements.

      Reviewer #2 (Recommendations For The Authors):

      - It would be instructive to the reader to explain how qFit handles the chromophore in the PYP (1OTA) example. To this end, it would be helpful to include deposition of the multiconformer model of PYP. This might also be a suitable occasion for discussion of potential hurdles in the deposition of multiconformer models in the PDB (if any!). Such concerns may be real concerns causing hesitation among potential users.

      Thank you for this comment. qFit does not alter the position or connectivity of any HETATM records (like the chromophore in this structure). Handling covalent modifications like this is an area of future development.

      Regarding deposition, we have noted above that the discussion now includes:

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore, generally also be deposited in the PDB using the standard deposition and validation process.”

      Finally, we have placed all PDBs in a Zenodo deposition (XXX) and have included that language in the manuscript. It is currently under a separate data availability section (page XXX). We will defer to the editor as to the best header that should go under.

      - It may be advisable to take the description of true/false pos/negatives out of the caption of Figure 4, and include it in a box or so, since these terms are important in the main text too, and the caption becomes very cluttered.

      We think adding the description of true/false pos/negatives to the Figure panel would make it very cluttered and wordy. We would like to retain this description within the caption. We have also briefly described each in the main text.

      - page 21, line 4: some issue with citation formatting.

      We have updated these citations.

      - page 25, second paragraph: cardinality is the number of members of a set. Perhaps "minimal occupancy" is more appropriate.

      Thank you for pointing this out. This was a mistake and should have been called the occupancy threshold.

      - page 26: it's - its

      Thank you, we have made this change. 

      - Font sizes in Supplementary Figures 5-7 are too small to be readable.

      We agree and will make this change. 

      Reviewer #3 (Recommendations For The Authors):

      General remarks

      (1) As I understand, the procedure starts from shifting residues one by one (page 4; A.1). Then, geometry reconstruction (e.g., B1) may be difficult in some cases joining back the shifted residues. It seems that such backbone perturbation can be done more efficiently by shifting groups of residues ("potential coupled motions") as mentioned at the bottom of page 9. Did I miss its description?

      We would describe the algorithm as sampling (which includes minimal shifts) in the backbone residues to ensure we can link neighboring residues. We agree that future iterations of qFit should include more effective backbone sampling by exploring motion along the Cβ-Cα, C-N, and (Cβ-Cα × C-N) bonds and exploring correlated backbone movements.

      (2) While the paper is well split in clear parts, some of them seem to be not at their right/optimal place and better can be moved to "Methods" (detailed "Overview of the qFit protein algorithm" as a whole) or to "Data" missed now (Two first paragraphs of "qFit improves overall fit...", page 8, and "Generating the qFit test set", page 22, and "Generating synthetic data ..." at page 26; description of the test data set), At my personal taste, description of tests with simulated data (page 15) would be better before that of tests with real data.

      Thank you for this comment, but we stand by our original decision to keep the general flow of the paper as it was submitted.

      (3) I wonder if the term "quadratic programming" (e.g., A3, page 5) is appropriate. It supposes optimization of a quadratic function of the independent parameters and not of "some" parameters. This is like the crystallographic LS which is not a quadratic function of atomic coordinates, and I think this is a similar case here. Whatever the answer on this remark is, an example of the function and its parameters is certainly missed.

      We think that the term quadratic programming is appropriate. We fit a function with a loss function (observed density - calculated density), while satisfying the independent parameters. We fit the coefficients minimizing a quadratic loss. We agree that the quadratic function is missing from the paper, and we have now included it in the Methods section.

      Technical remarks to be answered by the authors :

      (1) Page 1, Abstract, line 3. The ensemble modeling is not the only existing frontier, and saying "one of the frontiers" may be better. Also, this phrase gives a confusing impression that the authors aim to predict the ensemble models while they do it with experimental data.

      We agree with this statement and have re-worded the abstract to reflect this.

      (2) Page 2. Burling & Brunger (1994) should be cited as predecessors. On the contrary, an excellent paper by Pearce & Gros (2021) is not relevant here.

      While we agree that we should mention the Burling & Brunger paper and the Pearce & Gros (2021) should not be removed as it is not discussing the method of ensemble refinement.

      (3) Page 2, bottom. "Further, when compared to ..." The preference to such approach sounds too much affirmative.

      We have amended this sentence to state:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot(Emsley et al. 2010) unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      “The point we were trying to make in this sentence was that ensemble-based models are much harder to manually manipulate in Coot or other similar software compared to multiconformer models. We think that the new version of this sentence states this point more clearly.”

      (4) Page 2, last paragraph. I do not see an obvious relation of references 15-17 to the phrase they are associated with.

      We disagree with this statement, and think that these references are appropriate.

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      (5) Page 3, paragraph 2. Cryo-EM maps should be also "high-resolution"; it does not read like this from the phrase.

      We agree that high-resolution should be added, and the sentence now states:

      “However, many factors make manually creating multiconformer models difficult and time-consuming. Interpreting weak density is complicated by noise arising from many sources, including crystal imperfections, radiation damage, and poor modeling in X-ray crystallography, and errors in particle alignment and classification, poor modeling of beam induced motion, and imperfect detector Detector Quantum Efficiency (DQE) in high-resolution cryo-EM.”

      (6) Page 3, last paragraph before "results". The words "... in both individual cases and large structural bioinformatic projects" do not have much meaning, except introducing a self-reference. Also, repeating "better than 2 A" looks not necessary.

      We agree that this was unnecessary and have simplified the last sentence to state:

      “With the improvements in model quality outlined here, qFit can now be increasingly used for finalizing high-resolution models to derive ensemble-function insights.”

      (7) Page 3. "Results". Could "experimental" be replaced by a synonym, like "trial", to avoid confusing with the meaning "using experimental data"?

      We have replaced experimental with exploratory to describe the use of qFit on CryoEM data. The statement now reads:

      “For cryo-EM modeling applications, equivalent metrics of map and model quality are still developing, rendering the use of qFit for cryo-EM more exploratory.”

      (8) Page 4, A.1. Should it be "steps +/- 0.1" and "coordinate" be "coordinate axis"? One can modify coordinates and not shift them. I do not understand how, with the given steps, the authors calculated the number of combinations ("from 9 to 81"). Could a long "Alternatively, ...absent" be reduced simply to "Otherwise"?

      We have simplified and clarified the sentence on the sampling of backbone coordinates to state:

      “If anisotropic B-factors are absent, the translation of coordinates occurs in the X, Y, and Z directions. Each translation takes place in steps of 0.1 along each coordinate axis, extending to 0.3 Å, resulting in 9 (if isotropic) or to 81 (if anisotropic) distinct backbone conformations for further analysis.”

      (9) Page 6, B.1, line 2. Word "linearly" is meaningless here.

      We have modified this to read:

      “Moving from N- to C- terminus along the protein,”

      (10) Page 9, line 2. It should be explained which data set is considered as the test set to calculate Rfree.

      We think this is clear and would be repetitive if we duplicated it.

      (11) Page 9, line 7. It should be "a valuable metric" and not "an"

      We agree and have updated the sentence to read:

      “Rfree is a valuable metric for monitoring overfitting, which is an important concern when increasing model parameters as is done in multiconformer modeling.”

      (12) Page 10, paragraph 3. "... as a string (Methods)". I did not find any other mention of this term "string", including in "Methods" where it supposed to be explained. Either this should be explained (and an example is given?), or be avoided.

      We agree that string is not necessary (discussing the programmatic datatype). We have removed this from the sentence. It now reads:

      “To quantify how often qFit models new rotameric states, we analyzed the qFit models with phenix.rotalyze, which outputs the rotamer state for each conformer (Methods).”

      (13) Page10, lines 3-4 from bottom. Are these two alternative conformations justified?

      We are unsure what this is referring to.

      (14) Page 12, Fig. 2A. In comparison with Supplement Fig 2C, the direction of axes is changed. Could they be similar in both Figures?

      We have updated Supplementary Figure 2C to have the same direction of axes as Figure 2A.

      (15) Page 15, section's title. Choose a single verb in "demonstrate indicate".

      We have amended the title of this section to be:

      “Simulated data demonstrate qFit is appropriate for high-resolution data.”

      (16) Page 15, paragraph 2. "Structure factors from 0.8 to 3.0 A resolution" does not mean what the author wanted apparently to tell: "(complete?) data sets with the high-resolution limit which varied from 0.8 to 3.0 A ...". Also, a phrase of "random noise increasing" is not illustrated by Figs.5 as it is referred to.

      We have edited this sentence to now read:

      “To create the dataset for resolution dependence, we used the ground truth 7KR0 model, including all alternative conformations, and generated artificial structure factors with a high resolution limit ranging from  0.8 to 3.0 Å resolution (in increments of 0.1 Å).”

      (17) Page 15, last paragraph is written in a rather formal and confusing way while a clearer description is given in the figure legend and repeated once more in Methods. I would suggest to remove this paragraph.

      We agree that this is confusing. Instead of create a true positive/false positive/true negative/false negative matrix, we have just called things as they are, multiconformer or single conformer and match or no match. We have edited the language the in the manuscript and figure legends to reflect these changes.

      (18) Page 16. Last two paragraphs start talking about a new story and it would help to separate them somehow from the previous ones (sub-title?).

      We agree that this could use a subtitle. We have included the following subtitle above this section:

      “Simulated multiconformer data illustrate the convergence of qFit.”

      (19) Page 20. "or static" and "we determined that" seem to be not necessary.

      We have removed static and only used single conformer models. However, as one of the main conclusions of this paper is determining that qFit can pick up on alternative conformers that were modeled manually, we have decided to the keep the “we determined that”.

      (20) Page 21, first paragraph. "Data" are plural; it should be "show" and "require"

      We have made these edits. The sentence now reads:

      “However, our data here shows that not only does qFit need a high-resolution map to be able to detect signal from noise, it also requires a very well-modeled structure as input.”

      (21) Page 21, References should be indicated as [41-45], [35,46-48], [55-57]. A similar remark to [58-63] at page 22.

      We have fixed the reference layout to reflect this change.

      (22) Page 21, last paragraph. "Further reduce R-factors" (moreover repeated twice) is not correct neither by "further", since here it is rather marginal, nor as a goal; the variations of R-factors are not much significant. A more general statement like "improving fit to experimental data" (keeping in mind density maps) may be safer.

      We agree with the duplicative nature of these statements. We have amended the sentence to now read:

      “Automated detection and refinement of partial-occupancy waters should help improve fit to experimental data further reduce Rfree15 and provide additional insights into hydrogen-bond patterns and the influence of solvent on alternative conformations.”

      (23) Page 22. Sub-sections of "Methods" are given in a little bit random order; "Parallelization of large maps" in the middle of the text is an example. Put them in a better order may help.

      We have moved some section of the Methods around and made better headings by using an underscore to highlight the subsections (Generating and running the qFit test set, qFit improved features, Analysis metrics, Generating synthetic data for resolution dependence).

      (24) Page 24. Non-convex solution is a strange term. There exist non-convex problems and functions and not solutions.

      We agree and we have changed the language to reflect that we present the algorithm with non-convex problems which it cannot solve.

      (25) Page 26, "Metrics". It is worthy to describe explicitly the metrics and not (only) the references to the scripts.

      For all metrics, we describe a sentence or two on what each metric describes. As these metrics are well known in the structural biology field, we do not feel that we need to elaborate on them more.

      (26) Page 26. Multiplying B by occupancy does not have much sense. A better option would be to refer to the density value in the atomic center as occ*(4*pi/B)^1.5 which gives a relation between these two entities.

      We agree and have update the B-factor figures and metrics to reflect this.

      (27) Page 40, suppl. Fig. 5. Due to the color choice, it is difficult to distinguish the green and blue curves in the diagram.

      We have amended this with the colors of the curves have been switched.

      (28) Page 42, Suppl. Fig. 7. (A) How the width of shaded regions is defined? (B) What the blue regions stand for? Input Rfree range goes up to 0.26 and not to 0.25; there is a point at the right bound. (C) Bounds for the "orange" occupancy are inversed in the legend.

      (A) The width of the shaded region denotes the standard deviations among the values at every resolution. We have made this clearer in the caption

      (B) The blue region denotes the confidence interval for the regression estimate. Size of the confidence interval was set to 95%. We have made this clearer in the caption

      (C) This has been fixed now

      The maximum R-free value is 0.2543, which we rounded down to 0.25.

      (29) Page 43. Letters E-H in the legend are erroneously substituted by B-E.

      We apologize for this mistake. It is now corrected.

    2. eLife assessment

      This work describes important updates to qFit, the state-of-the art tool for modeling alternative conformations of protein molecules based on high resolution X-ray diffraction or Cryo-EM data. The authors provide some convincing analyses of qFit's performance in selected test cases. This manuscript will be of interest to structural biologists and protein biochemists, since the adoption of qFit in structural refinement may lead to new mechanistic insights into protein function.

    3. Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

    4. Reviewer #2 (Public Review):

      Summary

      The manuscript "Uncovering Protein Ensembles: Automated Multiconformer Model building for X-ray Crystallography and Cryo-EM" by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well-designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well-written and the validation thorough.<br /> I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      Due to limitations of past software engineering, the paper lacks a careful comparison to past versions of qFit. In light of the extensive assessment of the current version of qFit, this is a minor concern.

      Although qFit can handle supramolecular assemblies and bound organic molecules, analysis in the manuscript is limited to single-chain X-ray structures. I look forward to demonstration of its utility in such cases in future work.

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

    5. Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Differently from structure prediction methods, this approach is strongly based on accurate high-resolution experimental information and requires validated single-copy high-quality models as input. In overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - an automation of construction of multiple conformations when they can be identified visually;<br /> - a determination of multiple conformations when their visual identification is difficult or impossible.

      The former is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show. It is an indisputably useful tool for such a goal. The second problem is important from the physical point of view and has been considered first thirty years ago by Burling & Brünger. The manuscript does not specify clearly how much the current tool addresses the second case. To model such maps, the authors introduced errors in structure factors, however, being independent, as in this work, such errors, even quite high, may leave the maps reasonably well interpretable. Obviously, it is impossible to model all kinds of errors and this modeling of noise is appreciated but it would helpful for understanding if the manuscript shows, for example, the worst map when the procedure was successful.

      The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model does not overfit data but hardly as a strong argument in favor of the model.

      In general, global targets are less appropriate for this kind of problems and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have a more freedom to place the atoms where the experiments "see" them.

      The key local characteristic for multicomformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, the manuscript does not describe how sharply it identifies the best model and how much it changes between the initial and final models; in general, there is no feeling about its values. The Q-score (page 17) can be an appropriate tool for the first problem where the multiple conformations and individual atomic images are clearly separated and not for the second problem where the contributions from neighboring conformations and atoms are merged. In addition to BIC or to even more conventional global target functions such as LS or map correlation, the extreme values of the local difference maps may help to validate, or not, the model.

      This described method with the results presented is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. This tool is important to produce user-unbiased multiconformer models rapidly and automatically. At the same time, absence of strong density-based validation components may limit its impact.

      Strengths:<br /> Addressing an important problem and automatisation of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:<br /> An insufficient validation of the models when no discrete alternative conformations visible and insufficiency of local real-space validation indicators.

    1. eLife assessment

      Urofacial syndrome is a rare early-onset lower urinary tract disorder characterized by variants in HPSE2, the gene encoding heparanase-2. This study provides a useful proof-of-principle demonstration that AAV9-based gene therapy for urofacial syndrome is feasible and safe at least over the time frame evaluated, with restoration of HPSE2 expression leading to re-establishment of evoked contraction and relaxation of bladder and outflow tract tissue, respectively, in organ bath studies. The evidence is, however, still incomplete. The work would benefit from evaluation of additional replicates for several endpoints, quantitative assessment of HPSE2 expression, inclusion of in vivo analyses such as void spot assays or cystometry, single-cell analysis of the urinary tract in mutants versus controls, and addressing concerns regarding the discrepancy in HPSE2 expression between bladder tissue and liver in humans and mice.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors try to use a gene therapy approach to cure urofacial symptoms in an HSPE2 mutant mouse model.

      Strengths:

      The authors have convincingly shown the expression of AAV9/HSPE2 in pelvic ganglion and liver tissues. They have also shown the defects in urethra relaxation and bladder muscle contraction in response to EFS in mutant mice, which were reversed in treated mice.

      Weaknesses:

      It is easy to understand that high expression levels of HPSE2 in the bladder tissue lead to bladder dysfunction in human patients, however, the undetectable level of HPSE2 in AAV9 transfected mice bladders is a big question for the functional correction in those HPSE2 mutated mice.

    3. Reviewer #2 (Public Review):

      In this study, Lopes and colleagues provide evidence to support the potential for gene therapy to restore expression of heparanase-2 (Hpse2) in mice mutant for this gene, as occurs in urofacial syndrome. Building on prior studies describing the nature of urinary tract dysfunction in Hpse2 mutant mice, the authors applied a gene therapy approach to determine whether gene replacement could be achieved, and if so, whether restoration of HPSE2 expression could mitigate the urinary tract dysfunction. Using a viral vector-based strategy, shown to be successful for gene replacement in humans, the authors demonstrated dose-dependent viral transduction of pelvic ganglia and liver in wild type mice. No impact on body weight or liver health was noted suggesting the approach was safe. Administration of AAV9/HPSE2 to Hpse2 mutant mice was associated with similar transduction of pelvic ganglia and a corresponding increase in heparanase-2 protein expression in this site. Analysis of bladder outflow tract and bladder body physiology using organ bath studies showed that re-expression of heparanase-2 in Hpse2 mutant mice was associated with restored neurogenic relaxation of the outflow tract and nerve-evoked contraction of the bladder body, albeit with notable variability in the response at lower frequencies across replicates. Differences were noted in the evoked response to carbachol with bladders from Hpse2 mutant male mice showing increased sensitivity upon HPSE2 replacement compared to wild type, but bladders from female mice showing no difference. Based on these findings the authors concluded that AAV9-based HPSE2 replacement is feasible and safe, mitigates some physiological deficits in outflow tract and bladder tissue from Hpse2 mutant mice and provides proof-of-principle for gene replacement approaches for other genes implicated in lower urinary tract disorders. Strengths include a solid experimental design and data in support of some of the conclusions, and discussion of limitations of the approach. Weaknesses include the variability, albeit acknowledged, in some of the functional assessments, and the limited investigation of bladder tissue morphology in Hpse2 mutant mice.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Some important and interesting data are missing. For example, whether the gene therapy can extend the life span of these mutants? The overall in vivo voiding function is missing. AAV9/HSPE2 expression in the bladder wall is not shown.

      Our study was not designed to determine whether gene therapy can improve life span of the Hpse2 mutant mice. We know that the mutant mice usually become ill after the first month of life and can die. However, we wanted to study the mice when they were generally well so that there would be no confounding effects on the bladder physiology caused by general ill health. Indeed, a recent study of Hpse2 inducible deletion in adult mice has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420). We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      We strongly agree that in vivo voiding studies will be important in the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      It is correct that in this paper we focussed on gene transduction into the pelvic ganglia, because the evidence is mounting that this is a neurogenic disease, with our ex vivo physiological studies showing predominantly neurogenic defects that are corrected by the gene therapy. To further understand the biodistribution of the vector we have now sought evidence of viral transduction into the bladder itself (the new Figure 5). In contrast to the neurons of the pelvic ganglia, we observed very limited transduction: “The vector genome sequence WPRE3, and HPSE2 transcripts, were not detected in the urothelium or lamina propria, the loose tissue directly underneath the urothelium. Within the detrusor muscle layer itself, the large smooth muscle cells were not transduced. However, there were rare small foci of BaseScopeTM signal that may represent nerves coursing through the detrusor.”

      Reviewer 2:

      Weaknesses include a lack of discussion of the basis for differences in carbachol sensitivity in Hpse2 mutant mice, limited discussion of bladder tissue morphology in Hpse2 mutant mice, some questions over the variability of the functional data, and a need for clarification on the presentation of statistical significance of functional data

      Yes, it is interesting that untreated male mutant mice have an increased bladder body contraction to carbachol compared with WT males. In a previous paper (Manak et al., 2020) we performed quantitative western blots for the M2 and M3 receptors and found levels were similar in mutants to the WTs, thus the increased sensitivity probably lies post-receptor.

      A detailed study of the bladder body is an interesting idea, in terms of possible transgene expression and detailed histology, and is something we will pursue in future studies.

      We have reported in our physiology graphs what we find. We do find some variability, particularly at lower frequencies, but our conclusions depend on analyses of the whole curve, which depend on multiple frequencies and show the expected overall pattern of frequency-dependent relaxation.

      Thank you, the stats for Figure 8 (now figure 9) have been corrected.

      Reviewer 3:

      Single-cell analysis of mutants versus control bladder, urethra including sphincter. This would be great also for the community.

      Yes, in future we are very interested in using a single cell sequencing approach to look at the mutant, WT and rescued pelvic ganglia. In the manuscript we have provided further discussion on the aetiology of urofacial syndrome, and what we still have to learn. We highlight a recent paper in eLife that uses single cell sequencing of mouse pelvic ganglia (Sivori et al., 2024), demonstrating the feasibility of this molecular approach in the pelvic ganglia, and propose this technique could be applied to the study the UFS mice to provide important insights into the molecular pathobiology of the condition.

      Detailed tables showing data from each mouse examined.

      In theory, it would be very interesting to correlate the strength of human gene transduction into the pelvic ganglia, with, for example, the effect on a physiological parameter. However, in general we used different sets of mice for these techniques so at the present we don’t have this information.

      Use of measurements that are done in vivo (spot assay for example). This sounds relatively simple.

      We strongly agree that in vivo voiding studies will be important it the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      Assessment of viral integration in tissues besides the liver (could be done by QPCR).

      This is an important point, and suggest the pancreas is a particularly interesting target for future studies. In the manuscript, we have highlighted a recent study of Hpse2 inducible deletion in young adult mice that has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420), associated with fatty degeneration of pancreatic acinar cells. The Hpse2 mutant animals are smaller than wildtype littermates, the reason for which has not been identified but could be due to defects in processing milk and food.  We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      Discuss subtypes of neurons that are present and targeted in the context of mutants and controls.

      The make-up of the pelvic ganglia in Hpse2 mutant mice is a fascinating question. Future analysis using scRNA-Seq may be the most effective way to answer this question and is a molecular approach we are looking to pursue in the future.

    1. eLife assessment

      This important study develops a machine learning method to reveal hidden unknown functions and behavior in gene regulatory networks by searching parameter space in an efficient way. The evidence for some parts of the paper is still incomplete, needing systematic comparison to other methods and to the ground truth, but the work will nevertheless be of broad interest to anyone working in biology of all stripes, since the ideas put forward by the authors extend beyond gene regulatory networks to revealing hidden functions in any complex system with many interacting parts.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper suggests to apply intrinsically-motivated exploration for the discovery of robust goal states in gene regulatory networks.

      Strengths:

      The paper is well written. The biological motivation and the need for such methods are formulated extraordinarily well. The battery of experimental models is impressive.

      Weaknesses:

      (1) The proposed method is compared to the random search. That says little about the performance with regard to the true steady-state goal sets. The latter could be calculated at least for a few simple ODE (e.g., BIOMD0000000454, `Metabolic Control Analysis: Rereading Reder'). The experiment with 'oscillator circuits' may not be directly interpolated to the other models.

      The lack of comparison to the ground truth goal set (attractors of ODE) from arbitrary initial conditions makes it hard to evaluate the true performance/contribution of the method. A part of the used models can be analyzed numerically using JAX, while there are models that can be analyzed analytically.

      "...The true versatility of the GRN is unknown and can only be inferred through empirical exploration and proxy metrics....": one could perform a sensitivity analysis of the ODEs, identifying stable equilibria. That could provide a proxy for the ground truth 'versatility'.

      (2) The proposed method is based on `Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning', which assumes state action trajectories [s_{t_0:t}, a_{t_0:t}], (2.1 Notations and Assumptions' in the IMGEP paper). However, the models used in the current work do not include external control actions, but rather only the initial conditions can be set. It is not clear from the methods whether IMGEP was adapted to this setting, and how the exploration policy was designed w/o actual time-dependent actions. What does "...generates candidate intervention parameters to achieve the current goal...."<br /> mean considering that interventions 'Sets the initial state...' as explained in Table 2?

      (3) Fig 2 shows the phase space for (ERK, RKIPP_RP) without mentioning the typical full scale of ERK, RKIPP_RP. It is unclear whether the path from (0, 0) to (~0.575, ~3.75) at t=1000 is significant on the typical scale of this phase space. is it significant on the typical scale of this phase space?

      (4) Table 2:<br /> (a) Where is 'effective intervention' used in the method?<br /> (b) In my opinion 'controllability', 'trainability', and 'versatility' are different terms. If there correspondence is important I would suggest to extend/enhance the column "Proposed Isomorphism". otherwise, it may be confusing. I don't see how this table generalizes generalizes "concepts from dynamical complex systems and behavioral sciences under a common navigation task perspective".

    3. Reviewer #2 (Public Review):

      Summary:

      Etcheverry et al. present two computational frameworks for exploring the functional capabilities of gene regulatory networks (GRNs). The first is a framework based on intrinsically motivated exploration, here used to reveal the set of steady states achievable by a given gene regulatory network as a function of initial conditions. The second is a behaviorist framework, here used to assess the robustness of steady states to dynamical perturbations experienced along typical trajectories to those steady states. In Figs. 1-5, the authors convincingly show how these frameworks can explore and quantify the diversity of behaviors that can be displayed by GRNs. In Figs. 6-9, the authors present applications of their framework to the analysis and control of GRNs, but the support presented for their case studies is often incomplete.

      Following revision, my overall perspective of the paper remains unchanged. The first half of the paper provides solid evidence to support an important conceptual framework. The evidence presented for the use cases in the latter half is incomplete; as the authors note, they are preliminary and meant to be built on in future work. I have included my first round comments below.

      Strengths:

      Overall, the paper presents an important development for exploring and understanding GRNs/dynamical systems broadly, with solid evidence supporting the first half of their paper in a narratively clear way.

      The behaviorist point of view for robustness is potentially of interest to a broad community, and to my knowledge introduces novel considerations for defining robustness in the GRN context.

      Some specific weaknesses, mostly concerning incomplete analyses in the second half of the paper:

      (1) The analysis presented in Fig. 6 is exciting but preliminary. Are there other appropriate methods for constructing energy landscapes from dynamical trajectories in gene regulatory networks? How do the results in this particular case study compare to other GRNs studied in the paper?

      Additionally, it is unclear whether the analysis presented in Fig. 6C is appropriate. In particular, if the pseudopotential landscapes are constructed from statistics of visited states along trajectories to the steady state, then the trajectories derived from dynamical perturbations do not only reflect the underlying pseudo-landscape of the GRN. Instead, they also include contributions from the perturbations themselves.

      (2) In Fig. 7, I'm not sure how much is possible to take away from the results as given here, as they depend sensitively on the cohort of 432 (GRN, Z) pairs used. The comparison against random networks is well-motivated. However, as the authors note, comparison between organismal categories is more difficult due to low sample size; for instance, the "plant" and "slime mold" categories each only has 1 associated GRN. Additionally, the "n/a" category is difficult to interpret.

      (3) In Fig. 8, it is unclear whether the behavioral catalog generated is important to the intervention design problem of moving a system in one attractor basin to another. The authors note that evolutionary searches or SGD could also be used to solve the problem. Is the analysis somehow enabled by the behavioral catalog in a way that is complementary to those methods? If not, comparison against those methods (or others e.g. optimal control) would strengthen the paper.

      (4) The analysis presented in Fig. 9 also is preliminary. The authors note that there exist many algorithms for choosing/identifying the parameter values of a dynamical system that give rise to a desired time series. It would be a stronger result to compare their approach to more sophisticated methods, as opposed to random search and SGD. Other options from the recent literature include Bayesian techniques, sparse nonlinear regression techniques (e.g. SINDy), and evolutionary searches. The authors note that some methods require fine-tuning in order to be successful, but even so, it would be good to know the degree of fine-tuning which is necessary compared to their method. [second round: the authors have included a comparison against CMA-ES, an evolutionary algorithm]

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study develops a machine learning method to reveal hidden unknown functions and behavior in gene regulatory networks by searching parameter space in an efficient way. The evidence for some parts of the paper is still incomplete and needs systematic comparison to other methods and to the ground truth, but the work will be of broad interest to anyone working in biology of all stripes since the ideas reach beyond gene regulatory networks to revealing hidden functions in any complex system with many interacting parts.

      We thank the editors and reviewers for their positive assessment and constructive suggestions. In our response, we acknowledge the importance of systematic comparison to other methods and to the ground truth, when available. However we also emphasize the challenges associated with evaluating such methods in the context of uncovering hidden behaviors in complex biological networks as the ground truth is often unknown.  We hope that our explanations will clarify the potential of our approach in advancing the exploration of these systems.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper suggests to apply intrinsically-motivated exploration for the discovery of robust goal states in gene regulatory networks.

      Strengths:

      The paper is well written. The biological motivation and the need for such methods are formulated extraordinarily well. The battery of experimental models is impressive.

      We thank the reviewer for sharing interest in the research problem and for recognizing the strengths of our work.

      Weaknesses:

      (1) The proposed method is compared to the random search. That says little about the performance with regard to the true steady-state goal sets. The latter could be calculated at least for a few simple ODE (e.g., BIOMD0000000454, `Metabolic Control Analysis: Rereading Reder'). The experiment with 'oscillator circuits' may not be directly interpolated to the other models.

      The lack of comparison to the ground truth goal set (attractors of ODE) from arbitrary initial conditions makes it hard to evaluate the true performance/contribution of the method. A part of the used models can be analyzed numerically using JAX, while there are models that can be analyzed analytically.

      "...The true versatility of the GRN is unknown and can only be inferred through empirical exploration and proxy metrics....": one could perform a sensitivity analysis of the ODEs, identifying stable equilibria. That could provide a proxy for the ground truth 'versatility'.

      We agree with the reviewer that one primary concern is to properly evaluate the effectiveness of the proposed method. However, as we move toward complex pathways, knowledge of the “true” steady-state goal sets is often unknown which is where the use of machine learning methods as the one we propose are particularly interesting (but challenging to evaluate).

      For simple models whose true steady-state distribution can be derived numerically and/or analytically, it is very likely that their exploration will be much simpler and this is not where a lot of improvement over random search may be found, which explains our focus on more complex models. While we agree that it is still interesting to evaluate exploration methods on these simple models for checking their behavior, it is not clear how to scale this analysis to the targeted more complex systems.

      For systems whose true steady state distribution cannot be derived analytically or numerically, we believe that random search is a pertinent baseline as it is commonly used in the literature to discover the attractors/trajectories of a biological network. For instance, Venkatachalapathy et al. [1] initialize stochastic simulations at multiple randomly sampled starting conditions (which is called a kinetic Monte Carlo-based method) to capture the steady states of a biological system. Similarly, Donzé et al. [29] use a Monte Carlo approach to compute the reachable set of a biological network «when the number of parameters  is large and their uncertain range  is not negligible». For the considered models, the true steady-state goal set is unknown, which is why we chose comparison with random search. We added a “Statistics” subsection in the Methods section providing additional details about the statistical analyses we perform between our method and the random search baseline.

      (2) The proposed method is based on `Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning', which assumes state action trajectories [s_{t_0:t}, a_{t_0:t}], (2.1 Notations and Assumptions' in the IMGEP paper). However, the models used in the current work do not include external control actions, but rather only the initial conditions can be set. It is not clear from the methods whether IMGEP was adapted to this setting, and how the exploration policy was designed w/o actual time-dependent actions. What does "...generates candidate intervention parameters to achieve the current goal....", mean considering that interventions 'Sets the initial state...' as explained in Table 2?

      We thank the reviewer for asking for clarification, as indeed the IMGEP methodology originates from developmental robotics scenarios which generally focus on the problem of robotic sequential decision-making, therefore assuming state action trajectories as presented in Forestier et al. [65]. However, in both cases, note that the IMGEP is responsible for sampling parameters which then govern the exploration of the dynamical system. In Forestier et al. [65], the IMGEP also only sets one vector at the start (denoted ) which was specifying parameters of a movement (like the initial state of the GRN), which was then actually produced with dynamic motion primitives which are dynamical system equations similar to GRN ODEs, so the two systems are mathematically equivalent. More generally, while in our case the “intervention” of the IMGEP (denoted ) only controls the initial state of the GRN, future work could consider more advanced sequential interventions simply by setting parameters of an action policy  at the start which could be called during the GRN’s trajectory to sample control actions  where  would be the state of the GRN. In practice this would also require setting only one vector at the start, so it would remain the same exploration algorithm and only the space of parameters would change, which illustrates the generality of the approach.

      (3) Fig 2 shows the phase space for (ERK, RKIPP_RP) without mentioning the typical full scale of ERK, RKIPP_RP. It is unclear whether the path from (0, 0) to (~0.575, ~3.75) at t=1000 is significant on the typical scale of this phase space. is it significant on the typical scale of this phase space?

      The purpose of Figure 2 is to illustrate an example of GRN trajectory in transcriptional space, and to illustrate what “interventions” and “perturbations” can be in that context. To that end we have used the fixed initial conditions provided in the BIOMD0000000647, replicating Figure 5 of Cho et al. [56].

      While we are not sure of what the reviewer means with “typical” scale of this phase space, we would like to point reviewer toward Figure 8 which shows examples of certain paths that indeed reach further point in the same phase space (up to ~10 in RKIPP_RP levels and ~300 in ERK levels). However, while the paths displayed in Figure 8 are possible (and were discovered with the IMGEP), note that they may be “rarer” to occur naturally  in the sense that a large portion of the tested initial conditions with random search tend to converge toward smaller (ERK, RKIPP_RP) steady-state values similar to the ones displayed in Figure 2.

      (4) Table 2:

      a. Where is 'effective intervention' used in the method?

      b. in my opinion 'controllability', 'trainability', and 'versatility' are different terms. If their correspondence is important I would suggest to extend/enhance the column "Proposed Isomorphism". otherwise, it may be confusing.

      a) We thank the reviewer for pointing out that “effective intervention” is not explicitly used in the method. The idea here is that as we are exploring a complex dynamical system (here the GRN), some of the sampled interventions will be particularly effective at revealing novel unseen outcomes whereas others will fail to produce a qualitative change to the distribution of discovered outcomes. What we show in this paper, for instance in Figure 3a and Figure 4, is that the IMGEP method is particularly sample-efficient in finding those “effective interventions”, at least more than a random exploration. However we agree that the term “effective intervention” is ambiguous (does not say effective in what) and we have replaced it with “salient intervention” in the revised version.

      b) We thank the reviewer for highlighting some confusing terms in our chosen vocabulary, and we have clarified those terms in the revised version. We agree that controllability/trainability and versatility are not exactly equivalent concepts, as controllability/trainability typically refers to the amount to which a system is externally controllable/trainable whereas versatility typically refers to the inherent adaptability or diversity of behaviors that a system can exhibit in response to inputs or conditions. However, they are both measuring the extent of states that can be reached by the system under a distribution of stimuli/conditions, whether natural conditions or engineered ones, which is why we believe that their correspondence is relevant.

      I don't see how this table generalizes "concepts from dynamical complex systems and behavioral sciences under a common navigation task perspective".

      We have replaced the verb “generalize” with “investigate” in the revised version.

      Reviewer #2 (Public Review):

      Summary:

      Etcheverry et al. present two computational frameworks for exploring the functional capabilities of gene regulatory networks (GRNs). The first is a framework based on intrinsically-motivated exploration, here used to reveal the set of steady states achievable by a given gene regulatory network as a function of initial conditions. The second is a behaviorist framework, here used to assess the robustness of steady states to dynamical perturbations experienced along typical trajectories to those steady states. In Figs. 1-5, the authors convincingly show how these frameworks can explore and quantify the diversity of behaviors that can be displayed by GRNs. In Figs. 6-9, the authors present applications of their framework to the analysis and control of GRNs, but the support presented for their case studies is often incomplete.

      Strengths:

      Overall, the paper presents an important development for exploring and understanding GRNs/dynamical systems broadly, with solid evidence supporting the first half of their paper in a narratively clear way.

      The behaviorist point of view for robustness is potentially of interest to a broad community, and to my knowledge introduces novel considerations for defining robustness in the GRN context.

      We thank the reviewer for recognizing the strengths and novelty of the proposed experimental framework for exploring and understanding GRNs, and complex dynamical systems more generally. We agree that the results presented in the section “Possible Reuses of the Behavioral Catalog and Framework” (Fig 6-9) can be seen as incomplete along certain aspects, which we tried to make as explicit as possible throughout the paper, and why we explicitly state that these are “preliminary experiments”. Despite the discussed limitations, we believe that these experiments are still very useful to illustrate the variety of potential use-cases in which the community could benefit from such computational methods and experimental framework, and build on for future work.

      Some specific weaknesses, mostly concerning incomplete analyses in the second half of the paper:

      (1) The analysis presented in Fig. 6 is exciting but preliminary. Are there other appropriate methods for constructing energy landscapes from dynamical trajectories in gene regulatory networks? How do the results in this particular case study compare to other GRNs studied in the paper?

      We are not aware of other methods than the one proposed by Venkatachalapathy et al. [1] for constructing an energy landscape given an input set of recorded dynamical trajectories, although it might indeed be the case. We want to emphasize that any of such methods would anyway depend on the input set of trajectories, and should therefore benefit from a set that is more representative of the diversity of behaviors that can be achieved by the GRN, which is why we believe the results presented in Figure 6 are interesting. As the IMGEP was able to find a higher diversity of reachable goal states (and corresponding trajectories) for many of the studied GRNs, we believe that similar effects should be observable when constructing the energy landscapes for these GRN models, with the discovery of additional or wider “valleys” of reachable steady states.

      Additionally, it is unclear whether the analysis presented in Fig. 6C is appropriate. In particular, if the pseudopotential landscapes are constructed from statistics of visited states along trajectories to the steady state, then the trajectories derived from dynamical perturbations do not only reflect the underlying pseudo-landscape of the GRN. Instead, they also include contributions from the perturbations themselves.

      We agree that the landscape displayed Fig. 6C integrates contributions from the perturbations on the GRN’s behavior, and that it can shape the landscape in various ways, for instance affecting the paths that are accessible, the shape/depth of certain valleys, etc. But we believe that qualitatively or quantitatively analyzing the effect of these perturbations  on the landscape is precisely what is interesting here: it might help 1) understand how a system respond to a range of perturbations and to visualize which behaviors are robust to those perturbations, 2) design better strategies for manipulating those systems to produce certain behaviors

      (2) In Fig. 7, I'm not sure how much is possible to take away from the results as given here, as they depend sensitively on the cohort of 432 (GRN, Z) pairs used. The comparison against random networks is well-motivated. However, as the authors note, comparison between organismal categories is more difficult due to low sample size; for instance, the "plant" and "slime mold" categories each only have 1 associated GRN. Additionally, the "n/a" category is difficult to interpret.

      We acknowledge that this part is speculative as stated in the paper: “the surveyed database is relatively small with respect to the wealth of available models and biological pathways, so we can hardly claim that these results represent the true distribution of competencies across these organism categories”. However, when further data is available, the same methodology can be reused and we believe that the resulting statistical analyses could be very informative to compare organismal (or other) categories.

      (3) In Fig. 8, it is unclear whether the behavioral catalog generated is important to the intervention design problem of moving a system from one attractor basin to another. The authors note that evolutionary searches or SGD could also be used to solve the problem. Is the analysis somehow enabled by the behavioral catalog in a way that is complementary to those methods? If not, comparison against those methods (or others e.g. optimal control) would strengthen the paper.

      We thank the reviewer for asking to clarify this point, which might not be clearly explained in the paper. Here the behavioral catalog is indeed used in a complementary way to the optimization method, by identifying a representative set of reachable attractors which are then used to define the optimization problem. For instance here, thanks to the catalog, we 1) were able to identify a “disease” region and several possible reachable states in that region and 2) use several of these states as starting points of our optimization problem, where we want to find a single intervention that can successfully and robustly reset all those points, as illustrated in Figure 8. Please note that given this problem formulation, a simple random search was used as an optimization strategy. When we mention more advanced techniques such as EA or SGD, it is to say that they might be more efficient optimizers than random search. However, we agree that in many cases optimizing directly will not work if starting from random or bad initial guess, and this even with EA or SGD. In that case the discovered behavioral catalog can be useful to better initialize  this local search and make it more efficient/useful, akin to what is done in Figure 9.

      (4) The analysis presented in Fig. 9 also is preliminary. The authors note that there exist many algorithms for choosing/identifying the parameter values of a dynamical system that give rise to a desired time-series. It would be a stronger result to compare their approach to more sophisticated methods, as opposed to random search and SGD. Other options from the recent literature include Bayesian techniques, sparse nonlinear regression techniques (e.g. SINDy), and evolutionary searches. The authors note that some methods require fine-tuning in order to be successful, but even so, it would be good to know the degree of fine-tuning which is necessary compared to their method.

      We agree that the analysis presented in Figure 9 is preliminary, and thank the reviewer for the suggestion. We would first like to refer to other papers from the ML literature that have more thoroughly analyzed this issue, such as Colas et al. [74] and Pugh et al. [34], and shown the interest of diversity-driven strategies as promising alternatives.  Additionally, as suggested by the reviewer, we added an additional comparison to the CMA-ES algorithm in the revised version in order to complete our analysis. CMA-ES is an evolutionary algorithm which is self-adaptive in the optimization steps and that is known to be better suited than SGD to escape local minimas when the number of parameters is not too high (here we only have 15 parameters). However, our results showed that while CMA-ES explores more the solution space at the beginning of optimization than SGD does, it also ultimately converges into a local minima similarly to SGD. The best solution converges toward a constant signal (of the target b) but fails to maintain the target oscillations, similar to the solutions discovered by gradient descent. We tried this for a few hyperparameters (init mean and std) but always found similar results.  We have updated the figure 9 image and caption, as well as descriptive text, to include these novel results in the revised version. We also added a reference to the CMA-ES paper in the citations.

      Reviewer #1 (Recommendations For The Authors):

      I would suggest to conduct a more rigor analysis of the performance by estimating/approximating the ground truth robust goal sets in important GRNs.

      Also, the use of terminology from different disciplines can be improved. Please see my comments above. Specifically, the connection between controllability in dynamical control systems and versatility used in this paper is unclear.

      We hope to have addressed the reviewer's concerns in our previous answers.

      Reviewer #2 (Recommendations For The Authors):

      Fig 4b: I'm not sure if DBSCAN is the appropriate method to use here, as the visual focus on the core elements of the clusters downplays the full convex hull of the points that random sampling achieves in Z space. An analysis based on convex hulls or the ball-coverage from Fig. 3b would presumably generate plots that were more similar between random sampling and curiosity search. If the goal is to highlight redundancy/non-linearity in the mapping between Z and I, another approach might be to simply bin Z-space in a grid, or to use a clustering algorithm that is less stringent about core/noise distinctions.

      We thank the reviewer for the suggestion. This plot is intended to convey the reader an understanding of why a method that uniformly samples goals in Z (what the  IMGEP is doing), is more efficient than a method that uniformly samples parameters in I (what the random search is doing), in systems for which there is high redundancy/non-linearity in the mapping between I and Z. We agree that binning the Z-space in a grid and counting the number of achieved bins is a way to quantitatively measure this, which is by the way very close to what we do in Figure 3 for measuring the achieved diversity. We believe however that the clustering and coloring provides additional intuitions on why this is the case: it illustrates that large regions of the intervention space map to small regions in the outcome space and vice versa.

      Additional changes in the revised version:

      We added a sentence in the Methods section as well as in the caption of Table S1 providing additional details about the way we simulate the biological models from the BioModels website

      We fixed a wrong reference to Figure 4 in the Methods “Sensitivity measure” subsection with reference to Figure 5.

    1. eLife assessment

      Despite the importance of long-lived plasma cells (LLPCs), particularly for the infection and vaccination field, it is still unclear how they acquire their longevity. With a solid genetic approach, the authors demonstrate quite convincingly a requirement for chemokine/chemokine receptor-mediated interaction in LLPC longevity. The data are very valuable for the development of new types of vaccines.

    2. Reviewer #1 (Public Review):

      The mechanisms underlying the generation and maintenance of LLPCs have been one of the unresolved issues. In the last few years, several groups have independently generated new genetic tools or models and addressed how LLPCs are generated or maintained in homeostatic conditions or upon immunization or infection. Here, Jing et al. have also established a new PC time stamping system and tried to address the issues above. The authors have found that LLPCs accumulated in the BM PC pool, along with aging, and that LLPCs had unique sufacetome, transcriptome, and BCR clonality. These observations have already been made by other groups (Xu et al. 2020, Robinson et al. 2022, Liu et al. 2022, Koike et al. 2023, Robinson et al. 2023, plus Tellier et al., 2024), therefore it is hard to find significant conceptual advances there. In my opinion, however, genetic analysis of the role of CXCR4 on PC localization or survival in BM (Figure 4 and 5) provided new aspects which have not been addressed in previous studies. Importantly, CXCR4 was required for the maintenance of plasma cells in bone marrow survival niches, conditional loss of which led to rapid mobilization from the bone marrow, reduced plasma cell survival, and reduced antibody titer. Thus, these data suggest that CXCR4-CXCL12 axis is not only important for plasma cell recruitment to the bone marrow but also essential for their lodging on the niches. I think the study is of high quality and the findings should be widely shared in the field.

    3. Reviewer #2 (Public Review):

      In this study by Jing, Fooksman, and colleagues, a Blimp1-CreERT2-based genetic tracing study is employed to label plasma cells. Over the course of several months post-tamoxifen treatment, the only remaining labeled cells are long-lived plasma cells. This system provides a way to sort live long-lived plasma cells and compare them to unlabeled plasma cells, which contain a range of short-to-long-lived cells. From this analysis, several observations are made: 1) the turnover rate of plasma cells is greater in the spleen than in the bone marrow; 2) the turnover rate is highest early in life; 3) subtle transcriptional and cell surface marker differences distinguish long- from shorter-lived plasma cells; 4) long-lived plasma cells in the bone marrow are sessile and localize in clusters with each other; 5) CXCR4 is required for plasma cell retention in these clusters and in the bone marrow; 6) Repertoire analysis hints that the selection of long-lived plasma cells is not random for any cell that lands in the bone marrow.

      Strengths:

      (1) The genetic timestamping approach is a clever and functional way to separate plasma cells of differing longevities.

      (2) This approach led to the identification of several markers that could help prospective separation of long-lived plasma cells from others.

      (3) Functional labeling of long-lived plasma cells allowed for a higher resolution analysis of transcriptomes and motility than was previously possible.

      (4) The genetic system allowed for a revisitation of the importance of CXCR4 in plasma cell retention and survival.

      Weaknesses:

      (1) Most of the labeling studies, likely for practical reasons, were done on polyclonal rather than antigen-specific plasma cells. The triggers of these responses could vary based on age at the time of exposure, anatomical sites, etc. How these differences might influence markers and transcriptomes, independently of longevity, is not completely known.

      (2) The fraction of long-lived plasma cells in the unlabeled fraction varies with age, potentially diluting differences between long- and short-lived plasma cells.

      (3) The authors suggest their data favors a model by which plasma cells compete for niche space. Yet there is no evidence presented here that these niches are limiting. While a finite number of plasma cells may occupy a single niche (Figure 2), it may be that these niches overall are abundant in the bone marrow and do not restrict LLPC numbers. Robinson...Tarlinton and colleagues (Immunity, 2023) in fact provide experimental evidence against an extrinsic limit.

      (4) The functional importance of the observed transcriptome differences between long- and shorter-lived plasma cells is unknown. An assessment as to whether these differences are conserved in human long- and short-lived bone marrow plasma cells might provide circumstantial supporting evidence that these changes are important for longevity.

    4. Reviewer #3 (Public Review):

      Summary:

      Long-lived PCs are maintained in a CXCR4-dependent manner.

      Strengths:

      The reporter mice for fate-mapping can clearly distinguish long-lived PCs from total PCs and greatly contribute to the identification of long-lived PCs.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Despite the importance of long-lived plasma cells (LLPCs), particularly in the vaccination field, their natures are still unclear. In this valuable manuscript, as a first step towards clarifying these natures, the authors used a solid genetic approach (time-stamping one) and successfully labelled only functional LLPCs. Although four groups have already published data by the same genetic approach, the authors' manuscript includes additional significant findings in the LLPC field.

      Public Reviews:

      Reviewer #1 (Public Review):

      The mechanisms underlying the generation and maintenance of LLPCs have been one of the unresolved issues. Recently, four groups have independently generated new genetic tools that allow fate tracing of murine plasma cells and have addressed how LLPCs are generated or maintained in homeostatic conditions or upon antigen immunization or viral infection. Here, Jing et al. have established another, but essentially the same, PC time stamping system, and tried to address the issues above. The question is whether the findings reported here provide significant conceptual advances from what has already been published.

      (1) Some of the observations in this manuscript have already been made by other studies (Xu et al. 2020, Robinson et al. 2022, Liu et al. 2022, Koike et al. 2023, Robinson et al. 2023). In my opinion, however, genetic analysis of the role of CXCR4 on PC localization or survival in BM (Figure 5) was well performed and provided some new aspects which have not been addressed in previous reports. The motility of CXCR4 cKO plasma cells in BM is not shown, but it could further support the idea that reduced mobility or increased clustering is required for longevity.

      (2) The combination of the several surface markers shown in Figure 3&4 doesn't seem to be practically applicable to identify or gate on LLPCs, because differential expression of CD81, CXCR4, CD326, CD44, or CD48 on LLPCs vs bulk PCs was very modest. EpCAMhi/CXCR3-, Ly6Ahi/Tigit- (Liu et al. 2022), B220lo/MHC-IIlo (Koike et al. 2023), or SLAMF6lo/MHC-IIlo (Robinson et al. 2023) has been reported as markers for LLPC population. It is unclear that the combination of surface markers presented here is superior to published markers. In addition, it is unclear why the authors did not use their own gene expression data (Fig.6), instead of using public datasets, for picking up candidate markers.

      In terms of the utility of these markers, we agree they are not sufficient to distinguish bona fide LLPCs but they did enrich for LLPCs by 6-fold (Figure 3).  In the other studies cited, LLPCs are enriched in those gates but not exclusively found in the gates, suggesting some plasticity.  In terms of how they were chosen, we conducted the flow surface studies in parallel and prior to completing the gene expression studies, thus, they were not available in time to be useful for the longitudinal studies.  As this was not the major findings of the paper, we have reduced emphasis on this section, and moved some of the data to Figure S2.

      Reviewer #2 (Public Review):

      In this study by Jing, Fooksman, and colleagues, a Blimp1-CreERT2-based genetic tracing study is employed to label plasma cells. Over the course of several months post-tamoxifen treatment, the only remaining labeled cells are long-lived plasma cells. This system provides a way to sort live long-lived plasma cells and compare them to unlabeled plasma cells, which contain a range of short-to-long-lived cells. From this analysis, several observations are made: 1) the turnover rate of plasma cells is greater in the spleen than in the bone marrow; 2) the turnover rate is highest early in life; 3) subtle transcriptional and cell surface marker differences distinguish long- from shorter-lived plasma cells; 4) long-lived plasma cells in the bone marrow are sessile and localize in clusters with each other; 5) CXCR4 is required for plasma cell retention in these clusters and in the bone marrow; 6) Repertoire analysis hints that the selection of long-lived plasma cells is not random for any cell that lands in the bone marrow.

      Strengths:

      (1) The genetic timestamping approach is a clever and functional way to separate plasma cells of differing longevities.

      (2) This approach led to the identification of several markers that could help prospective separation of long-lived plasma cells from others.

      (3) Functional labeling of long-lived plasma cells allowed for a higher resolution analysis of transcriptomes and motility than was previously possible.

      (4) The genetic system allowed for a revisitation of the importance of CXCR4 in plasma cell retention and survival.

      Weaknesses:

      (1) Most of the labeling studies, likely for practical reasons, were done on polyclonal rather than antigen-specific plasma cells. The triggers of these responses could vary based on age at the time of exposure, anatomical sites, etc. How these differences might influence markers and transcriptomes, independently of longevity, is not completely known.

      (2) The fraction of long-lived plasma cells in the unlabeled fraction varies with age, potentially diluting differences between long- and short-lived plasma cells.

      (3) The authors suggest their data favors a model by which plasma cells compete for niche space. Yet there is no evidence presented here that these niches are limiting.

      In Figure 2, we provide important evidence that LLPCs are enriched in PC clusters, and are less motile, suggesting they occupy a unique niche compared to bulk PCs in the bone marrow.  But we agree it does not clarify if that niche is limited.

      (4) The functional importance of the observed transcriptome differences between long- and shorter-lived plasma cells is unknown. An assessment as to whether these differences are conserved in human long- and short-lived bone marrow plasma cells might provide circumstantial supporting evidence that these changes are important for longevity.

      Reviewer #3 (Public Review):

      The valuable work shows some unique characteristics of long-lived PCs in comparison with bulk PCs. In particular, the authors clearly indicated the dependency of CXCR4 in PC longevity and provided a deal of resource of PC transcriptomes. Though CD93 is known as a marker for long-lived PCs, the authors can provide more data related to CD93.

      Summary:

      Long-lived PCs are maintained with low motility and in a CXCR4-dependent manner. 

      Strengths:

      The reporter mice for fate-mapping can clearly distinguish long-lived PCs from total PCs and greatly contribute to the identification of long-lived PCs.

      Weaknesses:

      The authors are unable to find a unique marker for long-lived PCs

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Given the author's expertise, I suggest investigating the motility of CXCR4 cKO plasma cells in BM. 

      Thank you for the suggestion. This work would certainly fit in with the theme of the paper.  We tried to measure this using the BEC Rosa-LSL-YFP Cxcr4f/f system after tamoxifen treatment but unfortunately, these PCs leave the BM concurrently as they turn on YFP expression from the Rosa26 locus, making it impossible to capture the change in motility.  This is also evident in our data in updated Figure 5 which shows that intratibial injection of 4HO-Tamoxifen causes rapid mobilization of CXCR4KO PCs from the tibia within 1 day.  We tried to breed other models that would allow us to visualize these early events, which were unsuccessful, and also responsible for the long delay in resubmission.

      (2) Expression of CD81, CXCR4, CD326, CD44, or CD48 was not different enough to distinguish LLPCs from bulk PCs (Figure 3B). The caveat is that bulk PCs also contained a significant frequency of LLPCs, which would make the difference in expression levels smaller. I suggest looking at the expression of these molecules on newly generated PCs, soon after protein immunization, for example.

      This would be a separate issue, when they begin to express the LLPC phenotype, and definitely worthwhile in future studies.

      Reviewer #2 (Recommendations For The Authors):

      (1) Related to the above public comment #4, I would recommend looking at Halliley et al., Immunity, 2015 to see if some of the same LLPC transcriptional and marker differences can be observed between CD19+ and CD19- plasma cells in the human marrow.

      Thank you for the suggestion to do a human correlation.  It is unclear what conclusions we can draw from overlapping or non-overlapping patterns, on their own.

      (2) For CD93, since it is bimodal, it may be better to express this as % positive rather than fold changes in MFI as in Figure 3.

      We have updated Figure 3C to include %positive as suggested. Fold changes were moved to Figure S2.

      Reviewer #3 (Recommendations For The Authors):

      The valuable work shows some unique characteristics of long-lived PCs in comparison with bulk PCs. In particular, the authors clearly indicated the dependency of CXCR4 in PC longevity and provided a deal of resources of PC transcriptomes. Though CD93 is known as a marker for long-lived PCs, the authors can provide more data related to CD93.

      Major points:

      The authors show data that some bulk PCs express CD93 lower. Are CD93low bulk PCs are higher motile in the BM compared to CD93high? Are CD93low highly mutated in the Ig gene? Do CD93high bulk PCs have similar transcriptome to long-lived PCs on some representative genes?

      Although we do not have data here, the difference between CD93high cells and CD93low cells are likely to be small since labeled PCs were observed to express higher CD93 surface level as early as day 5 in BM and SP shown in updated Figure 3C. Thus, while CD93 is strongly enriched in LLPCs, it cannot be used as a single marker to sufficiently isolate LLPCs, which would make it very difficult to detect changes in motility, mutation of Ig gene, and gene expression.

      Minor points:

      (1) In the title, the authors describe that surface receptor expression support PC-intrinsic longevity. The surface receptor is only CXCR4. The ambiguous description confuses the readers. 

      While CXCR4 was shown functionally to be involved, we found multiple surface receptors are differentially expressed in LLPCs.

      (2) The abbreviations of 'bone marrow' and 'BM' should be unified.

      (3) In Fig. 7C, the bars for comparison are unclear. What dots are compared? 

      Bars are comparing day 90 middle aged to day 5 controls, as there were only n=2 for some day 90 young mice samples for all internally pared comparisons.

      (4) The explanation about Fig.7I can't be understood. How are conclusions occurred from the panel? 

      Fig. 7I shows that of the most common public clones found (found in the most samples or mice), across all LLPC and Bulk 42 total samples, most of the hits came from LLPC samples (all colored) whereas few were from bulk PC samples (white bars), suggesting the shared repertoire is uniquely LLPC-like.  These were observations drawn, but no statistical analysis was conducted here.

    1. eLife assessment

      This study presents an important finding on the molecular mechanism for transduction of environmentally induced polyphenism. The evidence supporting the claims of the author is incomplete due to limited sample sizes and inadequate analysis. This paper would be of interest to those studying aphids wing dimorphism.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, a chromosome-level genome of the rose-grain aphid M. dirhodum was assembled with high quality, and A-to-I RNA-editing sites were systematically identified. The authors then demonstrated that: 1) Wing dimorphism induced by crowding in M. dirhodum is regulated by 20E (ecdysone signaling pathway); 2) an A-to-I RNA editing prevents the binding of miR-3036-5p to CYP18A1 (the enzyme required for 20E degradation), thus elevating CYP18A1 expression, decreasing 20E titer, and finally regulating the wing dimorphism of offspring.

      Strengths:

      The authors present both genome and A-to-I RNA editing data. An interesting finding is that a A-to-I RNA editing site in CYP18A1 ruin the miRNA binding site of miR-3036-5p. And loss of miR-3036-5p regulation lead to less 20E and winged offspring.

      Weaknesses:

      How crowding represses the miR-3036-5p is still unclear.

    3. Reviewer #2 (Public Review):

      Summary:

      Environmental influences on development are ubiquitous, affecting many phenotypes in organisms. However molecular genetic and cellular mechanisms transducing environmental signals are still only barely understood. This study examines part of one such intracellular mechanism in a polyphenic (or dimorphic) aphid.

      Strengths:

      While other published reports have linked phenotypic plasticity to RNA editing before, this study reports such an interaction in insects. The study uses a wide array of molecular tools to identify connections upstream and downstream of the RNA editing to elucidate the regulatory mechanism, which is illuminating.

      Weaknesses:

      While this system is intriguing, this report does not foster confidence in its conclusions. Many of the analyses seem based on very small sample sizes. It is itself problematic that sample sizes are not obvious in most figures, although based on Methods section covering RNAseq, they seem to be either 3, 6 or 9, depending on whether stages were pooled, but that point is not made clear. With such small sample sizes, statistical tests of any kind are unreliable. Besides the ambiguity on sample sizes, it's unclear what error bars or whiskers show in plots throughout this study. When sample sizes are small estimates of variance are not reliable. Student's t-test is not appropriate for comparisons with such small sample sizes. Presently, it is not possible to replicate the tests shown in Figures 3, 4 and 6. (Besides the HT-seq reads, other data should also be made publicly available, following the journal's recommendations.) Regardless, effect sizes in some comparisons (Fig 3J, 4A-C, 6E,H) are clearly not large, making confidence in conclusions low. The authors should be cautious about over-interpreting these data.

    1. eLife assessment

      This important study presents a new quantitative imaging pipeline that describes with high temporal precision and throughput the movements of late-stage Drosophila embryos, a critical moment when motion first appears. A new approach is used to explore the role of miRNAs in motion onset and presents solid evidence that shows a role for miR-2b-1 and its target Motor in embryonic motion. The data are well supported even if the mechanistic insight into the emergence of movement remains to be explored.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study makes a valuable empirical contribution to our understanding of visual processing in primates and deep neural networks, with a specific focus on the concept of factorization. The analyses provide solid evidence that high factorization scores are correlated with neural predictivity, yet more evidence would be needed to show that neural responses show factorization. Consequently, while several aspects require further clarification, in its current form this work is interesting to systems neuroscientists studying vision and could inspire further research that ultimately may lead to better models of or a better understanding of the brain.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper investigates visual processing in primates and deep neural networks (DNNs), focusing on factorization in the encoding of scene parameters. It challenges the conventional view that object classification is the primary function of the ventral visual stream, suggesting instead that the visual system employs a nuanced strategy involving both factorization and invariance. The study also presents empirical findings suggesting a correlation between high factorization scores and good neural predictivity.

      Strengths:

      (1) Novel Perspective: The paper introduces a fresh viewpoint on visual processing by emphasizing the factorization of non-class information.

      (2) Methodology: The use of diverse datasets from primates and humans, alongside various computational models, strengthens the validity of the findings.

      (3) Detailed Analysis: The paper suggests metrics for factorization and invariance, contributing to a future understanding & measurements of these concepts.

      Weaknesses:

      (1) Vagueness (Perceptual or Neural Invariance?): The paper uses the term 'invariance', typically referring to perceptual stability despite stimulus variability [1], as the complete discarding of nuisance information in neural activity. This oversimplification overlooks the nuanced distinction between perceptual invariance (e.g., invariant object recognition) and neural invariance (e.g., no change in neural activity). It seems that by 'invariance' the authors mean 'neural' invariance (rather than 'perceptual' invariance) in this paper, which is vague. The paper could benefit from changing what is called 'invariance' in the paper to 'neural invariance' and distinguish it from 'perceptual invariance,' to avoid potential confusion for future readers. The assignment of 'compact' representation to 'invariance' in Figure 1A is misleading (although it can be addressed by the clarification on the term invariance). [1] DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends in cognitive sciences. 2007 Aug 1;11(8):333-41.

      Thanks for pointing out this ambiguity. In our Introduction we now explicitly clarify that we use “invariance” to refer to neural, rather than perceptual invariance, and we point out that both factorization and (neural) invariance may be useful for obtaining behavioral/perceptual invariance.

      (2) Details on Metrics: The paper's explanation of factorization as encoding variance independently or uncorrelatedly needs more justification and elaboration. The definition of 'factorization' in Figure 1B seems to be potentially misleading, as the metric for factorization in the paper seems to be defined regardless of class information (can be defined within a single class). Does the factorization metric as defined in the paper (orthogonality of different sources of variation) warrant that responses for different object classes are aligned/parallel like in 1B (middle)? More clarification around this point could make the paper much richer and more interesting.

      Our factorization metric measures the degree to which two sets of scene variables are factorized from one another. In the example of Fig. 1B, we apply this definition to the case of factorization of class vs. non-class information. Elsewhere in the paper we measure factorization of several other quantities unrelated to class, specifically camera viewpoint, lighting conditions, background content, and object pose. In our revised manuscript we have clarified the exposition surrounding Fig. 1B to make it clear that factorization, as we define it, can be applied to other quantities as well and that responses do not need to be aligned/parallel but simply live in a different set of dimensions whether linearly or nonlinearly arranged. Thanks for raising the need to clarify this point.

      (3) Factorization vs. Invariance: Is it fair to present invariance vs. factorization as mutually exclusive options in representational hypothesis space? Perhaps a more fair comparison would be factorization vs. object recognition, as it is possible to have different levels of neural variability (or neural invariance) underlying both factorization and object recognition tasks.

      We do not mean to imply that factorization and invariance are mutually exclusive, or that they fully characterize the space of possible representations. However, they are qualitatively distinct strategies for achieving behavioral capabilities like object recognition. In the revised manuscript we also include a comparison to object classification performance (Figures 5C & S4, black x’s) as a predictor of brain-like representations, alongside the results for factorization and invariance.

      In our revised Introduction and beginning of the Results section, we make it more clear that factorization and invariance are not mutually exclusive – indeed, our results show that both factorization and invariance for some scene variables like lighting and background identity are signatures of brain-like representations. Our study focuses on factorization because we believe its importance has not been studied or highlighted to the degree that invariance to “nuisance” parameters has in concert with selectivity to object identity in individual neuron tuning functions. Moreover, the loss functions used for supervised training functions of neural networks for image classification would seem to encourage invariance as a representational strategy. Thus, the finding that factorization of scene parameters is an equally good if not better predictor of brain-like representations may motivate new objective functions for neural network training.

      (4) Potential Confounding Factors in Empirical Findings: The correlation observed in Figure 3 between factorization and neural predictivity might be influenced by data dimensionality, rather than factorization per se [2]. Incorporating discussions around this recent finding could strengthen the paper.

      [2] Elmoznino E, Bonner MF. High-performing neural network models of the visual cortex benefit from high latent dimensionality. bioRxiv. 2022 Jul 13:2022-07.

      We thank the Reviewer for pointing out this important, potential confound and the need for a direct quantification. We have now included an analysis computing how well dimensionality (measured using the participation ratio metric for natural images, as was done in [2] Elmoznino& Bonner bioRxiv. 2022) can account for model goodness-of-fit (additional pink bars in Figure 6). Factorization of scene parameters appears to add more predictive power than dimensionality on average (Figure 6, light shaded bars), and critically, factorization+classification jointly predict goodness-of-fit significantly better than dimensionality+classification for V4 and IT/HVC brain areas (Figure 6, dark shaded bars). Indeed, dimensionality+classification is only slightly more predictive than classification alone for V4 and IT/HVC indicating some redundancy in those measures with respect to neural predictivity of models (Figure 6, compare dark shaded pink bar to dashed line).

      That said, high-dimensional representations can, in principle, better support factorization, and thus we do not regard these two representational strategies necessarily in competition. Rather, our results suggest (consistent with [2]) that dimensionality is predictive of brain-like representation to some degree, such that some (but not all) of factorization’s predictive power may indeed owe to a partial correlation with dimensionality. We elaborate in the Discussion where this point comes up and now refer to the updated Figure 6 that shows the control for dimensionality.

      Conclusion:

      The paper offers insightful empirical research with useful implications for understanding visual processing in primates and DNNs. The paper would benefit from a more nuanced discussion of perceptual and neural invariance, as well as a deeper discussion of the coexistence of factorization, recognition, and invariance in neural representation geometry. Additionally, addressing the potential confounding factors in the empirical findings on the correlation between factorization and neural predictivity would strengthen the paper's conclusions.

      Taken together, we hope that the changes described above address the distinction between neural and perceptual invariance, provide a more balanced understanding of the contributions of factorization, invariance, and local representational geometry, and rule against dimensionality for natural images as contributing to the main finding of the benefits from factorization of scene parameters.

      Reviewer #2 (Public Review):

      Summary:

      The dominant paradigm in the past decade for modeling the ventral visual stream's response to images has been to train deep neural networks on object classification tasks and regress neural responses from units of these networks. While object classification performance is correlated to the variance explained in the neural data, this approach has recently hit a plateau of variance explained, beyond which increases in classification performance do not yield improvements in neural predictivity. This suggests that classification performance may not be a sufficient objective for building better models of the ventral stream. Lindsey & Issa study the role of factorization in predicting neural responses to images, where factorization is the degree to which variables such as object pose and lighting are represented independently in orthogonal subspaces. They propose factorization as a candidate objective for breaking through the plateau suffered by models trained only on object classification.

      They claim that (i) maintaining these non-class variables in a factorized manner yields better neural predictivity than ignoring non-class information entirely, and (ii) factorization may be a representational strategy used by the brain.

      The first of these claims is supported by their data. The second claim does not seem well-supported, and the usefulness of their observations is not entirely clear.

      Strengths:

      This paper challenges the dominant approach to modeling neural responses in the ventral stream, which itself is valuable for diversifying the space of ideas.

      This paper uses a wide variety of datasets, spanning multiple brain areas and species. The results are consistent across the datasets, which is a great sign of robustness.

      The paper uses a large set of models from many prior works. This is impressively thorough and rigorous.

      The authors are very transparent, particularly in the supplementary material, showing results on all datasets. This is excellent practice.

      Weaknesses:

      (1) The primary weakness of this paper is a lack of clarity about what exactly is the contribution. I see two main interpretations: (1-A) As introducing a heuristic for predicting neural responses that improve over-classification accuracy, and (1-B) as a model of the brain's representational strategy. These two interpretations are distinct goals, each of which is valuable. However, I don't think the paper in its current form supports either of them very well:

      (1-A) Heuristic for neural predictivity. The claim here is that by optimizing for factorization, we could improve models' neural predictivity to break through the current predictivity plateau. To frame the paper in this way, the key contribution should be a new heuristic that correlates with neural predictivity better than classification accuracy. The paper currently does not do this. The main piece of evidence that factorization may yield a more useful heuristic than classification accuracy alone comes from Figure 5. However, in Figure 5 it seems that factorization along some factors is more useful than others, and different linear combinations of factorization and classification may be best for different data. There is no single heuristic presented and defended. If the authors want to frame this paper as a new heuristic for neural predictivity, I recommend the authors present and defend a specific heuristic that others can use, e.g. [K * factorization_of_pose + classification] for some constant K, and show that (i) this correlates with neural predictivity better than classification alone, and (ii) this can be used to build models with higher neural predictivity. For (ii), they could fine-tune a state-of-the-art model to improve this heuristic and show that doing so achieves a new state-of-the-art neural predictivity. That would be convincing evidence that their contribution is useful.

      Our paper does not make any strong claim regarding the Reviewer’s point 1-A (on heuristics for neural predictivity). In the Discussion, last paragraph, we better specify that our work is merely suggestive of claim 1-A about heuristics for more neurally predictive, more brainlike models. We believe that our paper supports the Reviewer’s point 1-B (on brain representation) as we discuss below.

      We leave it to future work to determine if factorization could help optimize models to be more brainlike. This treatment may require exploration of novel model architectures and loss functions, and potentially also more thorough neural datasets that systematically vary many different forms of visual information for validating any new models.

      (1-B) Model of representation in the brain. The claim here is that factorization is a general principle of representation in the brain. However, neural predictivity is not a suitable metric for this, because (i) neural predictivity allows arbitrary linear decoders, hence is invariant to the orthogonality requirement of factorization, and (ii) neural predictivity does not match the network representation to the brain representation. A better metric is representational dissimilarity matrices. However, the RDM results in Figure S4 actually seem to show that factorization does not do a very good job of predicting neural similarity (though the comparison to classification accuracy is not shown), which suggests that factorization may not be a general principle of the brain. If the authors want to frame the paper in terms of discovering a general principle of the brain, I suggest they use a metric (or suite of metrics) of brain similarity that is sensitive to the desiderata of factorization, e.g. doesn't apply arbitrary linear transformations, and compare to classification accuracy in addition to invariance.

      We agree with the Reviewer about the shortcomings of neural predictivity for comparing representational geometries, and in our revised manuscript we have provided a more comprehensive set of results that includes RDM predictivity in new Figures 6 & 7, alongside the results for neural fit predictivity. In addition, as suggested we added classification accuracy predictivity in Figures 5C & S4 (black x’s) for visual comparison to factorization/invariance. In Figure S4 on RDMs, it is apparent how factorization is at least as good a predictor as classification on all V4 & IT datasets from both monkeys and humans (compared x’s to filled circles in Figure S4; note that some of the points from the original Figure S4 changed as we discovered a bug in the code that specifically affected the RDM analysis for a few of the datasets).

      We find that the newly included RDM analyses in Figures 6 & 7 are consistent with the conclusions of the neural fit regression analyses: that the correlation of factorization metrics with RDM matches are strong, comparable in magnitude to that of classification accuracy (Figure 6, 3rd & 4th columns, compare black dashed line to faded colored bars) and are not fully accounted for by the model’s classification accuracy alone (Figure 6, 3rd & 4th columns, higher unfaded bars for classification combined with factorization, and see corresponding example scatters in Figure 7 middle/bottom rows).

      It is encouraging that the added benefit of factorization for RDM predictivity accounting for classification performance is at least as good as the improvement seen for neural fit predictivity (Figure 6, 1st & 2nd columns for encoding fits versus 3rd & 4th columns for RDM correlations).

      (2) I think the comparison to invariance, which is pervasive throughout the paper, is not very informative. First, it is not surprising that invariance is more weakly correlated with neural predictivity than factorization, because invariant representations lose information compared to factorized representations. Second, there has long been extensive evidence that responses throughout the ventral stream are not invariant to the factors the authors consider, so we already knew that invariance is not a good characterization of ventral stream data.

      While we appreciate the Reviewer’s intuition that highly invariant representations are not strongly supported in the high-level visual cortex, we nevertheless thought it was valuable to put this intuition to a quantitative, detailed test. As a result, we uncovered effects that were not obvious a priori, at least to us – for example, that invariance for some scene parameters (camera view, object pose) is negatively correlated with neural predictions while invariance to others (background, lighting) is positively correlated. Thus, our work exercises the details of invariance for different types of information.

      (3) The formalization of the factorization metric is not particularly elegant, because it relies on computing top K principal components for the other-parameter space, where K is arbitrarily chosen as 10. While the authors do show that in their datasets the results are not very sensitive to K (Figure S5), that is not guaranteed to be the case in general. I suggest the authors try to come up with a formalization that doesn't have arbitrary constants. For example, one possibility that comes to mind is E[delta_a x delta_b], where 'x' is the normalized cross product, delta_a, and delta_b are deltas in representation space induced by perturbations of factors a and b, and the expectation is taken over all base points and deltas. This is just the first thing that comes to mind, and I'm sure the authors can come up with something better. The literature on disentangling metrics in machine learning may be useful for ideas on measuring factorization.

      Thanks to the Reviewer for raising this point. First, we wish to clarify a potential misunderstanding of the factorization metric: the number K of principal components we choose is not an arbitrary constant, but rather calibrated to capture a certain fraction of variance, set to 90% by default in our analyses. While this variance threshold is indeed an arbitrary hyperparameter, it has a more intuitive interpretation than the number of principal components.

      Nonetheless, the Reviewer’s comment did inspire us to consider another metric for factorization that does not depend on any arbitrary parameters. In the revised version, we now include a covariance matrix based metric which simply measures the elementwise correlation of the covariance matrices induced by varying the scene parameter of interest and the covariance matrix induced by varying the other parameters (and then subtracts this quantity from 1).

      Correspondingly, we now present results for both the new covariance based measure and the original PCA based one in Figures 5C, 6, and 7. The main findings remain largely the same when using the covariance based metric, and the covariance based metric (Figure 5C, compare light shaded to dark shaded filled circles; Figure 6, compare top row to bottom row; Figure 7, compare middle rows to bottom rows).

      Ultimately, we believe these two metrics are complementary and somewhat analogous to two metrics commonly used for measuring dimensionality (the number of components needed to explain a certain fraction of the variance, analogous to our original PCA based definition; the participation ratio, analogous to our covariance based definition). We have added the formula for the covariance based factorization metric along with a brief description to the Methods.

      (4) The authors defined the term "factorization" according to their metric. I think introducing this new term is not necessary and can be confusing because the term "factorization" is vague and used by different researchers in different ways. Perhaps a better term is "orthogonality", because that is clear and seems to be what the authors' metric is measuring.

      We agree with the Reviewer that factorization has become an overloaded term. At the same time, we think that in this context, the connotation of the term factorization effectively conveys the notion of separating out different latent sources of variance (factors) such that they can be encoded in orthogonal subspaces.

      To aid clarity, we now mention in the Introduction that factorization defined here is meant to measure orthogonalization of scene factors. Additionally, in the Discussion section, we now go into more detail comparing our metric to others previously used in the literature, including orthogonality, to help put it in context.

      (5) One general weakness of the factorization paradigm is the reliance on a choice of factors. This is a subjective choice and becomes an issue as you scale to more complex images where the choice of factors is not obvious. While this choice of factors cannot be avoided, I suggest the authors add two things: First, an analysis of how sensitive the results are to the choice of factors (e.g. transform the basis set of factors and re-run the metric); second, include some discussion about how factors may be chosen in general (e.g. based on temporal statistics of the world, independent components analysis, or something else).

      The Reviewer raises a very reasonable point about the limitation of this work. While we limited our analysis to generative scene factors that we know about and that could be manipulated, there are many potential factors to consider. It is not clear to us exactly how to implement the Reviewer’s suggestion of transforming the basis set of factors, as the factors we consider are highly nonlinear in the input space. Ultimately, we believe that finding unsupervised methods to characterize the “true” set of factors that is most useful for understanding visual representations is an important subject for future work, but outside the scope of this particular study. We have added a comment to this effect in the Discussion.

      Reviewer #3 (Public Review):

      Summary:

      Object classification serves as a vital normative principle in both the study of the primate ventral visual stream and deep learning. Different models exhibit varying classification performances and organize information differently. Consequently, a thriving research area in computational neuroscience involves identifying meaningful properties of neural representations that act as bridges connecting performance and neural implementation. In the work of Lindsey and Issa, the concept of factorization is explored, which has strong connections with emerging concepts like disentanglement [1,2,3] and abstraction [4,5]. Their primary contributions encompass two facets: (1) The proposition of a straightforward method for quantifying the degree of factorization in visual representations. (2) A comprehensive examination of this quantification through correlation analysis across deep learning models.

      To elaborate, their methodology, inspired by prior studies [6], employs visual inputs featuring a foreground object superimposed onto natural backgrounds. Four types of scene variables, such as object pose, are manipulated to induce variations. To assess the level of factorization within a model, they systematically alter one of the scene variables of interest and estimate the proportion of encoding variances attributable to the parameter under consideration.

      The central assertion of this research is that factorization represents a normative principle governing biological visual representation. The authors substantiate this claim by demonstrating an increase in factorization from macaque V4 to IT, supported by evidence from correlated analyses revealing a positive correlation between factorization and decoding performance. Furthermore, they advocate for the inclusion of factorization as part of the objective function for training artificial neural networks. To validate this proposal, the authors systematically conduct correlation analyses across a wide spectrum of deep neural networks and datasets sourced from human and monkey subjects. Specifically, their findings indicate that the degree of factorization in a deep model positively correlates with its predictability concerning neural data (i.e., goodness of fit).

      Strengths:

      The primary strength of this paper is the authors' efforts in systematically conducting analysis across different organisms and recording methods. Also, the definition of factorization is simple and intuitive to understand.

      Weaknesses:

      This work exhibits two primary weaknesses that warrant attention: (i) the definition of factorization and its comparison to previous, relevant definitions, and (ii) the chosen analysis method.

      Firstly, the definition of factorization presented in this paper is founded upon the variances of representations under different stimuli variations. However, this definition can be seen as a structural assumption rather than capturing the effective geometric properties pertinent to computation. More precisely, the definition here is primarily statistical in nature, whereas previous methodologies incorporate computational aspects such as deviation from ideal regressors [1], symmetry transformations [3], generalization [5], among others. It would greatly enhance the paper's depth and clarity if the authors devoted a section to comparing their approach with previous methodologies [1,2,3,4,5], elucidating any novel insights and advantages stemming from this new definition.

      [1] Eastwood, Cian, and Christopher KI Williams. "A framework for the quantitative evaluation of disentangled representations." International conference on learning representations. 2018.

      [2] Kim, Hyunjik, and Andriy Mnih. "Disentangling by factorising." International Conference on Machine Learning. PMLR, 2018.

      [3] Higgins, Irina, et al. "Towards a definition of disentangled representations." arXiv preprint arXiv:1812.02230 (2018).

      [4] Bernardi, Silvia, et al. "The geometry of abstraction in the hippocampus and prefrontal cortex." Cell 183.4 (2020): 954-967.

      [5] Johnston, W. Jeffrey, and Stefano Fusi. "Abstract representations emerge naturally in neural networks trained to perform multiple tasks." Nature Communications 14.1 (2023): 1040.

      Thanks to the Reviewer for this suggestion. We agree that our initial submission did not sufficiently contextualize our definition of factorization with respect to other related notions in the literature. We have added additional discussion of these points to the Discussion section in the revised manuscript and have included therein the citations provided by the Reviewer (please see the third paragraph of Discussion).

      Secondly, in order to establish a meaningful connection between factorization and computation, the authors rely on a straightforward synthetic model (Figure 1c) and employ multiple correlation analyses to investigate relationships between the degree of factorization, decoding performance, and goodness of fit. Nevertheless, the results derived from the synthetic model are limited to the low training-sample regime. It remains unclear whether the biological datasets under consideration fall within this low training-sample regime or not.

      We agree that our model in Figure 1C is very simple and does not fully capture the complex interactions between task performance and features of representational geometry, like factorization. We intend it only as a proof of concept to illustrate how factorized representations can be beneficial for some downstream task use cases. While the benefits of factorized representations disappear for large numbers of samples in this simulation, we believe this is primarily a consequence of the simplicity and low dimensionality of the simulation. Real-world visual information is complex and high-dimensional, and as such the relevant sample size regime in which factorization offers tasks benefits may be much greater. As a first step toward this real-world setting, Figure 2 shows how decreasing the amount of factorization in neural population data in macaque V4/IT can have an effect on object identity decoding.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Missing citations: The paper could benefit from discussions & references to related papers, such as:

      Higgins I, Chang L, Langston V, Hassabis D, Summerfield C, Tsao D, Botvinick M. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nature communications. 2021 Nov 9;12(1):6456.

      We have added additional discussion of related work, including the suggested reference and others on disentanglement, to the Discussion section in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Here are several small recommendations for the authors, all much more minor than those in the public review:

      I suggest more use of equations in methods sections about Figure 1C and macaque neural data analysis.

      Thanks for this suggestion. We have added new Equation 1 for the method transforming neural data to reduce factorization of a variable while preserving other firing rate statistics.

      In Figure 1-C, the methods indicate that Gaussian noise was added. This is a very important detail, and complexifies the interpretation of the figure because it adds an assumption about the structure of noise. In other words, if I understand correctly, the correct interpretation of Figure 1C is "assuming i.i.d. noise, decoding accuracy improves with factorization." The i.i.d. noise is a big assumption, and it is debated how well the brain satisfies this assumption. I suggest you either omit noise for this figure or clearly state in the main text (e.g. caption) that the figure must be interpreted under an i.i.d. noise assumption.

      We have added an explicit statement of the i.i.d. noise assumption to the Figure 1C legend.

      For Figure 2B, I suggest labeling the x-axis clearly below the axis on both panels. Currently, it is difficult to read, particularly in print.

      We have made the x-axis labels more clear and included on both panels.

      Figure 3A is difficult to read because of the very small task. I suggest avoiding such small fonts.

      We agree that Figure 3A is difficult to read. We have broken out Figure 3 into two new Figures 3 & 4 to increase clarity and sizing of text in Figure 3A.

      Reviewer #3 (Recommendations For The Authors):

      To strengthen this work, it is advisable to incorporate more comprehensive comparisons with previous research, particularly within the machine learning (ML) community. For instance, it would be beneficial to explore and reference works focusing on disentanglement [1,2,3]. This would provide valuable context and facilitate a more robust understanding of the contributions and novel insights presented in the current study.

      We have added additional discussion of related work and other notions similar to factorization to the Discussion section in the revised manuscript.

      Additionally, improving the quality of the figures is crucial to enhance the clarity of the findings:

      • Figure 2: The caption of subfigure B could be revised for greater clarity.

      Thank you, we have substantially clarified this figure caption.

      • Figure 3: Consider a more equitable approach for computing the correlation coefficient, such as calculating it separately for different types of models. In the case of supervised models, it appears that the correlation between invariance and goodness of fit may not be negligible across various scene parameters.

      We appreciate the suggestion, but we are not confident in our ability to conclude much from analyses restricted to particular model classes, given the relatively small N and the fact that the different model classes themselves are an important source of variance in our data.

      • Figure 4: To enhance the interpretability of subfigures A and B, it may be beneficial to include p-values (indicating confidence levels).

      As we supply bootstrapped confidence intervals for our results, which provide at least as much information as p-values, and most of the effects of interest are fairly stark when comparing invariance to factorization, p-values were not needed to support our points. We added a sentence to the legend of new Figure 5 (previously Figure 4) indicating that error bars reflect standard deviations over bootstrap resampling of the models.

      • Figure 5: For subfigure B, it could be advantageous to plot the results solely for factorization, allowing for a clear assessment of whether the high correlation observed in Classification+Factorization arises from the combined effects of both factors or predominantly from factorization alone.

      First, we clarify/note that the scatters solely for factorization that the Reviewer seeks are already presented earlier in the manuscript across all conditions in Figures 4A,B and Figure S2.

      While we could also include these in new Figure 7 (previously Figure 5B) as the Reviewer suggests, we believe it would distract from the message of that figure at the end of the manuscript – which is that factorization is useful as a supplement to classification in predictive matches to neural data. Nonetheless, new Figure 6 (old Figure 5A) provides a summary quantification of the information that the reviewer requests (Fig. 6, faded colored bars reflect the contribution of factorization alone).

    2. eLife assessment

      The study makes a valuable empirical contribution to our understanding of visual processing in primates and deep neural networks, with a specific focus on the concept of factorization. The analyses provide convincing evidence that high factorization scores are correlated with neural predictivity. This work will be of interest to systems neuroscientists studying vision and could inspire further research that ultimately may lead to better models of or a better understanding of the brain.

    3. Reviewer #2 (Public Review):

      Summary:

      The dominant paradigm in the past decade for modeling the ventral visual stream's response to images has been to train deep neural networks on object classification tasks and regress neural responses from units of these networks. While object classification performance is correlated to variance explained in the neural data, this approach has recently hit a plateau of variance explained, beyond which increases in classification performance do not yield improvements in neural predictivity. This suggests that classification performance may not be a sufficient objective for building better models of the ventral stream. Lindsey & Issa study the role of factorization in predicting neural responses to images, where factorization is the degree to which variables such as object pose and lighting are represented independently in orthogonal subspaces. They propose factorization as a candidate objective for breaking through the plateau suffered by models trained only on object classification. They show the degree of factorization in a model captures aspects of neural variance that classification accuracy alone does not capture, hence factorization may be an objective that could lead to better models of ventral stream. I think the most important figure for a reader to see is Fig. 6.

      Strengths:

      This paper challenges the dominant approach to modeling neural responses in the ventral stream, which itself is valuable for diversifying the space of ideas.

      This paper uses a wide variety of datasets, spanning multiple brain areas and species. The results are consistent across the datasets, which is a great sign of robustness.

      The paper uses a large set of models from many prior works. This is impressively thorough and rigorous.

      The authors are very transparent, particularly in the supplementary material, showing results on all datasets. This is excellent practice.

      Weaknesses:

      The authors have addressed many of the weaknesses in the original review. The weaknesses that remain are limitations of the work that cannot be easily addressed. In addition to the limitations stated at the end of the discussion, I'll add two:

      (1) This work shows that factorization is correlated with neural similarity, and notably explains some variance in neural similarity that classification accuracy does not explain. This suggests that factorization could be used as an objective (along with classification accuracy) to build better models of the brain. However, this paper does not do that - using factorization to build better models of the brain is left to future work.

    4. Reviewer #3 (Public Review):

      Summary:

      Object classification serves as a vital normative principle in both the study of the primate ventral visual stream and deep learning. Different models exhibit varying classification performances and organize information differently. Consequently, a thriving research area in computational neuroscience involves identifying meaningful properties of neural representations that act as bridges connecting performance and neural implementation. In the work of Lindsey and Issa, the concept of factorization is explored, which has strong connections with emerging concepts like disentanglement [1,2,3] and abstraction [4,5]. Their primary contributions encompass two facets: (1) The proposition of a straightforward method for quantifying the degree of factorization in visual representations. (2) A comprehensive examination of this quantification through correlation analysis across deep learning models.

      To elaborate, their methodology, inspired by prior studies [6], employs visual inputs featuring a foreground object superimposed onto natural backgrounds. Four types of scene variables, such as object pose, are manipulated to induce variations. To assess the level of factorization within a model, they systematically alter one of the scene variables of interest and estimate the proportion of encoding variances attributable to the parameter under consideration.

      The central assertion of this research is that factorization represents a normative principle governing biological visual representation. The authors substantiate this claim by demonstrating an increase in factorization from macaque V4 to IT, supported by evidence from correlated analyses revealing a positive correlation between factorization and decoding performance. Furthermore, they advocate for the inclusion of factorization as part of the objective function for training artificial neural networks. To validate this proposal, the authors systematically conduct correlation analyses across a wide spectrum of deep neural networks and datasets sourced from human and monkey subjects. Specifically, their findings indicate that the degree of factorization in a deep model positively correlates with its predictability concerning neural data (i.e., goodness of fit).

      Strengths:

      The primary strength of this paper is the authors' efforts in systematically conducting analysis across different organisms and recording methods. Also, the definition of factorization is simple and intuitive to understand.

      Weaknesses:

      Comments on revised version:

      I thank the authors for addressing the weaknesses I brought up regarding the manuscript.

    1. Author response:

      Reply to Reviewer #1 (Public Review):

      The post-processing increases number of putative neoantigens. As shown in Author response image 1, this is done through data augmentation or “mutations” of individual amino acids in a sequence by their most similar amino acid in the BLOSUM62 embedding. If most of the mutations result in a positive prediction (which we binarize through a >0.5 score) the sequence changes its prediction.

      Author response image 1.

      Post-processing pipeline to increase the number of putative neoantigens. Sequences can either be predicted using the forward method, for which a raw score is produced, or it can be introduced to a majority-vote prediction of the ensemble prediction of similar protein sequences.

      In this article, we obtain the following candidates after post-processing.

      Author response table 1.

      As mentioned, the prediction column shows a binary label. The full list contained 402 sequences did not include any other sequences that met the majority vote criteria.

      As noted by the reviewer, the Table 3 of our original paper includes the scores of the direct prediction, which has four sequences in common with the post-processing criteria (*Pnp, *Adar, *Lrrc28 and *Nr1h2). * indicates the mutated form of the peptide, i.e neoantigen.

      We selected the top 4 predicted antigens (present both by direct prediction and after post-processing; (*Pnp, *Adar, *Lrrc28 and *Nr1h2) (Wert-Carvajal et al. 2021), but we encountered difficulty in synthesizing, *Nr1h2 (Mutated Nr1h2), and thus it could not be included in the study.

      We also decided to evaluate the immunogenicity of *Wiz, which was identified as a potential TNA only after postprocessing. *Wiz exhibited lower levels of immunogenicity compared to *Pnp, *Adar, and *Lrrc28. However, unlike these, *Wiz is highly expressed in the tumor, and vaccination with *Wiz provided the strongest protection levels. These findings led us to incorporate post-processingg into the NAP-CNB platform.

      We chose *Herc6 as a mutated antigen predicted not to be a TNA over other candidates because its expression in the tumor was similar to that of *Wiz.

      Depending on the experiment we used 4 or 5 animals per group (this will be clarify in the revised version)

      The software used for statistical analysis was GraphPad Prism.

      Reply to Reviewer #2 (Public Review):

      This is true, binding affinity does not always predict immune responses but in most cases, high affinity peptides are immunogenic. There are of course other parameters that drive the effective priming of tumor-reactive CD8+ T cells through antigen cross-presentation, but the mechanisms of antigen presentation are yet not completely understood. High affinity peptides are desirable as good candidates in neoantigen-based vaccines.

    2. eLife assessment

      This important study assesses a novel in silico neoantigen prediction algorithm combined with in vivo validation to determine important parameters of neoantigen immunogenicity and tumor control. The strength of evidence is compelling. This study contributes to the field and will aid in the development of improved personalized cancer vaccines.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors of the study are trying to show that RNAseq can be used for neoantigen prediction and that the machine learning approach to the prediction can reveal very useful information for the selection of neoantigens for personalized antitumor vaccination.

      Strengths:

      The authors demonstrated that RNA expression of a neoantigen is a very important factor in the selection of peptides for the creation of personalized vaccines. They proved in vivo that in silico-predicted neoantigens can trigger an antitumor response in mice.

      Weaknesses:

      The selection of the peptides for vaccination is not clear. Some peptides were selected before and some after processing. What processing is also not clear. The authors didn't provide the full list of peptides before and after processing, please add those. And it wasn't clear that these peptides were previously published. Looking at the previously published table with peptide from B16 F10 (https://www.nature.com/articles/s41598-021-89927-5/tables/3), there are other genes with high expression, e.g. Tab2, Tm9sf3 that have higher expression than Herc6, please clarify the choice.

      It's not clear how many mice were used for each group in each experiment, please add this information to the text and figures. It would be good to add this, to aid the understanding of a broader audience.

      Please provide information about what software was used for statistical analysis.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors develop a new neoantigen prediction tool (NAP-CNB) which primarily predicts neoantigens based on expression (RNAseq) and ranks mutations using binding affinity. The validated predicted neoantigens in mice demonstrate that neoantigens with higher expression (but not necessarily the highest immunogenicity) lead to the greatest tumor control.

      Strengths:

      There is in vivo validation of the neoantigens.<br /> Demonstrates comparability to other prediction algorithms that are commonly used.<br /> Demonstrates that expression holds a higher value than T-cell responses in actual tumor control.

      Weaknesses:

      Binding affinity does not always predict immune responses or tumor control in vivo which is used as part of the selection criteria.

    1. eLife assessment

      This study presents a valuable finding on sperm flagellum and HTCA stabilization. The evidence supporting the authors' claims is incomplete. The work will be of broad interest to cell and reproductive biologists working on cilium and sperm biology.

    2. Reviewer #1 (Public Review):

      In this paper, Wu et al. investigated the physiological roles of CCDC113 in sperm flagellum and HTCA stabilization by using CRISPR/Cas knockouts mouse models, co-IP, and single sperm imaging. They find that CCDC113 localizes in the linker region among radial spokes, the nexin-dynein regulatory complex (N-DRC), and doublet microtubules (DMTs) RS, N-DRC, and DMTs and interacts with axoneme-associated proteins CFAP57 and CFAP91, acting as an adaptor protein that facilitates the linkage between RS, N-DRC, and DMTs within the sperm axoneme. They show the disruption of CCDC113 produced spermatozoa with disorganized sperm flagella and CFAP91, DRC2 could not colocalize with DMTs in Ccdc113-/- spermatozoa. Interestingly, the data also indicate that CCDC113 could localize on the HTCA region, and interact with HTCA-associated proteins. The knockout of Ccdc113 could also produce acephalic spermatozoa. By using Sun5 and Centlein knockout mouse models, the authors further find SUN5 and CENTLEIN are indispensable for the docking of CCDC113 to the implantation site on the sperm head. Overall, the experiments were designed properly and performed well to support the authors' observation in each part. Furthermore, the study's findings offer valuable insights into the physiological and developmental roles of CCDC113 in the male germ line, which can provide insight into impaired sperm development and male infertility. The conclusions of this paper are mostly well supported by data, but some points need to be clarified and discussed.<br /> (1) In Figure 1, a sperm flagellum protein, which is far away from CCDC113, should be selected as a negative control to exclude artificial effects in co-IP experiments.<br /> (2) Whether the detachment of sperm head and tail in Ccdc113-/- mice is a secondary effect of the sperm flagellum defects? The author should discuss this point.<br /> (3) Given that some cytoplasm materials could be observed in Ccdc113-/- spermatozoa (Fig. 5A), whether CCDC113 is also essential for cytoplasmic removal?<br /> (4) Although CCDC113 could not bind to PMFBP1, the localization of CCDC113 in Pmfbp1-/- spermatozoa should be also detected to clarify the relationship between CCDC113 and SUN5-CENTLEIN-PMFBP1.

    3. Reviewer #2 (Public Review):

      Summary:

      In the present study, the authors select the coiled-coil protein CCDC113 and revealed its expression in the stages of spermatogenesis in the testis as well as in the different steps of spermiogenesis with expression also mapped in the different parts of the epididymis. Gene deletion led to male infertility in CRISPR-Cas9 KO mice and PAS staining showed defects mapped in the different stages of the seminiferous cycle and through the different steps of spermiogenesis. EM and IF with several markers of testis germ cells and spermatozoa in the epididymis indicated defects in flagella and head-to-tail coupling for flagella as well as acephaly. The authors' co-IP experiments of expressed CCDC113 in HEK293T cells indicated an association with CFAP91 and DRC2 as well as SUN5 and CENTLEIN.

      The authors propose that CCDC113 connects CFAP91 and DRC2 to doublet microtubules of the axoneme and CCDC113's association with SUN5 and CENTLEIN to stabilize the sperm flagellum head-to-tail coupling apparatus. Extensive experiments mapping CCDC13 during postnatal development are reported as well as negative co-IP experiments and studies with SUN5 KO mice as well as CENTLEIN KO mice.

      Strengths:

      The authors provide compelling observations to indicate the relevance of CCDC113 to flagellum formation with potential protein partners. The data are relevant to sperm flagella formation and its coupling to the sperm head.

      Weaknesses:

      The authors' observations are consistent with the model proposed but the authors' conclusions for the mechanism may require direct demonstration in sperm flagella. The Walton et al paper shows human CCDC96/113 in cilia of human respiratory epithelia. An application of such methodology to the proteins indicated by Wu et al for the sperm axoneme and head-tail coupling apparatus is eagerly awaited as a follow-up study.

    4. Author response:

      eLife assessment

      This study presents a valuable finding on sperm flagellum and HTCA stabilization. The evidence supporting the authors' claims is incomplete. The work will be of broad interest to cell and reproductive biologists working on cilium and sperm biology.

      We thank the Editor and the two referees for their time in carefully reviewing our work, and we are grateful for the helpful guidance about how to improve our study. We will supplement the experiments and provide quantitative data guided by the referees’ comments in the revised manuscript. Additionally, we will polish the manuscript and add further context to help readers understand the significance of this work.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, Wu et al. investigated the physiological roles of CCDC113 in sperm flagellum and HTCA stabilization by using CRISPR/Cas knockouts mouse models, co-IP, and single sperm imaging. They find that CCDC113 localizes in the linker region among radial spokes, the nexin-dynein regulatory complex (N-DRC), and doublet microtubules (DMTs) RS, N-DRC, and DMTs and interacts with axoneme-associated proteins CFAP57 and CFAP91, acting as an adaptor protein that facilitates the linkage between RS, N-DRC, and DMTs within the sperm axoneme. They show the disruption of CCDC113 produced spermatozoa with disorganized sperm flagella and CFAP91, DRC2 could not colocalize with DMTs in Ccdc113-/- spermatozoa. Interestingly, the data also indicate that CCDC113 could localize on the HTCA region, and interact with HTCA-associated proteins. The knockout of Ccdc113 could also produce acephalic spermatozoa. By using Sun5 and Centlein knockout mouse models, the authors further find SUN5 and CENTLEIN are indispensable for the docking of CCDC113 to the implantation site on the sperm head. Overall, the experiments were designed properly and performed well to support the authors' observation in each part. Furthermore, the study's findings offer valuable insights into the physiological and developmental roles of CCDC113 in the male germ line, which can provide insight into impaired sperm development and male infertility. The conclusions of this paper are mostly well supported by data, but some points need to be clarified and discussed.

      We thank Reviewer #1 for his or her critical reading and the positive assessment.

      (1) In Figure 1, a sperm flagellum protein, which is far away from CCDC113, should be selected as a negative control to exclude artificial effects in co-IP experiments.

      We greatly appreciate Reviewer #1’s insightful suggestion. We will include a negative control in the co-IP experiment to eliminate potential artificial effects.

      (2) Whether the detachment of sperm head and tail in Ccdc113-/- mice is a secondary effect of the sperm flagellum defects? The author should discuss this point.

      Good question. Given that CCDC113 could localized in the sperm neck region, and interact with SUN5 and CENTELIN, CCDC113 may directly function in the sperm head and tail connection. Indeed, PAS staining revealed that Ccdc113–/– sperm heads with abnormal orientation in stages V–VIII seminiferous epithelia (Fig. 6C), and transmission electron microscopy (TEM) analysis further revealed that the disruption of CCDC113 caused the detachment of the destroyed coupling apparatus from the sperm head in step 9–11 spermatids (Fig. 6D). All these results suggest that the detachment of sperm head and tail in Ccdc113–/– mice may be not a secondary effect of the sperm flagellum defects. And we have discuss this point as below:

      CCDC113 could interact with SUN5 and CENTLEIN, but not PMFBP1 (Fig. 7A-C), and CCDC113 was in the cytoplasm in Sun5–/– and Centlein–/– spermatozoa (Fig. 7L, K). In addition, CCDC113 colocalizes with SUN5 in the HTCA region, and the immunofluorescence staining in spermatozoa shows that SUN5 is closer to the sperm nucleus than CCDC113 (Fig. 7G, H). Therefore, SUN5 and CENTLEIN may be more closed to the sperm nucleus compared with CCDC113. PAS staining revealed that Ccdc113–/– sperm heads with abnormal orientation in stages V–VIII seminiferous epithelia (Fig. 6C), and transmission electron microscopy (TEM) analysis further revealed that the disruption of CCDC113 caused the detachment of the destroyed coupling apparatus from the sperm head in step 9–11 spermatids (Fig. 6D). All these results suggest that the detachment of sperm head and tail in Ccdc113–/– mice may be not a secondary effect of the sperm flagellum defects.

      (3) Given that some cytoplasm materials could be observed in Ccdc113-/- spermatozoa (Fig. 5A), whether CCDC113 is also essential for cytoplasmic removal?

      Good question. Unremoved cytoplasm could be detected in spermatozoa by using transmission electron microscopy (TEM) analysis, including disrupted mitochondria, damaged axonemes, and large vacuoles, indicating cytoplasmic removal defects in Ccdc113–/– mice. We have discussed this point as below:

      “Unremoved cytoplasm could be detected in spermatozoa by using transmission electron microscopy (TEM) analysis, including disrupted mitochondria, damaged axonemes, and large vacuoles, indicating cytoplasmic removal defects in Ccdc113–/– mice (Fig. 5A).”

      (4) Although CCDC113 could not bind to PMFBP1, the localization of CCDC113 in Pmfbp1-/- spermatozoa should be also detected to clarify the relationship between CCDC113 and SUN5-CENTLEIN-PMFBP1.

      We are thankful to Reviewer #1 for this suggestion. We will analyze the localization of CCDC113 in Pmfbp1-/- spermatozoa to clarify the relationship between CCDC113 and SUN5-CENTLEIN-PMFBP1.

      Reviewer #2 (Public Review):

      Summary:

      In the present study, the authors select the coiled-coil protein CCDC113 and revealed its expression in the stages of spermatogenesis in the testis as well as in the different steps of spermiogenesis with expression also mapped in the different parts of the epididymis. Gene deletion led to male infertility in CRISPR-Cas9 KO mice and PAS staining showed defects mapped in the different stages of the seminiferous cycle and through the different steps of spermiogenesis. EM and IF with several markers of testis germ cells and spermatozoa in the epididymis indicated defects in flagella and head-to-tail coupling for flagella as well as acephaly. The authors' co-IP experiments of expressed CCDC113 in HEK293T cells indicated an association with CFAP91 and DRC2 as well as SUN5 and CENTLEIN.

      The authors propose that CCDC113 connects CFAP91 and DRC2 to doublet microtubules of the axoneme and CCDC113's association with SUN5 and CENTLEIN to stabilize the sperm flagellum head-to-tail coupling apparatus. Extensive experiments mapping CCDC13 during postnatal development are reported as well as negative co-IP experiments and studies with SUN5 KO mice as well as CENTLEIN KO mice.

      Strengths:

      The authors provide compelling observations to indicate the relevance of CCDC113 to flagellum formation with potential protein partners. The data are relevant to sperm flagella formation and its coupling to the sperm head.

      We are grateful to Reviewer #2 for his or her recognition of the strength of this study.

      Weaknesses:

      The authors' observations are consistent with the model proposed but the authors' conclusions for the mechanism may require direct demonstration in sperm flagella. The Walton et al paper shows human CCDC96/113 in cilia of human respiratory epithelia. An application of such methodology to the proteins indicated by Wu et al for the sperm axoneme and head-tail coupling apparatus is eagerly awaited as a follow-up study.

      We thank Reviewer 2 for his/her kindly help in improving the manuscript. We now understand that directly detection of CCDC113 precise localization in sperm axoneme and head-tail coupling apparatus (HTCA) using cryo-electron microscopy (cryo-EM) could powerfully strengthen our model. Recent advances in cryo-electron microscopy (cryo-EM) have facilitated the analysis of axonemal structures and determined the structures of native axonemal DMTs from mouse, bovine, and human sperm (Leung et al., 2023; Zhou et al., 2023). However, some high-resolution structures of sperm axoneme and HTCA regions, including those involving CCDC113, remain to be detected. Thus, we would like to discuss this point and regard it as an important follow-up study.

      References:

      Bazan, R., Schröfel, A., Joachimiak, E., Poprzeczko, M., Pigino, G., & Wloga, D. (2021). Ccdc113/Ccdc96 complex, a novel regulator of ciliary beating that connects radial spoke 3 to dynein g and the nexin link. PLoS Genet, 17(3), e1009388.

      Ghanaeian, A., Majhi, S., McCafferty, C. L., Nami, B., Black, C. S., Yang, S. K., Legal, T., Papoulas, O., Janowska, M., Valente-Paterno, M., Marcotte, E. M., Wloga, D., & Bui, K. H. (2023). Integrated modeling of the Nexin-dynein regulatory complex reveals its regulatory mechanism. Nat Commun, 14(1), 5741.

      Leung, M. R., Zeng, J., Wang, X., Roelofs, M. C., Huang, W., Zenezini Chiozzi, R., Hevler, J. F., Heck, A. J. R., Dutcher, S. K., Brown, A., Zhang, R., & Zeev-Ben-Mordehai, T.  (2023). Structural specializations of the sperm tail. Cell, 186(13), 2880-2896.e2817

      Walton, T., Gui, M., Velkova, S., Fassad, M. R., Hirst, R. A., Haarman, E., O'Callaghan, C., Bottier, M., Burgoyne, T., Mitchison, H. M., & Brown, A. (2023). Axonemal structures reveal mechanoregulatory and disease mechanisms. Nature, 618(7965), 625-633.

      Zhou, L., Liu, H., Liu, S., Yang, X., Dong, Y., Pan, Y., Xiao, Z., Zheng, B., Sun, Y., Huang, P., Zhang, X., Hu, J., Sun, R., Feng, S., Zhu, Y., Liu, M., Gui, M., & Wu, J. (2023). Structures of sperm flagellar doublet microtubules expand the genetic spectrum of male infertility. Cell, 186(13), 2897-2910.e2819.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript highlights single-stranded DNA exo- and endo-nuclease activities of ExoIII as a potential caveat and an underestimated source of decreased efficiency in its use in biosensor assays. The data present convincing evidence for the ssDNA nuclease activity of ExoIII and identifies residues that contribute to it. The findings are useful, but the study remains incomplete as the effect on biosensor assays was not established.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors show compelling data indicating that ExoIII has significant ssDNA nuclease activity that is posited to interfere with biosensor assays. This does not come as a surprise as other published works have indeed shown the same, but in this work, the authors provide a deeper analysis of this underestimated activity.

      Response: Thank you so much for reviewing and summarizing our work.

      Strengths:

      The authors used a variety of assays to examine the ssDNA nuclease activity of ExoIII and its origin. Fluorescence-based assays and native gel electrophoresis, combined with MS analysis clearly indicate that both commercial and laboratory purified ExoIII contain ssDNA nuclease activity. Mutational analysis identifies the residues responsible for this activity. Of note is the observation in this submitted work that the sites of ssDNA and dsDNA exonuclease activity overlap, suggesting that it may be difficult to identify mutations that affect one activity but not the other. In this regard, it is of interest the observation by the authors that the ssDNA nuclease activity depends on the sequence composition of the ssDNA, and this may be used as a strategy to suppress this activity when necessary. For example, the authors point out that a 3′ A4-protruding ssDNA could be employed in ExoIII-based assays due to its resistance to digestion. However, this remains an interesting suggestion that the authors do not test, but that would have strengthened their conclusion.

      Response: Thank you so much for the positive evaluation and insightful comments on our manuscript. In the revised version, we have modified the manuscript to address the reviewer’s concerns by providing point-to-point responses to all the comments.

      Weaknesses:

      The authors provide a wealth of experimental data showing that E. coli ExoIII has ssDNA nuclease activities, both exo- and endo-, however this work falls short in showing that indeed this activity practically interferes with ExoIII-driven biosensor assays, as suggested by the authors. Furthermore, it is not clear what new information is gained compared to the one already gathered in previously published works (e.g. references 20 and 21). Also, the authors show that ssDNA nuclease activity has sequence dependence, but in the context of the observation that this activity is driven by the same site as dsDNA Exo, how does this differ from similar sequence effects observed for the dsDNA Exo? (e.g. see Linxweiler, W. and Horz, W. (1982). Nucl. Acids Res. 10, 4845-4859).

      Response: We agree with the reviewer regarding the limitations in showing the practical influence of the ssDNAse activity in the commercial detection kit. Different from the biosensor in reference 20, our results showed a potential impact of ExoⅢ on another frequently used detection system, as the primer and probe required for the detection kit could be digested by ExoⅢ, leading to a lower detection efficiency. Since the activities of ExoⅢ on ssDNA and dsDNA share a same active center, we reason that the difference in sequence specificity of ExoⅢ on these two types of substrates might be caused in two aspects: on the nuclease, some unidentified residues of ExoⅢ that play an auxiliary role in digesting ssDNA but not in dsDNA, might exist, which contribute to the difference we observed; on the substrate structure, without the base-pairing of complementary sequence, the structure of ssDNA is more flexible (changeable with environmental factors such as ions and temperature) than that of dsDNA. The two aspects may collectively result in the difference in sequence specificity of ExoⅢ on ssDNA and dsDNA. We believe that cryo-electronic microscopy-based structure analysis of the ExoⅢ-ssDNA complex would provide more comprehensive and direct evidence.

      Because of the claim that the underestimated ssDNA nuclease activity can interfere with commercially available assays, it would have been appropriate to test this. The authors only show that ssDNA activity can be identified in commercial ExoIII-based kits, but they do not assess how this affects the efficiency of a full reaction of the kit. This could have been achieved by exploiting the observed ssDNA sequence dependence of the nuclease activity. In this regard, the work cited in Ref. 20 showed that indeed ExoIII has ssDNA nuclease activity at concentrations as low as 50-fold less than what test in this work. Ref 20 also tested the effect of the ssDNA nuclease activity in Targeted Recycle Assays, rather than just testing for its presence in a kit.

      Response: Thanks so much for your comments. Logically, to evaluate the practical influence, we need to compare the current and improved detection kits. Our result suggested that raising the temperature or using the mutant may minimize the ssDNase activity of ExoⅢ. But the RAA or RPA-ExoⅢ detection kit is multiple-component system consisting of recombinase T4 UvsX, loading factor T4 UvsY, ssDNA binding protein T4 gp32 polymerase Bsu and ExoⅢ (Analyst. 2018 Dec 17;144(1):31-67. doi: 10.1039/c8an01621f), which collectively decide the performance of the kit. By increasing the temperature, the activities or functions of other proteins contained in the detection kit would also be affected, and the resultant change in detection efficiency would not reflect the real practical influence of the ssDNase activity of ExoⅢ; By replacing the wild type with the mutant, the other four proteins need to be prepared and combined with an optimized ratio for rebuilding the detection system, which is challenging. The targeted recycle assays in Ref 20 is a simple system composed of ExoⅢ and corresponding nucleic acid adapters, which could be easily simulated by the researchers for evaluation. Being a much more complex system, the RAA or RPA-ExoⅢ detection kit is difficult to manipulate for displaying the practical influence. Thank you again for your insightful suggestions; and we may conduct a systematic investigation improve the detection kit in future studies.

      Because of the implication that the presence of ssDNA exonuclease activity may have in reactions that are supposed to only use ExoIII dsDNA exonuclease, it is surprising that in this submitted work no direct comparison of these two activities is done. Please provide an experimental determination of how different the specific activities for ssDNA and dsDNA are.

      Response: As for your suggestion, we have compared the digesting rate of two activities by using an equal amount of the commercial ExoⅢ (10 U/µL) on the two types of substrates (10 µM). The results below revealed that ExoⅢ required 10 minutes to digest the 30-nt single-stranded DNA (ssDNA) (A), whereas it could digest the same sequence on double-stranded DNA (dsDNA) within 1 minute (B) (in a newly produced Supplementary Figure S1). This indicated that ExoⅢ digested the dsDNA at a rate at least ten times faster than ssDNA. In conjunction with these results, a recent study has shown that the ssDNase activity of ExoⅢ surpasses that of the conventional ssDNA-specific nuclease ExoI (Biosensors (Basel), 2023, May 26; 13(6):581, doi: 10.3390/bios13060581), suggesting a potential biological significance of ExoⅢ in bacteria related to ssDNA, even though the digesting rate is not as rapid as the dsDNA. The corresponding text has been added to the result (Lines 200-207).

      Author response image 1.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes some experiments addressing 3' exonuclease and 3' trimming activity of bacterial exonuclease III. The quantitative activity is in fact very low, despite claims to the contrary. The work is of low interest with regard to biology, but possibly of use for methods development. Thus the paper seems better suited to a methods forum.

      Response: We thank you for your time and effort in improving our work. In the following, we have revised the manuscript by providing point-to-point responses to your comments.

      Strengths:

      Technical approaches.

      Response: Thanks for your evaluation.

      Weaknesses:

      The purity of the recombinant proteins is critical, but no information on that is provided. The minimum would be silver-stained SDS-PAGE gels, with some samples overloaded in order to detect contaminants.

      Response: As suggested, we have performed the silver-stained SDS-PAGE on the purified proteins. The result below indicated that no significant contaminant was found, except for a minor contaminant in S217A (in a newly produced Supplementary Figure S4).

      Author response image 2.

      Lines 74-76: What is the evidence that BER in E. coli generates multinucleotide repair patches in vivo? In principle, there is no need for the nick to be widened to a gap, as DNA Pol I acts efficiently from a nick. And what would control the extent of the 3' excision?

      Response: Thank you for the insightful questions. The team of Gwangrog Lee lab has found that ExoⅢ is capable of creating a single-stranded DNA (ssDNA) gap on dsDNA during base excision repair, followed by the repair of DNA polymerase I. The gap size is decided by the rigidity of the generated ssDNA loop and the duplex stability of the dsDNA (Sci Adv. 2021 Jul 14;7(29):eabg0076. doi: 10.1126/sciadv.abg0076).

      Figure 1: The substrates all report only the first phosphodiester cleavage near the 3' end, which is quite a limitation. Do the reported values reflect only the single phosphodiester cleavage? Including the several other nucleotides likely inflates that activity value. And how much is a unit of activity in terms of actual protein concentration? Without that, it's hard to compare the observed activities to the many published studies. As best I know, Exo III was already known to remove a single-nucleotide 3'-overhang, albeit more slowly than the digestion of a duplex, but not zero! We need to be able to calculate an actual specific activity: pmol/min per µg of protein.

      Response: Yes, once the FQ reporter is digested off even one nucleotide or phosphodiester, fluorescence will be generated, and the value reflects how many phosphodiesters at least have been cleaved during the period, based on which the digesting rate or efficiency of the nuclease on ssDNA could be calculated. The following Figure 2 and 3 showed ExoⅢ could digest the ssDNA from the 3’ end, not just a single nucleotide. Since the “unit” has been widely used in numerous studies (Nature. 2015 Sep 10;525(7568):274-7; Cell. 2021 Aug 19;184(17):4392-4400.e4; Nat Nanotechnol. 2018 Jan;13(1):34-40.), its inclusion here aids in facilitating comparisons and evaluations of the activity in these studies. And the actual activity of ExoⅢ had been calculated in Figure 4D.

      Figures 2 & 3: These address the possible issue of 1-nt excision noted above. However, the question of efficiency is still not addressed in the absence of a more quantitative approach, not just "units" from the supplier's label. Moreover, it is quite common that commercial enzyme preparations contain a lot of inactive material.

      Response: Thanks for your comments. In fact, numerous studies have used the commercial ExoⅢ (Nature. 2015 Sep 10;525(7568):274-7; Cell. 2021 Aug 19;184(17):4392-4400.e4; Nat Nanotechnol. 2018 Jan;13(1):34-40.). Using this universal label of “units” helps researchers easily compare or evaluate the activity and its influence. The commercial ExoⅢ is developed by New England Biolabs Co., Ltd., and its quality has been widely examined in a wide range of scientific investigations.

      Figure 4D: This gets to the quantitative point. In this panel, we see that around 0.5 pmol/min of product is produced by 0.025 µmol = 25,000 pmol of the enzyme. That is certainly not very efficient, compared to the digestion of dsDNA or cleavage of an abasic site. It's hard to see that as significant.

      Response: Thanks for your comments; the possible confusion could have arisen due to the arrangement of the figure. Please note that based on Figure 4D, the digestion rate of 0.025 µM ExoⅢ on the substrate is approximately 5 pmol/min (as shown on the right vertical axis), rather than 0.5 pmol/min. Given that the reaction contained ExoⅢ with a concentration of 0.025 uM in a total volume of 10 µL, the quantity of ExoⅢ was determined to be 0.25 pmol (0.025 µmol/L × 10 µL, rather than 25,000 pmol), resulting in a digestion rate of 5 pmol/min. It suggested each molecule of ExoⅢ could digest one nucleotide in 3 seconds (5 pmol nucleotides /0.25 pmol ExoⅢ/60second=0.33 nucleotides/molecular/second). While it may not be as rapid as the digestion of ExoⅢ on dsDNA, a recent study has shown that the ssDNase activity of ExoⅢ surpasses that of the conventional ssDNA-specific nuclease ExoI (Biosensors (Basel), 2023, May 26; 13(6):581, doi: 10.3390/bios13060581), suggesting a potential biological significance of ExoⅢ in bacteria related to ssDNA.

      Line 459 and elsewhere: as noted above, the activity is not "highly efficient". I would say that it is not efficient at all.

      Response: We respectfully disagree with this point. Supported by the outcomes from fluorescence monitoring of FQ reporters, gel analysis of the ssDNA probe, and mass spectrometry findings, the conclusion is convincing, and more importantly, our findings align with a recent study (Biosensors 2023, 13(6), 581; https://doi.org/10.3390/bios13060581).

      Reviewer #3 (Public Review):

      Overall:

      ExoIII has been described and commercialized as a dsDNA-specific nuclease. Several lines of evidence, albeit incomplete, have indicated this may not be entirely true. Therefore, Wang et al comprehensively characterize the endonuclease and exonuclease enzymatic activities of ExoIII on ssDNA. A strength of the manuscript is the testing of popular kits that utilize ExoIII and coming up with and testing practical solutions (e.g. the addition of SSB proteins ExoIII variants such as K121A and varied assay conditions).

      Response: We really appreciate the reviewer for pointing out the significance and strength of our work. Additionally, we have responded point-by-point to the comments and suggestions.

      Comments:

      (1) The footprint of ExoIII on DNA is expected to be quite a bit larger than 5-nt, see structure in manuscript reference #5. Therefore, the substrate design in Figure 1A seems inappropriate for studying the enzymatic activity and it seems likely that ExoIII would be interacting with the FAM and/or BHQ1 ends as well as the DNA. Could this cause quenching? Would this represent real ssDNA activity? Is this figure/data necessary for the manuscript?

      Response: Thanks so much for your questions. The footprint of ExoⅢ on the dsDNA appears to exceed 5 nucleotides based on the structural analysis in reference #5. However, the footprint may vary when targeting ssDNA. Mass spectrometry analysis in our study demonstrated that ExoⅢ degraded a ~20-nucleotide single-stranded DNA substrate to mononucleotides (Figure 3), suggesting its capability to digest a 5-nt single-stranded DNA into mononucleotides as well. Otherwise, the reaction product left would only be 5-nt ssDNA fragment. Thus, the 5-nt FQ reporter is also a substrate for ExoⅢ. ExoⅢ possibly interacts with BHQ1 and affects its quenching efficiency on FAM to trigger the fluorescence release, as shown in Figure 1A, but this possibility has already been ruled out by the development of the RPA-ExoⅢ detection kit. As pointed out in the introduction part, the kit requires a probe labeled with fluorophore and quencher. If ExoⅢ could affect the fluorophore and quencher causing fluorescence release, the detection kit would yield a false-positive result regardless of the presence of the target, rendering the detection system ineffective. Thus, ExoⅢ does not interfere with the fluorophore and quencher. The digestion of ExoⅢ on the ssDNA within the FQ reporter was the sole cause of fluorescence release, and the emitted fluorescence represented the ssDNA activity. The result suggested that the FQ reporter might offer an effective approach to sensitively detect or quantitatively study the ssDNase activity of a protein that has not been characterized.

      (2) Based on the descriptions in the text, it seems there is activity with some of the other nucleases in 1C, 1F, and 1I other than ExoIII and Cas12a. Can this be plotted on a scale that allows the reader to see them relative to one other?

      Response: Thanks so much for your suggestions. We attempted to adjust the figure, but due to most of the values being less than or around 0.005, it was challenging to re-arrange for presentation.

      (3) The sequence alignment in Figure 2N and the corresponding text indicates a region of ExoIII lacking in APE1 that may be responsible for their differences in substrate specificity in regards to ssDNA. Does the mutational analysis support this hypothesis?

      Response: Our result indicated that the mutation of R170 located in the region (αM helix) resulted in lower digesting efficiency on ssDNA than the wild type, which showed that R170 was an important residue for the ssDNase activity, partially supported the hypothesis. Further investigation is needed to determine whether the structure of the αM helix accounts for the distinctions observed between ExoⅢ and APE1. Future research may require more residue mutations in this area for validation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      • A significant fraction of amplitude is missing in the presented fluorescence time courses reporting on ssDNA nuclease activity (Figs 1 B, E, and H). Please indicate the dead time of mixing in these experiments, and if necessary include additional points in this time scale. It is unacceptable for the authors to simply connect the zero-time point and the first experimental point with a dashed line.

      Response: We thank the reviewer for pointing out the critical detail. We agree that simply connecting with a dashed line is an inappropriate way for indicating the real fluorescence generated in the initial stage. The fluorescence monitor machine needs about two minutes to initiate from the moment we place the reaction tube into the machine. But ExoⅢ can induce significant fluorescence immediately, reaching the peak within ~40 seconds, as shown in the video data. Therefore, it is difficult to record the initial real-time fluorescence generated. To avoid misleading, we have added a description in the legend as follows: “The dashed line used in the figure does not indicate the real-time fluorescence generated in the reaction but only represents a trend in the period for the monitor machine to initiate (~2 minutes).” The text was added in Lines 836-838.

      • The authors chose to utilize a 6% agarose electrophoresis to analyze digestion products. However, while this approach clearly shows that the substrates are being digested, it does not allow us to clearly estimate the extent. It would be appropriate to include control denaturing PAGE assays to test the extent of reaction, especially for dsDNA that contains a ssDNA extension, as in Figure 8, or for selected mutants to test whether exo activity may be limited to just a few nts, that may not be resolved with the lower resolution agarose gels.

      Response: We agree with the reviewer that denaturing PAGE assays usually is the choice for high-resolution analysis. And we performed this experiment on the short ssDNA, but observed that the bands of digestion products frequently shifted more or less in the gel. Of note, the other independent study also showed a similar phenomenon (Nucleic Acids Res. 2007;35(9):3118-27. doi: 10.1093/nar/gkm168). Even slight band shifting would significantly interfere with our analysis of the results, especially on the short ssDNA utilized in the study. After numerous attempts, we discovered that 6% agarose gel electrophoresis could detect the digested ssDNA bands with lower resolution than PAGE, but less shifting was observed. Considering all the factors, the 6% agarose gel was finally selected to analyze the digestion process.

      Reviewer #2 (Recommendations For The Authors):

      Line 158: tipycal should be typical

      Response: Thanks so much, and as the reviewer pointed, we have corrected the typo.

      Lines 299-300: "ssD-NA" should not be hyphenated, i.e., it should be ssDNA. .

      Response: Thank you for pointing this out. We have rectified the error and thoroughly reviewed the entire paper for any necessary corrections.

      Reviewer #3 (Recommendations For The Authors):

      Figure 2A should indicate the length of the substate. The legend says omitted nucleotides - I assume they were present in the substrate and just not in the figure? The authors should be very clear about this. Moreover, the text and figure do not well describe the design differences between the three probes. Are they the same except just 23, 21, and 20 nt in length? Are the sequences selected at random?

      Response: Thank you for your questions. The lengths of probes were described in the figure (23, 21, and 20 nt). The legend has been reworded in Line 843 as “The squiggle line represents the ~20 nucleotides of the ssDNA oligo.” And the sequences of three ssDNA substrates were randomly selected, and all the detailed information was provided in Supplementary Table S4.

    2. Reviewer #3 (Public Review):

      Overall:

      ExoIII has been described and commercialized as a dsDNA specific nuclease. Several lines of evidence, albeit incomplete, have indicated this may not be entirely true. Therefore, Wang et al comprehensively characterize the endonuclease and exonuclease enzymatic activities of ExoIII on ssDNA. A strength of the manuscript is the testing of popular kits that utilize ExoIII and coming up with and testing practical solutions (e.g., addition of SSB proteins ExoIII variants such as K121A and varied assay conditions).

      Comments:

      (1) The footprint of ExoIII on DNA is expected to be quite a bit larger than 5-nt, see structure in manuscript reference #5. Therefore, the substrate design in Figure 1A seems inappropriate for studying the enzymatic activity and it seems likely that ExoIII would be interacting with the FAM and/or BHQ1 ends as well as the DNA. Could this cause quenching? Would this represent real ssDNA activity? Is this figure/data necessary for the manuscript?<br /> (2) Based on the descriptions in the text, it seems there is activity with some of the other nucleases in 1C, 1F, and 1I other than ExoIII and Cas12a. Can this be plotted on a scale that allows the reader to see these relative to one other?<br /> (3) The sequence alignment in Figure 2N and corresponding text indicate a region of ExoIII lacking in APE1 that may be responsible for their differences in substrate specificity in regards to ssDNA. Does the mutational analysis support this hypothesis?

    3. eLife assessment

      This manuscript highlights single-stranded DNA exo- and endo-nuclease activities of ExoIII as a potential caveat and an underestimated source of decreased efficiency in its use in biosensor assays. The data present solid evidence for the ssDNA nuclease activity of ExoIII and identifies residues that contribute to it. The findings are useful, but some aspects in the study remain incomplete.

    4. Reviewer #2 (Public Review):

      Summary:

      This paper describes some experiments addressing 3' exonuclease and 3' trimming activity of bacterial exonuclease III. The quantitative activity is in fact very low, despite claims to the contrary. The work is of low interest with regard to biology, but possibly of use for methods development. Thus the paper seems better suited to a methods forum.

      Strengths:

      Technical approaches.

      Comments on revised version:

      All concerns have been addressed.

    1. eLife assessment

      The study offers a compelling molecular model for the organization of rootlets, a critical organelle that links cilia to the basal body, ensuring proper anchoring. While previous research has explored rootlet structure and organization, this study delivers an unprecedented level of resolution, valuable to the centrosome and cilia field. This research marks a significant step forward in our understanding of rootlets' molecular organization.

    2. Reviewer #1 (Public Review):

      Summary:

      Ciliary rootlet is a structure associated with the ciliary basal body (centriole) with beautiful striation observed by electron microscopy. It has been known for more than a century, but its function and protein arrangement is still unknown. This work reconstructed near-atomic resolution 3D structure of the rootlet using cryo-electron tomography, discovered a number of interesting filamentous structures inside and built molecular model of the rootlet.

      Strengths:

      The authors exploited the current possible ability of cryo-ET and used it appropriately to describe 3D structure of the rootlet. They carefully conducted subtomogram averaging and classification, which enabled an unprecedented detailed view of this structure. The dual use of (nearly) intact rootlet from cilia and extracted (demembraned) rootlet enabled them to describe with confidence how D1/D2/A bands form periodic structures and cross with longitudinal filaments, which are likely coiled-coil.

      Weaknesses:

      Some more clarifications in the method and indications in figures were needed in the original version. The authors addressed them in the revision.

    3. Reviewer #3 (Public Review):

      Summary:

      The study offers a compelling molecular model for the organization of rootlets, a critical organelle that links cilia to the basal body. Striations have been observed in rootlets, but their assembly, composition, and function remain unknown. While previous research has explored rootlet structure and organization, this study delivers an unprecedented level of resolution, valuable to the centrosome and cilia field. The authors isolated rootlets from mice's eyes. They apply EM to partially purified rootlets (first negative stain, then cryoET). From these micrographs, they observed striations along the membranes along the rootlet but no regular spacing was observed.

      The thickness of the sample and membranes prevented good contrast in the tomograms. Thus they further purified the rootlets using detergent, which allowed them to obtain cryoET micrographs of the rootlets with greater details. The tomograms were segmented and further processed to improve the features of the rootlet structures. From their analysis, they described 3 regular cross-striations and amorphous densities, which are connected perpendicularly to filaments along the length of the rootlets. They propose that various proteins provide the striations and rootletin (mouse homolog of human c-nap1) forms parallel coiled coils that run along the rootlet. Overall their data provide a detailed model for the molecular organization of the rootlet.

      The major strength is that this high-quality study uses state-of-the-art cryo-electron tomography, sub-tomogram averaging, and image analysis to provide a model of the molecular organization of rootlets. The micrographs are exceptional, with excellent contrast and details, which also implies the sample preparation was well optimized to provide excellent samples for cryo-ET. The manuscript is also clear and accessible.

      This research marks a significant step forward in our understanding of rootlets' molecular organization.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public reviews):

      Summary:

      Ciliary rootlet is a structure associated with the ciliary basal body (centriole) with beautiful striation observed by electron microscopy. It has been known for more than a century, but its function and protein arrangement are still unknown. This work reconstructed the near-atomic resolution 3D structure of the rootlet using cryo-electron tomography, discovered a number of interesting filamentous structures inside, and built a molecular model of the rootlet.

      Strengths:

      The authors exploited the currently possible ability of cryo-ET and used it appropriately to describe the 3D structure of the rootlet. They carefully conducted subtomogram averaging and classification, which enabled an unprecedented detailed view of this structure. The dual use of (nearly) intact rootlets from cilia and extracted (demembraned) rootlets enabled them to describe with confidence how D1/D2/A bands form periodic structures and cross with longitudinal filaments, which are likely coiled-coil.

      Weaknesses:

      Some more clarifications are needed. This reviewer believes that the authors can address them.

      Reviewer #1 (Recommendations for the authors):

      Recommendation 1: According to Fig.1B, the rootlet was mechanically pulled out from the visual cell for a long distance by vortexing. Is there no artifact? Can the authors comment on it by referring to old literature, for example, with EM of resin-embedded and sectioned basal bodies?

      Response: A previous study (Gilliam et al., 2012) compared cryoET of purified rootlets with resinembedded ultrathin sections of mouse eyecups. They reported no changes in striation repeat or rootlet morphology suggesting there is no artifact of purification. Our rootlet data are consistent with that of Gilliam, suggesting the tomograms we report are representative of rootlets prior to purification. 

      We have clarified this in the text: pg 2: “As previously described (Gilliam et al., 2012), rootlet striation-repeat and morphology appear unaltered by the purification method. Moreover, …” 

      Recommendation 2: Fig.1F: It is not clear how to distinguish striation-membrane joints indicated by grey and white arrows. It seems relatively straight striation is indicated by a white arrow, while in the case of the bulky feature it is shown by a grey arrow (and the bulk is colored in blue). But there is no clear border between these features. How were they distinguished? Are they based on classification?

      Response: The membrane-associated densities (colored in blue) were assigned according to the TomoSeg neural network. It was trained on a small set of globular densities closely associated with a membrane. This training set included examples both close to and far away from the rootlet. We trained a separate network on recognizing rootlet striations. Both networks competed on assigning pixels in the tomogram as either striations or membrane-associated proteins. The different membrane connections were therefore defined by the probability within the TomoSeg network rather than classification.

      We clarified this in the main text: pg 3: “All the striations partially or fully spanned the width of the rootlet and extended beyond the outermost longitudinal filaments. These rootlet-protruding striation-densities frequently contacted the membrane (Fig 1E). Close examination suggested some make a direct contact, whereas others contact a subset of globular membrane-associated densities that are a striking feature of the tomograms. These densities are ~7 nm in diameter and cover almost every membrane surface. Where two membranes come into proximity, the intervening space is filled with two layers of these membrane-associated proteins, one layer associated with each membrane (Fig 1C, S1A, blue arrowheads). We trained a TomoSeg neural network to assign these densities and let this network compete with one that assigned striations. This resulted in a final segmentation with membrane-associated densities indicated in blue and striations in yellow (Fig 1E, F and S1D–F).”  

      We also clarified this in the methods:

      pg 12/13: “The tomograms were then preprocessed in EMAN2.2 for training of the TomoSeg CNN (Chen et al., 2017). Here, the features (filaments, D-bands, A-bands, gold fiducials, actin, membranes, membrane-associated densities and ice contaminations) were individually trained. Segmented maps were allowed to compete for the assignment of pixels in the tomograms, cleaned up in Amira (Thermo Fisher Scientific), and converted to object files. The object files and corresponding tomograms were displayed in ChimeraX (Pettersen et al., 2021). Assignment of direct and indirect striation-membrane connections was done manually by assessing whether TomoSeg-segmented striations and membranes were connected directly or via membrane-associated densities. The automated segmentation of amorphous striations picked up mostly dense amorphous features. The fainter densities that we observed to laterally connect the amorphous features were manually drawn by dotted lines.” 

      Recommendation 3: p.3 "All the striations partially or fully spanned the width of the rootlet before protruding from its surface." This reviewer would read the last part of this sentence as "before protruding from the surface of the rootlet membrane toward inside". Is this correct?

      Response: This was not what we had intended to imply. 

      We have changed this sentence in the text to avoid confusion:  pg 3: “All the striations partially or fully spanned the width of the rootlet and extended beyond the outermost longitudinal filaments. These rootlet-protruding striation-densities frequently contacted the membrane (Fig 1E).”

      Recommendation 4: Same for p.4 "The protrusions from the rootlets were flexible". This means the protrusions from the membrane if this reviewer understands correctly.

      We also clarified this sentence in the text:  pg 4: “The proteinaceous protrusions that extended from the rootlets were flexible and did not induce a regular spacing in the membrane-associated proteins they contacted (Fig 1F, S1D–F).”

      Recommendation 5: p.4 "Due to the thickness of the sample and the presence of membranes": How thick is the typical sample?

      Response: We typically collected data on samples thicker than 300nm. We initially tried making thinner samples, for better contrast, but observed this led to sample disruption. We changed “sample” to “ice” to clarify that we refer to the prepared sample and not the biological object.

      Changes in text:

      pg 4: “Due to the ice-thickness and the presence of membranes, the tomograms had limited contrast.”

      Recommendation 6: p.4 "We were also able to see these bands with cryo-ET." It would be nice if the comparison between tomograms of the native and purified rootlets was done. This reviewer could not get where the D1/D2/A bands are in Fig.1E.

      Response: Due to the noise in the native tomograms it is difficult to see the regular striation pattern in Fig 1E. However, we see it better when we project the native rootlet onto a single image. We added the projection image, the corresponding fourier transform, and repeat measurements to the supplement (Fig S1B, C). We updated all figure references in the text.

      We updated the text accordingly:

      pg 4: “We were also able to see these bands with cryo-ET. The striations in the purified rootlets appeared more ordered and clearer than in the cellular tomograms due to the improved contrast. In the cellular rootlets, we identified the bands in a tomogram projection (Fig S1B), with an average distance of 79.52 ± 0.26 nm between each repeat (Fig S1C). The repeat distance for the purified rootlets is 80.1 ± 0.03 nm based on a sine fit to A and D-bands of 10 fourier-filtered tomogram projections (Fig 2D, Fig S2E–I).”

      We updated the figure legend of Fig S1:

      pg 18: “(B) Projection image of a 53 nm thick slice through the tomogram and the corresponding Fast Fourier Transform (FFT). Measured frequencies are indicated with red lines. (C) Quantification of the distance measured between pairs of discrete striations. (D–F) …”

      Recommendation 7: Fig.2E-I: Could the authors explain how these bands were tracked? It is very difficult for this reviewer to trace, for example, the A-band in Fig.2g.

      Response: We trained the neural network of TomoSeg to pick up discrete and amorphous striations. The Tomoseg segmentation of the amorphous striations often only picked up dense features marked in green. However, we could see densities by eye in the tomograms that connect these dense features.

      These connecting densities were manually drawn with a dotted line.

      We clarified this in the methods:

      pg 13: “The automated segmentation of amorphous striations picked up mostly dense amorphous features. The fainter densities that we observed to laterally connect the amorphous features were manually drawn by dotted lines.”

      We also changed the figure legend of Fig2: 

      pg 5: “(F,G,I) fainter features not picked up by the automated segmentation were drawn with dotted lines.”

      Recommendation 8: Fig.2: The caption of Fig.2I is missing.

      We have edited the legend of Fig 2 to include this caption: pg 5: “(I) Segmentation that shows amorphous features occur as two bands and connect to the rootlet surface densities.”

      Recommendation 9: p.6 "Additionally, the surface densities show evidence of connecting to the A-bands (Fig 2I and S3I)." Does the author mean Fig.2J and S3I?

      Response: This is most clearly visible in figure 2I and S3I (S3J after revisions), but it is also visible in 2J. 

      We therefore edited this figure reference:

      pg 6: (Fig 2I, J and S3J)

      Recommendation 10:  p.8 "The metazoan rootlet is a cilium-associated fiber that is characterized by regular cross-striations." In this reviewer's memory, Tetrahymena also has a rootlet. Are they different in structure?

      Response: Tetrahymena and other protists have striated rootlets (known as kinetodesmal fibres or System-I fibres), that are classified as being different from mammalian rootlets (Andersen et al., 1991). Tetrahymena rootlets have a 32 nm repeat (Munn, 1970), which is less than half of the 80 nm repeat observed for mammalian rootlets. While the protein composition of Tetrahymena rootlets is unknown, a 250 kDa protein was proposed to be their main component (Williams et al., 1979). Tetrahymena rootlet proteins were proposed to span a minimum of 4-5 striation repeats, based on early thin-sectioning EM (Munn, 1970), while we show that rootletin predictions span at most ~3.3 repeats in mammalian rootlets. Since the early proposal of Tetrahymena rootlet protein organisation, more components have been identified: DisAp (Galati et al., 2014) with a predicted length of ~37 nm (0.15 nm/residue), and proteins of 170 kDa that cross react with the Naegleria Gruberi major rootlet component (Dingle & Larson, 1981). Thus, the available data suggest that Tetrahymena rootlets are different in structure from mammalian ones.

      Reviewer #2 (Public reviews):

      Summary:

      This work performs structural analysis on isolated or purified rootlets.

      Strengths:

      To date, most studies of this cellular assembly have been from fluorescence microscopy, conventional TEM methods, or through biochemical analysis of constituents. It is clearly a challenging target for structural analysis due to its complexity and heterogeneity. The authors combine observations from cryo-electron tomograms, automated segmentations, subtomogram averaging, and previous data from the literature to present an overall model of how the rootlet is organised.

      Their model will serve as a jumping-off point for future studies, and as such it is something of considerable value and interest.

      Weaknesses:

      It is speculative but is presented as such, and is well-reasoned, plausible, and thorough.

      Reviewer #2 (Recommendations for the authors):

      Recommendation 1: My suggestions to improve the manuscript lie in some of the technical details:

      The subtomogram averaging methods are overly brief - I am not convinced that someone could replicate the process from the text in the methods (and results sections).

      We have now extended our description of the subtomogram averaging methods: 

      pg 13: “For particle picking, the tomograms were deconvolved using the TOM package (Tegunov & Cramer, 2019). Dynamo was used for particle extraction using the Dynamo surface model (Castaño-Díez et al., 2012, 2017): Each D2 band was traced in multiple slices per rootlet to define dynamo surfaces. Surface triangulation was set to result in extraction coordinates approximately 4 times the number of expected filaments. The coordinates were extracted as a Dynamo table that was subsequently converted to the motl-format using subTOM scripts, available at https://github.com/DustinMorado/subTOM/ (Leneva et al., 2021). Particles were extracted from tomograms reconstructed using novaCTF (Turoňová et al., 2017).

      An initial reference was obtained by in-plane randomizing and averaging all particles prior to alignments. Initial alignments were performed to centre filaments, by using a 10 nm wide cylindrical mask, limited to 4 nm shifts in X and Y with respect to the reference orientation, A spherical mask with large diameter was used for alignments the D-bands, these alignments were restricted to the reference Z direction. Cluster- and careful per-tomogram cross-correlation cleaning were applied to remove particle duplicates, particles with no filaments, and particles with disordered D-bands. This resulted in a cleaned particle dataset.  

      Prior to classification in subTOM, alignments with limited X/Y/Z shifts and increasingly finer in-plane rotations were performed. 20 eigenvolumes were generated by K-means classification over 20 eigenvectors. The eigenvolumes and particles clustered per eigenvector were assessed to identify which vectors described the missing wedge or structural features (Leneva et al., 2021). The structural eigenvectors were used to cluster particles into the final class averages that described particle heterogeneity. 

      For the final subtomogram class-average that contained the twist, the cleaned particle dataset motl was converted to a STAR file compatible with RELION 4.0 alpha (Zivanov et al., 2022). Gold beads were removed from the preprocessed tomogram frames by converting the aligned tomogram gold coordinates initially obtained by Etomo bead-finder during preprocessing steps (Kremer et al., 1996). Particles were then extracted in RELION 4.0 alpha. The initial reference was an inplane randomized average of the cleaned particle dataset. Instead of refinement, which resulted in anisotropic structures due to a lack of features for the alignment, we used simultaneous alignment and classification. We restricted the alignments to full inplane rotations with respect to the reference Z-axis.”

      Recommendation 2: I find it difficult to assess the quality of the final subtomogram averages as presented in the manuscript. One potential worry is the fact that the authors state that nothing is visible outside the mask, which can be a sign of overfitting (though, as the authors state, can just be a sign of heterogeneity). I would suggest that the authors include FSC curves, as well as 2D slices through the unmasked subtomogram averages - it is easier to judge the impact of the mask when viewing it this way and not at the isosurface.

      Response: We understand the reviewer’s concern for overfitting and masking. To clarify our approach, the class averages we show in Fig3G and FigS5C are the result of simultaneous classification with alignment and not a gold-standard refined average. The classification does not produce an FSC since it does not work with half sets. We initially tried a refinement approach, but the filaments did not have enough features to align and resulted in anisotropic structures. The FSC of such a refinement is shown below. However, because of the anisotropy, we did not include these structures or FSCs in the manuscript and we make no claims about the resolution. 

      Author response image 1.

      Instead, we presented the data from simultaneous classification with alignment which revealed the twist in the filament. Like the reviewer, we were initially concerned that the filament twist could be an artefact of the narrow masks and reference we used. However, we only used rotationally symmetric references and masks that do not contain any features. We therefore, realized this asymmetric twistfeature could not have arisen from imposed alignment regiments, reference biases or overfitting. 

      To make our approach clearer, we have updated the main text:

      pg 8: “To ensure unbiased alignment of any coiled-coil features we generated a smooth reference by randomizing the inplane rotational orientation of the particles (Fig S5B). Initial refinement of the data resulted in an anisotropic structure since the filaments did not have enough features to align to. Therefore, we performed classification with alignment in RELION 4.0 alpha (Zivanov et al., 2022), and used a narrow 3.3 nm-wide mask with a smooth edge up to 7.7 nm (Fig S5B). This was the narrowest mask that still resulted in an isotropic structure and revealed features that were absent in the smooth reference. The resulting class averages contained a twist along the filament length in classes 2, 3 and 4 but most prominently in class 5 (Fig S5C). Class 5 contained a filament of 2 nm thick by 5 nm wide with a groove along its length (Fig 3G).” 

      We also clarified this in the methods:

      pg 13: “The initial reference was an inplane randomized average of the cleaned particle dataset. Instead of refinement, which resulted in anisotropic structures due to a lack of features for the alignment, we used simultaneous alignment and classification. We restricted the alignments to full inplane rotations with respect to the reference Z-axis.”

      Recommendation 3: The authors should include the version of Alphafold that they used to perform the structural predictions. Predictions, especially for multimers, have improved in the newest version, and it could be expected that further improvements will occur in the future. Including the version used here will act as a timestamp.

      We have now updated the methods to include the version:

      pg 14: “Alpha fold predictions of 300 AA long dimer fragments with 50 AA overlap were generated using colabfold 4 that uses a modified version of alphaFold2. To run the large number of sequences we used a customized script called alphascreen (version 1.15) available at https://github.com/samichaaban/alphascreen.”

      Recommendation 4: Figure 2G is not so clear in depicting two offset D bands. The authors could include a more zoomed-out image to make it clearer.

      Response: We have now included a more zoomed out image in the supplement (Fig S3A).

      We updated the figure legend of Fig 2G and Fig S3A: pg 5: “(G) Example where D1 aligns with D2 of a neighboring sub-fiber. Larger view in Fig S3A.”

      pg 20: “(A) Tomogram slice and segmentation where D1 aligns with D2 of a neighboring sub-fiber. The dotted square marks the location of Fig 2G. (B)”

      Recommendation 5: Did the authors attempt to predict the structure of rootletin oligomers? i.e. folding four rootletin fragments at once instead of two? This could be interesting.

      Response: We attempted to predict interactions between all combinations of rootletin fragments. We did this for two fragment (e.g. CC1+CC1 or CC1+CC2) and four fragment (e.g. CC1+CC1+CC1+CC1 or CC1+CC1+CC2+CC2) combinations.

      Homodimer combinations (e.g. CC1+CC1) were predicted with most confidence. We did not identify any higher oligomerization. AlphaFold did not identify interactions that were previously proposed in the literature–for example between two CC3 dimers (Ko et al., 2020) or weak interactions between CC2 and CC3 (Yang et al., 2002). These interactions were either not properly predicted or may require additional proteins other than the ones we tested (CCDC102B, CEP68, beta-catenin, ARL2, centlein). 

      We have updated our methods to include our AlphaFold attempts:

      Pg 14: “This setup was used to predict interactions for dimeric and oligomeric combinations of rootletin fragments (e.g. CC2+CC2, CC3+CC4, CC1+CC1+CC1+CC1, CC3+CC3+CC4+CC4 etc). Homodimeric and oligomeric combinations were tested with other proteins identified as putative rootletin-binding: CCDC102B, CEP68, beta-catenin, ARL2, centlein. In our hands, only homodimeric rootletin fragment combinations resulted in confident predictions.”

      Reviewer #3 (Public reviews):

      Summary:

      The study offers a compelling molecular model for the organization of rootlets, a critical organelle that links cilia to the basal body. Striations have been observed in rootlets, but their assembly, composition, and function remain unknown. While previous research has explored rootlet structure and organization, this study delivers an unprecedented level of resolution, valuable to the centrosome and cilia field. The authors isolated rootlets from mice's eyes. They apply EM to partially purified rootlets (first negative stain, then cryoET). From these micrographs, they observed striations along the membranes along the rootlet but no regular spacing was observed.

      The thickness of the sample and membranes prevented good contrast in the tomograms. Thus they further purified the rootlets using detergent, which allowed them to obtain cryoET micrographs of the rootlets with greater details. The tomograms were segmented and further processed to improve the features of the rootlet structures. From their analysis, they described 3 regular cross-striations and amorphous densities, which are connected perpendicularly to filaments along the length of the rootlets. They propose that various proteins provide the striations and rootletin (mouse homolog of human cnap1) forms parallel coiled coils that run along the rootlet. Overall their data provide a detailed model for the molecular organization of the rootlet.

      The major strength is that this high-quality study uses state-of-the-art cryo-electron tomography, subtomogram averaging, and image analysis to provide a model of the molecular organization of rootlets. The micrographs are exceptional, with excellent contrast and details, which also implies the sample preparation was well optimized to provide excellent samples for cryo-ET. The manuscript is also clear and accessible.

      To further validate their model, it would have been useful to identify some components in the EM maps through complementary approaches (mass spectrometry, mutants disrupting certain features, CLEM). Some potential candidates are mentioned in the discussion.

      This research marks a significant step forward in our understanding of rootlets' molecular organization.

      Response: We agree with the reviewer that it would be ideal to identify rootlet components in the EM densities using complementary approaches. Prior to submitting the manuscript, we attempted several approaches, the details of which are described below:

      We performed mass spectrometry on our purified rootlets. This identified the rootlet components rootletin and CCDC102B and various axonemal components, due to the association between the rootlet and axoneme. However, due to the limitations in quantifying components using mass spectrometry, we were unable to confidently identify novel rootlet constituents present in quantities comparable to rootletin.

      We further attempted cross-linking mass spectrometry on the rootlets to gain deeper insights to the interactions between rootletin molecules. Unfortunately, this effort resulted in a completely insoluble sample despite extended digestion times, leading to issues with mass spectrometry column clogging and rendering our results inconclusive.

      We attempted to express rootlet components recombinantly and were able to purify fibres, but they did not contain the characteristic repeat pattern seen in native rootlets. We also considered purifying native rootlets from cultured cells, but we were unable to obtain sufficient sample for cryoET imaging.

      We therefore regret that other approaches to validate our model are outside the scope of this current work.

      Reviewer #3 (Recommendations for the authors):

      Recommendation 1: There are some problems with spaces in references in the methods.

      Response: We have thoroughly checked the methods and manuscript for double spaces and corrected this.

      Recommendation 2: Figure 1A, the figure would benefit from more labelling, to show the reader the basal body and nucleus.

      Response: We have now added the labels "basal bodies" and "Nucleus" to the cartoon in Fig 1A.

    1. eLife assessment

      Hepatocellular carcinoma (HCC) is a particularly aggressive form of cancer, with an increasing number of treatment options approved for use in patients over the past decade. However, the biology of HCC and identifiable therapeutic targets have not been as clear, even in the era of molecular oncology. Likewise, the cellular biology of HCC, including the role of intercellular communication, has not been well elucidated. In this compelling study, Dantzer et al. provide fundamental insight into the role of beta-catenin on intercellular communication occurring via extracellular vesicles, with implications for immune evasion in a cancer increasingly being treated using immuno-oncologic agents.

    2. Reviewer #1 (Public Review):

      Summary:

      This finding shows a connection between cancer associated beta-catenin mutations extracellular vesicle secretion. A link between the beta-catenin mutation and expression of trafficking and exocytosis machinery. They used a multidisciplinary approach to explore expression levels of relevant proteins and single particle imaging to directly explore the release of extracellular vesicles. These results suggest a role of extracellular vesicles in immune evasion in liver cancer with the role needing to be further explored in other forms of cancer. I find this work to be compelling and of strong significance.

      Strengths:

      This paper uses multidisciplinary methods to demonstrate a compelling role of beta-catenin mutations in suppressing EV secretion in tumors. The results and imaging are extremely convincing and compelling.

    3. Reviewer #2 (Public Review):

      Summary:

      Dantzer and colleagues are investigating the pivotal role of ß-catenin, a gene that undergoes mutation in various cancer cells, and its influence on promoting the evasion of immune cells. In their initial experiments, the authors developed a HepG2 mutated ß-catenin KD model, conducting transcriptional and proteomic analyses. The results revealed that the silencing of mutated ß-catenin in HepG2 cells led to an up-regulation in the expression of exosome biogenesis genes.

      Furthermore, the researchers verified that these KD cells exhibited an increased production of exosomes, with the mutant form of ß-catenin concurrently decreasing the expression of SDC4 and Rab27a. Intriguingly, applying a GSK inhibitor to the cells resulted in reduced expression of SDC4 and Rab27a. Subsequent findings indicated that mutated ß-catenin actively facilitates immune escape through exosomes, and silencing exosome biogenesis correlates with a decrease in immune cell infiltration.<br /> In a crucial clinical correlation, the study demonstrated that patients with ß-catenin mutations exhibited low levels of exosome biogenesis.

      Strengths:

      Overall, the data robustly supports the outlined conclusions, and the study is commendably designed and executed. However, there are a few suggestions for manuscript improvement.

      Weaknesses: No weakness

    4. Reviewer #3 (Public Review):

      Summary:

      In this very important study by Dantzer et al., 'Emerging role of oncogenic b-catenin in exosome biogenesis as a driver of immune escape in hepatocellular carcinoma' the authors define a role for oncogenic b-catenin on exosome biology and explore the link between reduce exosome secretion and tumor immune cell evasion. Using transcriptional and proteomic analysis of hepatocellular carcinoma cells with either oncogenic or wildtype b-catenin the authors find that oncogenic b-catenin negatively regulates exosome biogenesis.

      The authors can provide compelling evidence that oncogenic b-catenin in different hepatocellular carcinoma cells negatively regulates exosome biogenesis and secretion, by downregulation of, amongst others, SDC4 and RAB27A, two proteins involved in exosome biogenesis. The authors corroborate these results by inducing b-catenin activation using CHIR99021 in a hepatocarcinoma cell line with non-oncogenic bCatenin (Huh7 cells). The authors can further demonstrate convincingly that reduction in exosome release by hepatocarcinoma spheroids leads to a reduction in immune cell infiltration into the tumor spheroid.

      Strengths:

      This is a very important and well-conceived study, that appeals to a readership beyond the field of hepatocarcinoma. The authors demonstrate a compelling link between oncogenic bCatenin and exosome biogenesis. Their results are convincing and with well-designed control experiments. The authors included various complementary lines of investigation to verify their findings.

      Weaknesses:

      One limitation of this study is that the mechanistic relationship of exosome release and how they affect immune cells remains to be elucidated. In this context, the authors conclusions rest on the assumption that hepatocarcinoma immune evasion is based exclusively on the reduced number of exosomes. However, the authors do not analyze exosome composition between exosomes of wildtype and oncogenic background, which could be different.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      While the role of Rab27 was strongly examined, the hits of the VAMP proteins were not explored in detail. I was wondering if the decrease in the presence of VAMPS directly suggests the final step of membrane fusion in the exocytosis of EVs is what is being impaired. Or if it is other trafficking steps along the EV secretion pathway.

      We appreciate the relevance of this comment and we agree that the decrease of VAMP gene expression in the β-catenin-mutated HepG2 cells could suggest an impairment of the final membrane fusion step in exocytosis of EVs. We have therefore expanded this important point in the discussion (page 10). Indeed, we identified an upregulation of VAMP2, VAMP5 and VAMP8 expressions after mutated β-catenin depletion in the transcriptomic analysis of HepG2 cells. However, these proteins were not detected in the mass spectrometry analysis. Only VAMP3 and VAMP7 proteins were detected in the proteomic analysis without any variation. This is why we didn't focus on this trafficking step, but it could be interesting to explore it further in the future. 

      Reviewer 2:

      (1) In Figure 1F, it is essential to investigate why mass spectrometry analysis indicated no significant changes in SDC4 levels.

      We agree with the reviewer that indeed whereas we did observe a significant alteration of syndecan-4 expression at the mRNA level, we did not observe significant changes in syndecan-4 levels by mass spectrometry. One possible explanation is that heparan sulfate proteoglycans like syndecan-4 exhibit a high degree of structural heterogeneity due to the biosynthetic process that produces linear polysaccharides. This characteristic can alter the robustness of mass spectrometry analyses, leading to greater variability. 

      (2) Figure 2G lacks clarity in explaining how the quantification of MVBs (multivesicular bodies) was conducted.

      We apologize for the lack in clarity in explaining how the quantification of MVBs was conducted in figure 2G. The Materials and methods section (part electron microscopy-cells, page 23) has been modified in order to emphasize this point.

      (3) In Supplementary Figure 1F, there is a suggestion to highlight exosomes using arrowheads for enhanced clarity.

      According to the reviewer’s suggestions, we added arrowheads on supplementary figure 1F in order to highlight the exosomes (page 16). This indeed improves clarity.

      (4) Figure 3C prompts a question about the peculiar appearance of Actin staining in KD cells, requiring further investigation.

      The peculiar appearance of this intense phalloidin staining between hepatocytes corresponds to bile canaliculi (BC), features of more differentiated HepG2 cells. As phalloidin-stained BC are very bright, this may diminish the visibility of other, thinner actin structures. We decided to change the image of KD cells for a more relevant one (new Figure 3C).

      (5) An intriguing avenue for exploration is suggested in testing how the treatment of a GSK inhibitor on HepG2 cells might impact Rab27a and SDC4 expression.

      We appreciate the relevance of the suggestion in testing how the treatment of a GSK inhibitor on HepG2 cells might impact Rab27a and SDC4 expression. According to the reviewer’s suggestions, experiments have been carried out and the data are presented in Author response image 1 below. In HepG2 cells, GSK inhibitor stabilized the wild-type β-catenin protein but surprisingly the mutated form of β-catenin is slightly decreased (Author response image 1A). Regarding the expression levels of both Rab27a and SDC4 mRNA, a small increase is observed (Author response image 1B). Rab27a protein is also increased upon the treatment with a GSK inhibitor on HepG2 cells (Author response image 1C). This increased in expression could be due to the decrease of the mutated form of β-catenin in HepG2 cells confirming that Rab27a and SDC4 are repressed by the mutated β-catenin. 

      Author response image 1.

      Impact of a GSK inhibitor (CHIR99021) on Rab27a and syndecan-4 (SDC4) expressions in HepG2 cells. HepG2 cells were treated by 3 µM CHIR990221 or DMSO as control for 48h. A) Western-blot (upper panel) and quantification (lower panel) of wild-type (WT) and mutated (MUT) β-catenin proteins in HepG2 cells treated with DMSO (control) or with CHIR990221. B) qRT-PCR analysis of Rab27a and SDC4 expression in HepG2 cells treated with DMSO (control) or with CHIR990221. C) Western-blot (left panel) and quantification (right panel) of Rab27a protein in HepG2 cells treated with DMSO (control) or with CHIR990221. *P<0.05

      Reviewer 3:

      (1) One limitation of this study is that the mechanistic relationship of exosome release and how they affect immune cells remains to be elucidated. In this context, the authors conclusions rest on the assumption that hepatocarcinoma immune evasion is based exclusively on the reduced number of exosomes. However, the authors do not analyze exosome composition between exosomes of wild type and oncogenic background, which could be different.

      We agree that the mechanistic relationship of exosome release and how they affect immune cells remains to be elucidated. In the discussion we mentioned that the content of ß-catenin-regulated EVs remains to be explored to fully understand their function in the immunomodulation of the tumor microenvironment. In this line, we have ongoing experiments in order to analyse the exosomal content in term of proteins and microRNAs. According to our preliminary results, we are able to say  that the exosome composition in knock-down mutated ß-catenin HepG2 cells compared to control HepG2 cells seems to be different suggesting not only an involvement of the number of exosomes in the immunomodulation but also of their content. 

      (2) The manuscript would benefit from minor language editing and the introduction from restructuring to enhance clarity.

      The manuscript has now benefited from a language editing thanks to the Professor William A. Thomas (Colby-Sawyer College, New Hampshire). Acknowledgments have been modified (page 12) to thank the Professor William A. Thomas for proof- reading of the manuscript. The introduction has been also restructured and modified according to the reviewer's suggestions to enhance clarity (page 3).

      (3) I believe that within the abstract, the authors mean 'defect' not 'default' in the sentence: Then, we demonstrated in 3D spheroid models that activation of β-catenin promotes a decrease of immune cell infiltration through a default in exosome secretion.

      We apologize for the mistake between 'default' and 'defect' in the abstract. The abstract has been modified accordingly.

      (4) Within the 'Introduction' part of the manuscript, the authors might consider reviewing and reorganizing the first paragraph for more clarity - I suggest leading with the first three sentences of the second paragraph (HCC is the most...) and then introducing b-catenin and the effects and implications of oncogenic ß-catenin in HCC.

      If the authors prefer the current structure of the 'Introduction', I would like to propose exchanging some of the wording:

      -In line 4: 'despite' instead of 'in front of'? Sentence: Thus, in front of the therapeutic revolution for cancers, with the emergence of immunotherapy and more particularly immune checkpoint inhibitors (anti-PD1, anti-PD-L1)

      -Additionally in line 7: In these tumors, the oncogenic β-catenin is able to set up a microenvironment that favors tumor progression notably by promoting immune escape. Here, 'establish' might be a better choice instead of 'set up' - In line 9 I suggest rephrasing the sentence: Few studies have reported that the defect of intercellular communication between cancer cells and immune cells is partly mediated by a decrease of chemokines production leading to a reduction of immune infiltrates.... and maybe adding a reference here.

      The introduction has been altered accordingly. Thanks for these suggestions that helped us to improve our manuscript.

    1. eLife assessment

      The study elucidates a detailed molecular mechanism of the initial stages of transport in the medically relevant Na+-coupled GABA neurotransmitter transporter GAT1 and thus generates useful new insights into this protein family. In particular, it presents convincing evidence for the presence of a "staging binding site" that locally concentrates Na+ ions to increase transport activity, whilst solid evidence for how Na+ binding influences larger scale dynamics.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript authored by Stockner and colleagues delves into the molecular simulations of Na+ binding pathway and the ionic interactions at the two known sodium binding sites site 1 and site 2. They further identify a patch of two acidic residues in TM6 that seemingly populate the Na+ ions prior to entry into the vestibule. These results highlight the importance of studying the ion-entry pathways through computational approaches and the authors also validate some of their findings through experimental work. They observe that sodium site 1 binding is stabilized by the presence of the substrate in the s1 site and this is particularly vital as the GABA carboxylate is involved in coordinating the Na+ ion unlike other monoamine transporters and binding of sodium to the Na2 site stabilizes the conformation of the GAT1 by reducing flexibility among the helical bundles involved in alternating access.

      Strengths:

      The study displays results that are generally consistent with available information from experiments on SLC6 transporters particularly GAT1 and puts forth the importance of this added patch of residues in the extracellular vestibule that could be of importance to the ion permeation in SLC6 transporters. This is a nicely performed study and could be improved if the authors could comment on and fix the following queries.

      Comments on revised version:

      The authors have satisfactorily addressed my comments and this has significantly improved the clarity of the manuscript.

      The only point that I would like to inquire about is the role of EL4 in modulating Na+ entry. In the simulations do the authors see no role of EL4 in controlling Na+ entry. It is particularly intriguing as some studies in the recent past displayed charged mutations in EL4 of dDAT, SERT and GAT1 as being detrimental for substrate entry/uptake. It would therefore be nice to add a small discussion if there is any role for EL4 in Na+ entry.

    3. Reviewer #2 (Public Review):

      Summary

      Starting from an AlphaFold2 model of the outward-facing conformation of the GAT1 transporter, the authors primarily use state-of-the-art MD simulations to dissect the role of the two Na+ ions that are known to be co-transported with the substrate, GABA (and a co-transported Cl- ion). The simulations indicated that Na+ binding to OF GAT depends on the electrostatic environment. The authors identify an extracellular recruiting site including residues D281 and E283 which they hypothesized to increase transport by locally increasing the available Na+ concentration and thus increasing binding of Na+ to the canonical binding sites NA1 and NA2. The charge-neutralizing double mutant D281A-E283A showed decreased binding in simulations. The authors performed GABA uptake experiments and whole-cell patch clamp experiments that taken together validated the hypothesis that the Na+ staging site is important for transport due to its role in pulling in Na+.

      Detailed analysis of the MD simulations indicated that Na+ binding to NA2 has multiple structural effects: The binding site becomes more compact (reminiscent of induced fit binding) and there is some evidence that it stabilizes the outward-facing conformation.

      Binding to NA1 appears to require the presence of the substrate, GABA, whose carboxylate moiety participates in Na+ binding; thus the simulations predict cooperativity between binding of GABA and Na+ binding to NA1.

      Strengths

      - MD simulations were used to propose a hypothesis (the existence of the staging Na+ site) and then tested with a mutant in simulations AND in experiments. This is an excellent use of simulations in combination with experiments.

      - A large number of repeat MD simulations are generally able to provide a consistent picture of Na+ binding. Simulations are performed according to current best practices and different analyses illuminate the details of the molecular process from different angles.

      - The role of GABA in cooperatively stabilizing Na+ binding to the NA1 site looks convincing and intriguing.

      Weaknesses

      - Assessing the effects of Na+ binding on the large scale motions of the transporter is more speculative because the PCA does not clearly cover all of the conformational space and the use of an AlphaFold2 model may have introduced structural inconsistencies. For example, it is not clear if movements of the inner gate are due to a AF2 model that's not well packed or really a feature of the open outward conformation.

      - Quantitative analyses are difficult with the existing data; for example, the tICA "free energy" landscape is probably not converged because unbinding events haven't been observed.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study elucidates a detailed molecular mechanism of the initial stages of transport in a medically relevant GABA neurotransmitter transporter GAT1 and thus generates useful new insights for this protein family. In particular, it presents convincing evidence for the presence of a "staging binding site" that locally concentrates Na+ ions to increase transport activity, whilst solid evidence for how Na+ binding affects the larger scale dynamics.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript authored by Stockner and colleagues delves into the molecular simulations of Na+ binding pathway and the ionic interactions at the two known sodium binding sites site 1 and site 2. They further identify a patch of two acidic residues in TM6 that seemingly populate the Na+ ions prior to entry into the vestibule. These results highlight the importance of studying the ion-entry pathways through computational approaches and the authors also validate some of their findings through experimental work. They observe that sodium site 1 binding is stabilized by the presence of the substrate in the S1 site and this is particularly vital as the GABA carboxylate is involved in coordinating the Na+ ion unlike other monoamine transporters and binding of sodium to the Na2 site stabilizes the conformation of the GAT1 by reducing flexibility among the helical bundles involved in alternating access.

      Strengths:

      The study displays results that are generally consistent with available information from experiments on SLC6 transporters particularly GAT1 and puts forth the importance of this added patch of residues in the extracellular vestibule that could be of importance to the ion permeation in SLC6 transporters. This is a nicely performed study and could be improved if the authors could comment on and fix the following queries.

      We thank our reviewer for the overall positive evaluation.

      Weaknesses:

      (1) How conserved are the residue pair of D281-E283 in other SLC6 transporters. The authors commented on the presence of these residues in SERT but it would be nice to know how widespread these residues are in other SLC6 transporters like NET, GlyT, and DAT.

      We have created a sequence alignment of the entire human SLC6 family (Supplementary Figure 1) and found that E283 is polar or charged in all SLC6 transporters. D281 shows a higher level of conservation across the family compared to E283. D281 is negatively charged in approximately 50% of the SLC6 family members, an aspartate in all GABA transporters and a glutamate in all monoamine transporters.

      (2) Further, one would like to see the effect of individual mutations D281A and E283A on transport, surface expression, and EC50 of Na+ to gauge the effect on transport.

      We have carried out experiments to investigate the effects of the individual mutations. The results revealed intermediate effects between WT and the double mutant (D281A-E283A) and showed that the effects mostly align with the degree of conservation, as a neutralisation of D281 by alanine has a stronger effect than the E283A mutant. Both single mutants had minimal effects on the sodium dependence of uptake, D281A had a stronger effect on expression, Km and Vmax as compared to E283. Only D281A reduced surface expression, while E283A expresses to a similar level as wild type GAT1.

      (3) A clear figure of the S1 site where Na+ tends to stay prior to Na1 site interactions needs to be provided with a clear figure. Further, it is not entirely clear how access to S1 is altered if the transporter is in an outwardoccluded conformation if F294 is blocking solvent access. Please comment.

      We have modified the structural images in Figure 1, 5, 6 and 7 to improve their comprehensibility. We have also added a comment on the role of F294 as part of the outer hydrophobic gate to the discussion. In short, F294 does not occlude the passage to the S1 as long as GAT1 is outward open, and we find that GAT1 is outward open in all sodium binding simulations.

      (4) The p-value of the EC50 differences between GAT1WT and GAT1double mutant need to be mentioned. The difference in sodium dependence EC50 seems less than twofold, and it would be useful to mention how critical the role of the recruitment site is. Since the transport is not affected the site could play a transient role in attracting ions.

      We have added p-values or standard deviation to our data.

      (5) It would be very nice to know how K+ ions are attracted by this recruitment site. This could further act as a control simulation to test the preference for Na+ ions among SLC6 members.

      We think that attraction of potassium to the recruitment site is not of relevance, as the residues are at the extracellular side and exposed to bulk, where the concentration of sodium is high (typically 130-150 mM), while the concentration of potassium is very small (3-5 mM). Exploring sodium binding by simulations for all SLC6 members could be interesting, but clearly outside the scope of this manuscript.

      (6) Some of the important figures are not very clear. For instance, there should be a zoomed-in view of the recruitment site. The current one in Fig. 1b and 1c could be made clearer. Similarly as mentioned earlier the Na residence at the S1 site away from the Na1 and Na2 sites needs to be shown with greater clarity by putting side chain information in Fig. 6d.

      We have modified the structural images in Figure 1, 5, 6 and 7 to improve their comprehensibility.

      (7) The structural features that comprise the two principal components PC1 and PC2 should be described in greater detail.

      We have modified Figure 6 and added images that show the motions along PC1 and PC2. In addition, these are now better explained in the text.

      Reviewer #2 (Public Review):

      Summary:

      Starting from an AlphaFold2 model of the outward-facing conformation of the GAT1 transporter, the authors primarily use state-of-the-art MD simulations to dissect the role of the two Na+ ions that are known to be cotransported with the substrate, GABA (and a co-transported Cl- ion). The simulations indicated that Na+ binding to OF GAT depends on the electrostatic environment. The authors identify an extracellular recruiting site including residues D281 and E283 which they hypothesized to increase transport by locally increasing the available Na+ concentration and thus increasing binding of Na+ to the canonical binding sites NA1 and NA2. The charge-neutralizing double mutant D281A-E283A showed decreased binding in simulations. The authors performed GABA uptake experiments and whole-cell patch clamp experiments that taken together validated the hypothesis that the Na+ staging site is important for transport due to its role in pulling in Na+.

      Detailed analysis of the MD simulations indicated that Na+ binding to NA2 has multiple structural effects: The binding site becomes more compact (reminiscent of induced fit binding) and there is some evidence that it stabilizes the outward-facing conformation.

      Binding to NA1 appears to require the presence of the substrate, GABA, whose carboxylate moiety participates in Na+ binding; thus the simulations predict cooperativity between binding of GABA and Na+ binding to NA1.

      Strengths:

      -  MD simulations were used to propose a hypothesis (the existence of the staging Na+ site) and then tested with a mutant in simulations AND in experiments. This is an excellent use of simulations in combination with experiments.

      -  A large number of repeat MD simulations are generally able to provide a consistent picture of Na+ binding. Simulations are performed according to current best practices and different analyses illuminate the details of the molecular process from different angles.

      -  The role of GABA in cooperatively stabilizing Na+ binding to the NA1 site looks convincing and intriguing.

      We thank the review for the very supportive assessment.

      Weaknesses:

      -  Assessing the effects of Na+ binding on the large-scale motions of the transporter is more speculative because the PCA does not clearly cover all of the conformational space and the use of an AlphaFold2 model may have introduced structural inconsistencies. For example, it is not clear if movements of the inner gate are due to an AF2 model that's not well packed or really a feature of the open outward conformation.

      The long range effect of sodium binding to GAT1 and destabilisation of the inner gate has, based on our data, a causal effect. PCA separates conformational motions into degrees of freedom and sorts them according to the largest motions. Motions of TM5a were among the 2 largest motions, which suggests that these are relevant motions. To directly quantify their behaviour, we measured informative distances at the inner gate of GAT1, as shown in Figure 6i,j,k and separated data according to the presence of sodium in NA2.

      For the following reasons we exclude that the results are a consequence of structural inconsistencies introduced by AlphaFold2 and therefore not reflecting functionally relevant effects:

      (1) If depending on the model instead of sodium binding, the effects should not be correlated with the presence of sodium in the NA2 binding site.

      (2)  We carried out new simulations starting from the occluded GAT1 structure (Figure 6j,k). The data shows that in the occluded state the distance across the inner vestibule and the length of TM5a differ, consistent with our interpretation of the data. As sodium binding fixes GAT1 outwardfacing, as it also occurs in other SLC6 family members (Szöllősi and Stockner, 2022), the distances of the outward-open GAT1 are at the short extreme of the scale, distances of the inward-open state of the cryo-EM structure(s) are at the other extreme, while the occluded conformation of GAT1 shows intermediate values.

      (3)  We have observed the same property in SERT, for which we used experimental structures as starting structure (Gradisch et al., 2024), suggesting that this could be a generally mechanism.

      (4)  All available structures from the entire SLC6 family are consistent with structural effects of TM5a in response to bundle domain motions and therefore to binding of sodium to NA2 as it stabilized the outward-open state as well as transition to the inward facing conformation.

      - Quantitative analyses are difficult with the existing data; for example, the tICA "free energy" landscape is probably not converged because unbinding events haven't been observed.

      Simulations can always be too short and therefore not fully describe the complete underlying conformational ensemble. We added a statement in the discussion indicating this shortcoming. With respect to the tICA analysis in our manuscript, the tICA approach does, by design, not need long simulations that capture the full binding and unbinding in multiple instances to construct a correct free energy landscape. Instead, the tICA method builds on Markov chain dependencies and relies only on the convergence of transitions between hundreds of conformational microstates and the fluxes between them. The free energy profile derived for the S1, including NA1, TMP and NA2 and up to the salt bridge of the outer gate is well converged and we observed many transitions. In contrast, the entry from the recruitment side to the S1 has most likely a too low density of microstate and a too small number of transition to be considered converged with respect to quantifying the free energy of binding from bulk. We now explain this shortcoming.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for The Authors):

      Authors should furnish p-values in the figure legends for experimental results.

      We have added the p-values to text and figure legends.

      Reviewer #2 (Recommendations For The Authors):

      -  Deposit simulation data in a public repository (input files, trajectories (possibly subsampled)).

      We deposited the data to Zenodo and provided the DOI: 10.5281/zenodo.10686813 to the data. As we were unable to upload the trajectories to zenodo, we deposited the starting and the end structures of the simulations.

      -  Please include a short discussion of the reliability of using an AF2 model instead of experimental structures. What is expected to be correct/which parts of the structure are potentially incorrect? What makes you think that the AF2 model is a good model of the OF conformation of GAT1?

      Unfortunately, an outward-facing structure of GAT1 is not available. We have initially worked with an outward-open homology model of GAT1 based on SERT (build with MODELLER), but the structural differences between SERT and GAT1 are sufficiently large that these models did not behave well in simulations and too frequently could not maintain a sealed inner gate, also forming a channel. In contrast to the SERT-based GAT1 model, the AlphaFold2 model of GAT1 behaved as expected and consistent with the behaviour of SERT in simulations and with general knowledge of protein dynamics from literature. Based on structural analysis of our simulations and on the comparison to SERT we could not identify a region of GAT1 which would be potentially behave incorrect or unexpectedly. We added a statement to the discussion on this potential limitation of the use of homology models.

      -  Fig 1a: Na+ densities are not very clear (both due to small size and the transparency). I have a hard time seeing where bulk, 2*bulk regions are --- are you showing "onion shells" of density? Perhaps investigate presenting as cuts through the full density?

      I like the labelling in terms of absolute density and multiples of bulk.

      We have created new images to improve the visualisation of data. The data are shown as onion shells (isosurface), with the shells at the indicated densities. This is now clearly stated. Transparency is needed, otherwise e.g. the inner onion shells would not be visible. The cut-through is intuitive, but we could not find a useful plain, as the densities are too extensively distributed in 3D and not on a single plain.

      -  Fig 1h-k: would be clearer if "recruitment site" (TMP?) was indicated in the figure.

      We have created a new image for the recruiting site (Figure 1b,c) and temporary site (Figure 1g) and indicated these two sites as appropriate.

      -  Show time series of Na+ binding with a suitable order parameter (z or distances to NA1 and NA2?) to show how ions bind spontaneously. Mark the different sites. Mark pre- and post-binding parts of trajectories.

      We have added time series for every simulation that shows sodium binding to the NA1 or NA2 to the supplementary information Figure 2a,b,c. These quantify the distances to the recruiting site, the temporary site and the respective sodium binding site.

      -  PCA - how much of the total variance was captured by PC1 and PC2?

      The variance captured by the PCs are shown as eigenvalues in supplementary information Figure 4. PC1 captures about 19% of the variance, PC2 8%.

      -  "We found that the inner hydrophobic gate is dynamic in the absence of Na2" -- is this instability due to the AF2 model or likely realistic? E.g. was similar behaviour ever observed in simulations of the occluded state?

      In simulations of the occluded state we do not see such instabilities as observed in the outward-open state in the absence of sodium (Figure 6). As these larger scale fluctuations are not randomly distributed across all simulations starting from the AlphaFold2 models, but confined to the systems without sodium, it is unlikely an effect of the AlphaFold2 model.

      Please note, we have seen comparable behaviour in simulations of SERT starting from experimental structures (Gradisch et al., 2024), therefore suggesting a more general mechanism.

      -  Cooperativity between GABA-binding and Na+ binding to NA1: How would this lead to an experimentally measurable signature, i.e., which experiments could validate this interesting prediction?

      Direct detection of cooperativity is difficult to separate from other effects in experiments, as sodium binding and transport involves NA1 and NA2, NA2 has a higher affinity according to our data, while mutations will not only affect cooperativity, but will also have other effects.

      Conformational changes can also complicate experimental detection, as NA2 stabilises the outward-open conformation, while NA1+GABA binding triggers the transition to the inward-open state. To quantify cooperativity, it would be important to isolate the cooperative from all other effects, which is a challenge. Support for cooperativity has been found by (Zhou, Zomot and Kanner, 2006; Meinild and Forster, 2012) using this route. In the first paper the authors make use of lithium that only binds to the NA2, even though lithium is not only a mere NA2 selective ligand and otherwise identical to sodium. By comparing two GABA concentrates the authors showed that the sodium dependence of GABA transport is left shifted at higher GABA concentrations, which is not the case in the absence of lithium. This data is indirect, but consistent with cooperativity between GABA and NA1-bound sodium, as GABA transport mainly reflects binding of sodium to NA1. Similar approaches could be further explored, for example by varying the GABA concentration instead of sodium. Other options could be to create an outward-facing and conformationally locked GAT1 and to measure the cooperativity of sodium and GABA binding using for example the scintillation proximity assay. Most likely the assay would also need a way to be NA2 binding independent. We are not aware of such a GABA transporter system.

      -  There are some instances of [SI Figure] or [citation needed] that should be cleaned up.

      We have corrected these instances.

      References

      Gradisch, R. et al. (2024) ‘Ligand coupling mechanism of the human serotonin transporter differentiates substrates from inhibitors’, Nature Communications, 15(1), p. 417. Available at: https://doi.org/10.1038/s41467-023-44637-6.

      Meinild, A.-K. and Forster, I.C. (2012) ‘Using lithium to probe sequential cation interactions with GAT1’, American Journal of Physiology. Cell Physiology, 302(11), pp. C1661-1675. Available at: https://doi.org/10.1152/ajpcell.00446.2011.

      Szöllősi, D. and Stockner, T. (2022) ‘Sodium Binding Stabilizes the Outward-Open State of SERT by Limiting Bundle Domain Motions’, Cells, 11(2), p. 255. Available at: https://doi.org/10.3390/cells11020255.

      Zhou, Y., Zomot, E. and Kanner, B.I. (2006) ‘Identification of a lithium interaction site in the gamma-aminobutyric acid (GABA) transporter GAT-1’, The Journal of Biological Chemistry, 281(31), pp. 22092–22099. Available at: https://doi.org/10.1074/jbc.M602319200.

    1. eLife assessment

      In this potentially important study, the authors report results of QM/MM simulations and kinetic measurements for the phosphoryl-transfer step in adenylate kinase. The results point to the mechanistic proposal that the transition state ensemble is broader in the most efficient form of the enzyme (i.e., in the presence of Mg2+ in the active site) and thus a different activation entropy. With a broad set of computations and experimental analyses, the level of evidence is considered solid by some reviewers. On the other hand, there remain limitations in the computational analyses, especially regarding free energy profiles using different methodologies and the activation entropy, leading some reviewers to the evaluation that the level of evidence is incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      This study investigated the phosphoryl transfer mechanism of the enzyme adenylate kinase, using SCC-DFTB quantum mechanical/molecular mechanical (QM/MM) simulations, along with kinetic studies exploring the temperature and pH dependence of the enzyme's activity, as well as the effects of various active site mutants. Based on a broad free energy landscape near the transition state, the authors proposed the existence of wide transition states (TS), characterized by the transferring phosphoryl group adopting a meta-phosphate-like geometry with asymmetric bond distances to the nucleophilic and leaving oxygens. In support of this finding, kinetic experiments were conducted with Ca2+ ions at different temperatures and pH, which revealed a reduced entropy of activation and unique pH-dependence of the catalyzed reaction.

      Strengths:

      A combined application of simulation and experiments is a strength.

      Weaknesses:

      The conclusion that the enzyme-catalyzed reaction involves a wide transition state is not sufficiently clarified with some concerns about the determined free energy profiles compared to the experimental estimate. (See Recommendations for the authors.)

    3. Reviewer #2 (Public Review):

      Summary:

      The authors report results of QM/MM simulations and kinetic measurements for the phosphoryl-transfer step in adenylate kinase. The main assertion of the paper is that a wide transition state ensemble is a key concept in enzyme catalysis as a strategy to circumvent entropic barriers. This assertion is based on observation of a "structurally wide" set of energetically equivalent configurations that lie along the reaction coordinate in QM/MM simulations, together with kinetic measurements that suggest a decrease of the entropy of activation.

      Strengths:

      The study combines theoretical calculations and supporting experiments.

      Weaknesses:

      The current paper hypothesizes a "wide" transition state ensemble as a catalytic strategy and key concept in enzyme catalysis. Overall, it is not clear the degree to which this hypothesis is fully supported by the data. The reasons are as follows:

      (1) Enzyme catalysis reflects a rate enhancement with respect to a baseline reaction in solution. In order to assert that something is part of a catalytic strategy of an enzyme, it would be necessary to demonstrate from simulations that the activation entropy for the baseline reaction is indeed greater and the transition state ensemble less "wide". Alternatively stated, when indicating there is a "wide transition state ensemble" for the enzyme system - one needs to indicate that is with respect to the non-enzymatic reaction. However, these simulations were not performed and the comparisons not demonstrated. The authors state "This chemical step would take about 7000 years without the enzyme" making it impossible to measure; nonetheless, the simulations of the nonenzymatic reaction would be fairly straight forward to perform in order to demonstrate this key concept that is central to the paper. Rather, the authors examine the reaction in the absence of a catalytically important Mg ion.

      (2) The observation of a "wide conformational ensemble" is not a quantitative measure of entropy. In order to make a meaningful computational prediction of the entropic contribution to the activation free energy, one would need to perform free energy simulations over a range of temperatures (for the enzymatic and non-enzymatic systems). Such simulations were not performed, and the entropy of activation was thus not quantified by the computational predictions. The authors instead use a wider TS ensemble as a proxy for larger entropy, and miss an opportunity to compare directly to the experimental measurements.

    4. Reviewer #3 (Public Review):

      Summary:

      By conducting QM/MM free energy simulations, the authors aimed to characterize the mechanism and transition state for the phosphoryl transfer in adenylate kinase. The qualitative reliability of the QM/MM results has been supported by several interesting experimental kinetic studies. However, the interpretation of the QM/MM results is not well supported by the current calculations.

      Strengths:

      The QM/MM free energy simulations have been carefully conducted. The accuracy of the semi-empirical QM/MM results was further supported by DFT/MM calculations, as well as qualitatively by several experimental studies.

      Weaknesses:

      (1) One key issue is the definition of the transition state ensemble. The authors appear to define this by simply considering structures that lie within a given free energy range from the barrier. However, this is not the rigorous definition of transition state ensemble, which should be defined in terms of committor distribution. This is not simply an issue of semantics, since only a rigorous definition allows a fair comparison between different cases - such as the transition state in an enzyme vs in solution, or with and without the metal ion. For a chemical reaction in a complex environment, it is also possible that many other variables (in addition to the breaking and forming P-O bonds) should be considered when one measures the diversity in the conformational ensemble.

      In the revised ms, the authors included committor analysis. However, the discussion of the result is very brief. In particular, if we use the common definition of the transition state ensemble (TSE) as those featuring the committor around 0.5, the reaction coordinate of the TSE would span a much narrower range than those listed in Table 1. This point should be carefully addressed.

      (2) While the experimental observation that the activation entropy differs significantly with and without the Ca2+ ion is interesting, it is difficult to connect this result with the "wide" transition state ensemble observed in the QM/MM simulations so far. Even without considering the definition of the transition state ensemble mentioned above, it is unlikely that a broader range of P-O distances would explain the substantial difference in the activation entropy measured in the experiment. Since the difference is sufficiently large, it should be possible to compute the value by repeating the free energy simulations at different temperatures, which would lead to a much more direct evaluation of the QM/MM model/result and the interpretation.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a potentially important study that integrates QM/MM free energy simulations and experimental kinetic analyses to probe the nature of phosphoryl transfer transition state in adenylate kinase. The idea that the transition state ensemble encompasses conformations with substantially different structural features (including the breaking/forming bonds) is interesting and potentially applicable to many other enzyme systems. In the current form, however, the study is considered incomplete since the connection between the putative transition state ensemble from the computations and key experimental observables, such as the activation entropy, is not well established.

      Thank you so much for your great professional work as the senior editor. We thank you and the reviewers for carefully reading our manuscript and for very valuable suggestions. In response, we have performed the recommended additional calculations and modified the manuscript as suggested, in order to improve the connection between the transition state ensemble obtained from simulations and experimental observables. Importantly, the new simulations fully corroborate our original findings, and thanks to your work made the revised manuscript stronger and better.

      Below are our point-to-point responses:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study investigated the phosphoryl transfer mechanism of the enzyme adenylate kinase, using SCC-DFTB quantum mechanical/molecular mechanical (QM/MM) simulations, along with kinetic studies exploring the temperature and pH dependence of the enzyme's activity, as well as the effects of various active site mutants. Based on a broad free energy landscape near the transition state, the authors proposed the existence of wide transition states (TS), characterized by the transferring phosphoryl group adopting a meta-phosphate-like geometry with asymmetric bond distances to the nucleophilic and leaving oxygens. In support of this finding, kinetic experiments were conducted with Ca2+ ions (instead of Mg2+) at different temperatures, which revealed a negative entropy of activation. Overall, in its present form, the manuscript has more weaknesses in terms of interpretation of the simulation results than strengths, which need to be addressed by the authors.

      We thank the reviewer for carefully reviewing our manuscript and the great suggestions for the revisions. Thanks to these points raised we are able to submit a revised manuscript addressing all questions.

      There are several major concerns:

      First, the authors' claim that the catalytic mechanism of adenylate kinase (Adk) has not been previously studied by QM/MM free energy simulations is somewhat inaccurate. In fact, two different groups have previously investigated the catalytic mechanism of Adk. The first study, cited by the authors themselves, used the string method to determine the minimum free energy profile, but resulted in an unexpected intermediate; note that they obtained a minimum free energy profile, not a minimum energy profile. The second study (Ojedat-May et al., Biochemistry 2021 and Dulko-Smith et al., J Chem Inf Model 2023) overlaps substantially with the present study, but its main conclusions differ from those of the present study. Therefore, a thorough discussion comparing the results of these studies is needed.

      We thank the reviewer for pointing out two additional articles to the one we had discussed. Accordingly, we have changed the claim that the Adk mechanism was not previously studied using QM/MM, and added a discussion of the latter two citations. Notably, although the general outcome is consistent with our results, the conclusions and details of findings differ. The two additional papers agree with our findings of a concerted TS, and not the metastable intermediate as observed in the QM/MM simulation of Shibanuma et al., 2020.

      The difference of the two papers by Nam/Wolf-Watz and our manuscript pointed out by the reviewer is mainly in the interpretation. Importantly, the authors do not primarily focus on the nature of the Transition State for the P-transfer reaction, but on the connection between the chemical and conformational steps. We have extensively reported on the fact that the conformational changes of lid opening and closing are obviously unrelated to the chemical step, see also our free energy landscape in Fig. 1a. Consequently, there cannot be a coupling. We note that our group had extensively studied the lid opening step both experimentally and computationally before. In contrast, we discover here a fundamental concept for rate enhancement by an optimal enzyme: the reduction in the activation entropy by a wide TSE. New experiments were triggered by this finding, that then delivered experimental validation of this concept.

      In the revised version of the manuscript, and according to the reviewer’s suggestion we expanded our discussion to these two additional papers.

      Second, the interpretation of the TS ensemble needs deeper scrutiny. In general, the TS is defined as the hypersurface separating the reactant and product states. Consequently, if a correct reaction coordinate is defined, trajectories initiated at the TS should have equal probabilities of reaching either the reactant or product state; if an approximate reaction coordinate, such as the distance difference used in this study, is used, recrossing may be introduced as a correction into the probabilities. Thus, in order to establish the presence of a wide TS region, it is necessary to characterize the TS ensemble through a commitment analysis across the TS region.

      We thank the reviewer for suggesting to add a commitment analysis to our calculations. The newly performed commitment analysis is shown in Fig. 4b. The corresponding analysis further strengthens our original findings of the wide TS in the fully active enzyme.

      The relatively flat free energy surface observed near TS in Figures 1c and 2a, may be attributed to the cleavage and formation of P-O bonds relative to the marginally stable phosphorane intermediate, as described in Zhou et al.'s work (Chem Rev 1998, 98:991). This scenario is clearly different from a wide TS ensemble concept. In addition, given the inherent similarity in reactivity of the two oxygens towards the phosphoryl atom, it is reasonable to expect a single TS as shown in Figure 1 - supplement 9, rather than two TSs with a marginally stable intermediate as shown in Figure 1c. Consequently, it remains uncertain whether the elongated P-O bonds observed near the TS and their asymmetry are realistic or potentially an artifact of the pulling/non-equilibrium MD simulations. Further validation in this regard is required.

      The reviewer raises the key issue of how realistic the observation of the wide TSE is, and the possibility of it being a potential artifact of the simulation strategy, and suggests that further validation is required in this regard. According to his/her suggestion, in the revised version we have further validated this key observation by two additional simulations. First, we performed a commitment analysis (see above), and second, we also performed Umbrella Sampling, see Fig. 4a. We consistently observe one wide TSE in the presence of Mg2+, but not in the absence of Mg2+. The fact that this wide TSE is observed with the three strategies (i.e pulling/nonequilibrium MD, commitment analysis, and umbrella sampling) most likely rules out the possibility of an artifact related to the simulation strategy.

      Third, there are several inconsistencies in the free energy results and their discussion. First, the data from Kerns et al. (Kerns, NSMB, 2015, 22:124) indicate that the ATP/AMP -> ADP/ADP reaction proceeds at a faster rate than the ADP/ADP -> ATP/AMP reaction, suggesting that the ADP/ADP state has a lower free energy (approximately -1.0 kcal/mol) compared to the ATP/ATP state. This contrasts with Figure 1c, which shows a higher free energy of 6.0 kcal/mol for the ATP/ADP state. This discrepancy needs to be discussed.

      The reviewer correctly found our experimental result on the equilibrium of about -1 kcal/mol for ADP/ADP relative to ATP/AMP with Mg. Importantly, that was measured at a pH of 7. With a pKA of about 7.2 for ADP, under these experimental conditions more than 50% is in the monoprotonated state. As we found in our QM/MM simulations, for the monoprotonated state the ADP/ADP is much more stable than ATP/AMP (see Figure 1 – supplement 4, about 8 kcal/mol). In contrast, as shown in Fig. 1c and highlighted by the reviewer, for the nonprotonated state the equilibrium is flipped. Consequently our QM/MM simulations roughly recapitulate the ensemble equilibrium of substrates/products measured at pH 7. 

      We should have better described these facts in the manuscript, and we thank the reviewer for noting this point, as it promoted us to better explaining this agreement between experiments and computation for this on enzyme equilibrium between the substrate and product states (see page 11 in the revised manuscript).

      Furthermore, the barrier for ATP/AMP -> ADP/ADP, calculated to be 20 kcal/mol for the fully charged state, exceeds the corresponding barrier for the monoprotonated state. This cautions against the conclusion that the fully charged state is the reactive state. In addition, the difference in the barrier for the no-Mg2+ system compared to the barriers with Mg2+ is substantially too large (21 kcal/mol from the calculation versus 7 kcal/mol from the experimental values). These inconsistencies raise questions as to their origins, whether they result from the use of the pulling/non-equilibrium MD simulation approach, which may yield unrealistic TS geometries, or from potential issues related to the convergence of the determined free energy values. To address this issue, a comparison of results obtained by umbrella sampling and similar methodologies is necessary.

      We agree that these points need to be clarified. For the resubmission, we performed an umbrella sampling for the fully charged nucleotide with Mg2+, and for the noMg2+ systems, and added these new figures to the manuscript (new Fig. 4). We agree with the reviewer that the obtained free energy profiles from the umbrella sampling are more reliable; the original simulations for the monoprotonated state have larger errors, see Fig. 1, supplement 4. Importantly, we experimentally measured the pH dependence of the reaction in the direction ADP/ADP to AMP and ATP, and hence compare the corresponding barriers in this direction.

      In respect to the comparison of the simulated (9.5 kcal/mol) to the experimental barriers with and without Mg, the experimental barrier is 7 kcal/mol for Ca2+ versus no metal, but larger for Mg2+ versus no metal, for which the simulations were performed. The P-transfer with Mg2+ is faster than 500 sec-1, meaning the experimental barrier for the no Mg versus magnesium is ≥ 11 kcal/mol, which is in quite good agreement with our umbrella sampling barrier differences (Fig. 4a). In response to this reviewer’s question, we added these points into the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors report the results of QM/MM simulations and kinetic measurements for the phosphoryl-transfer step in adenylate kinase. The main assertion of the paper is that a wide transition state ensemble is a key concept in enzyme catalysis as a strategy to circumvent entropic barriers. This assertion is based on the observation of a "structurally wide" set of energetically equivalent configurations that lie along the reaction coordinate in QM/MM simulations, together with kinetic measurements that suggest a decrease in the entropy of activation.

      We thank the reviewer for the endorsement, and very useful suggestions to improve the manuscript in an revised manuscript. Thanks to the questions, we have edited our manuscript accordingly. All suggested additional simulations and analysis further support our original findings.

      Strengths:

      The study combines theoretical calculations and supporting experiments.

      Weaknesses:

      The role(s) of entropy in enzyme catalysis has been discussed extensively in the literature, from the Circe effect proposed by Jencks and many other works. The current paper hypothesizes a "wide" transition state ensemble as a catalytic strategy and key concept in enzyme catalysis. Overall, it is not clear the degree to which this hypothesis is supported by the data. The reasons are as follows:

      (1) Enzyme catalysis reflects a rate enhancement with respect to a baseline reaction in solution. In order to assert that something is part of a catalytic strategy of an enzyme, it would be necessary to demonstrate from simulations that the activation entropy for the baseline reaction is indeed greater and the transition state ensemble less "wide". Alternatively stated, when indicating there is a "wide transition state ensemble" for the enzyme system - one needs to indicate that is with respect to the non-enzymatic reaction. However, these simulations were not performed and the comparisons were not demonstrated.

      We agree with the reviewer, that the ideal comparison to address enzyme catalytic power is to compare with the baseline reaction in solution. However, as is the case for many biological relevant reactions, in solution the reactions are too slow (i.e have too high barriers) and thus cannot be measured (this reaction would take about 7000 years without the enzyme). Moreover, in many cases, the reaction mechanism in solution is too different to that observed in the enzyme.

      To overcome this problem, another reference reaction is used instead of that in solution, such as a mutant enzyme, or the enzyme lacking a key cofactor, hence a non-optimized enzyme. In the present case, this baseline reaction corresponds to enzyme reaction in the absence of the Mg ion. Consistently, our results clearly show that the reaction without Mg which displays a larger barrier, has a narrower TS. We want to highlight that the extensive and excellent literature about QM/MM calculations of the hydrolysis of ATP hydrolysis in solution, which shows narrow transitions state ensembles, just to mention a few: Klähn, M., Rosta, E., & Warshel, A. (2006).

      On the mechanism of hydrolysis of phosphate monoesters dianions in solutions and proteins.

      Journal of the American Chemical Society, 128(47), 15310–15323. https://doi.org/10.1021/ja065470t; Wang, C., Huang, W., & Liao, J. lou. (2015). QM/MM investigation of ATP hydrolysis in aqueous solution. Journal of Physical Chemistry B, 119(9), 3720–3726. https://doi.org/10.1021/jp512960e.

      (2) The observation of a "wide conformational ensemble" is not a quantitative measure ofentropy. In order to make a meaningful computational prediction of the entropic contribution to the activation of free energy, one would need to perform free energy simulations over a range of temperatures (for the enzymatic and non-enzymatic systems). Such simulations were not performed, and the entropy of activation was thus not quantified by the computational predictions.

      In the present work we do not intend to quantify entropy from the simulations, since such calculations are known to have too large errors.  However, even if not strictly quantified, a wider TS ensemble is a proxy for a larger entropy.

      (3) The authors indicate that lid-opening, essential for product release, and not P-transfer is therate-limiting step in the catalytic cycle and Mg2+ accelerates both steps. How is it certain that the kinetic measurements are reporting on the chemical steps of the reaction, and not other factors such as metal ion binding or conformational changes?

      These questions were indeed the absolute critically ones we needed to answer early for studying how adenylate kinase is catalyzing the reaction by more than 14 orders of magnitude. This was done by a combination of pre-steady state, steady-state experiments combined with NMR dynamics, published in (Kerns et al., 2015), and described in the beginning of this manuscript in Fig. 1a. We agree with the reviewer that for many other enzymes such experimental examination of all microscopic steps for the enzymatic cycle had not been performed, leading to the risk of wrong interpretation of observed kinetic rates.

      (4) The authors explore different starting states for the chemical steps of the reaction (e.g.,different metal ion binding and protonation states), and conclude that the most reactive enzyme configuration is the one with the more favorable reaction-free energy barrier. However, it is not clear what is the probability of observing the system in these different states as a function of pH and metal ion concentration without performing appropriate pKa and metal ion binding calculations. This was not done, and hence these results seem somewhat inconclusive.

      As noted by the reviewer, in the present work our aim was to compare the chemical step of the reaction in different metal ion and protonation states. Our computational results show that the most reactive enzyme configuration is the nonprotonated state with Mg2+ in our forward reaction.

      We actually know what the probability of the metal-bound states are for this enzyme. The experimental data were described in (Kerns et al., 2015), we directly experimentally determined the concentration needed to fully occupy the Mg site with Mg or Ca, therefore no metal binding calculations are needed as the experiments are a direct measurement. From our x-ray structures we know the accurate binding site, and also see full occupancy. This is also true for the pH dependence of the chemical step, measured in this manuscript and shown in Fig. 5b. We note that the excellent agreement between our simulations and the experiments are one of the key features of the current manuscript.  As stated in the manuscript, we analyzed the pH dependence of the P-transfer step and showed that the rate increases with higher pH in the presence of Ca2+, while without a metal the opposite trend is observed. These results further support the QM/MM results showing that the fully-charged nucleotides state was the most reactive in the presence of the metal, whereas in the absence of the cation, only the monoprotonated nucleotides (low pH) were reactive.

      Reviewer #3 (Public Review):

      Summary:

      By conducting QM/MM free energy simulations, the authors aimed to characterize the mechanism and transition state for the phosphoryl transfer in adenylate kinase. The qualitative reliability of the QM/MM results has been supported by several interesting experimental kinetic studies. However, the interpretation of the QM/MM results is not well supported by the current calculations.

      Strengths:

      The QM/MM free energy simulations have been carefully conducted. The accuracy of the semiempirical QM/MM results was further supported by DFT/MM calculations, as well as qualitatively by several experimental studies.

      We thank the reviewer for the positive comments on the manuscript, particularly highlighting the support of the QM/MM results by additional DFT/MM calculations and several experiments.

      Weaknesses:

      (1) One key issue is the definition of the transition state ensemble. The authors appear to define this by simply considering structures that lie within a given free energy range from the barrier. However, this is not the rigorous definition of transition state ensemble, which should be defined in terms of committor distribution. This is not simply an issue of semantics, since only a rigorous definition allows a fair comparison between different cases - such as the transition state in an enzyme vs in solution, or with and without the metal ion. For a chemical reaction in a complex environment, it is also possible that many other variables (in addition to the breaking and forming P-O bonds) should be considered when one measures the diversity in the conformational ensemble.

      We thank the reviewer for noting this issue and for this great suggestion, as this led to a strengthening of the key findings in the revised manuscript version.  According to his/her suggestion, we performed a commitment analysis to properly define the TSE and compare the results between the enzyme in the presence/absence of Mg2+ (see new Fig. 4b).  The results further strengthen our previous finding and interpretation of a wider TSE for the reaction with Mg relative to without Mg.

      (2) While the experimental observation that the activation entropy differs significantly with and without the Ca2+ ion is interesting, it is difficult to connect this result with the "wide" transition state ensemble observed in the QM/MM simulations so far. Even without considering the definition of the transition state ensemble mentioned above, it is unlikely that a broader range of P-O distances would explain the substantial difference in the activation entropy measured in the experiment. Since the difference is sufficiently large, it should be possible to compute the value by repeating the free energy simulations at different temperatures, which would lead to a much more direct evaluation of the QM/MM model/result and the interpretation.

      In the present work we do not intend to quantify entropy from the simulations, since such calculations are known to have too large errors.  However, even if not strictly quantified, a wider TS ensemble is a proxy for a larger entropy. We believe that the additional committor calculations and the umbrella sampling (new Fig. 4a) are a strong support of our original findings, and better suited for supporting our findings as compared to repeating the free energy simulations at different temperatures.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      Make sure consistent units are used, either kJ/mol or kcal/mol.

      Thanks, we made the changes.

      In the case of the mono-protonated simulation, where does the proton transfer between AD(T)P and AMP occur in both the forward and reverse reactions? It is worthwhile to note that the proton transfer may take place at different reaction coordinate values (between the two reactions), as it is not explicitly defined in the reaction coordinate. In this context, it is also necessary to discuss how to combine the results to generate a single free energy profile.

      We agree with the reviewer on this point. Accordingly, we have analyzed for the monoprotonated reaction when (or where in terms of RC) the proton transfer occurs in both forward and reverse reactions. The proton transfer occurs at -0.7 of the reaction coordinate (average value, figures 3-supplement 5 e and f).

      The methods section needs improvements:

      (1) Computational setup of the system: Were the systems neutralized? If so, what types of ions were used, and how many of them were included? If systems were not neutralized, discuss a potential artifact in the results. In addition, if the system for the reverse reaction (and no-Mg2+ systems) was prepared separately, provide details regarding their preparation.

      We thank the reviewer for noting this issue. Accordingly, we have provided the requested additional details of the computational setup in the revised version.

      (2) Simulation parameters: Clarify how non-bonded interactions were treated in both MM and QM/MM simulations. For the QM/MM simulation, specify the time step used, whether the Shake was applied; whether the NPT simulations were performed, and any other relevant parameters.

      We thank the reviewer for noting this issue. Accordingly, we have provided the requested additional details of the simulation parameters.

      (3) Free energy determination strategy: Describe how the two profiles (forward and reverse profiles) were combined and provide a theoretical justification for this approach. Additionally, include a comment on whether Jarzynski's inequality equation is directly applicable to the NPT simulation.

      According to the reviewer request, in the revised version of the manuscript we have described how the two profiles where combined and provided a theoretical justification for this approach.

      Reviewer #3 (Recommendations For The Authors):

      See recommendations in the Public Review regarding the analysis of transition state ensemble and activation entropy.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Response to reviewer #1:

      We thank the reviewer for the further recommendations for improving our presentation. We would like to carefully address the remaining concerns of the reviewer.

      (1) I realize now that I didn't make my point clear enough, which was that as far as I know there is no reason to believe that an oscillatory state cannot be induced with synaptic depression as with spike frequency adaptation when used in the context of the author's model. I'm fine with how the authors have distinguished their model from R&T 2015, but I think the more interesting question is whether there is any reason to believe that STD is not equally capable of doing all the things mentioned in this paper as SFA, and if not why not. I would like the authors to go out on a limb and address this, if only with a few sentences in the discussion. 

      Thank you for pointing this out again. In response to your query regarding the comparison between STD and SFA in generating bump sweeps, we have done simulations based on STD. The results showed that both STD and SFA are capable of inducing bi-directional sweeps. However, (based on our simulations) only SFA can produce uni-directional sweeps. The absence of uni-directional sweeps based on STD may be due to the subtle yet important differences between the two mechanisms. Specifically, STD modulates the neural activity by weakening the recurrent connections, which theoretically can only inhibit recurrent inputs, while SFA can attenuate all forms of excitatory inputs, including external inputs. However, since we did not exhaustively explore the entire parameter space, we cannot conclude that STD is incapable of producing uni-directional sweeps. Future simulations are required.

      According to the Reviewer’s suggestion, we added few sentences to discuss the distinctions between STD and SFA in generating theta sweeps in the CANN in line 432 to 440 in the Discussion session:

      “Based on our simulation, both STD and SFA show the ability to produce bi-directional sweeps within a CANN model, with the SFA uniquely enabling uni-directional sweeps in the absence of external theta inputs. This difference might be due to the lack of exhaustively exploration of the entire parameter space. However, it might also attribute to the subtle yet important theoretical distinctions between STD and SFA. Specifically, STD attenuates the neural activity through a reduction in recurrent connection strength, whereas SFA provides inhibitory input directly to the neurons, potentially impacting all excitatory inputs. These differences might explain the diverse dynamical behaviors observed in our simulations. Future experiments could clarify these distinctions by monitoring changes in synaptic strength and inhibitory channel activation during theta sweeps.”

      (2) I appreciate the inclusion of the experimental data in Fig 6a (though I don't find the left-most panel very useful). I also understand what the authors are trying to convey with plots in 6c and 6c. However, I don't find the text that was added above very helpful at all. I was hoping for a simpler demonstration of the effect, by plotting a series of sequential sweeps (cell index vs time, with color indicating firing rate, as in Fig 2d) in the case of both the slow speed and fast speed regimes. Here, vertical lines could mark the individual theta cycles and the firing of individual cells, showing the constancy of the former but change of the latter. 

      Thank you for your constructive feedback. It seems there might be a misunderstanding in our previous explanation, for which we apologize. The phenomenon we want to elucidate is not an increase in the theta frequency as detected in LFPs, but rather the slope of phase precession with respect to the animal's movement speed. Due to phase precession, the oscillations of place cells as the animal traverses the field is higher than the theta frequency. A plot as Fig 2.d will not make this point clearer, since it shows the baseline theta frequency (i.e., theta sweeps as we claimed previously). A straightforward way of thinking this point is as we added previously: “…The faster the animal runs, the faster the extra half cycle can be accomplished. Consequently, the firing frequency will increase more (a steeper slope in Fig. 6c red dots) than the baseline frequency”. We hope this clarification addresses the concerns raised.

      (3) This is still confusing to me. I just don't understand how the *phase* of the oscillating activity bump has anything to do with the movement of the animal. I would like to see a plot of the sweeps (again, cell index vs time, with color indicating the firing rate) before and after inactivation for short and long duration inactivation. Perhaps I am not understanding or appreciating how the bump recovers after inactivation and how this is related to the motion of the animal. 

      Thank you for pointing this out again. The activity bump will naturally pop out at the input location (which moves forward than before) after we remove the inactivation and then starts to sweep again as before the inactivation. Single cell phase precession and populational theta sweeps are actually the two sides of the same coin (if all cells start at roughly the same phase in theta cycles). If the reviewer accept this, then at the new location, the activity bump sweeps again (around the new location), and therefore phase precession starts again at a further phase, since phase codes the position as the animal traverses the place field.

      (4) I am glad the authors are spending more time discussing this phenomenon, but I am unsure of their explanation: for a sweep moving at constant speed, neurons all along the path will be equally affected (inhibited), so where does the bias for suppressing the "end" neurons come from? 

      While it may appear that neurons along the path are equally inhibited as the bump sweeps over them, our model incorporates external inputs with Gaussian profiles. These inputs bias neurons closer to the input location, resulting in fewer activations in neurons further away from the input position.

      (5) Here I was hoping that the authors might comment on what they suspect happens when the animal starts (or stops) moving, and how the network shifts from tracking regime to oscillatory regime (or vice versa), as is typically seen in experimental data (see for example, Kay et al., 2020, fig 4b,c). My apologies for not making this point clearer. 

      Thank you for pointing this out. In our model, we observed that when the animal stops, the network continues to generate theta oscillations near the input location, albeit with reduced amplitude (so the network dynamics looks like in the tracking regime). However, we hypothesize that when the animal pauses its movement for enough time (immobile but awake states), sensory input into the hippocampus also decreases, which is similar to removing external inputs in our model. In this case, the activity bump spontaneously moves away, resembling the phenomenon of replay (see also Romani & Tsodyks 2015).

      Regarding the experimental data (Kay et al.), it indeed appears that theta sweeps decoded from neural activity become less pronounced when the mouse moves at slower speeds. This observation could potentially correspond to a decrease in the amplitude of bump oscillations when external inputs associated with movement are halted but not entirely removed in our model. However, in experiments, when the mouse's movement slows down, hippocampal activity no longer oscillates at theta frequency, making it challenging to decode theta sweeps.

      We appreciate your clarification on this point and recognize the importance of further investigating how our model can accurately replicate the transition between tracking and oscillatory regimes observed in experimental data.

    2. eLife assessment

      This study provides valuable new insights on how a prevailing model of hippocampal sequence formation can account for recent data, including forward and backward sweeps, as well as constant cycling of sweeps across different arms of a T-maze. The convincing evidence presented in support of this work relies on classical analytical and computational techniques about continuous attractor networks.

    3. Reviewer #1 (Public Review):

      Continuous attractor networks endowed with some sort of adaptation in the dynamics, whether that be through synaptic depression or firing rate adaptation, are fast becoming the leading candidate models to explain many aspects of hippocampal place cell dynamics, from hippocampal replay during immobility to theta sequences during run. Here, the authors show that a continuous attractor network endowed with spike frequency adaptation and subject to feedforward external inputs is able to account for several previously unaccounted aspects of theta sequences, including (1) sequences that move both forwards and backwards, (2) sequences that alternate between two arms of a T-maze, (3) speed modulation of place cell firing frequency, and (4) the persistence of phase information across hippocampal inactivations.

      I think the main result of the paper (findings (1) and (2)) are likely to be of interest to the hippocampal community, as well as to the wider community interested in mechanisms of neural sequences. In addition, the manuscript is generally well written and the analytics are impressive. However, several issues should be addressed, which I outline below.

      Major comments:

      In real data, population firing rate is strongly modulated by theta (i.e., cells collectively prefer a certain phase of theta - see review paper Buzsaki, 2002) and largely oscillates at theta frequency during run. With respect to this cyclical firing rate, theta sweeps resemble "Nike" check marks, with the sweep backwards preceding the sweep forwards within each cycle before the activity is quenched at the end of the cycle. I am concerned that (1) the summed population firing rate of the model does not oscillate at theta frequency, and (2) as the authors state, the oscillatory tracking state must begin with a forward sweep. With regards to (1), can the authors show theta phase spike preference plots for the population to see if they match data? With regards to (2), can the authors show what happens if the bump is made to sweep backwards first, as it appears to do within each cycle?

      I could not find the width of the external input mentioned anywhere in the text or in the table of parameters. The implication is that it is unclear to me whether, during the oscillatory tracking state, the external input is large compared to the size of the bump, so that the bump lives within a window circumscribed by the external input and so bounces off the interior walls of the input during the oscillatory tracking phase, or whether the bump is continuously pulled back and forth by the external input, in which case it could be comparable to the size of the bump. My guess based on Fig 2c is that it is the latter. Please clarify and comment.

      I would argue that the "constant cycling" of theta sweeps down the arms of a T-maze was roughly predicted by Romani & Tsodyks, 2015, Figure 7. While their cycling spans several theta cycles, it nonetheless alternates by a similar mechanism, in that adaptation (in this case synaptic depression) prevents the subsequent sweep of activity from taking the same arm as the previous sweep. I believe the authors should cite this model in this context and consider the fact that both synaptic depression and spike frequency adaptation are both possible mechanisms for this phenomenon. But I certainly give the authors credit for showing how this constant cycling can occur across individual theta cycles.

      The authors make an unsubstantiated claim in the paragraph beginning with line 413 that the Tsodyks and Romani (2015) model could not account for forwards and backwards sweeps. Both the firing rate adaptation and synaptic depression are symmetry breaking models that should in theory be able to push sweeps of activity in both directions, so it is far from obvious to me that both forward and backward sweeps are not possible in the Tsodyks and Romani model. The authors should either prove that this is the case (with theory or simulation) or excise this statement from the manuscript.

      The section on the speed dependence of theta (starting with line 327) was very hard to understand. Can the authors show a more graphical explanation of the phenomenon? Perhaps a version of Fig 2f for slow and fast speeds, and point out that cells in the latter case fire with higher frequency than in the former?

      I had a hard time understanding how the Zugaro et al., (2005) hippocampal inactivation experiment was accounted for by the model. My intuition is that while the bump position is determined partially by the location of the external input, it is also determined by the immediate history of the bump dynamics as computed via the local dynamics within the hippocampus (recurrent dynamics and spike rate adaptation). So that if the hippocampus is inactivated for an arbitrary length of time, there is nothing to keep track of where the bump should be when the activity comes back on line. Can the authors please explain more how the model accounts for this?

      Can the authors comment on why the sweep lengths oscillate in the bottom panel of Fig 5b during starting at time 0.5 seconds before crossing the choice point of the T-maze? Is this oscillation in sweep length another prediction of the model? If so, it should definitely be remarked upon and included in the discussion section.

      Perhaps I missed this, but I'm curious whether the authors have considered what factors might modulate the adaptation strength. In particular, might rat speed modulate adaptation strength? If so, would have interesting predictions for theta sequences at low vs high speeds.

      I think the paper has a number of predictions that would be especially interesting to experimentalists but are sort of scattered throughout the manuscript. It would be beneficial to have them listed more prominently in a separate section in the discussion. This should include (1) a prediction that the bump height in the forward direction should be higher than in the backward direction, (2) predictions about bimodal and unimodal cells starting with line 366, (3) prediction of another possible kind of theta cycling, this time in the form of sweep length (see comment above), etc.

    4. Reviewer #2 (Public Review):

      In this work, the authors elaborate on an analytically tractable, continuous-attractor model to study an idealized neural network with realistic spiking phase precession/procession. The key ingredient of this analysis is the inclusion of a mechanism for slow firing-rate adaptation in addition to the otherwise fast continuous-attractor dynamics. The latter continuous-attractor dynamics classically arises from a combination of translation invariance and nonlinear rate normalization.

      For strong adaptation/weak external input, the network naturally exhibits an internally generated, travelling-wave dynamics along the attractor with some characteristic speed. For small adaptation/strong external stimulus, the network recovers the classical externally driven continuous-attractor dynamics. Crucially, when both adaptation and external input are moderate, there is a competition with the internally generated and externally generated mechanisms leading to an oscillatory tracking regime. In this tracking regime, the population firing profile oscillates around the neural field tracking the position of the stimulus. The authors demonstrate by a combination of analytical and computational arguments that oscillatory tracking corresponds to realistic phase precession/procession. In particular the authors can account for the emergence of unimodal and bimodal cells, as well as some other experimental observations with respect the dependence of phase precession/procession on the animal's locomotion.

      The strengths of this work are at least three-fold: 1) Given its simplicity, the proposed model has a surprisingly large explanatory power of the various experimental observations. 2) The mechanism responsible for the emergence of precession/procession can be understood as a simple yet rather illuminating competition between internally driven and externally driven dynamical trends. 3) Amazingly, and under some adequate simplifying assumptions, a great deal of analysis can be treated exactly, which allows for a detailed understanding of all parametric dependencies. This exact treatment culminates with a full characterization of the phase space of the network dynamics, as well as the computation of various quantities of interest, including characteristic speeds and oscillating frequencies.

      As mentioned by the authors themselves, the main limitation of this work is that it deals with a very idealized model and it remains to see how the proposed dynamical behaviors would persists in more realistic models. For example, the model is based on a continuous attractor model that assumes perfect translation-invariance of the network connectivity pattern. Would the oscillating tracking behavior persist in the presence of connection heterogeneities? Another limitation is that the system needs to be tuned to exhibit oscillation within the theta range and that this tuning involves a priori variable parameters such as the external input strength. Is the oscillating-tracking behavior overtly sensitive to input strength variations? The author mentioned that an external pacemaker can serve to drive oscillation within the desired theta band but there is no evidence presented supporting this. A final and perhaps secondary limitation has to do with the choice of parameter, namely the time constant of neural firing which is chosen around 3ms. This seems rather short given that the fast time scale of rate models (excluding synaptic processes) is usually given by the membrane time constant, which is typically about 15ms. I suspect this latter point can easily be addressed.

    1. eLife assessment

      This solid study assesses a novel mitochondrial inhibitor in combination with the BCL-2 inhibitor venetoclax, with the aim to increase its activity in acute myeloid leukemia. It provides valuable findings of combinatorial efficacy using preclinical models, confirming the overall importance of targeting oxidative phosphorylation to overcome venetoclax resistance in acute myeloid leukemia, and could be strengthened through mechanistic studies demonstrating drug specificity, pharmacodynamic efficacy studies in vivo to test clinical utility and extended statistical analyses of the results. The study is of interest to hematologists because it addresses a key biomedical issue in acute myeloid leukemia (venetoclax resistance) and provides data regarding the safety and activity of a novel inhibitor of the mitochondrial polymerase addressed in combination with venetoclax.

    2. Reviewer #1 (Public Review):

      This study exploits novel agent (IMT) that inhibits mitochondrial activity in combination with venetoclax. While the concept is not novel, the agent is novel (inhibitor of the mitochondrial RNA polymerase, described in Nature in other tumor models), and quest for safe mitochondrial inhibitors is highly warranted. The strength is in vivo activity data shown in CLDX and in one of the two AML PDX models tested, and the apparent safety of the combination. However, the impact on survival is impressive in CLDX but not in PDX, and unclear why Ven-sensitive PDX is resistant to combination (opposite what cell line data show). The paper is lacking mechanistic data beyond Seahorse and standard apoptosis assays, and even transcriptome analysis from PDX cells is poorly analyzed. There is no real evidence that this agent overcome Ven resistance, which could be done for example in primary AML cells. Finally, no on-target pharmacodynamic endpoints are measured in vivo to support the activity of the compound on mitochondrial activity at the doses used (which are safe). These multiple weaknesses significantly reduce my enthusiasm for this manuscript.

      The cell line data show additive/synergistic effects of IMT and Ven on cell viability in p53-WT cells. However, no mechanisms of synergy beyond OCR are shown, which is a missed opportunity.

      No data are shown in primary AML cells in vitro. This could address venetoclax-resistant AML cells with distinct genomic profiles.

      The in vivo CLDX model (MV4;11) data is quite impressive, showing reduction of tumor burden and meaningful extension of survival in combination cohort. It is unclear why venetoclax used at highest dose normally sued in vivo (100mg/kg) did not show any impact on survival in this Ven-sensitive model. It is disappointing that no biomarkers of mitochondrial activity (for example, simple pAMPK, or levels of mitochondrial subunits) are shown to support on-target pharmacodynamic activity. However, efficacy in human PDX is less impressive, for example in Fig 6C the combination has extended survival from 96 to 112 days, possibly due to early stopping of treatment (around day 30); and no extension of survival is seen in another PDX in Fig 7. Still, this is indicative of combinatorial activity in TP53-mutant PDX. There is however discrepancy with in vitro studies that show no impact of combination in TP53 mutant cells and synergy in TP53-wt cells, and the opposite findings in vivo, which is not explained. Overall, the activity of the combination is modest. The safety is encouraging, but again, no pharmacodynamic measurements are shown to support that IMT at least partially inhibited mitochondrial activity in AML cells.

      In Discussion the statement that inhibition of POLRMT can overcome venetoclax resistance is not supported by the data, as no additive effects are seen in vitro in TP53 mutant cells, and no other resistant models (such as primary AML cells) are tested. In vivo as stated above there is some activity in TP53 mutant PDX but this alone cannot be sued to justify this strong statement. Also, the sentence that "...we were able to reduce the tumor burden in all (cell- and patient-derived) xenografted mice treated with a combination of IMT and venetoclax" is not supported by data in Fig 7.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Arabanian and colleagues presents studies showing how inhibition of mitochondrial transcription and replication with a novel inhibitor of the mitochondrial polymerase, IMT, can promote AML cell death in combination with the Bcl2 inhibitor venetoclax. They further show that this combinatorial efficacy is evident in vivo in both the AML cell line MV411 and in a PDX model. Given the multiple studies showing the importance of Oxphos in maintaining AML cell survival, the current studies provide an additional strategy to inhibit Oxphos and thus improve the therapeutic management of AML.

      Strengths:

      A novel aspect of this work is that IMT is a new class of mitochondrial inhibitor that acts by inhibiting the mitochondrial polymerase. In addition, the demonstration of therapeutic efficacy both in vitro and in vivo (including with PDX), together with some data showing minimal toxicity, adds to the impact of this work. Their overall conclusion that IMT increases the potency of Vex in treating AMLs is supported.

      Weaknesses:

      There are several deficiencies that should be addressed to substantiate the rigor and impact of this study. Of most importance, they need to show that IMT actually inhibits the mitochondrial polymerase in AML cells, and there are additional concerns with their models that if addressed would improve the ability of IMT to be developed clinically.

    1. eLife assessment

      This valuable study aims to present a mathematical theory for why the periodicity of the hexagonal pattern of grid cell firing would be helpful for encoding 2D spatial trajectories. The idea is supported by solid evidence, but some of the comparisons of theory to the experimental data seem incomplete, and the reasoning supporting some of the assumptions made should be strengthened. The work would be of interest to neuroscientists studying neural mechanisms of spatial navigation.

    2. Reviewer #1 (Public Review):

      Rebecca R.G. et al. set to determine the function of grid cells. They present an interesting case claiming that the spatial periodicity seen in the grid pattern provides a parsimonious solution to the task of coding 2D trajectories using sequential cell activation. Thus, this work defines a probable function grid cells may serve (here, the function is coding 2D trajectories), and proves that the grid pattern is a solution to that function. This approach is somewhat reminiscent in concept to previous works that defined a probable function of grid cells (e.g., path integration) and constructed normative models for that function that yield a grid pattern. However, the model presented here gives clear geometric reasoning to its case.

      Stemming from 4 axioms, the authors present a concise demonstration of the mathematical reasoning underlying their case. The argument is interesting and the reasoning is valid, and this work is a valuable addition to the ongoing body of work discussing the function of grid cells.

      However, the case uses several assumptions that need to be clearly stated as assumptions, clarified, and elaborated on: Most importantly, the choice of grid function is grounded in two assumptions:<br /> (1) that the grid function relies on the activation of cell sequences, and<br /> (2) that the grid function is related to the coding of trajectories. While these are interesting and valid suggestions, since they are used as the basis of the argument, the current justification could be strengthened (references 28-30 deal with the hippocampus, reference 31 is interesting but cannot hold the whole case).

      The work further leans on the assumption that sequences in the same direction should be similar regardless of their position in space, it is not clear why that should necessarily be the case, and how the position is extracted for similar sequences in different positions. The authors also strengthen their model with the requirement that grid cells should code for infinite space. However, the grid pattern anchors to borders and might be used to code navigated areas locally. Finally, referencing ref. 14, the authors claim that no existing theory for the emergence of grid cell firing that unifies the experimental observations on periodic firing patterns and their distortions under a single framework. However, that same reference presents exactly that - a mathematical model of pairwise interactions that unifies experimental observations. The authors should clarify this point.

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors consider why grid cells might exhibit hexagonal symmetry - i.e., for what behavioral function might this hexagonal pattern be uniquely suited? The authors propose that this function is the encoding of spatial trajectories in 2D space. To support their argument, the authors first introduce a set of definitions and axioms, which then lead to their conclusion that a hexagonal pattern is the most efficient or parsimonious pattern one could use to uniquely label different 2D trajectories using sequences of cells. The authors then go through a set of classic experimental results in the grid cell literature - e.g. that the grid modules exhibit a multiplicative scaling, that the grid pattern expands with novelty or is warped by reward, etc. - and describe how these results are either consistent with or predicted by their theory. Overall, this paper asks a very interesting question and provides an intriguing answer. However, the theory appears to be extremely flexible and very similar to ideas that have been previously proposed regarding grid cell function.

      Major strengths:

      The general idea behind the paper is very interesting - why *does* the grid pattern take the form of a hexagonal grid? This is a question that has been raised many times; finding a truly satisfying answer is difficult but of great interest to many in the field. The authors' main assertion that the answer to this question has to do with the ability of a hexagonal arrangement of neurons to uniquely encode 2D trajectories is an intriguing suggestion. It is also impressive that the authors considered such a wide range of experimental results in relation to their theory.

      Major weaknesses:

      One major weakness I perceive is that the paper overstates what it delivers, to an extent that I think it can be a bit confusing to determine what the contributions of the paper are. In the introduction, the authors claim to provide "mathematical proof that ... the nature of the problem being solved by grid cells is coding of trajectories in 2-D space using cell sequences. By doing so, we offer a specific answer to the question of why grid cell firing patterns are observed in the mammalian brain." This paper does not provide proof of what grid cells are doing to support behavior or provide the true answer as to why grid patterns are found in the brain. The authors offer some intriguing suggestions or proposals as to why this might be based on what hexagonal patterns could be good for, but I believe that the language should be clarified to be more in line with what the authors present and what the strength of their evidence is.

      Relatedly, the authors claim that they find a teleological reason for the existence of grid cells - that is, discover the function that they are used for. However, in the paper, they seem to instead assume a function based on what is known and generally predicted for grid cells (encode position), and then show that for this specific function, grid cells have several attractive properties.

      There is also some other work that seems very relevant, as it discusses specific computational advantages of a grid cell code but was not cited here: https://www.nature.com/articles/nn.2901.

      A second major weakness was that some of the claims in the section in which they compared their theory to data seemed either confusing or a bit weak. I am not a mathematician, so I was not able to follow all of the logic of the various axioms, remarks, or definitions to understand how the authors got to their final conclusion, so perhaps that is part of the problem. But below I list some specific examples where I could not follow why their theory predicted the experimental result, or how their theory ultimately operated any differently from the conventional understanding of grid cell coding. In some cases, it also seemed that the general idea was so flexible that it perhaps didn't hold much predictive power, as extra details seemed to be added as necessary to make the theory fit with the data.

      I don't quite follow how, for at least some of their model predictions, the 'sequence code of trajectories' theory differs from the general attractor network theory. It seems from the introduction that these theories are meant to serve different purposes, but the section of the paper in which the authors claim that various experimental results are predicted by their theory makes this comparison difficult for me to understand. For example, in the section describing the effect of environmental manipulations in a familiar environment, the authors state that the experimental results make sense if one assumes that sequences are anchored to landmarks. But this sounds just like the classic attractor-network interpretation of grid cell activity - that it's a spatial metric that becomes anchored to landmarks.

      It was not clear to me why their theory predicted the field size/spacing ratio or the orientation of the grid pattern to the wall.

      I don't understand how repeated advancement of one unit to the next, as shown in Figure 4E, would cause the change in grid spacing near a reward.

      I don't follow how this theory predicts the finding that the grid pattern expands with novelty. The authors propose that this occurs because the animals are not paying attention to fine spatial details, and thus only need a low-resolution spatial map that eventually turns into a higher-resolution one. But it's not clear to me why one needs to invoke the sequence coding hypothesis to make this point.

      The last section, which describes that the grid spacing of different modules is scaled by the square root of 2, says that this is predicted if the resolution is doubled or halved. I am not sure if this is specifically a prediction of the sequence coding theory the authors put forth though since it's unclear why the resolution should be doubled or halved across modules (as opposed to changed by another factor).

    4. Reviewer #3 (Public Review):

      The manuscript presents an intriguing explanation for why grid cell firing fields do {\em not} lie on a lattice whose axes aligned to the walls of a square arena. This observation, by itself, merits the manuscript's dissemination to the journals audience.

      The presentation is quirky (but keep the quirkiness!).

      But let me recast the problem presented by the authors as one of combinatorics. Given repeating, spatially separated firing fields across cells, one obtains temporal sequences of grid cells firing. Label these cells by integers from $[n]$. Any two cells firing in succession should uniquely identify one of six directions (from the hexagonal lattice) in which the agent is currently moving.

      Now, take the symmetric group $\Sigma$ of cyclic permutations on $n$ elements.<br /> We ask whether there are cyclic permutations of $[n]$ such that

      So, for instance, $(4,2,3,1)$ would not be counted as a valid permutation of $(1,2,3,4)$, as $(2,3)$ and $(1,4)$ are adjacent.

      Furthermore, given $[n]$, are there two distinct cyclic permutations such that {\em no} adjacencies are preserved when considering any pair of permutations (among the triple of the original ordered sequence and the two permutations)? In other words, if we consider the permutation required to take the first permutation into the second, that permutation should not preserve any adjacencies.

      {\bf Key question}: is there any difference between the solution to the combinatorics problem sketched above and the result in the manuscript? Specifically, the text argues that for $n=7$ there is only {\em one} solution.

      Ideally, one would strive to obtain a closed-form solution for the number of such permutations as a function of $n$.

    1. eLife assessment

      Notch1 is expressed uniformly throughout the mouse endocardium during the initial stages of heart valve formation, yet it remains unclear how Notch signaling is activated in specific regions to induce valve formation. To answer this question, the authors used a combination of in vivo and ex vivo experiments in mice to demonstrate ligand-independent activation of Notch1 by circulation induced-mechanical stress and provide partially convincing evidence for stimulation of a novel mechanotransduction pathway involving post-translational modification of mTORC2 and Protein Kinase C (PKC) upstream of Notch1. While these findings represent an important advance in our understanding of Notch1-mediated valve formation, data supporting the main claims are incomplete.

    2. Joint Public Review:

      The overall goal of this manuscript is to understand how Notch signaling is activated in specific regions of the endocardium, including the OFT and AVC, that undergo EMT to form the endocardial cushions. Using dofetilide to transiently block circulation in E9.5 mice, the authors show that Notch receptor cleavage still occurs in the valve-forming regions due to mechanical sheer stress as Notch ligand expression and oxygen levels are unaffected. The authors go on to show that changes in lipid membrane structure activate mTOR signaling, which causes phosphorylation of PKC and Notch receptor cleavage.

      The strengths of the manuscript include the dual pharmacological and genetic approaches to block blood flow in the mouse, the inclusion of many controls including those for hypoxia, the quality of the imaging, and the clarity of the text. However, several weaknesses were noted surrounding the main claims where the supporting data are incomplete.

      PKC - Notch1 activation:

      (1) Does deletion of Prkce and Prkch affect blood flow, and if so, might that be suppressing Notch1 activation indirectly?

      (2) It would be helpful to visualize the expression of prkce and prkch by in situ hybridization in E9.5 embryos.

      (3) PMA experiments: Line 223-224: A major concern is related to the conclusion that "blood flow activates Notch in the cushion endocardium via the mTORC2-PKC signaling pathway". To make that claim, the authors show that a pharmacological activation with a potent PKC activator, PMA, rescues NICD levels in the AVC in dofetilide-treated embryos. This claim would also need proof that a lack of blood flow alters the activity of mTORC2 to phosphorylate the targets of PKC phosphorylation. Also, this observation does not explain the link between PKC activity and Notch activation.

      (4) In addition, the authors hypothesise that shear stress lies upstream of PKC and Notch activation, and that because shear stress is highest at the valve-forming regions, PKC and Notch activity is localised to the valve-forming regions. Since PMA treatment affects the entire endocardium which expresses Notch1, NICD should be seen in areas outside of the AVC in the PMA+dofetilide condition. Please clarify.

      Lipid Membrane:

      (1) It is not clear how the authors think that the addition of cholesterol changes the lipid membrane structure or alters Cav-1 distribution. Can this be addressed? Does adding cholesterol make the membrane more stiff? Does increased stiffness result from higher shear stress?

      (2) The loss of blood flow apparently affects Cav1 membrane localization and causes a redistribution from the luminal compartment to lateral cell adhesion sites. Cholesterol treatment of dofetilide-treated hearts (lacking blood flow) rescued Cav1 localization to luminal membrane microdomains and rescued NICD expression. It remains unclear how the general addition of cholesterol would result in a rescue of regionalized membrane distribution within the AVC and in high-shear stress areas.

      (3) The authors do not show the entire heart in that rescue treatment condition (cholesterol in dofetilide-treated hearts). Also, there is no quantification of that rescue in Figure 4B. Currently, only overview images of the heart are shown but high-resolution images on a subcellular scale (such as electron microscopy) are needed to resolve and show membrane microdomains of caveolae with Cav1 distribution. This is important because Cav-1could have functions independent of caveolae (eg. Lolo et al., https://doi.org/10.1038/s41556-022-01034-3).

      Figure Legends, missing data, and clarity:

      (1) The number of embryos used in each experiment is not clear in the text or figure legends. In general, figure legends are incomplete (for instance in Figure 1).

      (2) Line 204: The authors refer to unpublished endocardial RNAseq data from E9.5 embryos. These data must be provided with this manuscript if it is referred to in any way in the text.

      (3) Figure 1 shows Dll4 transcript levels, which do not necessarily correlate with protein levels. It would be important to show quantifications of these patterns as Notch/Dll4 levels are cycling and may vary with time and between different hearts.

      (4) Line 212-214: The authors describe cardiac cushion defects due to the loss of blood flow and refer to some quantifications that are not completely shown in Figure 3. For instance, quantifications for cushion cellularity and cardiac defects at three hours (after the start of treatment?) are missing.

      (5) Related to Figure 5. The work would be strengthened by quantification of the effects of dofetilide and verapamil on heartbeat at the doses applied. Is the verapamil dosage used here similar to the dose used in the clinic?

      Overstated Claims:

      (1) The authors claim that the lipid microstructure/mTORC2/PKC/Notch pathway is responsive to shear stress, rather than other mechanical forces or myocardial function. Their conclusions seem to be extrapolated from various in vitro studies using non-endocardial cells. To solidify this claim, the authors would need additional biomechanical data, which could be obtained via theoretical modelling or using mouse heart valve explants. This issue could also be addressed by the authors simply softening their conclusions.

      (2) Line 263-264: In the discussion, the authors conclude that "Strong fluid shear stress in the AVC and OFT promotes the formation of caveolae on the luminal surface of the endocardial cells, which enhances PKCε phosphorylation by mTORC2." This link was shown rather indirectly, rather than by direct evidence, and therefore the conclusion should be softened. For example, the authors could state that their data are consistent with this model.

      (3) In the Discussion, it says: "Mammalian embryonic endocardium undergoes extensive EMT to form valve primordia while zebrafish valves are primarily the product of endocardial infolding (Duchemin et al., 2019)." In the paper cited, Duchemin and colleagues described the formation of the zebrafish outflow tract valve. The zebrafish atrioventricular valve primordia is formed via partial EMT through Dll-Notch signaling (Paolini et al. Cell Reports 2021) and the collective cell migration of endocardial cells into the cardiac jelly. Then, a small subset of cells that have migrated into the cardiac jelly give rise to the valve interstitial cells, while the remainder undergo mesenchymal-to-endothelial transition and become endothelial cells that line the sinus of the atrioventricular valve (Chow et al., doi: 10.1371/journal.pbio.3001505). The authors should modify this part of the Discussion and cite the relevant zebrafish literature.

    1. Reviewer #2 (Public Review):

      Summary:

      The authors demonstrated that maternal choline supplementation (MCS) improved spatial memory, reduced a marker of hyperexcitability/epilepsy (FosB expression), and reduced oxidative stress (as measured by restored NeuN expression) in an Alzheimer's disease mouse model. This multidisciplinary study spanned behavior, EEG, and histological measures and constituted a large amount of work. Overall, the results supported that MCS does have important effects on hippocampal function, which may substantially impact human AD.

      Strengths:

      The strength of the group was the ability to monitor the incidence of interictal spikes (IIS) over the course of 1.2-6 months in the Tg2576 Alzheimer's disease model, combined with meaningful behavioral and histological measures. The authors were able to demonstrate MCS had protective effects in Tg2576 mice, which was particularly convincing in the hippocampal novel object location task.

      Weaknesses:

      Although choline deficiency was associated with impaired learning and elevated FosB expression, consistent with increased hyperexcitability, IIS was reduced with both low and high choline diets. Although not necessarily a weakness, it complicates the interpretation and requires further evaluation.

    1. eLife assessment

      In this fundamental work, the authors demonstrated that maternal choline supplementation improved spatial memory, reduced hyperexcitability, and restored NeuN expression in a familial Alzheimer's disease mouse model. Interestingly, choline deficiency increased mortality, while paradoxically reduced hyperexcitability. Through behavioral, electrophysiological, and histological measures, the authors present convincing evidence supporting the significant role of maternal choline supplementation in protecting hippocampal functions vulnerable to Alzheimer's disease.

    2. Joint Public Review:

      Chartampila et al. describe the effect of early-life choline supplementation on cognitive functions and epileptic activity in a mouse model of Alzheimer's disease. The cognitive abilities were assessed by the novel object recognition test and the novel object location test, performed in the same cohort of mice at 3 months and 6 months of age. Neuronal loss was tested using NeuN immunoreactivity, and neuronal hyperexcitability was examined using deltaFosB and video-EEG recordings, providing multi-level correlations between these different parameters.

      The study was designed as a 6-month follow-up, with repeated behavioral and EEG measurements through disease development and multilevel correlations providing valuable and interesting findings on AD progression and the effect of early-life choline supplementation. Moreover, the behavioral data that suggest an adverse effect of low choline in WT mice are interesting and important also beyond the context of AD, highlighting the dramatic effect of diet on the phenotypes of animal.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Weaknesses:

      The readability could be improved.

      We have gone through the paper again and tried to revise the text to improve readability.

      Reviewer #1 (Recommendations For The Authors):

      (1) Thank you for adding the discrimination ratio. However, as Fig 2 and 3 depict the same experimental data, consider harmonizing the presentation (symbols and colors) and consolidating the Figs for clarity.“

      This is an excellent point but it is actually very hard to harmonize symbols and colors because the data are divided in different ways. Upon considering this further, we actually don’t want to make the symbols and colors the same because it would be misleading. For example, WT and Tg training and testing session data are divided into grey and white throughout Figure 2, but in Figure 3, training and testing session data are pooled. To color code them grey and white in Figure 3 might make it seem that in Figure 3 training and testing were separated.

      (2) Fig 5 is missing

      We are not sure why Figure 5 was absent since it was present in our copy of the submitted pdf. We have double checked and in the revised manuscript we are sure Figure 5 is included.  

      (3) Fig 6 add raw data for WT

      We have added raw WT data. Revised figure 6 includes the raw data in part A4.

      (4) Fig 7 add raw data for WT

      We have added raw WT data. Revised Figure 7 includes the raw data in part A4.

    1. eLife assessment

      In this important work, a quantitative analysis method for three-dimensional morphogenetic processes during embryonic development is introduced. The proposed method is a pipeline combining several methods, allowing quantitative analysis of developmental processes without cell segmentation and tracking. Upon application of their method, the authors obtain convincing evidence that ascidian gastrulation is a two-step process. This work should be of interest to a broad range of developmental biologists who aim to obtain a quantitative understanding of morphogenesis.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors propose a new method to quantitatively assess morphogenetic processes during organismal development. They apply their method to ascidian morphogenesis and thus find that gastrulation is a two-step process.

      The method applies to morphogenetic changes of surfaces. It consists of the following steps: first, surface deformations are quantified based on microscopy images without requiring cellular segmentation and tracking. This is achieved by mapping, at each time point, a polygonal mesh initially defined on a sphere to the surface of the embryo. The mapped vertices of this polygonal mesh then serve as (Lagrangian) markers for the embryonic surface. From these, one can infer the deformation of the surface, which can be expressed in terms of the strain tensor at each point of the surface. Changes in the strain tensor give the strain rate, which captures the morphogenetic processes. Second, at each time point, the strain rate field is decomposed in terms of spherical harmonics. Finally, the evolution of the weights of the various spherical harmonics in the decomposition is analysed via wavelet analysis. The authors apply their workflow to ascidian development between 4 and 8.7 hpf. From their analysis, they find clear indications for gastrulation and neurulation and identify two sub-phases of gastrulation, namely, endoderm invagination and 'blastophore closure'.

      Strengths:

      The combination of various tools allows the authors to obtain a quantitative description of the developing embryo without the necessity of identifying fiducial markers. Visual inspection shows that their method works well. Furthermore, this quantification then allows for an unbiased identification of different morphogenetic phases.

      Weaknesses:

      At times, the explanation of the method is hard to follow, unless the reader is already familiar with concepts like level-set methods or wavelet transforms. Furthermore, the software for performing the determination of Lagrangian markers or the subsequent spectral analysis does not seem to be available to the readers.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors proposed a method to quantitatively analyze 3D live imaging data of early developing embryos, using ascidian development as an example. For this purpose, the previously proposed level set method was used to computationally track the temporal evolution of reference points introduced on the embryo surface. Then, from the obtained three-dimensional trajectories, the velocity field was obtained, from which the strain rate field was computed according to the idea of continuum mechanics. The information in the strain rate field was reduced to a scalar field, determined by taking the square root of the sum of the squares of the eigenvalues. The scalar field is then further decomposed into a spectrum using spherical harmonics. In this paper, the authors focused on the modes with lower order with real coefficients. The time evolution of these modes was analyzed using wavelet transforms. The authors claimed that the results reflected the developmental stages of ascidian embryos.

      Strengths:

      In this way, this manuscript proposes a pipeline of analyses combining various methods. The strength of this method lies in its ability to quantitatively analyze the deformation of the entire embryo without the requirement for cellular segmentation and tracking.

      Weaknesses:

      The limitations of the proposed analysis pipeline are not clearly indicated. Claims such as the identification of developmental stages need more quantitative validation. In addition, it is not clearly shown how the proposed method can distinguish between the superposition of individual cell behavior and the collective behavior of cells.

    1. eLife assessment

      This important study describes a neural circuit contributing to two behavioral processes affecting pathogen avoidance in the nematode C. elegans. The method used to identify specific contributing neurons is innovative and the experimental evidence supporting the major claims is solid. This study will be of interest to neuroscientists studying behavior, in particular in C. elegans.

    2. Reviewer #1 (Public Review):

      This study identifies two behavioral processes that underlie learned pathogen avoidance behavior in C. elegans: exiting and re-entry of pathogenic bacterial lawns. Long-term behavioral tracking indicates that animals increase the prevalence of both behaviors over long-term exposure to the pathogen Pseudomonas aeruginosa. Using an optogenetic silencing screen, the authors identify groups of neurons, whose activity regulates lawn occupancy. Surprisingly, they find that optogenetic inhibition of neurons during only the first two hours of pathogen exposure can establish subsequent long-term changes in pathogen aversion. By leveraging a compressed sensing approach, the authors define a set of neurons involved in either lawn exit or lawn re-entry behavior using a constrained set of transgenic lines that drive Arch-3 expression in overlapping groups of neurons. They then measure the calcium activity of the candidate neurons involved in lawn re-entry in freely moving animals using GCaMP, and observe a reduction in their neural activity after exposure to a pathogen. Optogenetic inhibition of AIY and SIA neurons during acute pathogen exposure in naïve animals delays lawn entry whereas activating these neurons in animals previously exposed to pathogen enhances lawn entry, albeit transiently.

      This work is missing several controls that are necessary to substantiate their claims. My most important concern is that the optogenetic screen for neurons that alter pathogenic lawn occupancy does not have an accompanying control on non-pathogenic OP50 bacteria. Hence, it remains unclear whether these neuronal inhibition experiments lead to pathogen-specific or generalized lawn-leaving alterations. For strains that show statistical differences between - and + ATR conditions, the authors should perform follow-up validation experiments on non-pathogenic OP50 lawns to ensure that the observed effect is PA14-specific. Similarly, neuronal inhibition experiments in Figures 5E and H are only performed with naïve animals on PA14 - we need to see the latency to re-entry on OP50 as well, to make general conclusions about these neurons' role in pathogen-specific avoidance.

      My second major concern is regarding the calcium imaging experiments of candidate neurons involved in lawn re-entry behavior. Although the data shows that AIY, AVK, and SIA/SIB neurons all show reduced activity following pathogen exposure, the authors do not relate these activity changes to changes in behavior. Given the well-established links between these cells and forward locomotion, it is essential to not only report differences in activity but also in the relationship between this activity and locomotory behavior. If animals are paused outside of the pathogen lawn, these neurons may show low activity simply because the animals are not moving forward. Other forward-modulated neurons may also show this pattern of reduced activity if the animals remain paused. Given that the authors have recorded neural activity before and after contact with pathogenic bacteria in freely moving animals, they should also provide an analysis of the relationship between proximity to the lawn and the activity of these neurons.

      This work is missing methodological descriptions that are necessary for the correct interpretation of the results shown here. Figure 2 suggests that the determination of statistical significance across the optogenetic inhibition screen will be found in the Methods, but this information is not to be found there. At various points in the text, authors refer to "exit rate", "rate constant", and "entry rate". These metrics seem derived from an averaged measurement across many individual animals in one lawn evacuation assay plate. However "latency to re-entry" is only defined on a per-animal basis in the lawn re-exposure assay. These differences should be clearly stated in the methods section to avoid confusion and to ensure that statistics are computed correctly.

      This work also contains mislabeled graphs and incorrect correspondence with the text, which make it difficult to follow the authors 'claims. The text suggests that Pdop-2::Arch3 and Pmpz-1::Arch3 show increased exit rates, whereas Figure 2 shows that Pflp-4::Arch3 but not Pmpz-1::Arch3 has increased exit rate. The authors should also make a greater effort to correctly and clearly label which type of behavioral experiment is used to generate each figure and describe the differences in experimental design in the main text, figure legends, and methods. Figure 2E depicts trajectories of animals leaving a lawn over a 2.5-minute interval but it is unclear when this time window occurs within the 18-hour lawn leaving assay. Likewise, Figure 2H depicts a 30-minute time window which has an unclear relationship to the overall time course of lawn leaving. This figure legend is also mislabeled as "Infected/Healthy", whereas it should be labeled "-/+ ATR".

      This work raises the interesting possibility that different sets of neurons control lawn exit and lawn re-entry behaviors following pathogen exposure. However, the authors never directly test this claim. To rigorously show this, the authors would need to show that lawn-exit-promoting neurons (CEPs, HSNs, RIAs, RIDs, SIAs) are dispensable for lawn re-entry behavior and that lawn re-entry promoting neurons (AVK, SIA, AIY, MI) are dispensable for lawn exit behavior in pathogen-exposed animals. The authors identify AVK neurons as important for modulating lawn re-entry behavior by brief inhibition at the start of pathogen exposure but fail to find that these neurons are required for increased latency to re-entry in naïve animals (Figure 5D). Recent work from Marquina-Solis et al (2024) shows that chronic silencing of these neurons delays pathogen lawn leaving, due to impaired release of flp-1 neuropeptide. Authors may wish to connect their work more closely with the existing literature by investigating the behavioral process by which AVK contributes to lawn evacuation.

      If the authors work through these criticisms, this work can become an important contribution to the field of pathogen learning in C. elegans. However, in its current form, this work remains incomplete.

    3. Reviewer #2 (Public Review):

      In this manuscript, Hallacy et al. used a compressed sensing-based optogenetic screening method to investigate the crucial neurons that regulate pathogenic avoidance behavior in C. elegans. They further substantiate their findings using complementary optogenetic activation and imaging techniques to confirm the roles of the key neurons identified through extensive screening efforts. Notably, they identified AIY and SIA as pivotal neurons in the dynamic process of pathogenic avoidance. Their significant discovery is the delayed or stalled reentry process, which drives avoidance behavior; to my knowledge, this dynamic has not been previously documented. Additionally, the successful integration of quantitative optogenetic tools and compressed sensing algorithms is noteworthy, demonstrating the potential for obtaining highly quantitative data from the C. elegans nervous system. This approach is quite rare in this field, yet it represents a promising direction for studying this simple nervous system.

      However, the paper's main weakness lies in its lack of a detailed mechanism explaining how the delayed reentry process directly influences the actual locomotor output that results in avoidance. The term 'delayed reentry' is used as a dynamic metric for quantifying the screening, yet the causal link between this metric and the mechanistic output remains unclear. Despite this, the study is well-structured, with comprehensive control experiments, and is very well constructed.

    4. Reviewer #3 (Public Review):

      Summary:

      Using a compressed sensing-based approach applied previously by the author's group, the authors conducted an initial screen for neurons that when optogenetically down-regulated, influenced learned pathogen avoidance consisting of two component behaviors, exit from the bacterial lawn and lawn re-entry. Authors found that 4 classes of neurons AVK, SIA, AIY, and MI were inferred over a wide range of sparsity parameters, thereby indicating the importance of lawn re-entry. They found six classes of neurons required for lawn exit. The authors then went on to further analyze the neurons for the re-entry behavior, and conducted calcium imaging of those neurons in the freely behaving animals. They found that the activities of AIY and SIA neurons decreased after the animals that had been exposed to the pathogenic bacteria tried to re-enter the bacterial lawn. They also found that when those neurons of the animals that had not been exposed to pathogenic bacteria were downregulated by optogenetics, those operated animals increased the latency of the re-entry, which is a similar behavioral modification to that of the animals that had been exposed to the pathogen. Conversely, those neurons of the animals that were exposed to pathogenic bacteria were up-regulated by optogenetics, those animals showed a shortened latency of the re-entry, which is similar to the behavior observed in the animals not exposed to pathogen.

      Strengths:

      This is overall a very nice piece of work. Most importantly, an initial screening of neurons was conducted by a compressed sensing-based approach previously applied by the same group. It is also worth emphasizing that this compressed analysis is applicable when the behavior of interest involves a small number of neurons, as the authors pointed out in the Introduction Session. Therefore, the readers should keep in mind that the validation and significance of this work heavily depend on the justification of scarcity parameters that the authors chose. Nevertheless, this work is well justified because neurons identified by the initial screening were thoroughly analyzed by various methods including calcium imaging and optogenetic manipulation of neuronal activities and behavioral analyses using an animal-tracking system.

      Weaknesses:

      My only concern is that the authors should be more careful about describing their "compressed sensing-based approach". Authors often cite their previous Nature Methods paper, but should explain more because this method is critical for this manuscript. Also, this analysis is based on the hypothesis that only a small number of neurons are responsible for a given behavior. Authors should explain more about how to determine scarcity parameters, for example.

    1. eLife assessment

      This potentially useful study involves neuro-imaging and electrophysiology in a small cohort of congenital cataract patients after sight recovery and age-matched control participants with normal sight. It aims to characterize the effects of early visual deprivation on excitatory and inhibitory balance in the visual cortex. While the findings are taken to suggest the existence of persistent alterations in Glx/GABA ratio and aperiodic EEG signals, the evidence supporting these claims is incomplete. Specifically, small sample sizes, lack of a specific control cohort, and other methodological limitations will likely restrict the usefulness of the work, with relevance limited to scientists working in this particular subfield.

    2. Reviewer #1 (Public Review):

      Summary:

      In this human neuroimaging and electrophysiology study, the authors aimed to characterize the effects of a period of visual deprivation in the sensitive period on excitatory and inhibitory balance in the visual cortex. They attempted to do so by comparing neurochemistry conditions ('eyes open', 'eyes closed') and resting state, and visually evoked EEG activity between ten congenital cataract patients with recovered sight (CC), and ten age-matched control participants (SC) with normal sight.

      First, they used magnetic resonance spectroscopy to measure in vivo neurochemistry from two locations, the primary location of interest in the visual cortex, and a control location in the frontal cortex. Such voxels are used to provide a control for the spatial specificity of any effects because the single-voxel MRS method provides a single sampling location. Using MR-visible proxies of excitatory and inhibitory neurotransmission, Glx and GABA+ respectively, the authors report no group effects in GABA+ or Glx, no difference in the functional conditions 'eyes closed' and 'eyes open'. They found an effect of the group in the ratio of Glx/GABA+ and no similar effect in the control voxel location. They then performed multiple exploratory correlations between MRS measures and visual acuity, and reported a weak positive correlation between the 'eyes open' condition and visual acuity in CC participants.

      The same participants then took part in an EEG experiment. The authors selected only two electrodes placed in the visual cortex for analysis and reported a group difference in an EEG index of neural activity, the aperiodic intercept, as well as the aperiodic slope, considered a proxy for cortical inhibition. They report an exploratory correlation between the aperiodic intercept and Glx in one out of three EEG conditions.

      The authors report the difference in E/I ratio, and interpret the lower E/I ratio as representing an adaptation to visual deprivation, which would have initially caused a higher E/I ratio. Although intriguing, the strength of evidence in support of this view is not strong. Amongst the limitations are the low sample size, a critical control cohort that could provide evidence for a higher E/I ratio in CC patients without recovered sight for example, and lower data quality in the control voxel.

      Strengths of study:

      How sensitive period experience shapes the developing brain is an enduring and important question in neuroscience. This question has been particularly difficult to investigate in humans. The authors recruited a small number of sight-recovered participants with bilateral congenital cataracts to investigate the effect of sensitive period deprivation on the balance of excitation and inhibition in the visual brain using measures of brain chemistry and brain electrophysiology. The research is novel, and the paper was interesting and well-written.

      Limitations:

      - Low sample size. Ten for CC and ten for SC, and a further two SC participants were rejected due to a lack of frontal control voxel data. The sample size limits the statistical power of the dataset and increases the likelihood of effect inflation.

      - Lack of specific control cohort. The control cohort has normal vision. The control cohort is not specific enough to distinguish between people with sight loss due to different causes and patients with congenital cataracts with co-morbidities. Further data from more specific populations, such as patients whose cataracts have not been removed, with developmental cataracts, or congenitally blind participants, would greatly improve the interpretability of the main finding. The lack of a more specific control cohort is a major caveat that limits a conclusive interpretation of the results.

      - MRS data quality differences. Data quality in the control voxel appears worse than in the visual cortex voxel. The frontal cortex MRS spectrum shows far broader linewidth than the visual cortex (Supplementary Figures). Compared to the visual voxel, the frontal cortex voxel has less defined Glx and GABA+ peaks; lower GABA+ and Glx concentrations, lower NAA SNR values; lower NAA concentrations. If the data quality is a lot worse in the FC, then small effects may not be detectable.

      - Because of the direction of the difference in E/I, the authors interpret their findings as representing signatures of sight improvement after surgery without further evidence, either within the study or from the literature. However, the literature suggests that plasticity and visual deprivation drive the E/I index up rather than down. Decreasing GABA+ is thought to facilitate experience-dependent remodelling. What evidence is there that cortical inhibition increases in response to a visual cortex that is over-sensitised due to congenital cataracts? Without further experimental or literature support this interpretation remains very speculative.

      - Heterogeneity in the patient group. Congenital cataract (CC) patients experienced a variety of duration of visual impairment and were of different ages. They presented with co-morbidities (absorbed lens, strabismus, nystagmus). Strabismus has been associated with abnormalities in GABAergic inhibition in the visual cortex. The possible interactions with residual vision and confounds of co-morbidities are not experimentally controlled for in the correlations, and not discussed.

      - Multiple exploratory correlations were performed to relate MRS measures to visual acuity (shown in Supplementary Materials), and only specific ones were shown in the main document. The authors describe the analysis as exploratory in the 'Methods' section. Furthermore, the correlation between visual acuity and E/I metric is weak, and not corrected for multiple comparisons. The results should be presented as preliminary, as no strong conclusions can be made from them. They can provide a hypothesis to test in a future study.

      - P.16 Given the correlation of the aperiodic intercept with age ("Age negatively correlated with the aperiodic intercept across CC and SC individuals, that is, a flattening of the intercept was observed with age"), age needs to be controlled for in the correlation between neurochemistry and the aperiodic intercept. Glx has also been shown to negatively correlate with age.

      - Multiple exploratory correlations were performed to relate MRS to EEG measures (shown in Supplementary Materials), and only specific ones were shown in the main document. Given the multiple measures from the MRS, the correlations with the EEG measures were exploratory, as stated in the text, p.16, and in Figure 4. Yet the introduction said that there was a prior hypothesis "We further hypothesized that neurotransmitter changes would relate to changes in the slope and intercept of the EEG aperiodic activity in the same subjects." It would be great if the text could be revised for consistency and the analysis described as exploratory.

      - The analysis for the EEG needs to take more advantage of the available data. As far as I understand, only two electrodes were used, yet far more were available as seen in their previous study (Ossandon et al., 2023). The spatial specificity is not established. The authors could use the frontal cortex electrode (FP1, FP2) signals as a control for spatial specificity in the group effects, or even better, all available electrodes and correct for multiple comparisons. Furthermore, they could use the aperiodic intercept vs Glx in SC to evaluate the specificity of the correlation to CC.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript reports non-invasive measures of activity and neurochemical profiles of the visual cortex in congenitally blind patients who recovered vision through the surgical removal of bilateral dense cataracts. The declared aim of the study is to find out how restoring visual function after several months or years of complete blindness impacts the balance between excitation and inhibition in the visual cortex.

      Strengths:

      The findings are undoubtedly useful for the community, as they contribute towards characterising the many ways this special population differs from normally sighted individuals. The combination of MRS and EEG measures is a promising strategy to estimate a fundamental physiological parameter - the balance between excitation and inhibition in the visual cortex, which animal studies show to be heavily dependent upon early visual experience. Thus, the reported results pave the way for further studies, which may use a similar approach to evaluate more patients and control groups.

      Weaknesses:

      The main issue is the lack of an appropriate comparison group or condition to delineate the effect of sight recovery (as opposed to the effect of congenital blindness). Few previous studies suggested an increased excitation/Inhibition ratio in the visual cortex of congenitally blind patients; the present study reports a decreased E/I ratio instead. The authors claim that this implies a change of E/I ratio following sight recovery. However, supporting this claim would require showing a shift of E/I after vs. before the sight-recovery surgery, or at least it would require comparing patients who did and did not undergo the sight-recovery surgery (as common in the field).

      MR Spectroscopy shows a reduced GLX/GABA ratio in patients vs. sighted controls; however, this finding remains rather isolated, not corroborated by other observations. The difference between patients and controls only emerges for the GLX/GABA ratio, but there is no accompanying difference in either the GLX or the GABA concentrations. There is an attempt to relate the MRS data with acuity measurements and electrophysiological indices, but the explorative correlational analyses do not help to build a coherent picture. A bland correlation between GLX/GABA and visual impairment is reported, but this is specific to the patients' group (N=10) and would not hold across groups (the correlation is positive, predicting the lowest GLX/GABA ratio values for the sighted controls - the opposite of what is found). There is also a strong correlation between GLX concentrations and the EEG power at the lowest temporal frequencies. Although this relation is intriguing, it only holds for a very specific combination of parameters (of the many tested): only with eyes open, only in the patient group.

      For these reasons, the reported findings do not allow us to draw firm conclusions on the relation between EEG parameters and E/I ratio or on the impact of early (vs. late) visual experience on the excitation/inhibition ratio of the human visual cortex.

    4. Reviewer #3 (Public Review):

      This manuscript examines the impact of congenital visual deprivation on the excitatory/inhibitory (E/I) ratio in the visual cortex using Magnetic Resonance Spectroscopy (MRS) and electroencephalography (EEG) in individuals whose sight was restored. Ten individuals with reversed congenital cataracts were compared to age-matched, normally sighted controls, assessing the cortical E/I balance and its interrelationship to visual acuity. The study reveals that the Glx/GABA ratio in the visual cortex and the intercept and aperiodic signal are significantly altered in those with a history of early visual deprivation, suggesting persistent neurophysiological changes despite visual restoration.

      My expertise is in EEG (particularly in the decomposition of periodic and aperiodic activity) and statistical methods. I have several major concerns in terms of methodological and statistical approaches along with the (over)interpretation of the results. These major concerns are detailed below.

      (1) Variability in visual deprivation:

      - The document states a large variability in the duration of visual deprivation (probably also the age at restoration), with significant implications for the sensitivity period's impact on visual circuit development. The variability and its potential effects on the outcomes need thorough exploration and discussion.

      (2) Sample size:

      - The small sample size is a major concern as it may not provide sufficient power to detect subtle effects and/or overestimate significant effects, which then tend not to generalize to new data. One of the biggest drivers of the replication crisis in neuroscience.

      - The main problem with the correlation analyses between MRS and EEG measures is that the sample size is simply too small to conduct such an analysis. Moreover, it is unclear from the methods section that this analysis was only conducted in the patient group (which the reviewer assumed from the plots), and not explained why this was done only in the patient group. I would highly recommend removing these correlation analyses.

      (3) Statistical concerns:

      - The statistical analyses, particularly the correlations drawn from a small sample, may not provide reliable estimates (see https://www.sciencedirect.com/science/article/pii/S0092656613000858, which clearly describes this problem).

      - Statistical analyses for the MRS: The authors should consider some additional permutation statistics, which are more suitable for small sample sizes. The current statistical model (2x2) design ANOVA is not ideal for such small sample sizes. Moreover, it is unclear why the condition (EO & EC) was chosen as a predictor and not the brain region (visual & frontal) or neurochemicals. Finally, the authors did not provide any information on the alpha level nor any information on correction for multiple comparisons (in the methods section). Finally, even if the groups are matched w.r.t. age, the time between surgery and measurement, the duration of visual deprivation, (and sex?), these should be included as covariates as it has been shown that these are highly related to the measurements of interest (especially for the EEG measurements) and the age range of the current study is large.

      - EEG statistical analyses: The same critique as for the MRS statistical analyses applies to the EEG analysis. In addition: was the 2x3 ANOVA conducted for EO and EC independently? This seems to be inconsistent with the approach in the MRS analyses, in which the authors chose EO & EC as predictors in their 2x2 ANOVA.

      - Figure 4: The authors report a p-value of >0.999 with a correlation coefficient of -0.42 with a sample size of 10 subjects. This can't be correct (it should be around: p = 0.22). All statistical analyses should be checked.

      - Figure 2c. Eyes closed condition: The highest score of the *Glx/GABA ratio seems to be ~3.6. In subplot 2a, there seem to be 3 subjects that show a Glx/GABA ratio score > 3.6. How can this be explained? There is also a discrepancy for the eyes-closed condition.

      (4) Interpretation of aperiodic signal:

      - Several recent papers demonstrated that the aperiodic signal measured in EEG or ECoG is related to various important aspects such as age, skull thickness, electrode impedance, as well as cognition. Thus, currently, very little is known about the underlying effects which influence the aperiodic intercept and slope. The entire interpretation of the aperiodic slope as a proxy for E/I is based on a computational model and simulation (as described in the Gao et al. paper).

      - Especially the aperiodic intercept is a very sensitive measure to many influences (e.g. skull thickness, electrode impedance...). As crucial results (correlation aperiodic intercept and MRS measures) are facing this problem, this needs to be reevaluated. It is safer to make statements on the aperiodic slope than intercept. In theory, some of the potentially confounding measures are available to the authors (e.g. skull thickness can be computed from T1w images; electrode impedances are usually acquired alongside the EEG data) and could be therefore controlled.

      - The authors wrote: "Higher frequencies (such as 20-40 Hz) have been predominantly associated with local circuit activity and feedforward signaling (Bastos et al., 2018; Van Kerkoerle et al., 2014); the increased 20-40 Hz slope may therefore signal increased spontaneous spiking activity in local networks. We speculate that the steeper slope of the aperiodic activity for the lower frequency range (1-20 Hz) in CC individuals reflects the concomitant increase in inhibition." The authors confuse the interpretation of periodic and aperiodic signals. This section refers to the interpretation of the periodic signal (higher frequencies). This interpretation can not simply be translated to the aperiodic signal (slope).

      - The authors further wrote: We used the slope of the aperiodic (1/f) component of the EEG spectrum as an estimate of E/I ratio (Gao et al., 2017; Medel et al., 2020; Muthukumaraswamy & Liley, 2018). This is a highly speculative interpretation with very little empirical evidence. These papers were conducted with ECoG data (mostly in animals) and mostly under anesthesia. Thus, these studies only allow an indirect interpretation by what the 1/f slope in EEG measurements is actually influenced.

      (5) Problems with EEG preprocessing and analysis:

      - It seems that the authors did not identify bad channels nor address the line noise issue (even a problem if a low pass filter of below-the-line noise was applied).

      - What was the percentage of segments that needed to be rejected due to the 120μV criteria? This should be reported specifically for EO & EC and controls and patients.

      - The authors downsampled the data to 60Hz to "to match the stimulation rate". What is the intention of this? Because the subsequent spectral analyses are conflated by this choice (see Nyquist theorem).

      - "Subsequently, baseline removal was conducted by subtracting the mean activity across the length of an epoch from every data point." The actual baseline time segment should be specified.

      - "We excluded the alpha range (8-14 Hz) for this fit to avoid biasing the results due to documented differences in alpha activity between CC and SC individuals (Bottari et al., 2016; Ossandón et al., 2023; Pant et al., 2023)." This does not really make sense, as the FOOOF algorithm first fits the 1/f slope, for which the alpha activity is not relevant.

      - The model fits of the 1/f fitting for EO, EC, and both participant groups should be reported.

      (6) Validity of GABA measurements and results:

      - According the a newer study by the authors of the Gannet toolbox (https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/abs/10.1002/nbm.5076), the reliability and reproducibility of the gamma-aminobutyric acid (GABA) measurement can vary significantly depending on acquisition and modeling parameter. Thus, did the author address these challenges? Furthermore, the authors wrote: "We confirmed the within-subject stability of metabolite quantification by testing a subset of the sighted controls (n=6) 2-4 weeks apart. Looking at the supplementary Figure 5 (which would be rather plotted as ICC or Blant-Altman plots), the within-subject stability compared to between-subject variability seems not to be great. Furthermore, I don't think such a small sample size qualifies for a rigorous assessment of stability.

      - "Why might an enhanced inhibitory drive, as indicated by the lower Glx/GABA ratio" Is this interpretation really warranted, as the results of the group differences in the Glx/GABA ratio seem to be rather driven by a decreased Glx concentration in CC rather than an increased GABA (see Figure 2).

      - Glx concentration predicted the aperiodic intercept in CC individuals' visual cortices during ambient and flickering visual stimulation. Why specifically investigate the Glx concentration, when the paper is about E/I ratio?

      (7) Interpretation of the correlation between MRS measurements and EEG aperiodic signal:

      - The authors wrote: "The intercept of the aperiodic activity was highly correlated with the Glx concentration during rest with eyes open and during flickering stimulation (also see Supplementary Material S11). Based on the assumption that the aperiodic intercept reflects broadband firing (Manning et al., 2009; Winawer et al., 2013), this suggests that the Glx concentration might be related to broadband firing in CC individuals during active and passive visual stimulation." These results should not be interpreted (or with very caution) for several reasons (see also problem with influences on aperiodic intercept and small sample size). This is a result of the exploratory analyses of correlating every EEG parameter with every MRS parameter. This requires well-powered replication before any interpretation can be provided. Furthermore and importantly: why should this be specifically only in CC patients, but not in the SC control group?

      (8) Language and presentation:

      - The manuscript requires language improvements and correction of numerous typos. Over-simplifications and unclear statements are present, which could mislead or confuse readers (see also interpretation of aperiodic signal).

      - The authors state that "Together, the present results provide strong evidence for experience-dependent development of the E/I ratio in the human visual cortex, with consequences for behavior." The results of the study do not provide any strong evidence, because of the small sample size and exploratory analyses approach and not accounting for possible confounding factors.

      - "Our results imply a change in neurotransmitter concentrations as a consequence of *restoring* vision following congenital blindness." This is a speculative statement to infer a causal relationship on cross-sectional data.

      - In the limitation section, the authors wrote: "The sample size of the present study is relatively high for the rare population , but undoubtedly, overall, rather small." This sentence should be rewritten, as the study is plein underpowered. The further justification "We nevertheless think that our results are valid. Our findings neurochemically (Glx andGABA+ concentration), and anatomically (visual cortex) specific. The MRS parameters varied with parameters of the aperiodic EEG activity and visual acuity. The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) (Ossandón et al., 2023), and effects of chronological age were as expected from the literature." These statements do not provide any validation or justification of small samples. Furthermore, the current data set is a subset of an earlier published paper by the same authors "The EEG data sets reported here were part of data published earlier (Ossandón et al., 2023; Pant et al., 2023)." Thus, the statement "The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) " is a circular argument and should be avoided.

    5. Author response:

      eLife assessment

      This potentially useful study involves neuro-imaging and electrophysiology in a small cohort of congenital cataract patients after sight recovery and age-matched control participants with normal sight. It aims to characterize the effects of early visual deprivation on excitatory and inhibitory balance in the visual cortex. While the findings are taken to suggest the existence of persistent alterations in Glx/GABA ratio and aperiodic EEG signals, the evidence supporting these claims is incomplete. Specifically, small sample sizes, lack of a specific control cohort, and other methodological limitations will likely restrict the usefulness of the work, with relevance limited to scientists working in this particular subfield.

      As pointed out in the public reviews, there are only very few human models which allow for assessing the role of early experience on neural circuit development. While the prevalent research in permanent congenital blindness reveals the response and adaptation of the developing brain to an atypical situation (blindness), research in sight restoration addresses the question of whether and how atypical development can be remediated if typical experience (vision) is restored. The literature on the role of visual experience in the development of E/I balance in humans, assessed via Magnetic Resonance Spectroscopy (MRS), has been limited to a few studies on congenital permanent blindness. Thus, we assessed sight recovery individuals with a history of congenital blindness, as limited evidence from other researchers indicated that the visual cortex E/I ratio might differ compared to normally sighted controls.

      Individuals with total bilateral congenital cataracts who remained untreated until later in life are extremely rare, particularly if only carefully diagnosed patients are included in a study sample. A sample size of 10 patients is, at the very least, typical of past studies in this population, even for exclusively behavioral assessments. In the present study, in addition to behavioral assessment as an indirect measure of sensitive periods, we investigated participants with two neuroimaging methods (Magnetic Resonance Spectroscopy and electroencephalography) to directly assess the neural correlates of sensitive periods in humans. The electroencephalography data allowed us to link the results of our small sample to findings documented in large cohorts of both, sight recovery individuals and permanently congenitally blind individuals. As pointed out in a recent editorial recommending an “exploration-then-estimation procedure,” (“Consideration of Sample Size in Neuroscience Studies,” 2020), exploratory studies like ours provide crucial direction and specific hypotheses for future work.

      We included an age-matched sighted control group recruited from the same community, measured in the same scanner and laboratory, to assess whether early experience is necessary for a typical excitatory/inhibitory (E/I) ratio to emerge in adulthood. The present findings indicate that this is indeed the case. Based on these results, a possible question to answer in future work, with individuals who had developmental cataracts, is whether later visual deprivation causes similar effects. Note that even if visual deprivation at a later stage in life caused similar effects, the current results would not be invalidated; by contrast, they are essential to understand future work on late (permanent or transient) blindness.

      Thus, we think that the present manuscript has far reaching implications for our understanding of the conditions under which E/I balance, a crucial characteristic of brain functioning, emerges in humans.

      Finally, our manuscript is one of the first few studies which relates MRS neurotransmitter concentrations to parameters of EEG aperiodic activity. Since present research has been using aperiodic activity as a correlate of the E/I ratio, and partially of higher cognitive functions, we think that our manuscript additionally contributes to a better understanding of what might be measured with aperiodic neurophysiological activity.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this human neuroimaging and electrophysiology study, the authors aimed to characterize the effects of a period of visual deprivation in the sensitive period on excitatory and inhibitory balance in the visual cortex. They attempted to do so by comparing neurochemistry conditions ('eyes open', 'eyes closed') and resting state, and visually evoked EEG activity between ten congenital cataract patients with recovered sight (CC), and ten age-matched control participants (SC) with normal sight.

      First, they used magnetic resonance spectroscopy to measure in vivo neurochemistry from two locations, the primary location of interest in the visual cortex, and a control location in the frontal cortex. Such voxels are used to provide a control for the spatial specificity of any effects because the single-voxel MRS method provides a single sampling location. Using MR-visible proxies of excitatory and inhibitory neurotransmission, Glx and GABA+ respectively, the authors report no group effects in GABA+ or Glx, no difference in the functional conditions 'eyes closed' and 'eyes open'. They found an effect of the group in the ratio of Glx/GABA+ and no similar effect in the control voxel location. They then performed multiple exploratory correlations between MRS measures and visual acuity, and reported a weak positive correlation between the 'eyes open' condition and visual acuity in CC participants.

      The same participants then took part in an EEG experiment. The authors selected only two electrodes placed in the visual cortex for analysis and reported a group difference in an EEG index of neural activity, the aperiodic intercept, as well as the aperiodic slope, considered a proxy for cortical inhibition. They report an exploratory correlation between the aperiodic intercept and Glx in one out of three EEG conditions.

      The authors report the difference in E/I ratio, and interpret the lower E/I ratio as representing an adaptation to visual deprivation, which would have initially caused a higher E/I ratio. Although intriguing, the strength of evidence in support of this view is not strong. Amongst the limitations are the low sample size, a critical control cohort that could provide evidence for a higher E/I ratio in CC patients without recovered sight for example, and lower data quality in the control voxel.

      Strengths of study:

      How sensitive period experience shapes the developing brain is an enduring and important question in neuroscience. This question has been particularly difficult to investigate in humans. The authors recruited a small number of sight-recovered participants with bilateral congenital cataracts to investigate the effect of sensitive period deprivation on the balance of excitation and inhibition in the visual brain using measures of brain chemistry and brain electrophysiology. The research is novel, and the paper was interesting and well-written.

      Limitations:

      (1.1) Low sample size. Ten for CC and ten for SC, and a further two SC participants were rejected due to a lack of frontal control voxel data. The sample size limits the statistical power of the dataset and increases the likelihood of effect inflation.

      Applying strict criteria, we only included individuals who were born with no patterned vision in the CC group. The population of individuals who have remained untreated past infancy is small in India, despite a higher prevalence of childhood cataract than Germany. Indeed, from the original 11 CC and 11 SC participants tested, one participant each from the CC and SC group had to be rejected, as their data had been corrupted, resulting in 10 participants in each group.

      It was a challenge to recruit participants from this rare group with no history of neurological diagnosis/intake of neuromodulatory medications, who were able and willing to undergo both MRS and EEG. For this study, data collection took more than 1.5 years.

      We took care of the validity of our results with two measures; first, assessed not just MRS, but additionally, EEG measures of E/I ratio. The latter allowed us to link results to a larger population of CC individuals, that is, we replicated the results of a larger group of 38 individuals (Ossandón et al., 2023) in our sub-group.

      Second, we included a control voxel. As predicted, all group effects were restricted to the occipital voxel.

      (1.2) Lack of specific control cohort. The control cohort has normal vision. The control cohort is not specific enough to distinguish between people with sight loss due to different causes and patients with congenital cataracts with co-morbidities. Further data from more specific populations, such as patients whose cataracts have not been removed, with developmental cataracts, or congenitally blind participants, would greatly improve the interpretability of the main finding. The lack of a more specific control cohort is a major caveat that limits a conclusive interpretation of the results.

      The existing work on visual deprivation and neurochemical changes, as assessed with MRS, has been limited to permanent congenital blindness. In fact, most of the studies on permanent blindness included only congenitally blind or early blind humans (Coullon et al., 2015; Weaver et al., 2013), or, in separate studies, only late-blind individuals (Bernabeu et al., 2009). Thus, accordingly, we started with the most “extreme” visual deprivation model, sight recovery after congenital blindness. If we had not observed any group difference compared to normally sighted controls, investigating other groups might have been trivial. Based on our results, subsequent studies in late blind individuals, and then individuals with developmental cataracts, can be planned with clear hypotheses.

      (1.3) MRS data quality differences. Data quality in the control voxel appears worse than in the visual cortex voxel. The frontal cortex MRS spectrum shows far broader linewidth than the visual cortex (Supplementary Figures). Compared to the visual voxel, the frontal cortex voxel has less defined Glx and GABA+ peaks; lower GABA+ and Glx concentrations, lower NAA SNR values; lower NAA concentrations. If the data quality is a lot worse in the FC, then small effects may not be detectable.

      Worse data quality in the frontal than the visual cortex has been repeatedly observed in the MRS literature, attributable to magnetic field distortions (Juchem & Graaf, 2017) resulting from the proximity of the region to the sinuses (recent example: (Rideaux et al., 2022)). Nevertheless, we chose the frontal control region rather than a parietal voxel, given the potential  neurochemical changes in multisensory regions of the parietal cortex due to blindness. Such reorganization would be less likely in frontal areas associated with higher cognitive functions. Further, prior MRS studies of the visual cortex have used the frontal cortex as a control region as well (Pitchaimuthu et al., 2017; Rideaux et al., 2022).

      In the present study, we checked that the frontal cortex datasets for Glx and GABA+ concentrations were of sufficient quality: the fit error was below 8.31% in both groups (Supplementary Material S3). For reference, Mikkelsen et al. reported a mean GABA+ fit error of 6.24 +/- 1.95% from a posterior cingulate cortex voxel across 8 GE scanners, using the Gannet pipeline. No absolute cutoffs have been proposed for fit errors. However, MRS studies in special populations (I/E ratio assessed in narcolepsy (Gao et al., 2024), GABA concentration assessed in Autism Spectrum Disorder (Maier et al., 2022)) have used frontal cortex data with a fit error of <10% to identify differences between cohorts (Gao et al., 2024; Pitchaimuthu et al., 2017). Based on the literature, MRS data from the frontal voxel of the present study would have been of sufficient quality to uncover group differences.

      In the revised manuscript, we will add the recently published MRS quality assessment form to the supplementary materials. Additionally, we would like to allude to our apriori prediction of group differences for the visual cortex, but not for the frontal cortex voxel.

      (1.4) Because of the direction of the difference in E/I, the authors interpret their findings as representing signatures of sight improvement after surgery without further evidence, either within the study or from the literature. However, the literature suggests that plasticity and visual deprivation drive the E/I index up rather than down. Decreasing GABA+ is thought to facilitate experience-dependent remodelling. What evidence is there that cortical inhibition increases in response to a visual cortex that is over-sensitised due to congenital cataracts? Without further experimental or literature support this interpretation remains very speculative.

      Indeed, higher inhibition was not predicted, which we attempt to reconcile in our discussion section. We base our discussion mainly on the non-human animal literature, which has shown evidence of homeostatic changes after prolonged visual deprivation in the adult brain (Barnes et al., 2015). It is also interesting to note that after monocular deprivation in adult humans, resting GABA+ levels decreased in the visual cortex (Lunghi et al., 2015). Assuming that after delayed sight restoration, adult neuroplasticity mechanisms must be employed, these studies would predict a “balancing” of the increased excitatory drive following sight restoration by a commensurate increase in inhibition (Keck et al., 2017). Additionally, the EEG results of the present study allowed for speculation regarding the underlying neural mechanisms of an altered E/I ratio. The aperiodic EEG activity suggested higher spontaneous spiking (increased intercept) and increased inhibition (steeper aperiodic slope between 1-20 Hz) in CC vs SC individuals (Ossandón et al., 2023).

      In the revised manuscript, we will more clearly indicate that these speculations are based primarily on non-human animal work, due to the lack of human studies on the subject.

      (1.5) Heterogeneity in the patient group. Congenital cataract (CC) patients experienced a variety of duration of visual impairment and were of different ages. They presented with co-morbidities (absorbed lens, strabismus, nystagmus). Strabismus has been associated with abnormalities in GABAergic inhibition in the visual cortex. The possible interactions with residual vision and confounds of co-morbidities are not experimentally controlled for in the correlations, and not discussed.

      The goal of the present study was to assess whether we would observe changes in E/I ratio after restoring vision at all. We would not have included patients without nystagmus in the CC group of the present study, since it would have been unlikely that they experienced congenital patterned visual deprivation. Amongst diagnosticians, nystagmus or strabismus might not be considered genuine “comorbidities” that emerge in people with congenital cataracts. Rather, these are consequences of congenital visual deprivation, which we employed as diagnostic criteria. Similarly, absorbed lenses are clear signs that cataracts were congenital. As in other models of experience dependent brain development (e.g. the extant literature on congenital permanent blindness, including anophthalmic individuals (Coullon et al., 2015; Weaver et al., 2013), some uncertainty remains regarding whether the (remaining, in our case) abnormalities of the eye, or the blindness they caused, are the factors driving neural changes. In case of people with reversed congenital cataracts, at least the retina is considered to be intact, as they would otherwise not receive cataract removal surgery.

      However, we consider it unlikely that strabismus caused the group differences, because the present study shows group differences in the Glx/GABA+ ratio at rest, regardless of eye opening or eye closure, for which strabismus would have caused distinct effects. By contrast, the link between GABA concentration and, for example, interocular suppression in strabismus, have so far been documented during visual stimulation (Mukerji et al., 2022; Sengpiel et al., 2006), and differed in direction depending on the amblyopic vs. non-amblyopic eye. Further, one MRS study did not find group differences in GABA concentration between the visual cortices of 16 amblyopic individuals and sighted controls (Mukerji et al., 2022), supporting that the differences in Glx/GABA+ concentration which we observed were driven by congenital deprivation, and not amblyopia-associated visual acuity or eye movement differences.  

      In the revised manuscript, we will discuss the inclusion criteria in more detail, and the aforementioned reasons why our data remains interpretable.

      (1.6) Multiple exploratory correlations were performed to relate MRS measures to visual acuity (shown in Supplementary Materials), and only specific ones were shown in the main document. The authors describe the analysis as exploratory in the 'Methods' section. Furthermore, the correlation between visual acuity and E/I metric is weak, and not corrected for multiple comparisons. The results should be presented as preliminary, as no strong conclusions can be made from them. They can provide a hypothesis to test in a future study.

      In the revised manuscript, we will clearly indicate that the exploratory correlation analyses are reported to put forth hypotheses for future studies.

      (1.7) P.16 Given the correlation of the aperiodic intercept with age ("Age negatively correlated with the aperiodic intercept across CC and SC individuals, that is, a flattening of the intercept was observed with age"), age needs to be controlled for in the correlation between neurochemistry and the aperiodic intercept. Glx has also been shown to negatively correlate with age.

      The correlation between chronological age and aperiodic intercept was observed across groups, but the correlation between Glx and the intercept of the aperiodic EEG activity was seen only in the CC group, even though the SC group was matched for age. Thus, such a correlation was very unlikely to  be predominantly driven by an effect of chronological age.

      In the revised manuscript, we will add the linear regressions with age as a covariate included below, for the relationship between aperiodic intercept and Glx concentration in the CC group. 

      a. A linear regression was conducted within the CC group to predict the intercept during visual stimulation, based on age and visual cortex Glx concentration. The results of the regression analysis indicated that the model explained a significant proportion of the variance in the aperiodic intercept, 𝑅2\=0.82_, t_(2,7)=16.1_, 𝑝=0.0024._ Note that the coefficient for age was not significant, 𝛽=0.007, t(7)=0.82, 𝑝=0.439. The regression coefficients and their respective statistics are presented in Author response table 1.

      Author response table 1.

      Regression Analysis Summary for Predicting Aperiodic Intercept (Visual Stimulation) in the CC group

      b. A linear regression was conducted to predict the intercept during eye opening at rest, based on age and visual cortex Glx concentration. The results of the regression analysis indicated that the model explained a significant proportion of the variance in the aperiodic intercept, 𝑅2\=0.842_, t_(2,7)=18.6,  𝑝=0.00159_._ Note that the coefficient for age was not significant, 𝛽=−0.005, t(7)=−0.90, 𝑝=0.400. The regression coefficients and their respective statistics are presented in Author response table 2.

      Author response table 2.

      Regression Analysis Summary for Predicting Aperiodic Intercept (Eyes Open) in the CC group

      c. Given that the Glx coefficient is significant in both models and age does not significantly predict either outcome, it can be concluded that Glx independently predicts the intercept of the aperiodic intercept.

      (1.8) Multiple exploratory correlations were performed to relate MRS to EEG measures (shown in Supplementary Materials), and only specific ones were shown in the main document. Given the multiple measures from the MRS, the correlations with the EEG measures were exploratory, as stated in the text, p.16, and in Figure 4. Yet the introduction said that there was a prior hypothesis "We further hypothesized that neurotransmitter changes would relate to changes in the slope and intercept of the EEG aperiodic activity in the same subjects." It would be great if the text could be revised for consistency and the analysis described as exploratory.

      In the revised manuscript, we will improve the phrasing. We consider the correlation analyses as exploratory due to our sample size and the absence of prior work. However, we did hypothesize that both MRS and EEG markers would concurrently be altered in CC vs SC individuals.

      (1.9) The analysis for the EEG needs to take more advantage of the available data. As far as I understand, only two electrodes were used, yet far more were available as seen in their previous study (Ossandon et al., 2023). The spatial specificity is not established. The authors could use the frontal cortex electrode (FP1, FP2) signals as a control for spatial specificity in the group effects, or even better, all available electrodes and correct for multiple comparisons. Furthermore, they could use the aperiodic intercept vs Glx in SC to evaluate the specificity of the correlation to CC.

      The aperiodic intercept and slope did not differ between CC and SC individuals for Fp1 and Fp2, suggesting the spatial specificity of the results. In the revised manuscript, we will add this analysis to the supplementary material.

      Author response image 1.

      Aperiodic intercept (top) and slope (bottom) for congenital cataract-reversal (CC, red) and age-matched normally sighted control (SC, blue) individuals. Distributions of these parameters are displayed as violin plots for three conditions; at rest with eyes closed (EC), at rest with eyes open (EO) and during visual stimulation (LU). Aperiodic parameters were calculated across electrodes Fp1 and Fp2. Solid black lines indicate mean values, dotted black lines indicate median values. Coloured lines connect values of individual participants across conditions.

      Further, Glx concentration in the visual cortex did not correlate with the aperiodic intercept in the SC group (Figure 4), suggesting that this relationship was indeed specific to the CC group.

      The data from all electrodes has been analyzed and published in other studies as well (Pant et al., 2023; Ossandón et al., 2023).

      Reviewer #2 (Public Review):

      Summary:

      The manuscript reports non-invasive measures of activity and neurochemical profiles of the visual cortex in congenitally blind patients who recovered vision through the surgical removal of bilateral dense cataracts. The declared aim of the study is to find out how restoring visual function after several months or years of complete blindness impacts the balance between excitation and inhibition in the visual cortex.

      Strengths:

      The findings are undoubtedly useful for the community, as they contribute towards characterising the many ways this special population differs from normally sighted individuals. The combination of MRS and EEG measures is a promising strategy to estimate a fundamental physiological parameter - the balance between excitation and inhibition in the visual cortex, which animal studies show to be heavily dependent upon early visual experience. Thus, the reported results pave the way for further studies, which may use a similar approach to evaluate more patients and control groups.

      Weaknesses:

      (2.1) The main issue is the lack of an appropriate comparison group or condition to delineate the effect of sight recovery (as opposed to the effect of congenital blindness). Few previous studies suggested an increased excitation/Inhibition ratio in the visual cortex of congenitally blind patients; the present study reports a decreased E/I ratio instead. The authors claim that this implies a change of E/I ratio following sight recovery. However, supporting this claim would require showing a shift of E/I after vs. before the sight-recovery surgery, or at least it would require comparing patients who did and did not undergo the sight-recovery surgery (as common in the field).

      Longitudinal studies would indeed be the best way to test the hypothesis that the lower E/I ratio in the CC group observed by the present study is a consequence of sight restoration. However, longitudinal studies involving neuroimaging are an effortful challenge, particularly in research conducted outside of major developed countries and dedicated neuroimaging research facilities. Crucially, however, had CC and SC individuals, as well as permanently congenitally blind vs SC individuals (Coullon et al., 2015; Weaver et al., 2013), not differed on any neurochemical markers, such a longitudinal study might have been trivial. Thus, in order to justify and better tailor longitudinal studies, cross-sectional studies are an initial step.

      (2.2) MR Spectroscopy shows a reduced GLX/GABA ratio in patients vs. sighted controls; however, this finding remains rather isolated, not corroborated by other observations. The difference between patients and controls only emerges for the GLX/GABA ratio, but there is no accompanying difference in either the GLX or the GABA concentrations. There is an attempt to relate the MRS data with acuity measurements and electrophysiological indices, but the explorative correlational analyses do not help to build a coherent picture. A bland correlation between GLX/GABA and visual impairment is reported, but this is specific to the patients' group (N=10) and would not hold across groups (the correlation is positive, predicting the lowest GLX/GABA ratio values for the sighted controls - the opposite of what is found). There is also a strong correlation between GLX concentrations and the EEG power at the lowest temporal frequencies. Although this relation is intriguing, it only holds for a very specific combination of parameters (of the many tested): only with eyes open, only in the patient group.

      We interpret these findings differently, that is, in the context of experiments from non-human animals and the larger MRS literature.

      Homeostatic control of E/I balance assumes that the ratio of excitation (reflected here by Glx) and inhibition (reflected here by GABA+) is regulated. Like prior work (Gao et al., 2024, 2024; Narayan et al., 2022; Perica et al., 2022; Steel et al., 2020; Takado et al., 2022; Takei et al., 2016), we assumed that the ratio of Glx/GABA+ is indicative of E/I balance rather than solely the individual neurotransmitter levels. One of the motivations for assessing the ratio vs the absolute concentration is that as per the underlying E/I balance hypothesis, a change in excitation would cause a concomitant change in inhibition, and vice versa, which has been shown in non-human animal work (Fang et al., 2021; Haider et al., 2006; Tao & Poo, 2005) and modeling research (Vreeswijk & Sompolinsky, 1996; Wu et al., 2022). Importantly, our interpretation of the lower E/I ratio is not just from the Glx/GABA+ ratio, but additionally, based on the steeper EEG aperiodic slope (1-20 Hz).  

      As in the discussion section and response 1.4, we did not expect to see a lower Glx/GABA+ ratio in CC individuals. We discuss the possible reasons for the direction of the correlation with visual acuity and aperiodic offset during passive visual stimulation, and offer interpretations and (testable) hypotheses.

      We interpret the direction of the  Glx/GABA+ correlation with visual acuity to imply that patients with highest (compensatory) balancing of the consequences of congenital blindness (hyperexcitation), in light of visual stimulation, are those who recover best. Note, the sighted control group was selected based on their “normal” vision. Thus, clinical visual acuity measures are not expected to sufficiently vary, nor have the resolution to show strong correlations with neurophysiological measures. By contrast, the CC group comprised patients highly varying in visual outcomes, and thus were ideal to investigate such correlations.

      This holds for the correlation between Glx and the aperiodic intercept, as well. Previous work has suggested that the intercept of the aperiodic activity is associated with broadband spiking activity in neural circuits (Manning et al., 2009). Thus, an atypical increase of spiking activity during visual stimulation, as indirectly suggested by “old” non-human primate work on visual deprivation (Hyvärinen et al., 1981) might drive a correlation not observed in healthy populations.

      In the revised manuscript, we will more clearly indicate in the discussion that these are possible post-hoc interpretations. We argue that given the lack of such studies in humans, it is all the more important that extant data be presented completely, even if the direction of the effects are not as expected.

      (2.3) For these reasons, the reported findings do not allow us to draw firm conclusions on the relation between EEG parameters and E/I ratio or on the impact of early (vs. late) visual experience on the excitation/inhibition ratio of the human visual cortex.

      Indeed, the correlations we have tested between the E/I ratio and EEG parameters were exploratory, and have been reported as such. The goal of our study was not to compare the effects of early vs. late visual experience. The goal was to study whether early visual experience is necessary for a typical E/I ratio in visual neural circuits. We provided clear evidence in favor of this hypothesis. Thus, the present results suggest the necessity of investigating the effects of late visual deprivation. In fact, such research is missing in permanent blindness as well.

      Reviewer #3 (Public Review):

      This manuscript examines the impact of congenital visual deprivation on the excitatory/inhibitory (E/I) ratio in the visual cortex using Magnetic Resonance Spectroscopy (MRS) and electroencephalography (EEG) in individuals whose sight was restored. Ten individuals with reversed congenital cataracts were compared to age-matched, normally sighted controls, assessing the cortical E/I balance and its interrelationship to visual acuity. The study reveals that the Glx/GABA ratio in the visual cortex and the intercept and aperiodic signal are significantly altered in those with a history of early visual deprivation, suggesting persistent neurophysiological changes despite visual restoration.

      My expertise is in EEG (particularly in the decomposition of periodic and aperiodic activity) and statistical methods. I have several major concerns in terms of methodological and statistical approaches along with the (over)interpretation of the results. These major concerns are detailed below.

      (3.1) Variability in visual deprivation:

      - The document states a large variability in the duration of visual deprivation (probably also the age at restoration), with significant implications for the sensitivity period's impact on visual circuit development. The variability and its potential effects on the outcomes need thorough exploration and discussion.

      We work with a rare, unique patient population, which makes it difficult to systematically assess the effects of different visual histories while maintaining stringent inclusion criteria such as complete patterned visual deprivation at birth. Regardless, we considered the large variance in age at surgery and time since surgery as supportive of our interpretation: group differences were found despite the large variance in duration of visual deprivation. Moreover, the existing variance was used to explore possible associations between behavior and neural measures, as well as neurochemical and EEG measures.

      In the revised manuscript, we will detail the advantages and disadvantages of our CC sample, with respect to duration of congenital visual deprivation.

      (3.2) Sample size:

      - The small sample size is a major concern as it may not provide sufficient power to detect subtle effects and/or overestimate significant effects, which then tend not to generalize to new data. One of the biggest drivers of the replication crisis in neuroscience.

      We address the small sample size in our discussion, and make clear that small sample sizes were due to the nature of investigations in special populations. It is worth noting that our EEG results fully align  with those of a larger sample of CC individuals (Ossandón et al., 2023), providing us confidence about their validity and reproducibility. Moreover, our MRS results and correlations of those with EEG parameters were spatially specific to occipital cortex measures, as predicted.

      The main problem with the correlation analyses between MRS and EEG measures is that the sample size is simply too small to conduct such an analysis. Moreover, it is unclear from the methods section that this analysis was only conducted in the patient group (which the reviewer assumed from the plots), and not explained why this was done only in the patient group. I would highly recommend removing these correlation analyses.

      We marked the correlation analyses as exploratory; note that we do not base most of our discussion on the results of these analyses. As indicated by Reviewer 1, reporting them allows for deriving more precise hypothesis for future studies. It has to be noted that we investigate an extremely rare population, tested outside of major developed economies and dedicated neuroimaging research facilities. In addition to being a rare patient group, these individuals come from poor communities. Therefore, we consider it justified to report these correlations as exploratory, providing direction for future research.

      (3.3) Statistical concerns:

      - The statistical analyses, particularly the correlations drawn from a small sample, may not provide reliable estimates (see https://www.sciencedirect.com/science/article/pii/S0092656613000858, which clearly describes this problem).

      It would undoubtedly be better to have a larger sample size. We nonetheless think it is of value to the research community to publish this dataset, since 10 multimodal data sets from a carefully diagnosed, rare population, representing a human model for the effects of early experience on brain development, are quite a lot.  Sample sizes in prior neuroimaging studies in transient blindness have most often ranged from n = 1 to n = 10. They nevertheless provided valuable direction for future research, and integration of results across multiple studies provides scientific insights.  

      Identifying possible group differences was the goal of our study, with the correlations being an exploratory analysis, which we have clearly indicated in the methods, results and discussion.

      - Statistical analyses for the MRS: The authors should consider some additional permutation statistics, which are more suitable for small sample sizes. The current statistical model (2x2) design ANOVA is not ideal for such small sample sizes. Moreover, it is unclear why the condition (EO & EC) was chosen as a predictor and not the brain region (visual & frontal) or neurochemicals. Finally, the authors did not provide any information on the alpha level nor any information on correction for multiple comparisons (in the methods section). Finally, even if the groups are matched w.r.t. age, the time between surgery and measurement, the duration of visual deprivation, (and sex?), these should be included as covariates as it has been shown that these are highly related to the measurements of interest (especially for the EEG measurements) and the age range of the current study is large.

      In our ANOVA models, the neurochemicals were the outcome variables, and the conditions were chosen as predictors based on prior work suggesting that Glx/GABA+ might vary with eye closure (Kurcyus et al., 2018). The study was designed based on a hypothesis of group differences localized to the occipital cortex, due to visual deprivation. The frontal cortex voxel was chosen to indicate whether these differences were spatially specific. Therefore, we conducted separate ANOVAs based on this study design.

      In the revised manuscript, we will add permutation analyses for our outcomes, as well as multiple regression models investigating whether the variance in visual history might have driven these results. Note that in the supplementary materials (S6, S7), we have reported the correlations between visual history metrics and MRS/EEG outcomes.

      The alpha level used for the ANOVA models specified in the methods section was 0.05. The alpha level for the exploratory analyses reported in the main manuscript was 0.008, after correcting for (6) multiple comparisons using the Bonferroni correction, also specified in the methods. Note that the p-values following correction are expressed as multiplied by 6, due to most readers assuming an alpha level of 0.05 (see response regarding large p-values).

      We used a control group matched for age and sex. Moreover, the controls were recruited and tested in the same institutes, using the same setup. We feel that we followed the gold standards for recruiting a healthy control group for a patient group.

      - EEG statistical analyses: The same critique as for the MRS statistical analyses applies to the EEG analysis. In addition: was the 2x3 ANOVA conducted for EO and EC independently? This seems to be inconsistent with the approach in the MRS analyses, in which the authors chose EO & EC as predictors in their 2x2 ANOVA.

      The 2x3 ANOVA was not conducted independently for the eyes open/eyes closed condition, the ANOVA conducted on the EEG metrics was 2x3 because it had group (CC, SC) and condition (eyes open (EO), eyes closed (EC) and visual stimulation (LU)) as predictors.

      - Figure 4: The authors report a p-value of >0.999 with a correlation coefficient of -0.42 with a sample size of 10 subjects. This can't be correct (it should be around: p = 0.22). All statistical analyses should be checked.

      As specified in the methods and figure legend, the reported p values in Figure 4 have been corrected using the Bonferroni correction, and therefore multiplied by the number of comparisons, leading to the seemingly large values.

      Additionally, to check all statistical analyses, we put the manuscript through an independent Statistics Check (Nuijten & Polanin, 2020) (https://michelenuijten.shinyapps.io/statcheck-web/) and will upload the consistency report with the revised supplementary material.

      - Figure 2c. Eyes closed condition: The highest score of the *Glx/GABA ratio seems to be ~3.6. In subplot 2a, there seem to be 3 subjects that show a Glx/GABA ratio score > 3.6. How can this be explained? There is also a discrepancy for the eyes-closed condition.

      The three subjects that show the Glx/GABA+ ratio > 3.6 in subplot 2a are in the SC group, whereas the correlations plotted in figure 2c are only for the CC group, where the highest score is indeed ~3.6.

      (3.4) Interpretation of aperiodic signal:

      - Several recent papers demonstrated that the aperiodic signal measured in EEG or ECoG is related to various important aspects such as age, skull thickness, electrode impedance, as well as cognition. Thus, currently, very little is known about the underlying effects which influence the aperiodic intercept and slope. The entire interpretation of the aperiodic slope as a proxy for E/I is based on a computational model and simulation (as described in the Gao et al. paper).

      Apart from the modeling work from Gao et al., multiple papers which have also been cited which used ECoG, EEG and MEG and showed concomitant changes in aperiodic activity with pharmacological manipulation of the E/I ratio (Colombo et al., 2019; Molina et al., 2020; Muthukumaraswamy & Liley, 2018). Further, several prior studies have interpreted changes in the aperiodic slope as reflective of changes in the E/I ratio, including studies of developmental groups (Favaro et al., 2023; Hill et al., 2022; McSweeney et al., 2023; Schaworonkow & Voytek, 2021) as well as patient groups (Molina et al., 2020; Ostlund et al., 2021).

      In the revised manuscript, we will cite those studies not already included in the introduction.

      - Especially the aperiodic intercept is a very sensitive measure to many influences (e.g. skull thickness, electrode impedance...). As crucial results (correlation aperiodic intercept and MRS measures) are facing this problem, this needs to be reevaluated. It is safer to make statements on the aperiodic slope than intercept. In theory, some of the potentially confounding measures are available to the authors (e.g. skull thickness can be computed from T1w images; electrode impedances are usually acquired alongside the EEG data) and could be therefore controlled.

      All electrophysiological measures indeed depend on parameters such as skull thickness and electrode impedance. As in the extant literature using neurophysiological measures to compare brain function between patient and control groups, we used a control group matched in age/ sex, recruited in the same region, tested with the same devices, and analyzed with the same analysis pipeline. For example, impedance was kept below 10 kOhm for all subjects. There is no evidence available suggesting that congenital cataracts are associated with changes in skull thickness that would cause the observed pattern of group results. Moreover, we cannot think of how any of the exploratory correlations between neurophysiological measures and MRS measures could be accounted for by a difference e.g. in skull thickness.

      - The authors wrote: "Higher frequencies (such as 20-40 Hz) have been predominantly associated with local circuit activity and feedforward signaling (Bastos et al., 2018; Van Kerkoerle et al., 2014); the increased 20-40 Hz slope may therefore signal increased spontaneous spiking activity in local networks. We speculate that the steeper slope of the aperiodic activity for the lower frequency range (1-20 Hz) in CC individuals reflects the concomitant increase in inhibition." The authors confuse the interpretation of periodic and aperiodic signals. This section refers to the interpretation of the periodic signal (higher frequencies). This interpretation cannot simply be translated to the aperiodic signal (slope).

      Prior work has not always separated the aperiodic and periodic components, making it unclear what might have driven these effects in our data. The interpretation of the higher frequency range was intended to contrast with the interpretations of lower frequency range, in order to speculate as to why the two aperiodic fits might go in differing directions. We will clarify our interpretation in the revised manuscript. Note that Ossandon et al. reported highly similar results (group differences for CC individuals and for permanently congenitally blind humans) for the aperiodic activity between 20-40 Hz and oscillatory activity in the gamma range. We will allude to these findings in the revised manuscript.

      - The authors further wrote: We used the slope of the aperiodic (1/f) component of the EEG spectrum as an estimate of E/I ratio (Gao et al., 2017; Medel et al., 2020; Muthukumaraswamy & Liley, 2018). This is a highly speculative interpretation with very little empirical evidence. These papers were conducted with ECoG data (mostly in animals) and mostly under anesthesia. Thus, these studies only allow an indirect interpretation by what the 1/f slope in EEG measurements is actually influenced.

      Note that Muthukumaraswamy et al. (2018) used different types of pharmacological manipulations and analyzed periodic and aperiodic MEG activity in addition to monkey ECoG (Medel et al., 2020) (now published as (Medel et al., 2023)) compared EEG activity in addition to ECoG data after propofol administration. The interpretation of our results are in line with a number of recent studies in developing (Hill et al., 2022; Schaworonkow & Voytek, 2021) and special populations using EEG. As mentioned above, several prior studies have used the slope of the 1/f component/aperiodic activity as an indirect measure of the E/I ratio (Favaro et al., 2023; Hill et al., 2022; McSweeney et al., 2023; Molina et al., 2020; Ostlund et al., 2021; Schaworonkow & Voytek, 2021), including studies using scalp-recorded EEG. We will make more clear in the introduction of the revised manuscript that this metric is indirect.

      While a full understanding of aperiodic activity needs to be provided, some convergent ideas have emerged . We think that our results contribute to this enterprise, since our study is, to the best of our knowledge, the first which assessed MRS measured neurotransmitter levels and EEG aperiodic activity.

      (3.5) Problems with EEG preprocessing and analysis:

      - It seems that the authors did not identify bad channels nor address the line noise issue (even a problem if a low pass filter of below-the-line noise was applied).

      As pointed out in the methods and Figure 1, we only analyzed data from two channels, O1 and O2, neither of which were rejected for any participant. Channel rejection was performed for the larger dataset, published elsewhere (Ossandón et al., 2023; Pant et al., 2023).

      In both published works, we did not consider frequency ranges above 40 Hz to avoid any possible contamination with line noise. Here, we focused on activity between 0 and 20 Hz, definitely excluding line noise contaminations. The low pass filter (FIR, 1-45 Hz) guaranteed that any spill-over effects of line noise would be restricted to frequencies just below the upper cutoff frequency.

      Additionally, a prior version of the analysis used the cleanline.m function to remove line noise before filtering, and the group differences remained stable. We will report this analysis in the supplementary version of the revised manuscript. Further, both groups were measured in the same lab, making line noise as an account for the observed group effects highly unlikely. Finally, any of the exploratory MRS-EEG correlations would be hard to explain if the EEG parameters would be contaminated with line noise.

      - What was the percentage of segments that needed to be rejected due to the 120μV criteria? This should be reported specifically for EO & EC and controls and patients.

      The mean percentage of 1 second segments rejected for each resting state condition is below. Mean percentage of 6.25 long segments rejected in each group for the visual stimulation condition are also included, and will be added to the revised manuscript:

      Author response table 3.

      - The authors downsampled the data to 60Hz to "to match the stimulation rate". What is the intention of this? Because the subsequent spectral analyses are conflated by this choice (see Nyquist theorem).

      This data were collected as part of a study designed to evoke alpha activity with visual white-noise, which ranged in luminance with equal power at all frequencies from 1-60 Hz, restricted by the refresh rate of the monitor on which stimuli were presented (Pant et al., 2023). This paradigm and method was developed by VanRullen and colleagues (Schwenk et al., 2020; Vanrullen & MacDonald, 2012), wherein the analysis requires the same sampling rate between the presented frequencies and the EEG data. The downsampling function used here automatically applies an anti-aliasing filter (EEGLAB 2019) .

      - "Subsequently, baseline removal was conducted by subtracting the mean activity across the length of an epoch from every data point." The actual baseline time segment should be specified.

      The time segment was the length of the epoch, that is, 1 second for the resting state conditions and 6.25 seconds for the visual stimulation conditions. This will be explicitly stated in the revised manuscript.

      - "We excluded the alpha range (8-14 Hz) for this fit to avoid biasing the results due to documented differences in alpha activity between CC and SC individuals (Bottari et al., 2016; Ossandón et al., 2023; Pant et al., 2023)." This does not really make sense, as the FOOOF algorithm first fits the 1/f slope, for which the alpha activity is not relevant.

      We did not use the FOOOF algorithm/toolbox in this manuscript. As stated in the methods, we used a 1/f fit to the 1-20 Hz spectrum in the log-log space, and subtracted this fit from the original spectrum to obtain the corrected spectrum. Given the pronounced difference in alpha power between groups (Bottari et al., 2016; Ossandón et al., 2023; Pant et al., 2023), we were concerned it might drive differences in the exponent values.  Our analysis pipeline had been adapted from previous publications of our group and other labs (Ossandón et al., 2023; Voytek et al., 2015; Waschke et al., 2017).

      We have conducted the analysis with and without the exclusion of the alpha range, as well as using the FOOOF toolbox both in the 1-20 Hz and 20-40 Hz ranges (Ossandón et al., 2023); The findings of a steeper slope in the 1-20 Hz range as well as lower alpha power in CC vs SC individuals remained stable. In Ossandón et al., the comparison between the piecewise fits and FOOOF fits led the authors to use the former as it outperformed the FOOOF algorithm for their data.

      - The model fits of the 1/f fitting for EO, EC, and both participant groups should be reported.

      In Figure 3 of the manuscript, we depicted the mean spectra and 1/f fits for each group. We will add the fit quality metrics and show individual subjects’ fits in the revised manuscript.

      (3.6) Validity of GABA measurements and results:

      - According the a newer study by the authors of the Gannet toolbox (https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/abs/10.1002/nbm.5076), the reliability and reproducibility of the gamma-aminobutyric acid (GABA) measurement can vary significantly depending on acquisition and modeling parameter. Thus, did the author address these challenges?

      We took care of data quality while acquiring MRS data by ensuring appropriate voxel placement and linewidth prior to scanning. Acquisition as well as modeling parameters were constant for both groups, so they cannot have driven group differences.

      The linked article compares the reproducibility of GABA measurement using Osprey, which was released in 2020 and uses linear combination modeling to fit the peak as opposed to Gannet’s simple peak fitting (Hupfeld et al., 2024). The study finds better test-retest reliability for Osprey compared to Gannet’s method.

      As the present work was conceptualized in 2018, we used Gannet 3.0, which was the state-of-the-art edited spectral analysis toolbox at the time, and still is widely used. In the revised manuscript, we will include a supplementary section reanalyzing the main findings with Osprey.

      - Furthermore, the authors wrote: "We confirmed the within-subject stability of metabolite quantification by testing a subset of the sighted controls (n=6) 2-4 weeks apart. Looking at the supplementary Figure 5 (which would be rather plotted as ICC or Blant-Altman plots), the within-subject stability compared to between-subject variability seems not to be great. Furthermore, I don't think such a small sample size qualifies for a rigorous assessment of stability.

      Indeed, we did not intend to provide a rigorous assessment of within-subject stability. Rather, we aimed to confirm that data quality/concentration ratios did not systematically differ between the same subjects tested longitudinally; driven, for example, by scanner heating or time of day. As with the phantom testing, we attempted to give readers an idea of the quality of the data, as they were collected from a primarily clinical rather than a research site.

      In the revised manuscript we will remove the statement regarding stability, and add the Blant-Altman plot.

      - "Why might an enhanced inhibitory drive, as indicated by the lower Glx/GABA ratio" Is this interpretation really warranted, as the results of the group differences in the Glx/GABA ratio seem to be rather driven by a decreased Glx concentration in CC rather than an increased GABA (see Figure 2).

      We used the Glx/GABA+ ratio as a measure, rather than individual Glx or GABA+ concentration, which did not significantly differ between groups. As detailed in Response 2.2, we think this metric aligns better with an underlying E/I balance hypothesis and has been used in many previous studies (Gao et al., 2024; Liu et al., 2015; Narayan et al., 2022; Perica et al., 2022).

      Our interpretation of an enhanced inhibitory drive additionally comes from the combination of aperiodic EEG (1-20 Hz) and MRS measures, which, when considered together, are consistent with a decreased E/I ratio.

      In the revised manuscript, we will rephrase this sentence accordingly. 

      - Glx concentration predicted the aperiodic intercept in CC individuals' visual cortices during ambient and flickering visual stimulation. Why specifically investigate the Glx concentration, when the paper is about E/I ratio?

      As stated in the methods, we exploratorily assessed the relationship between all MRS parameters (Glx, GABA+ and Glx/GABA+ ratio) with the aperiodic parameters (slope, offset), and corrected for multiple comparisons accordingly. We think this is a worthwhile analysis considering the rarity of the dataset/population (see 1.2, 1.6, 2.1 and reviewer 1’s comments about future hypotheses). We only report the Glx – aperiodic intercept correlation in the main manuscript as it survived correction for multiple comparisons.

      (3.7) Interpretation of the correlation between MRS measurements and EEG aperiodic signal:

      - The authors wrote: "The intercept of the aperiodic activity was highly correlated with the Glx concentration during rest with eyes open and during flickering stimulation (also see Supplementary Material S11). Based on the assumption that the aperiodic intercept reflects broadband firing (Manning et al., 2009; Winawer et al., 2013), this suggests that the Glx concentration might be related to broadband firing in CC individuals during active and passive visual stimulation." These results should not be interpreted (or with very caution) for several reasons (see also problem with influences on aperiodic intercept and small sample size). This is a result of the exploratory analyses of correlating every EEG parameter with every MRS parameter. This requires well-powered replication before any interpretation can be provided. Furthermore and importantly: why should this be specifically only in CC patients, but not in the SC control group?

      We indicate clearly in all parts of the manuscript that these correlations are presented as exploratory. Further, we interpret the Glx-aperiodic offset correlation, and none of the others, as it survived the Bonferroni correction for multiple comparisons. We offer a hypothesis in the discussion section as to why such a correlation might exist in the CC but not the SC group (see response 2.2), and do not speculate further.

      (3.8) Language and presentation:

      - The manuscript requires language improvements and correction of numerous typos. Over-simplifications and unclear statements are present, which could mislead or confuse readers (see also interpretation of aperiodic signal).

      In the revision, we will check that speculations are clearly marked and typos are removed.

      - The authors state that "Together, the present results provide strong evidence for experience-dependent development of the E/I ratio in the human visual cortex, with consequences for behavior." The results of the study do not provide any strong evidence, because of the small sample size and exploratory analyses approach and not accounting for possible confounding factors.

      We disagree with this statement and allude to convergent evidence of both MRS and neurophysiological measures. The latter link to corresponding results observed in a larger sample of CC individuals (Ossandón et al., 2023).

      - "Our results imply a change in neurotransmitter concentrations as a consequence of *restoring* vision following congenital blindness." This is a speculative statement to infer a causal relationship on cross-sectional data.

      As mentioned under 2.1, we conducted a cross-sectional study which might justify future longitudinal work. In order to advance science, new testable hypotheses were put forward at the end of a manuscript.

      In the revised manuscript we will add “might imply” to better indicate the hypothetical character of this idea.

      - In the limitation section, the authors wrote: "The sample size of the present study is relatively high for the rare population , but undoubtedly, overall, rather small." This sentence should be rewritten, as the study is plein underpowered. The further justification "We nevertheless think that our results are valid. Our findings neurochemically (Glx and GABA+ concentration), and anatomically (visual cortex) specific. The MRS parameters varied with parameters of the aperiodic EEG activity and visual acuity. The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) (Ossandón et al., 2023), and effects of chronological age were as expected from the literature." These statements do not provide any validation or justification of small samples. Furthermore, the current data set is a subset of an earlier published paper by the same authors "The EEG data sets reported here were part of data published earlier (Ossandón et al., 2023; Pant et al., 2023)." Thus, the statement "The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) " is a circular argument and should be avoided.

      Our intention was not to justify having a small sample, but to justify why we think the results might be valid as they align with/replicate existing literature.

      In the revised manuscript, we will add a figure showing that the EEG results of the 10 subjects considered here correspond to those of the 28 other subjects of Ossandon et al. We will adapt the text accordingly, clearly stating that the pattern of EEG results of the ten subjects reported here replicate those of the 28 additional subjects of Ossandon et al. (2023).

      References

      Barnes, S. J., Sammons, R. P., Jacobsen, R. I., Mackie, J., Keller, G. B., & Keck, T. (2015). Subnetwork-specific homeostatic plasticity in mouse visual cortex in vivo. Neuron, 86(5), 1290–1303. https://doi.org/10.1016/J.NEURON.2015.05.010

      Bernabeu, A., Alfaro, A., García, M., & Fernández, E. (2009). Proton magnetic resonance spectroscopy (1H-MRS) reveals the presence of elevated myo-inositol in the occipital cortex of blind subjects. NeuroImage, 47(4), 1172–1176. https://doi.org/10.1016/j.neuroimage.2009.04.080

      Bottari, D., Troje, N. F., Ley, P., Hense, M., Kekunnaya, R., & Röder, B. (2016). Sight restoration after congenital blindness does not reinstate alpha oscillatory activity in humans. Scientific Reports. https://doi.org/10.1038/srep24683

      Colombo, M. A., Napolitani, M., Boly, M., Gosseries, O., Casarotto, S., Rosanova, M., Brichant, J. F., Boveroux, P., Rex, S., Laureys, S., Massimini, M., Chieregato, A., & Sarasso, S. (2019). The spectral exponent of the resting EEG indexes the presence of consciousness during unresponsiveness induced by propofol, xenon, and ketamine. NeuroImage, 189(September 2018), 631–644. https://doi.org/10.1016/j.neuroimage.2019.01.024

      Consideration of Sample Size in Neuroscience Studies. (2020). Journal of Neuroscience, 40(21), 4076–4077. https://doi.org/10.1523/JNEUROSCI.0866-20.2020

      Coullon, G. S. L., Emir, U. E., Fine, I., Watkins, K. E., & Bridge, H. (2015). Neurochemical changes in the pericalcarine cortex in congenital blindness attributable to bilateral anophthalmia. Journal of Neurophysiology. https://doi.org/10.1152/jn.00567.2015

      Fang, Q., Li, Y. T., Peng, B., Li, Z., Zhang, L. I., & Tao, H. W. (2021). Balanced enhancements of synaptic excitation and inhibition underlie developmental maturation of receptive fields in the mouse visual cortex. Journal of Neuroscience, 41(49), 10065–10079. https://doi.org/10.1523/JNEUROSCI.0442-21.2021

      Favaro, J., Colombo, M. A., Mikulan, E., Sartori, S., Nosadini, M., Pelizza, M. F., Rosanova, M., Sarasso, S., Massimini, M., & Toldo, I. (2023). The maturation of aperiodic EEG activity across development reveals a progressive differentiation of wakefulness from sleep. NeuroImage, 277. https://doi.org/10.1016/J.NEUROIMAGE.2023.120264

      Gao, Y., Liu, Y., Zhao, S., Liu, Y., Zhang, C., Hui, S., Mikkelsen, M., Edden, R. A. E., Meng, X., Yu, B., & Xiao, L. (2024). MRS study on the correlation between frontal GABA+/Glx ratio and abnormal cognitive function in medication-naive patients with narcolepsy. Sleep Medicine, 119, 1–8. https://doi.org/10.1016/j.sleep.2024.04.004

      Haider, B., Duque, A., Hasenstaub, A. R., & McCormick, D. A. (2006). Neocortical network activity in vivo is generated through a dynamic balance of excitation and inhibition. Journal of Neuroscience. https://doi.org/10.1523/JNEUROSCI.5297-05.2006

      Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A. G., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076. https://doi.org/10.1016/J.DCN.2022.101076

      Hupfeld, K. E., Zöllner, H. J., Hui, S. C. N., Song, Y., Murali-Manohar, S., Yedavalli, V., Oeltzschner, G., Prisciandaro, J. J., & Edden, R. A. E. (2024). Impact of acquisition and modeling parameters on the test–retest reproducibility of edited GABA+. NMR in Biomedicine, 37(4), e5076. https://doi.org/10.1002/nbm.5076

      Hyvärinen, J., Carlson, S., & Hyvärinen, L. (1981). Early visual deprivation alters modality of neuronal responses in area 19 of monkey cortex. Neuroscience Letters, 26(3), 239–243. https://doi.org/10.1016/0304-3940(81)90139-7

      Juchem, C., & Graaf, R. A. de. (2017). B0 magnetic field homogeneity and shimming for in vivo magnetic resonance spectroscopy. Analytical Biochemistry, 529, 17–29. https://doi.org/10.1016/j.ab.2016.06.003

      Keck, T., Hübener, M., & Bonhoeffer, T. (2017). Interactions between synaptic homeostatic mechanisms: An attempt to reconcile BCM theory, synaptic scaling, and changing excitation/inhibition balance. Current Opinion in Neurobiology, 43, 87–93. https://doi.org/10.1016/J.CONB.2017.02.003

      Kurcyus, K., Annac, E., Hanning, N. M., Harris, A. D., Oeltzschner, G., Edden, R., & Riedl, V. (2018). Opposite Dynamics of GABA and Glutamate Levels in the Occipital Cortex during Visual Processing. Journal of Neuroscience, 38(46), 9967–9976. https://doi.org/10.1523/JNEUROSCI.1214-18.2018

      Liu, B., Wang, G., Gao, D., Gao, F., Zhao, B., Qiao, M., Yang, H., Yu, Y., Ren, F., Yang, P., Chen, W., & Rae, C. D. (2015). Alterations of GABA and glutamate-glutamine levels in premenstrual dysphoric disorder: A 3T proton magnetic resonance spectroscopy study. Psychiatry Research - Neuroimaging, 231(1), 64–70. https://doi.org/10.1016/J.PSCYCHRESNS.2014.10.020

      Lunghi, C., Berchicci, M., Morrone, M. C., & Russo, F. D. (2015). Short‐term monocular deprivation alters early components of visual evoked potentials. The Journal of Physiology, 593(19), 4361. https://doi.org/10.1113/JP270950

      Maier, S., Düppers, A. L., Runge, K., Dacko, M., Lange, T., Fangmeier, T., Riedel, A., Ebert, D., Endres, D., Domschke, K., Perlov, E., Nickel, K., & Tebartz van Elst, L. (2022). Increased prefrontal GABA concentrations in adults with autism spectrum disorders. Autism Research, 15(7), 1222–1236. https://doi.org/10.1002/aur.2740

      Manning, J. R., Jacobs, J., Fried, I., & Kahana, M. J. (2009). Broadband shifts in local field potential power spectra are correlated with single-neuron spiking in humans. The Journal of Neuroscience : The Official Journal of the Society for Neuroscience, 29(43), 13613–13620. https://doi.org/10.1523/JNEUROSCI.2041-09.2009

      McSweeney, M., Morales, S., Valadez, E. A., Buzzell, G. A., Yoder, L., Fifer, W. P., Pini, N., Shuffrey, L. C., Elliott, A. J., Isler, J. R., & Fox, N. A. (2023). Age-related trends in aperiodic EEG activity and alpha oscillations during early- to middle-childhood. NeuroImage, 269, 119925. https://doi.org/10.1016/j.neuroimage.2023.119925

      Medel, V., Irani, M., Crossley, N., Ossandón, T., & Boncompte, G. (2023). Complexity and 1/f slope jointly reflect brain states. Scientific Reports, 13(1), 21700. https://doi.org/10.1038/s41598-023-47316-0

      Medel, V., Irani, M., Ossandón, T., & Boncompte, G. (2020). Complexity and 1/f slope jointly reflect cortical states across different E/I balances. bioRxiv, 2020.09.15.298497. https://doi.org/10.1101/2020.09.15.298497

      Molina, J. L., Voytek, B., Thomas, M. L., Joshi, Y. B., Bhakta, S. G., Talledo, J. A., Swerdlow, N. R., & Light, G. A. (2020). Memantine Effects on Electroencephalographic Measures of Putative Excitatory/Inhibitory Balance in Schizophrenia. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 5(6), 562–568. https://doi.org/10.1016/j.bpsc.2020.02.004

      Mukerji, A., Byrne, K. N., Yang, E., Levi, D. M., & Silver, M. A. (2022). Visual cortical γ−aminobutyric acid and perceptual suppression in amblyopia. Frontiers in Human Neuroscience, 16. https://doi.org/10.3389/fnhum.2022.949395

      Muthukumaraswamy, S. D., & Liley, D. T. (2018). 1/F electrophysiological spectra in resting and drug-induced states can be explained by the dynamics of multiple oscillatory relaxation processes. NeuroImage, 179(November 2017), 582–595. https://doi.org/10.1016/j.neuroimage.2018.06.068

      Narayan, G. A., Hill, K. R., Wengler, K., He, X., Wang, J., Yang, J., Parsey, R. V., & DeLorenzo, C. (2022). Does the change in glutamate to GABA ratio correlate with change in depression severity? A randomized, double-blind clinical trial. Molecular Psychiatry, 27(9), 3833—3841. https://doi.org/10.1038/s41380-022-01730-4

      Nuijten, M. B., & Polanin, J. R. (2020). “statcheck”: Automatically detect statistical reporting inconsistencies to increase reproducibility of meta-analyses. Research Synthesis Methods, 11(5), 574–579. https://doi.org/10.1002/jrsm.1408

      Ossandón, J. P., Stange, L., Gudi-Mindermann, H., Rimmele, J. M., Sourav, S., Bottari, D., Kekunnaya, R., & Röder, B. (2023). The development of oscillatory and aperiodic resting state activity is linked to a sensitive period in humans. NeuroImage, 275, 120171. https://doi.org/10.1016/J.NEUROIMAGE.2023.120171

      Ostlund, B. D., Alperin, B. R., Drew, T., & Karalunas, S. L. (2021). Behavioral and cognitive correlates of the aperiodic (1/f-like) exponent of the EEG power spectrum in adolescents with and without ADHD. Developmental Cognitive Neuroscience, 48, 100931. https://doi.org/10.1016/j.dcn.2021.100931

      Pant, R., Ossandón, J., Stange, L., Shareef, I., Kekunnaya, R., & Röder, B. (2023). Stimulus-evoked and resting-state alpha oscillations show a linked dependence on patterned visual experience for development. NeuroImage: Clinical, 103375. https://doi.org/10.1016/J.NICL.2023.103375

      Perica, M. I., Calabro, F. J., Larsen, B., Foran, W., Yushmanov, V. E., Hetherington, H., Tervo-Clemmens, B., Moon, C.-H., & Luna, B. (2022). Development of frontal GABA and glutamate supports excitation/inhibition balance from adolescence into adulthood. Progress in Neurobiology, 219, 102370. https://doi.org/10.1016/j.pneurobio.2022.102370

      Pitchaimuthu, K., Wu, Q. Z., Carter, O., Nguyen, B. N., Ahn, S., Egan, G. F., & McKendrick, A. M. (2017). Occipital GABA levels in older adults and their relationship to visual perceptual suppression. Scientific Reports, 7(1). https://doi.org/10.1038/S41598-017-14577-5

      Rideaux, R., Ehrhardt, S. E., Wards, Y., Filmer, H. L., Jin, J., Deelchand, D. K., Marjańska, M., Mattingley, J. B., & Dux, P. E. (2022). On the relationship between GABA+ and glutamate across the brain. NeuroImage, 257, 119273. https://doi.org/10.1016/J.NEUROIMAGE.2022.119273

      Schaworonkow, N., & Voytek, B. (2021). Longitudinal changes in aperiodic and periodic activity in electrophysiological recordings in the first seven months of life. Developmental Cognitive Neuroscience, 47. https://doi.org/10.1016/j.dcn.2020.100895

      Schwenk, J. C. B., VanRullen, R., & Bremmer, F. (2020). Dynamics of Visual Perceptual Echoes Following Short-Term Visual Deprivation. Cerebral Cortex Communications, 1(1). https://doi.org/10.1093/TEXCOM/TGAA012

      Sengpiel, F., Jirmann, K.-U., Vorobyov, V., & Eysel, U. T. (2006). Strabismic Suppression Is Mediated by Inhibitory Interactions in the Primary Visual Cortex. Cerebral Cortex, 16(12), 1750–1758. https://doi.org/10.1093/cercor/bhj110

      Steel, A., Mikkelsen, M., Edden, R. A. E., & Robertson, C. E. (2020). Regional balance between glutamate+glutamine and GABA+ in the resting human brain. NeuroImage, 220. https://doi.org/10.1016/J.NEUROIMAGE.2020.117112

      Takado, Y., Takuwa, H., Sampei, K., Urushihata, T., Takahashi, M., Shimojo, M., Uchida, S., Nitta, N., Shibata, S., Nagashima, K., Ochi, Y., Ono, M., Maeda, J., Tomita, Y., Sahara, N., Near, J., Aoki, I., Shibata, K., & Higuchi, M. (2022). MRS-measured glutamate versus GABA reflects excitatory versus inhibitory neural activities in awake mice. Journal of Cerebral Blood Flow & Metabolism, 42(1), 197. https://doi.org/10.1177/0271678X211045449

      Takei, Y., Fujihara, K., Tagawa, M., Hironaga, N., Near, J., Kasagi, M., Takahashi, Y., Motegi, T., Suzuki, Y., Aoyama, Y., Sakurai, N., Yamaguchi, M., Tobimatsu, S., Ujita, K., Tsushima, Y., Narita, K., & Fukuda, M. (2016). The inhibition/excitation ratio related to task-induced oscillatory modulations during a working memory task: A multtimodal-imaging study using MEG and MRS. NeuroImage, 128, 302–315. https://doi.org/10.1016/J.NEUROIMAGE.2015.12.057

      Tao, H. W., & Poo, M. M. (2005). Activity-dependent matching of excitatory and inhibitory inputs during refinement of visual receptive fields. Neuron, 45(6), 829–836. https://doi.org/10.1016/J.NEURON.2005.01.046

      Vanrullen, R., & MacDonald, J. S. P. (2012). Perceptual echoes at 10 Hz in the human brain. Current Biology. https://doi.org/10.1016/j.cub.2012.03.050

      Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-related changes in 1/f neural electrophysiological noise. Journal of Neuroscience, 35(38). https://doi.org/10.1523/JNEUROSCI.2332-14.2015

      Vreeswijk, C. V., & Sompolinsky, H. (1996). Chaos in neuronal networks with balanced excitatory and inhibitory activity. Science, 274(5293), 1724–1726. https://doi.org/10.1126/SCIENCE.274.5293.1724

      Waschke, L., Wöstmann, M., & Obleser, J. (2017). States and traits of neural irregularity in the age-varying human brain. Scientific Reports 2017 7:1, 7(1), 1–12. https://doi.org/10.1038/s41598-017-17766-4

      Weaver, K. E., Richards, T. L., Saenz, M., Petropoulos, H., & Fine, I. (2013). Neurochemical changes within human early blind occipital cortex. Neuroscience. https://doi.org/10.1016/j.neuroscience.2013.08.004

      Wu, Y. K., Miehl, C., & Gjorgjieva, J. (2022). Regulation of circuit organization and function through inhibitory synaptic plasticity. Trends in Neurosciences, 45(12), 884–898. https://doi.org/10.1016/J.TINS.2022.10.006

    1. eLife assessment

      This study by Cuaya et al. reveals and characterizes two distinct forms of spike timing-dependent long-term depression (t-LTD) at the synapses between excitatory afferents from lateral (LPP) and medial (MPP) perforant pathways to granule cells (GC) of the dentate gyrus (DG) in mice. The findings are valuable for the field of synaptic physiology and are based on solid electrophysiological data. The study extends current knowledge by elucidating additional plasticity mechanisms at PP-GC synapses, complementing existing literature.

    2. Reviewer #1 (Public Review):

      Summary:

      The study characterized the cellular and molecular mechanisms of spike timing-dependent long-term depression (t-LTD) at the synapses between excitatory afferents from lateral (LPP) and medial (MPP) perforant pathways to granule cells (GC) of the dentate gyrus (DG) in mice.

      Strengths:

      The electrophysiological experiments are thorough. The experiments are systematically reported and support the conclusions drawn.<br /> This study extends current knowledge by elucidating additional plasticity mechanisms at PP-GC synapses, complementing existing literature.

      Weaknesses:

      To more conclusively define the pivotal role of astrocytes in modulating t-LTD at MPP and LPP GC synapses through SNARE protein-dependent glutamate release, as posited in this study, the authors could adopt additional methods, such as alternative mouse models designed to regulate SNARE-dependent exocytosis, as well as optogenetic or chemogenetic strategies for precise astrocyte manipulation during t-LTD induction. This would provide more direct evidence of the influence of astrocytic activity on synaptic plasticity.

    3. Reviewer #2 (Public Review):

      Summary:

      This work reports the existence of spike timing-dependent long-term depression (t-LTD) of excitatory synaptic strength at two synapses of the dentate gyrus granule cell, which are differently connected to the entorhinal cortex via either the lateral or medial perforant pathways (LPP or MPP, respectively). Using patch-clamp electrophysiological recording of tLTD in combination with either pharmacology or a genetically modified mouse model, they provide information on the differences in the molecular mechanism underlying this t-LTD at the two synapses.

      Strengths:

      The two synapses analyzed in this study have been understudied. This new data thus provides interesting new information on a plasticity process at these synapses, and the authors demonstrate subtle differences in the underlying molecular mechanisms at play. Experiments are in general well controlled and provide robust data that are properly interpreted.

      Weaknesses:

      - Caution should be taken in the interpretation of the results to extrapolate to adult brain as the data were obtained in P13-21 days old mice, a period during which synapses are still maturing and highly plastic.<br /> - In experiments where the drug FK506 or thapsigargin are loaded intracellularly, the concentrations used are as high as for extracellular application. Could there be an error of interpretation when stating that the targeted actors are necessarily in the post-synaptic neuron? Is it not possible for the drug to diffuse out of the cell as it is evident that it can enter the cell when applied extracellularly?<br /> - The experiments implicating glutamate release from astrocytes in t-LTD would require additional controls to better support the conclusions made by the authors. As the data stand, it is not clear how the authors identified astrocytes to load BAPTA and if dnSNARE expression in astrocytes does not indirectly perturb glutamate release in neurons.

      Significance:

      While this is the first report of t-LTD at these synapses, this plasticity process has been mechanistically well investigated at other synapses in the hippocampus and in the cortex. Nevertheless, this new data suggests that mechanistic differences in the induction of t-LTD at these two DG synapses could contribute to the differences in the physiological influence of the LPP and MPP pathways.

    4. Reviewer #3 (Public Review):

      Coatl et al. investigated the mechanisms of synaptic plasticity of two important hippocampal synapses, the excitatory afferents from lateral and medial perforant pathways (LPP and MPP, respectively) of the entorhinal cortex (EC) connecting to granule cells of the hippocampal dentate gyrus (DG). They find that these two different EC-DG synaptic connections in mice show a presynaptically expressed form of long-term depression (LTD) requiring postsynaptic calcium, eCB synthesis, CB1R activation, astrocyte activity, and metabotropic glutamate receptor activation. Interestingly, LTD at MPP-GC synapses requires ionotropic NMDAR activation whereas LTD at LPP-GC synapse is NMDAR independent. Thus, they discovered two novel forms of t-LTD that require astrocytes at EC-GC synapses. Although plasticity of EC-DG granule cell (GC) synapses has been studied using classical protocols, These are the first analysis of the synaptic plasticity induced by spike timing dependent protocols at these synapses. Interestingly, the data also indicate that t-LTD at each type of synapse require different group I mGluRs, with LPP-GC synapses dependent on mGluR5 and MPP-GC t-LTD requiring mGluR1.

      The authors performed a detailed analysis of the coefficient of variation of the EPSP slopes, miniature responses and different approaches (failure rate, PPRs, CV, and mEPSP frequency and amplitude analysis) they demonstrate a decrease in the probability of neurotransmitter release and a presynaptic locus for these two forms of LTD at both types of synapses. By using elegant electrophysiological experiments and taking advantage of the conditional dominant-negative (dn) SNARE mice in which doxycycline administration blocks exocytosis and impairs vesicle release by astrocytes, they demonstrate that both LTD forms require the release of gliotransmitters from astrocytes. These data add in an interesting way to the ongoing discussion on whether LTD induced by STDP participates in refining synapses potentially weakening excitatory synapses under the control of different astrocytic networks. The conclusions of this paper are mostly well supported by data, but some aspects the results must be clarified and extended.

      (1) It should be clarified whether present results are obtained with or without the functional inhibitory synapse activation. It is not clear if GABAergic synapses are blocked or not. If GABAergic synapses are not blocked authors must discuss whether the LTD of the EPSPs is due to a decrease in glutamatergic receptor activation or an increase in GABAergic receptor activation. Moreover, it should be recommended to analyze not only the EPSPs but also the EPSCs to address whether the decrease in synaptic transmission is caused by a decrease in the input resistance or by a decrease in the space constant (lambda).<br /> (2) Authors show that Thapsigargin loaded in the postsynaptic neuron prevents the induction of LTD at both synapses. Analyzing the effects of blocking postsynaptic IP3Rs (Heparin in the patch pipette) and Ryanodine receptors (Ruthenium red in the patch pipette) is recommended for a deeper analysis of the mechanism implicated in the induction of this novel forms of LTD in the hippocampus.<br /> (3) Authors nicely demonstrate that CB1R activation is required in these forms of LTD by blocking CB1Rs with AM251, however an interesting unanswered question is whether CB1R activation is sufficient to induce this synaptic plasticity. This reviewer suggests studying whether applying puffs of the CB1R agonist, WIN 55,212-2, could induce these forms of LTD.<br /> (4) Finally, adding a last figure with a cartoon summarizing the proposed model of action in these novel forms of LTD would add a positive value and would help the reading of the manuscript, especially in those aspects related with the discussion of the results.

      The extension of these results would improve the manuscript which provides interesting results showing two novel forms of presynaptic t-LTD in the brain synapses with different action mechanisms probably implicated in the different aspects of information processing.

  2. www.researchsquare.com www.researchsquare.com
    1. eLife assessment

      In this important study, the authors use a genetically engineered mouse model to reveal a tumor suppressive role for focal adhesion kinase in right-sided colon cancer. The evidence in support of the authors' claims is generally solid, although the data supporting the mechanism through which FAK deletion promotes tumorigenesis are incomplete. This work will be of interest to cancer researchers and others studying the biological consequences of tuning signal transduction pathways.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors provide solid evidence with a mouse model as well as supporting in vitro and analysis of clinical samples that loss of Fak increases the development of BRAF V600E-induced dysplastic lesions and carcinomas in the cecum via downregulation of Egfr-mediated Erk phosphorylation. This fine-tuning of Erk phosphorylation increases the expression of Lrg4 mRNA expression and promotes Lrg4 stability through downregulation of the E3 ubiquitin ligase Nedd4. The high Lrg4 expression correlates with an increased intestinal stem cell transcriptional signature that the authors suggest drives higher rates of transformation. This provides important insight that factors such as FAK may be able to modulate MAPK-driven tumorigenesis in specific circumstances. The data presented here are largely specific to the cecum. While these specific findings may ultimately have practical implications for human CRC outside the cecum and even therapeutic implications, these remain unexplored and will be a point for future investigations.

      Strengths:

      The authors use a mouse model (intestinal specific BRAF V600E +/- Fak knockout) as well as supporting in vitro analyses and clinical sample characterization to support their model. For both in vitro and in vivo studies, the authors use a combination of genetic and pharmacologic (including EGFR, FAK, and MEK inhibitors) tools to modulate the MAPK pathway. They also use a combination of transcriptional (RNA-Seq) and protein (IHC and Western blotting) readouts to support their proposed model. Importantly, they use a distinct mouse model (mutant Kras) to demonstrate their findings with Fak loss are specific to instances where EGFR can modulate ERK activation, providing strong evidence for their model. Finally, they also correlate their findings in the murine model with patient samples and with trends in the TCGA database. Collectively, these create a solid and convincing basis for their proposed model.

      Weaknesses:

      (1) The murine data is largely confined to the cecum. While the analysis of the cecum is appropriate based on the cecum specificity of their phenotype, they often use these findings to make broader generalizations about the nature of tumorigenesis in the intestinal epithelia and in CRC more generally. In my opinion, there was insufficient evidence presented supporting the extension of the proposed model beyond the cecum. While this is a weakness, it could be part of a growing effort to characterize left and right-sided malignancies as related but separate disease processes.

      (2) The authors generally do a good job of focusing their analysis on the cecum and supporting their model. For example, Figure 5A examines different colon compartments, including the cecum. However, the authors fail to demonstrate that Fak loss only promotes Lrg4 upregulation in the cecum, where they observe an increase in BRAF V600E dysplasia and carcinoma. This is again seen in Figure 6A, where they only characterize Nedd4 expression in the cecum and not other compartments of the colon.

      (3) The authors evaluate a broad range of tissues, including normal colonic mucosa, polyps, pre-cancerous dysplastic lesions, adenocarcinomas, and adenocarcinoma cell lines. While this breadth is a strength of the paper, the authors, at times, equate experimental observations in each of these conditions, despite the difference in the biology of these tissues/cells. For example, in their mouse model, they equate the development of dysplastic lesions and carcinoma lesions. This makes it difficult to accurately interpret their data and conclusions.

      (4) In Figure 5i, this experiment was only completed in one cell line (HT29), despite the conclusion that Lrg4 expression is increased by decreased ERK phosphorylation due to protein stabilization. HT29 cells are a transformed human CRC cell line, quite different than a pre-malignant cecum intestinal epithelial cell. While convincing, the authors could have performed this key experiment in non-transformed murine cecal organoids (as they did for other experiments in Figure 5E), which would better recapitulate the mouse and pre-malignant setting to explain their mouse phenotype.

      (5) While a large portion of the discussion focusses on the therapeutic implications of these findings, the authors only really investigate tumorigenesis. They likely have additional investigations planned for future manuscripts.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Gao et al. described a study identifying the role of FAK in fine-tuning the activation levels of ERK signaling in BRAF-V600E-driven colorectal cancer. The authors generated new mouse models combining Vill-Cre mediated BRAF-V600E expression with FAK deletion. Analyses of intestinal tumor phenotypes revealed that FAK-loss promotes BRAF-V600E-induced tumor formation, specifically in the cecum. Interestingly, these tumors closely resemble human sessile serrated adenoma/polyps. Using bioinformatics analysis, the authors found that FAK deletion upregulates the intestinal stem cell and fetal-type transcriptomic signatures compared to mice expressing BRAF-V600E alone. In addition, FAK-loss decreases the phosphorylation of ERK whereas it increases the expression of Lgr4 at both mRNA and protein levels. To mechanistically connect FAK-mediated downregulation of ERK and upregulation of Lgr4 in the context of BRAF-V600E mutation, results from biochemical experiments showed that MEK inhibitor treatment decreases the expression of NEDD4, a previously identified ubiquitin E3 ligase of Lgr4, which coincides with increased Lgr4 protein expression both in cells and in vivo. Moreover, the FAK-dependent modulation of ERK signaling is specific to BRAF-V600E-driven tumorigenesis only as knockout of FAK has no effect in Vill-Cre/KRAS-G12D mice. Collectively, the authors proposed a "just right" model in that a tunable FAK expression controls the optimal level of ERK pathway output needed for BRAF-V600E-induced cecal tumor formation.

      Strengths:

      This study provides new insights into the mechanisms underlying the serrated pathway-driven tumorigenesis in colorectal cancer. The newly established mouse model with compound mutations of BRAF and FAK offers a useful resource for future studies of the serrated pathway. The conclusions of this paper are mostly supported by data.

      Weaknesses:

      However, some aspects of the paper can be strengthened with additional mechanistically focused experiments.

      (1) Some of the conclusions of the paper mainly rely on bioinformatic analyses of RNA-seq data. For example, it has been noted in several places in the paper that the knockout of FAK in Vill-Cre/BRAF-V600E mice does not affect the transcriptional outcome downstream of ERK while ERK phosphorylation levels are decreased. This statement is based on the lack of significant difference in the MAPK signature according to GSEA. However, whereas a significant enrichment of certain pathways can be used as support evidence, the lack of enrichment does not necessarily indicate those pathways are not involved. Other experiments are needed to examine the expression of ERK target genes to confirm. Similarly, the upregulation of fetal stem cell signature in FAK knockout mice needs to be verified using other methods besides GSEA.

      (2) According to Figure 5i, the half-life of Lgr4 is around 48 hours in HT29 cells. However, it has been reported by at least two other publications cited in this paper (Ref. 44 and 45) that the half-life of Lgr4 is much shorter. This discrepancy is not explained.

      (3) The effect of decreased ERK signaling on NEDD4 expression has only been briefly explored in Figure 6. The mechanisms by which FAK-loss and/or inhibition of MEK/ERK activity regulate NEDD4 expression are currently unclear. Moreover, the levels of NEDD4 expression are only analyzed in one mouse per group in Figure 6a. Quantitative analysis of NEDD4 as well as Lgr4 expression in additional numbers of mice will provide more solid support for the inverse correlation between NEDD4 and Lgr4 proteins. Since MEK inhibitor treatment also increases Lgr4 mRNA expression as shown in Figure 5f-g, the relative contribution of this altered mRNA expression vs. NEDD4L-mediated ubiquitination has not been investigated.

      (4) It is an interesting finding that knockout FAK has no effect on KRAS-G12D-driven hyperplasia as shown in Figure 7. However, additional studies are needed to further explore the potential mechanisms by which FAK-loss specifically decreases EGFR/ERK signaling in the context of BRAF-V600E mutation.

    4. Reviewer #3 (Public Review):

      Summary:

      Right-sided colorectal Cancer (CRC) is very different from left-sided CRC. Therefore it is important to model this cancer in mice and find new molecular targets. A broad set of data exists on FAK (Focal Adhesion Kinase) being important in colorectal cancer. However, this has focussed on APC mutant CRC which tends to be left-sided. BRAF mutation is common in right-sided CRC (and is rarely mutated with APC). Therefore the authors have tested whether FAK is important in this context. The authors show that FAK deletion surprisingly accelerates BRAF mutant CRC. Tumours arise in the proximal colon (which recapitulates BRAF mutant right-sided CRC). There are low for Lgr5 and high for foetal programmes. Mechanistically they suggest a pathway from FAK to NEDD4 to Lgr4 may underpin this phenotype.

      Strengths:

      Strong genetic data from FAK revealed that there is an acceleration of tumourigenesis and mice now develop proximal colon tumours and can be viewed as a good model of right-sided CRC.<br /> The expression data between humans and mice is strong.

      Weaknesses:

      The functional mechanism of how FAK loss promotes tumourigenesis is still quite correlative. An alternative hypothesis is that it drives inflammation in the proximal colon that drives tumourigenesis.

      We still did not know the functional role for LGR4 (loss leads to a loss of paneth cells in homeostasis) so I'm not sure you can hypothesise a stem cell role.

    5. Author response:

      We thank the editor and reviewers for the time invested in our manuscript and their valuable and insightful critiques. However, we believe that the results justified our conclusions in the manuscript well; therefore, we have decided not to revise it.