26,869 Matching Annotations
  1. Apr 2024
    1. Reviewer #2 (Public Review):

      Summary:

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family shows that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ.

      Strengths:

      This is an interesting and rigorous study that uses various state-of-the-art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how the protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data.

      Weaknesses:

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing but they have not identified this residue. Some things are a little unclear. For instance, where does the state that they have defined as occluded sit on the diagram in Figure 1a? - is it truly the occluded state as shown on the diagram or does it tend to inward- or outward-facing? The pKa calculations and their interpretation are a bit unclear. Firstly, it is unclear whether they are using all the data in the calculations of the histograms, or just selected data and if so on what basis was this selection done. Secondly, they dismiss the pKa calculations of E53 in the outward-facing form as not being affected by peptide binding but say that E56 is when there seems to be a similar change in profile in the histograms.

    2. Reviewer #3 (Public Review):

      Summary:

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most well-studied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions.

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family.

      Some of the key results include<br /> (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition.

      (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down.

      (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and D342 significantly decreases transport activity as expected from simulations.

      Strengths:

      (1) This is an extensive MD-based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD, and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and have performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses:

      (1) Some of the statements made in the manuscript are not convincing and do not abide by the standards that are mostly followed in the manuscript. For instance, on page 4, it is stated that "the K64-D317 interaction is formed in only ≈ 70% of MD frames and therefore is unlikely to contribute much to extracellular gate stability." I do not agree that 70% is negligible. Particularly, Figure S3 does not include the time series so it is not clear whether the 30% of the time where the salt bridge is broken is in the beginning or the end of simulations. For instance, it is likely that the salt bridge is not initially present and then it forms very strongly. Of course, this is just one possible scenario but the point is that Figure S3 does not rule out the possibility of a significant role for the K64-D317 salt bridge.

      (2) Similarly, on page 4, it is stated that "whether by protonation or mutation - the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." I do not agree with this assessment. The authors need to be aware of the limitations of this approach. Consider "WT H87-prot" and "D342A H87-prot": when D342 residue is mutated, in one out of 3 simulations, we see the opening of the gate within 1 us. When D342 residue is not mutated we do not see the opening in any of the 3 simulations within 1 us. It is quite likely that if rather than 3 we have 10 simulations or rather than 1 us we have 10 us simulations, the 0/3 to 1/3 changes significantly. I do not find this argument and conclusion compelling at all.

      (3) While the MEMENTO methodology is novel and interesting, the method is presented as flawless in the manuscript, which is not true at all. It is stated on Page 5 with regards to the path generated by MEMENTO that "These paths are then by definition non-hysteretic." I think this is too big of a claim to say the paths generated by MEMENTO are non-hysteretic by definition. This claim is not even mentioned in the original MEMENTO paper. What is mentioned is that linear interpolation generates a hysteresis-free path by definition. There are two important problems here: (a) MEMENTO uses the linear interpolation as an initial step but modifies the intermediates significantly later so they are no longer linearly interpolated structures and thus the path is no longer hysteresis-free; (b) a more serious problem is the attribution of by-definition hysteresis-free features to the linearly interpolated states. This is based on conflating the hysteresis-free and unique concepts. The hysteresis in MD-based enhanced sampling is related to the presence of barriers in orthogonal space. For instance, one may use a non-linear interpolation of any type and get a unique pathway, which could be substantially different from the one coming from the linear interpolation. None of these paths will be hysteresis-free necessarily once subjected to MD-based enhanced sampling techniques.

    1. Reviewer #2 (Public Review):

      Summary:

      In this article, Kumar et al., report on a previously unappreciated mechanism of translational regulation whereby p130Cas induces LLPS condensates that then traffic out from focal adhesion into the cytoplasm to modulate mRNA translation. Specifically, the authors employed EGFP-tagged p130Cas constructs, endogenous p130Cas, and p130Cas knockouts and mutants in cell-based systems. These experiments in conjunction with various imaging techniques revealed that p130Cas drives assembly of LLPS condensates in a manner that is largely independent of tyrosine phosphorylation. This was followed by in vitro EGFP-tagged p130Cas-dependent induction of LLPS condensates and determination of their composition by mass spectrometry, which revealed enrichment of proteins involved in RNA metabolism in the condensates. The authors excluded the plausibility that p130Cas-containing condensates co-localize with stress granules or p-bodies. Next, the authors determined mRNA compendium of p130Cas-containing condensates which revealed that they are enriched in transcripts encoding proteins implicated in cell cycle progression, survival, and cell-cell communication. These findings were followed by the authors demonstrating that p130Cas-containing condensates may be implicated in the suppression of protein synthesis using puromycylation assay. Altogether, it was found that this study significantly advances the knowledge pertinent to the understanding of molecular underpinnings of the role of p130Cas and more broadly focal adhesions on cellular function, and to this end, it is likely that this report will be of interest to a broad range of scientists from a wide spectrum of biomedical disciplines including cell, molecular, developmental and cancer biologists.

      Strengths:

      Altogether, this study was found to be of potentially broad interest inasmuch as it delineates a hitherto unappreciated link between p130Cas, LLPS, and regulation of mRNA translation. More broadly, this report provides unique molecular insights into the previously unappreciated mechanisms of the role of focal adhesions in regulating protein synthesis. Overall, it was thought that the provided data sufficiently supported most of the authors' conclusions. It was also thought that this study incorporates an appropriate balance of imaging, cell and molecular biology, and biochemical techniques, whereby the methodology was found to be largely appropriate.

      Weaknesses:

      Two major weaknesses of the study were noted. The first issue is related to the experiments establishing the role of p130Cas-driven condensates in translational suppression, whereby it remained unclear whether these effects are affecting global mRNA translation or are specific to the mRNAs contained in the condensates. Moreover, some of the results in this section (e.g., experiments using cycloheximide) may be open to alternative interpretation. The second issue is the apparent lack of functional studies, and although the authors speculate that the described mechanism is likely to mediate the effects of focal adhesions on e.g., quiescence, experimental testing of this tenet was lacking.

    2. eLife assessment

      In this valuable study, Kumar et al., provide evidence suggesting that the p130Cas drives the formation of condensates that sprout from focal adhesions to cytoplasm and suppress translation. Pending further substantiation, this study was found to be likely to provide previously unappreciated insights into the mechanisms linking focal adhesions to the regulation of protein synthesis and was thus considered to be of broad general interest. However, the evidence supporting the proposed model was incomplete; additional evidence is warranted to substantiate the relationship between p130Cas condensates and mRNA translation and establish corresponding functional consequences.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors demonstrated the phenomenon of p130Cas, a protein primarily localized at focal adhesions, and its formation of condensates. They identified the constituents within the condensates, which include other focal adhesion proteins, paxillin, and RNAs. Furthermore, they proposed a link between p130Cas condensates and translation.

      Strengths:

      Adhesion components undergo rapid exchange with the cytoplasm for some unclear biological functions. Given that p130Cas is recognized as a prominent mechanical focal adhesion component, investigating its role in condensate formation, particularly its impact on the translation process, is intriguing and significant.

      Weaknesses:

      The authors identified the disordered region of p130Cas and investigated the formation of p130Cas condensate. They attempted to demonstrate that p130Cas condensates inhibit translation, but the results did not fully support this assertion. There are several comments below:

      (1) Despite isolating p130Cas-GFP protein using GFP-trap beads, the authors cannot conclusively eliminate the possibility of isolating p130Cas from focal adhesions. While the characterization of the GFP-tagged pulls can reveal the proteins and RNAs associated with p130Cas, they need to clarify their intramolecular mechanism of localization within p130Cas droplets. Whether the protein condensates retain their liquid phase or these GFP-p130Cas pulls represent protein aggregate remains uncertain.

      (2) The authors utilized hexanediol and ammonium acetate to highlight the phenomenon of p130Cas condensates. Although hexanediol is an inhibitor for hydrophobic interactions and ammonium acetate is a salt, a more thorough explanation of the intramolecular mechanisms underlying p130Cas protein-protein interaction is required. Additionally, given that the size of p130Cas condensates can exceed >100um2, classification is needed to differentiate between p130Cas condensates and protein aggregation.

      (3) The connection between p130Cas condensates and translation inhibition appears tenuous. The data only suggests a correlation between p130Cas expression and translation inhibition. Further evidence is required to bolster this hypothesis.

    1. eLife assessment

      The manuscript presents a useful model for the field of endosome maturation, providing perspective on the role of the deubiquitinating enzyme UPS-50/USP8 in the process. The evidence presented in the paper is clear, incorporating well-designed experiments that suggest the dual actions of UPS-50 and USP8 in the conversion of early endosomes into late endosomes. Overall, the work is solid and centers on an intriguing subject.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript focuses on the role of the deubiquitinating enzyme UPS-50/USP8 in endosome maturation. The authors aimed to clarify how this enzyme drives the conversion of early endosomes into late endosomes. Overall, they did achieve their aims in shedding light on the precise mechanisms by which UPS-50/USP8 regulates endosome maturation. The results support their conclusions that UPS-50 acts by disassociating RABX-5 from early endosomes to deactivate RAB-5 and by recruiting SAND-1/Mon1 to activate RAB-7. This work is commendable and will have a significant impact on the field. The methods and data presented here will be useful to the community in advancing our understanding of endosome maturation and identifying potential therapeutic targets for diseases related to endosomal dysfunction. It is worth noting that further investigation is required to fully understand the complexities of endosome maturation. However, the findings presented in this manuscript provide a solid foundation for future studies.

      Strengths:

      The major strengths of this work lie in the well-designed experiments used to examine the effects of UPS-50 loss. The authors employed confocal imaging to obtain a picture of the aftermath of the USP-50 loss. Their findings indicated enlarged early endosomes and MVB-like structures in cells deficient in USP-50/USP8.

      Weaknesses:

      Specifically, there is a need for further investigation to accurately characterize the anomalous structures detected in the ups-50 mutant. Also, the correlation between the presence of these abnormal structures and ESCRT-0 is yet to be addressed, and the current working model needs to be revised to prevent any confusion between enlarged early endosomes and MVBs.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors study how the deubiquitinase USP8 regulates endosome maturation in C. elegans and mammalian cells. The authors have isolated USP8 mutant alleles in C. elegans and used multiple in vivo reporter lines to demonstrate the impact of USP8 loss-of-function on endosome morphology and maturation. They show that in USP8 mutant cells, the early endosomes and MVB-like structures are enlarged while the late endosomes and lysosomal compartments are reduced. They elucidate that USP8 interacts with Rabx5, a guanine nucleotide exchange factor (GEF) for Rab5, and show that USP8 likely targets specific lysine residue of Rabx5 to dissociate it from early endosomes. They also find that the localization of USP8 to early endosomes is disrupted in Rabx5 mutant cells. They observe that in both Rabx5 and USP8 mutant cells, the Rab7 GEF SAND-1 puncta which likely represents late endosomes are diminished, although Rabex5 is accumulated in USP8 mutant cells. The authors provide evidence that USP8 regulates endosomal maturation in a similar fashion in mammalian cells. Based on their observations they propose that USP8 dissociates Rabex5 from early endosomes and enhances the recruitment of SAND-1 to promote endosome maturation.

      Strengths:

      The major highlights of this study include the direct visualization of endosome dynamics in a living multi-cellular organism, C. elegans. The high-quality images provide clear in vivo evidence to support the main conclusions. The authors have generated valuable resources to study mechanisms involved in endosome dynamics regulation in both the worm and mammalian cells, which would benefit many members of the cell biology community. The work identifies a fascinating link between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, which expands the targets and modes of action of USP8. The findings make a solid contribution toward the understanding of how endosomal trafficking is controlled.

      Weaknesses:

      - The authors utilized multiple fluorescent protein reporters, including those generated by themselves, to label endosomal vesicles. Although these are routine and powerful tools for studying endosomal trafficking, these results cannot tell whether the endogenous proteins (Rab5, Rabex5, Rab7, etc.) are affected in the same fashion.

      - The authors clearly demonstrated a link between USP8 and Rabx5, and they showed that cells deficient in both factors displayed similar defects in late endosomes/lysosomes. However, the authors didn't confirm whether and/or to which extent USP8 regulates endosome maturation through Rabx5. Additional genetic and molecular evidence might be required to better support their working model.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors were trying to elucidate the role of USP8 in the endocytic pathway. Using C. elegans epithelial cells as a model, they observed that when USP8 function is lost, the cells have a decreased number and size in lysosomes. Since USP8 was already known to be a protein linked to ESCRT components, they looked into what role USP8 might play in connecting lysosomes and multivesicular bodies (MVB). They observed fewer ESCRT-associated vesicles but an increased number of "abnormal" enlarged vesicles when USP8 function was lost. At this specific point, it's not clear what the objective of the authors was. What would have been their hypothesis addressing whether the reduced lysosomal structures in USP8 (-) animals were linked to MVB formation? Then they observed that the abnormally enlarged vesicles, marked by the PI3P biosensor YFP-2xFYVE, are bigger but in the same number in USP8 (-) compared to wild-type animals, suggesting homotypic fusion. They confirmed this result by knocking down USP8 in a human cell line, and they observed enlarged vesicles marked by YFP-2xFYVE as well. At this point, there is quite an important issue. The use of YFP-2xFYVE to detect early endosomes requires the transfection of the cells, which has already been demonstrated to produce differences in the distribution, number, and size of PI3P-positive vesicles (doi.org/10.1080/15548627.2017.1341465). The enlarged vesicles marked by YFP-2xFYVE would not necessarily be due to the loss of UPS8. In any case, it appears relatively clear that USP8 localizes to early endosomes, and the authors claim that this localization is mediated by Rabex-5 (or Rabx-5). They finally propose that USP8 dissociates Rabx-5 from early endosomes facilitating endosome maturation.

      Weaknesses:

      The weaknesses of this study are, on one side, that the results are almost exclusively dependent on the overexpression of fusion proteins. While useful in the field, this strategy does not represent the optimal way to dissect a cell biology issue. On the other side, the way the authors construct the rationale for each approximation is somehow difficult to follow. Finally, the use of two models, C. elegans and a mammalian cell line, which would strengthen the observations, contributes to the difficulty in reading the manuscript.

      The findings are useful but do not clearly support the idea that USP8 mediates Rab5-Rab7 exchange and endosome maturation, In contrast, they appear to be incomplete and open new questions regarding the complexity of this process and the precise role of USP8 within it.

    1. eLife assessment

      This important study reports the formation of a new organelle, called giant unilocular vacuole (GUVac), in mammary epithelial cells through a macropinocytosis-like process. The evidence supporting conclusions is solid, using state-of-the-art cell biology techniques. This work will be of interest to cell biologists and contribute to the understanding of cell survival mechanisms against anoikis.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors found that the loss of cell-ECM adhesion leads to the formation of giant monocular vacuoles in mammary epithelial cells. This process takes place in a macropinocytosis-like process and involves PI3 kinase. They further identified dynamin and septin as essential machinery for this process. Interestingly, this process is reversible and appears to protect cells from cell death.

      Strengths:

      The data are clean and convincing to support the conclusions. The analysis is comprehensive, using multiple approaches such as SIM and TEM. The discussion on lactation is plausible and interesting.

      Weaknesses:

      As the first paper describing this phenomenon, it is adequate. However, the elucidation of the molecular mechanisms is not as exciting as it does not describe anything new. It is hoped that novel mechanisms will be elucidated in the future. In particular, the molecules involved in the reversing process could be quite interesting. Additionally, the relationship to conventional endocytic compartments, such as early and late endosomes, is not analyzed.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript "Formation of a giant unilocular vacuole via macropinocytosis-like process confers anoikis resistance" describes an interesting observation and provides initial steps towards understanding the underlying molecular mechanism.

      The manuscript describes that the majority of non-tumorigenic mammary gland epithelial cells (MCF-10A) in suspension initiate entosis. A smaller fraction of cells form a single giant unilocular vacuole (hereafter referred to as a GUVac). GUVac appeared to be empty and did not contain invading (entotic) cells. The formation of GUVac could be promoted by disrupting actin polymerisation with LatB and CytoD. The formation of GUVacs correlated with resistance to anoikis. GUVac formation was detected in several other epithelial cells from secretory tissues.

      The authors then use electron microscopy and super-resolution imaging to describe the biogenesis of GUVac. They find that GUVac formation is initiated by a micropinocytosis-like phenomenon (that is independent of actin polymerisation). This process leads to the formation of large plasma membrane invaginations, that pinch off from the PM to form larger vesicles that fuse with each other into GUVacs.

      Inhibition of actin polymerisation in suspended MCF-10a leads to the recruitment of Septin 6 to the PM via its amphipathic helix. Treatment with FCF (a septin polymerisation inhibitor) blocked GUVac biogenesis, as did pharmacological inhibition of dynamin-mediated membrane fission. The fusion of these vesicles in GUVacs required (perhaps not surprisingly) PI3P.

      Strengths:

      The authors have made an interesting and potentially important observation. They describe the formation of an endo-lysosomal organelle (a giant unilocular vacuole - GUVac) in suspended epithelial cells and correlate the formation of GUVacs with resistance to aniokis.

      Weaknesses:

      My major concern is the experimental strategy that is used throughout the paper to induce and study the formation GUVac. Almost every experiment is conducted in suspended cells that were treated with actin depolymerising drugs (e.g. LatB) and thus almost all key conclusions are based on the results of these experiments. I only have a few suggestions that would improve these experiments or change their outcome and interpretation.

      Yet, I believe it is essential to identify the endogenous pathway leading to the actin depolymerisation that drives the formation of GUVacs in detached epithelial cells (or alternatively to figure out how it is suppressed in most detached cells). A first step in that direction would be to investigate the polymerization status of actin in MCF-10a cells that 'spontaneously' form GUVacs and to test if these cells also become resistant to anoikis.

      Also, it would be great (and I believe reasonably easy) to better characterise molecular markers of GUVacs (LAMP's, Rab's, Cathepsins, etc....) to discriminate them from other endosomal organelles

    4. Reviewer #3 (Public Review):

      Summary:

      Loss of cell attachment to extracellular matrix (ECM) triggers aniokis (a type of programmed cell death), and resistance to aniokis plays a role in cancer development. However, mechanisms underlying anoikis resistance, and the precise role of F-actin, are not fully known.

      Here the authors describe the formation of a new organelle, giant unilocular vacuole (GUVac), in cells whose F-actin is disrupted during loss of matrix attachment. GUVac formation (diameter >500 nm) resulted from a previously unrecognised macropinocytosis-like process, characterized by inwardly curved micron-sized plasma membrane invaginations, dependent on F-actin depolymerization, septin recruitment, and PI(3)P. Finally, the authors show GUVac formation after loss of matrix attachment promotes resistance to anoikis.

      From these results, the authors conclude that GUVac formation promotes cell survival in environments where F-actin is disrupted and conditions of cell stress.

      Strengths:

      The manuscript is clear and well-written, figures are all presented at a very high level.

      A variety of cutting-edge cell biology techniques (eg time-lapse imaging, EM, super-resolution microscopy) are used to study the role of the cytoskeleton in GUVac formation. It is discovered that: (i) a macropinocytosis-like process dependent on F-actin depolymerisation, SEPT6 recruitment, and PI(3)P contributes to GUVac formation, and (ii) GUVac formation is associated with resistance to cell death.

      Weaknesses:

      The manuscript is highly reliant on the use of drugs, or combinations of drugs, for long periods of time (6hr, 18hr..). Wherever possible the authors should test conclusions drawn from experiments involving drugs also using other canonical cell biology approaches (eg siRNA, Crispr). Although suggestive as a first approach, it is not reliable to draw conclusions from experiments where only drug combinations are being advanced (eg LatB + FCF).

      F-actin is well known to play a wide variety of roles in cell death and other canonical cell death pathways (PMID: 26292640). The authors show using pharmacological inhibition that F-actin is key for GUVac formation. However, especially when testing for physiological relevance, how can these other roles for F-actin be ruled out?

      To test the role of septins in GUVac formation only recruitment studies and no direct functional work is performed. A drug forchlofeneuron (FCF) is used, but this is well known to have off-target effects (PMID: 27473917).

      Cells that possess GUVac are resistant to aniokis, but how are these cells resistant? This report is focused on mechanisms underlying GUVac formation and does not directly test for mechanisms underlying aniokis resistance.

    1. Reviewer #1 (Public Review):

      The paper 'Structural Analysis of the Dynamic Ribosome-Translocon Complex,' authored by Lewis et al., meticulously explores various conformations and states of the ribosome-translocon complex. Employing advanced techniques such as cryoEM structural determination and AlphaFold modeling, the study delves into the dynamic nature of the ribosome-translocon complex. The findings from these analyses unveil crucial insights, significantly advancing our understanding of the co-translational translocation process in cellular mechanisms.

      To begin with, the authors employed a construct comprising the first two transmembrane domains of rhodopsin as a model for studying protein translocation. They conducted in vitro translation, followed by the purification of the ribosome-translocon complex, and determined its cryoEM structures. An in-depth analysis of their ribosome-translocon complex structure revealed that the nascent chain can pass through the lateral gate of translocon Sec61, akin to the behavior of a Signaling Peptide. Additionally, Sec61 was found to interact with 28S rRNA helix 24 and the ribosomal protein uL24. In summary, their structural model aligns with the through-pore model of insertion, contradicting the sliding model.

      Secondly, the authors successfully identified RAMP4 in their ribosome-translocon complex structure. Notably, the transmembrane domain of RAMP4 mimics the binding of a Signaling Peptide at the lateral gate of Sec61, albeit without unplugging. Intriguingly, RAMP4 is exclusively present in the non-multipass translocon ribosome-translocon complex, not in those containing multipass translocon. This observation suggests that co-translational translocation specifically occurs in the Sec61 channel that includes bound RAMP4. Additionally, the authors discovered an interaction between the C-tail of ribosomal proteins uL22 and the translocon Sec61, providing valuable insights into the nascent chain's behavior.

      Moving on to the third point, the focused classification unveiled TRAP complex interactions with various components. The authors propose that the extra density observed in their novel ribosome-translocon complex can be attributed to calnexin, a major binder of TRAP according to previous studies. Furthermore, the new structure reveals a TRAP-OSTA interaction. This newly identified TRAP-OSTA interaction offers a potential explanation for why patients with TRAP delta defects exhibit congenital disorders of glycosylation.

      In conclusion, this paper presents a robust contribution to the field with its thorough structural and modeling analyses. The significance of the findings is evident, providing valuable insights into the intricate mechanisms of protein co-translational translocation. The well-crafted writing, meticulous analyses, and clear figures collectively contribute to the overall strength of the paper.

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

    2. eLife assessment

      This fundamental study offers new structural insights into the form and functions of the ribosome-translocon complex. Through a combination of in vitro translation, cryoEM imaging, and comprehensive AlphaFold comparative modeling, the authors offer convincing support for the lateral gate model of co-translational ER protein biogenesis, including the location of RAMP4 near the Sec61 lateral gate and the plausible role of helix 59 of the 28S ribosomal RNA as a determinant of the positive-inside rule. While the reviewers identified minor limitations, such as the need to validate RAMP4 presence with orthogonal measures, these results will be broadly impactful.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript 'Structural analysis of the dynamic ribosome-translocon complex' Lewis and Hegde present a structural study of the ribosome-bound multipass translocon (MPT) based on re-analysis of cryo-EM single particle data of ribosome-MPTs processing the multipass transmembrane substrate RhoTM2 from a previous publication (Smalinskaité et al, Nature 2022) and AlphaFold2 multimer modeling. Detailed analysis of the laterally open Sec61 is obtained from PAT-less particles.

      The following major claims are made:

      - TMs can bind similarly to the Sec61 lateral gate as signal peptides.

      - Ribosomal H59 is in immediate proximity to basic residues of TMs and signal peptides, suggesting it may contribute to the positive-inside rule.

      - RAMP4/SERP1 binds to the Sec61 lateral gate and the ribosome near 28S rRNA's helices 47, 57, and 59 as well as eL19, eL22, and eL31.

      - uL22 C-terminal tail binds H24/47 blocking a potential escape route for nascent peptides to the cytosol.

      - TRAP and BOS compete for binding to Sec61 hinge.

      - Calnexin TM binds to TRAPg.

      - NOMO wedges between TRAP and MPT.

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

      Weaknesses:

      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

    1. eLife assessment

      This study provides a useful strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) with serum derived from mCSCC-exposed mice. The exploration of serum-derived antibodies as a potential therapy for curing cancer is particularly promising but the study provides inadequate evidence for specific effects of mCSCC-binding serum antibodies. This study will be of interest to scientists seeking a novel immunotherapic strategy in cancer therapy.

    2. Joint Public Review:

      Summary:

      This study presents an immunotherapeutic strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) using serum from mice inoculated with mCSCC. The author hypothesizes that antibodies in the generated serum could aid the immune system in tumor volume reduction. The study results showed a reduction in tumor volume and altered expression of several cancer markers (p53, Bcl-xL, NF-κB, Bax) suggesting the potential effectiveness of this approach.

      Strengths:

      The approach shows potential effect on preventing tumor progression, from both the tumor size and the cancer biomarker expression levels bringing attention to the potential role of antibodies and B cell responses in cancer therapy.

      Weaknesses:

      These are some of the specific things that the author could consider to strengthen the evidence supporting the claims in their study.

      (1) The study fails to provide evidence of the specific effect of mCSCC-antibodies on mCSCC. The study utilized serum which also contains many immune response factors like cytokines that could contribute to tumor reduction. There is no information on serum centrifugation conditions, which makes it unclear whether immune components like antigen-specific T cells, activated NK cells, or other immune cells were removed from the serum. The study does not provide evidence of neutralizing antibodies through isolation, analysis of B cell responses, or efficacy testing against specific cancer epitopes. To affirm the specific antibodies' role in the observed immune response, isolating antibodies rather than employing whole serum could provide more conclusive evidence. Purifying the serum to isolate mCSCC-binding antibodies, such as through protein A purification, and ELISA would have been more useful to quantify the immune response. It would be interesting to investigate the types of epitopes targeted following direct tumor cell injection. A more thorough characterization of the antibodies, including B cell isolation and/or hybridoma techniques, would strengthen the claim.

      (2) In the study design, the control group does not account for the potential immunostimulatory effects of serum injection itself. A better control would be tumor-bearing mice receiving serum from healthy non-mCSCC-exposed mice. Additionally, employing a completely random process for allocating the treatment groups would be preferable. Also, the study does not explain why intravenous injection of tumor cells would produce superior antibodies compared to those naturally generated in mCSCC-bearing mice.

      (3) In Figure 2B, it would be more helpful if the author could provide raw data/figures of the tumor than just the bar graph. Similarly in Figure 3, the author should show individual data points in addition to the error bar to visualize the actual distribution.

      (4) The author mentioned that different stages of tumor cells have different surface biomarkers. Therefore, experimenting with injecting tumor cells at various stages could reveal the most immunogenic stage. Such an approach would allow for a comparative analysis of immune responses elicited by tumor cells at different stages of development.

      (5) In the abstract the author mentioned that using mCSCC is a proof-of-concept for this potential cancer treatment strategy. The discussion session should extend to how this strategy might apply to other cancer types beyond carcinoma.

    1. Reviewer #2 (Public Review):

      Summary:

      This combined experimental-theoretical paper introduces a novel two-domain statistical thermodynamic model (primarily Equation 1) to study allostery in generic systems but focusing here on the tetracycline repressor (TetR) family of transcription factors. This model, building on a function-centric approach, accurately captures induction data, maps mutants with precision, and reveals insights into epistasis between mutations.

      Strengths:

      The study contributes innovative modeling, successful data fitting, and valuable insights into the interconnectivity of allosteric networks, establishing a flexible and detailed framework for investigating TetR allostery. The manuscript is generally well-structured and communicates key findings effectively.

      Comments on revised version:

      I am happy with the changes made by the authors

    2. eLife assessment

      The study presents valuable findings where two-domain thermodynamic model for TetR accurately predicts in vivo phenotype changes brought about as a result of various mutations. The evidence provided is compelling and features the first innovative observations with a computational model that captures the structural behavior, much more than the current single-domain models.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors' earlier deep mutational scanning work observed that allosteric mutations in TetR (the tetracycline repressor) and its homologous transcriptional factors are distributed across the structure instead of along the presumed allosteric pathways as commonly expected. Especially, in addition, the loss of the allosteric communications promoted by those mutations, was rescued by additional distributed mutations. Now the authors develop a two-domain thermodynamic model for TetR that explains these compelling data. The model is consistent with the in vivo phenotypes of the mutants with changes in parameters, which permits quantification. Taken together their work connects intra- and inter-domain allosteric regulation that correlate with structural features. This leads the authors to suggest broader applicability to other multidomain allosteric proteins.

      Here the authors follow their first innovative observations with a computational model that captures the structural behavior, aiming to make it broadly applicable to multidomain proteins. Altogether, an innovative and potentially useful contribution.

      Weaknesses:

      None that I see, except that I hope that in the future, if possible, the authors would follow with additional proteins to further substantiate the model and show its broad applicability. I realize however the extensive work that this would entail.

    1. eLife assessment

      This work describes a novel affinity interactomics approach that allows investigators to identify networks of protein-protein interactions in cells. The important findings presented here describe the application of this technique to the SH3 domain of the membrane remodeling Bridging Integrator 1 (BIN1), the truncation of which leads to centronuclear myopathy. The authors present solid evidence that BIN1 SH3 engages with an unexpectedly high number of cellular proteins, many of which are linked to skeletal muscle disease, and evidence is presented to suggest that BIN1 may play a role in mitosis creating the potential for new avenues in drug development efforts. Some of the findings, however, remain rather preliminary, lack sufficient replicates and may require additional experiments to definitively support the conclusions.

    2. Reviewer #1 (Public Review):

      Original review:<br /> The authors report here interesting data on the interactions mediated by the SH3 domain of BIN1 that expand our knowledge on the role of the SH3 domain of BIN1 in terms of mediating specific interactions with a potentially high number of proteins and how variants in this region alter or prevent these protein-protein interactions. These data provide useful information that will certainly help to further dissect the networks of proteins that are altered in some human myopathies as well as the mechanisms that govern the correct physiological activity of muscle cells.

      The work is mostly based on improved biochemical techniques to measure protein-protein interaction and provide solid evidence that the SH3 domain of BIN1 can establish an unexpectedly high number of interactions with at least a hundred cellular proteins, among which the authors underline the presence of other proteins known to be causative of skeletal muscle diseases and not known to interact with BIN1. This represents an unexpected and interesting finding relevant to better define the network of interactions established among different proteins that, if altered, can lead to muscle disease. An interesting contribution is also the detailed identification of the specific sites, namely the Proline-Rich Motifs (PRMs) that in the interacting proteins mediate binding to the BIN1 SH3 domain. Less convincing, or too preliminary in my opinion, are the data supporting BIN1 co-localization with PRC1. Indeed, the affinity of PRC1 is significantly lower than that of DNM2, an established BIN1 interacting protein. Thus, this does not provide compelling evidence to support PRC1 as a significant interactor of BIN1. Similarly, the localization data appears somewhat preliminary to substantiate a role of BIN1 in mitotic processes. These findings may necessitate additional experimental work to be more convincing.

      Comments on revision:<br /> I acknowledge the significant changes made by the authors in the revised manuscript. However, I remain puzzled by the data concerning the interaction between BIN1 and PRC1. While I agree with the authors that even weak interactions among proteins can be significant, I am hesitant to accept a priori that the lack of clear evidence of colocalization between proteins can be justified solely by their low affinity.

      Moreover, the possibility that other mitotic proteins may be potential partners of BIN1 does not inherently support an interaction between BIN1 and PRC1. I suggest that the authors present the interaction with PRC1 as a potential event and emphasize that further studies are needed to definitively establish it.

    3. Reviewer #2 (Public Review):

      Original review:<br /> Summary:<br /> In this paper, Zambo and coworkers use a powerful technique, called native holdup, to measure the affinity of the SH3 domain of BIN1 for cellular partners. Using this assay, they combine data using cellular proteins and proline-containing fragments in these proteins to identify 97 distinct direct binding partners of BIN1. They also compare the binding interactome of the BIN1 SH3 domain to the interactome of several other SH3 domains, showing varying levels of promiscuity among SH3 domains. The authors then use pathway analysis of BIN1 binding partners to show that BIN1 may be involved in mitosis. Finally, the authors examine the impact of clinically relevant mutations of the BIN1 SH3 domain on the cellular interactome. The authors were able to compare the interactome of several different SH3 domains and provide novel insight into the cellular function of BIN1. Generally, the data supports the conclusions, although the reliance on one technique and the low number of replicates in each experiment is a weakness of the study.

      Strengths:<br /> The major strength of this paper is the use of holdup and native holdup assays to measure the affinity of SH3 domains to cellular partners. The use of both assays using cell-derived proteins and peptides derived from identified binding partners allows the authors to better identify direct binding partners. This assay has some complexity but does hold the possibility of being used to measure the affinity of the cellular interactome of other proteins and protein domains. Beyond the utility of the technique, this study also provides significant insight into the cellular function of BIN1. The authors have strong evidence that BIN1 might have an undiscovered function in cellular mitosis, which potentially highlights BIN1 as a drug target. Finally, the study provides outstanding data on the cellular binding properties and partners of seven distinct SH3 domains, showing surprising differences in the promiscuity of these proteins.

      Weaknesses:<br /> There are three major weaknesses of the study. First, the authors rely completely on a single technique to measure the affinity of the cellular interactome. The native holdup is a relatively new technique that is powerful yet relatively unproven. However, it appears to have the capacity to measure the relative affinity of proteins. Second, the authors appear to use a relatively small number of replicates for the holdup assays. There is no information in the legends about the number of replicates but the materials and methods suggest the native holdup data is from a single experimental replicate with multiple technical replicates. Finally, the authors' data using cellular proteins and fragments show that the affinity of the whole proteins is 5-20 fold lower than individual proline-containing fragments. The authors state that this difference suggests that there is cooperativity between different proline-rich sites of the binding partners of BIN1, yet BIN1 only has one SH3 domain. It is unclear what the molecular mechanism of the cooperative interaction would be exactly since there would be only one SH3 domain to bind the partner. An alternative interpretation would be that the BIN 1 SH3 domain requires sequences outside of the short proline-rich regions for high-affinity interactions with cellular partners, a hypothesis that is supported by other studies.

      Comments on revision:<br /> I thank the authors for their thoughtful response. I have additional comments.

      I appreciate that this is not a techniques paper and that the authors have done more detailed work in a separate publication. It would be helpful to readers not familiar with this new method to more fully describe this technique in this manuscript.

      I also thank the authors for their description of why they performed only 1 biological replicate of the experiment. However, I still believe that multiple biological replicates will provide more rigorous and reproducible data. The data the authors provide actually argues for the inclusion of more biological replicates. They state they performed 2 separate nHU replicates using different mass spectrometers. It is unclear if this data uses the same lysates and protein preparations, but by the data, the two methods detected a total of 207 distinct binding partners. Only 29 of these were significant binders in both replicates and only 90 were detected binders in both replicates. 117 binding partners were found in only one replicate suggesting a significant differences between replicates. Different batches of SH3 domains can have different activities and different replicates of cell lysates can vary, even when made from the same cell line. Thus, there can still be significant differences between replicates in this method. I appreciate the difficulty of performing and analyzing multiple biological replicates, but it is the most rigorous way to identify potential cellular partners.

      I also thank the author for including the mechanistic discussion about the differences between peptides and whole proteins. There is literature showing that regions outside of the short PxxP regions drive binding to SH3 domains, especially for the GRB2 family of adaptor proteins.

    1. eLife assessment

      This manuscript presents valuable findings on the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance using 5-fluorouracil. The evidence is compelling, supported by data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity, which provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". This work will interest those studying drug resistance in cancer therapy.

    2. Joint Public Review:

      Zhang et. al. presents compelling results that support the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance 5-fluorouracil. The experimental approach was developed and pursued with in vitro and in vivo strategies. Combining molecular, cellular, and biochemical approaches, the authors identify a germline variant with compromised enhancer control. Several lines of evidence were presented that are consistent with increased CEBP recruitment to the DPYD regulatory domain with consequential modifications in promoter-enhancer interactions that are associated with compromised 5-fluorouracil resistance. Functional identification of promoter and enhancer elements was validated by CRISPRi and CRISPRa assays. ChIP and qPCR documented histone marks that can account for the control of DPYD gene expression were established. Consistency with data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". It can be informative to directly demonstrate DPYD promoter-enhancer interactions. However, the genetic variants support the integration of regulatory activities.

    1. eLife assessment

      This study presents valuable new structures of a carbamylation-mimetic K125E mutant of the Cx26 gap junction channel uncovering the cytoplasmic loop structure and information about the closed state of the channel. The cryo-EM maps are in high quality and serve as strong foundations for dissecting the gating mechanism by CO2, providing convincing evidence in support of a mechanism where CO2-mediated carbamylation of Lys125 shifts the conformational equilibrium towards a state where the N-terminus occludes the pore of the channel. This information will be of interest to biochemists, cell biologists and biophysicists interested in the function of gap-junction channels in health and disease.

    2. Reviewer #1 (Public Review):

      Gap junction channels establish gated intercellular conduits that allow the diffusion of solutes between two cells. Hexameric connexin26 (Cx26) hemichannels are closed under basal conditions and open in response to CO2. In contrast, when forming a dodecameric gap-junction, channels are open under basal conditions and close with increased CO2 levels. Previous experiments have implicated Cx26 residue K125 in the gating mechanism by CO2, which is thought to become carbamylated by CO2. Carbamylation is a labile post-translational modification that confers negative charge to the K125 side chain. How the introduction of a negative charge at K125 causes a change in gating is unclear, but it has been proposed that carbamylated K125 forms a salt bridge with the side chain at R104, causing a conformational change in the channel. It is also unclear how overall gating is controlled by changes in CO2, since there is significant variability between structures of gap-junction channels and the cytoplasmic domain is generally poorly resolved. Structures of WT Cx26 gap-junction channels determined in the presence of various concentrations of CO2 have suggested that the cytoplasmatic N-terminus changes conformation depending on the concentration of the gas, occluding the pore when CO2 levels are high.

      In the present manuscript, Deborah H. Brotherton and collaborators use an intercellular dye-transfer assay to show that Cx26 gap-junction channels containing the K125E mutation, which mimics carbamylation caused by CO2, is constitutively closed even at CO2 concentrations where WT channels are open. Several cryo-EM structures of WT and mutant Cx26 gap junction channels were determined at various conditions and using classification procedures that extracted more than one structural class from some of the datasets. Together, the features on each of the different structures are generally consistent with previously obtained structures at different CO2 concentrations and support the mechanism that is proposed in the manuscript. The most populated class for K125E channels determined at high CO2 shows a pore that is constricted by the N-terminus, and a cytoplasmic region that was better resolved than in WT channels, suggesting increased stability. The K125E structure closely resembles one of the two major classes obtained for WT channels at high CO2. These findings support the hypothesis that the K125E mutation biases channels towards the closed state, while WT channels are in an equilibrium between open and closed states even in the presence of high CO2. Consistently, a structure of K125E obtained in the absence of CO2 appeared to also represent a closed state but at a lower resolution, suggesting that CO2 has other effects on the channel beyond carbamylation of K125 that also contribute to stabilizing the closed state. Structures determined for K125R channels, which are constitutively open because arginine cannot be carbamylated, and would be predicted to represent open states, yielded apparently inconclusive results.

      A non-protein density was found to be trapped inside the pore in all structures obtained using both DDM and LMNG detergents, suggesting that the density represents a lipid rather than a detergent molecule. It is thought that the lipid could contribute to the process of gating, but this remains speculative. The cytoplasmic region in the tentatively closed structural class of the WT channel obtained using LMNG was better resolved. An additional portion of the cytoplasmic face could be resolved by focusing classification on a single subunit, which had a conformation that resembled the AlphaFold prediction. However, this single-subunit conformation was incompatible with a C6-symmetric arrangement. Together, the results suggest that the identified states of the channel represent open states and closed states resulting from interaction with CO2. Therefore, the observed conformational changes illuminate a possible structural mechanism for channel gating in response to CO2.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Brotherton et al. describes a structural study of connexin-26 (Cx26) gap junction channel mutant K125E, which is designed to mimic the CO2-inhibited form of the channel. In the wild-type Cx26, exposure to CO2 is presumed to close the channel through carbamylation of the redeye K125. The authors mutated K125 to a negatively charged residue to mimic this effect and observed by cryo-EM analysis of the mutated channel that the pore of the channel is constricted. The authors were able to observe conformations of the channel with resolved density for the cytoplasmic loop (in which K125 is located). Based on the observed conformations and on the position of the N-terminal helix, which is involved in channel gating and in controlling the size of the pore, the authors propose the mechanisms of Cx26 regulation.

      Strengths:

      This is a very interesting and timely study, and the observations provide a lot of new information on connexin channel regulation. The authors use the state of the art cryo-EM analysis and 3D classification approaches to tease out the conformations of the channel that can be interpreted as "inhibited", with important implications for our understanding of how the conformations of the connexin channels controlled.

      Weaknesses:

      The revised version of the manuscript is improved, and the authors have addressed the review comments/criticisms in a satisfactory manner.

    4. Reviewer #3 (Public Review):

      Summary:

      The mechanism underlying the well-documented CO2-regulated activity of connexin 26 (Cx26) remains poorly understood. This is largely due to the labile nature of CO2-mediated carbamylation, making it challenging to visualize the effects of this reversible posttranslational modification. This paper by Brotherton et al. aims to address this gap by providing structural insights through cryo-EM structures of a carbamylation-mimetic mutant of the gap junction protein.

      Strength:

      The combination of the mutation, elevated PCO2, and the use of LMNG detergent resulted in high-resolution maps that revealed, for the first time, the structure of the cytoplasmic loop between transmembrane helix (TM) 2 and 3.

      Weaknesses:

      While the structure of the TM2-TM3 loop may suggest a mechanism for stabilizing the closed conformation, the EM density is not strong enough to support direct interaction with carbamylated or mutated K125.

      Overall, the cryo-EM structures presented in this study support their proposing mechanism in which carbamylation at K125 promotes Cx26 gap junction closure. Through careful control of the pH and PCO2 for each cryo-EM sample, the current study substantiated that the more closed conformation observed in high PCO2 is independent of pH but likely triggered by carbamylation. This was unclear from their prior cryo-EM map of wildtype Cx26 at high PCO2.

      While the new structures successfully visualize the TM2-TM3 loop, which likely plays significant roles in CO2-regulated Cx26 activity, further studies are necessary to understand the underlying mechanism. For instance, the current study lacks explanation regarding what propels the movement of the N-terminal helix, how carbamylated K125 interacts with the TM2-TM3 loop, the importance of the lipids visualized in the map, or the reason why gap junctions are constitutively open while hemichannels are closed under normal PCO2 levels

    1. eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

    2. Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

    3. Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model's ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

    1. Reviewer #2 (Public Review):

      In this revised manuscript Aguillon and collaborators convincingly demonstrating that CLK is required for free-running behavioral rhythms under constant conditions in the Cnidarian Nematostella. The results also convincingly show that CLK impacts rhythmic gene expression in this organism. This original work thus demonstrate that CLK was recruited very early during animal evolution in the circadian clock mechanism to optimize behavior and gene expression with the time-of-day. The manuscript could still benefit from some improvements so that it is more accessible for a wide readership.

    2. eLife assessment

      This fundamental study for the first time defines genetically the role of the Clock gene in basal metazoa, using the cnidarian Nematostella vectensis. With convincing evidence, the study provides insight into the early evolution of circadian clocks. Clock in this species is important for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

    1. eLife assessment

      BMP signaling plays a vital role in skeletal tissues, and the importance of its role in microtia prevention is novel and promising. This important study sheds light on the role of BMP signaling in preventing microtia in the ear, with solid data broadly supporting the claims of the authors.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ruichen Yang et al. investigated the importance of BMP signaling in preventing microtia. Authors showed that Cre recombinase mediated deletion of Bmpr1a using skeletal stem specific Cre Prx1Cre leads to microtia in adult and young mice. In these mice, distal auricle is more affected than middle and proximal. In these Bmpr1a floxed Prx1Cre mice, auricle chondrocyte start to differentiate into osteoblasts through increase in PKA signaling. The authors showed human single-cell RNA-Seq data sets where they observed increased PKA signaling in microtia patient which resembles their animal model experiments.

      Strengths:

      Although the importance of BMP signaling in skeletal tissues has been previously reported, the importance of its role in microtia prevention is novel and very promising to study in detail. The authors satisfied the experimental questions by performing correct methods and explaining the results in detail.

    3. Reviewer #2 (Public Review):

      The authors (Yang et al.) present a well-executed study of a mouse model of Bmpr1a focusing on microtia development and pathogenesis.

      The authors report that the generation of the Bmpr1a in Prrx1+ cells in adult mice helps characterize the developmental progression of the external ear.

      The authors explain how auricular chondrocytes differ from growth plates or other chondrocytes and BMP-Smd1/5/9 activation, which is required to maintain chondrocyte fate in the distal part of the ear. The authors explain with evidence how BMP signaling actively maintains auricle cartilage in the post-developmental stage.

      Elegant immunofluorescence staining, excellent histology preparations and dissections, excellent microscopy, sufficient experimental sample size, and good statistical analyses support the results. The study is well grounded in extensively reviewed and cited existing literature. This report sets the stage for a comprehensive interrogation of Bmpr1a deficiency and ear defects.

    1. eLife assessment

      This study presents useful findings on an unresolved question of cerebellar physiology: Do synapses between Purkinje cells and granule cells, made by the ascending part of the granule cells' axon, have different properties than those made by parallel fibers? The authors conducted patch-clamp recordings on rat cerebellar slices and found a new type of plasticity in the synapses of the ascending part of granule cell axons. While the finding may contribute to a better understanding of cerebellar function, the results are still incomplete because the shift in the baseline recording may have influenced the readout of long-term plasticity.

    2. Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:<br /> The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:<br /> Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:<br /> One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009). This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.<br /> Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.<br /> The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.<br /> In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.<br /> The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.<br /> Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.<br /> A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.<br /> It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can sub-threshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

    4. Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

    1. eLife assessment

      This paper presents a valuable automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system. The authors have developed a technique for analyzing cells that grow in suspension and used their method to look at different tumor cell lines that grow in suspension and determine the effect of drugs that directly affect the cell cycle. They show solid evidence that the method can be applied to both adherent and non-adherent cell lines. This paper will be of interest to cell biologists investigating cell cycle effects.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes a series of steps using the FIJI environment, the authors have created a plugin for the initial steps of the process, merging images into an RGB stack, conversion to HSV, and then using brightness for reference and hue to distinguish the phases of the cycle. Then, the well-known Trackmate plugin was used to identify single cells and extract intensities. The data was further post-processed in R, where a series of steps, smoothing, scaling, and addressing missing frames were used to train a random forest. Hard-coded values of hue were used to distinguish G1, S, and G2/M. The process was validated with a score comparing the quality of the tracks and the authors reported the successful measure of the cell cycles.

      Strengths:

      The implementation of the pipeline seems easy, although it requires two separate platforms: Fiji and R. A similar approach could be implemented in a single programming environment like Python or Matlab and there would not be any need to export from one to the other. However, many labs have similar setups and that is not necessarily a problem.

      Weaknesses:

      I found two important weaknesses in the proposal:

      (1) The pipeline relies on a large number of hard-coded conditions: size of Gaussian blur (Gaussian should be written in uppercase), values of contrast, size of filters, levels of intensity, etc. Presumably, the authors followed a heuristic approach and tried values of these and concluded that the ones proposed were optimal. A proper sensitivity analysis should be performed. That is, select a range of values of the variables and measure the effect on the output.

      (2) Linked to the previous comments. Other researchers that want to follow the pipeline would have either to have exactly the same acquisition conditions as the manuscript or start playing with values and try to compensate for any difference in their data (cell diameter, fluorescent intensity, etc.) to see if they can match the results of the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper presents an automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system and applies the method to look at different tumor cell lines that grow in suspension and determine their cell cycle profile and the effect of drugs that directly affect the cell cycles, on progression through the cell cycle for a 72 hour period.

      Strengths:

      This is a METHODS paper. The one potentially novel finding is that they can identify cells that are at the G1-S transition by the change in color as one protein starts to go up and the other one goes down, similar to the change seen as cells enter G2/M.

      Weaknesses:

      They did not clearly indicate whether the G1/S cells are identified automatically or need to be identified by the person reviewing the data. In Figures 1 and S1, the movie shows cells with no color at a time corresponding to what is about the G1/S transition. Their assigned cell cycle phase is shown in Figure 1 but not in Figure S1. None of these pictures show the G1/S cells that they talk about being able to detect with a different color.

    4. Author Response:

      We greatly appreciate the insightful feedback provided by the reviewers and the editor on our manuscript titled "Automated workflow for the cell cycle analysis of non-adherent and adherent cells using a machine learning approach".  We will provide a revised version of the manuscript aiming to address the comments and recommendations provided by the reviewers to enhance the quality and clarity of our work. In detail:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes a series of steps using the FIJI environment, the authors have created a plugin for the initial steps of the process, merging images into an RGB stack, conversion to HSV, and then using brightness for reference and hue to distinguish the phases of the cycle. Then, the well-known Trackmate plugin was used to identify single cells and extract intensities. The data was further post-processed in R, where a series of steps, smoothing, scaling, and addressing missing frames were used to train a random forest. Hard-coded values of hue were used to distinguish G1, S, and G2/M. The process was validated with a score comparing the quality of the tracks and the authors reported the successful measure of the cell cycles.

      Strengths:

      The implementation of the pipeline seems easy, although it requires two separate platforms: Fiji and R. A similar approach could be implemented in a single programming environment like Python or Matlab and there would not be any need to export from one to the other. However, many labs have similar setups and that is not necessarily a problem.

      Weaknesses:

      I found two important weaknesses in the proposal:

      (1) The pipeline relies on a large number of hard-coded conditions: size of Gaussian blur (Gaussian should be written in uppercase), values of contrast, size of filters, levels of intensity, etc. Presumably, the authors followed a heuristic approach and tried values of these and concluded that the ones proposed were optimal. A proper sensitivity analysis should be performed. That is, select a range of values of the variables and measure the effect on the output.

      (2) Linked to the previous comments. Other researchers that want to follow the pipeline would have either to have exactly the same acquisition conditions as the manuscript or start playing with values and try to compensate for any difference in their data (cell diameter, fluorescent intensity, etc.) to see if they can match the results of the manuscript.

      We thank Reviewer #1 for the insightful comments. We acknowledge the importance of ensuring the reproducibility and robustness of our pipeline among different sample types, acquisition conditions and, consequently, image S/N ratio and resolution. To address the concerns regarding the reliance on hard-coded conditions and the impact of varying parameter values on the output, we will complete the Methods section of the manuscript and the “Usage” section of the README file in the Github repository (https://github.com/ieoresearch/cellcycle-image-analysis)  providing a summary of best practices that should be applied in the pre-processing part of the analysis. As an example, the usable image filters types and their settings related to cells with different size, fluorescence intensities and acquisition conditions will be analysed in detail and general guidelines will be provided.

      Moreover, we will provide detailed documentation on the acquisition conditions required for reproducibility in the README file and Methods section.

      For the Tracking Analysis part, we will refer to the well documented TrackMate tutorial to adapt the tracking analysis to different cell types, image resolution and intensities.

      Reviewer #2 (Public Review):

      Summary:

      This paper presents an automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system and applies the method to look at different tumor cell lines that grow in suspension and determine their cell cycle profile and the effect of drugs that directly affect the cell cycles, on progression through the cell cycle for a 72 hour period.

      Strengths:

      This is a METHODS paper. The one potentially novel finding is that they can identify cells that are at the G1-S transition by the change in color as one protein starts to go up and the other one goes down, similar to the change seen as cells enter G2/M.

      Weaknesses:

      They did not clearly indicate whether the G1/S cells are identified automatically or need to be identified by the person reviewing the data. In Figures 1 and S1, the movie shows cells with no color at a time corresponding to what is about the G1/S transition. Their assigned cell cycle phase is shown in Figure 1 but not in Figure S1. None of these pictures show the G1/S cells that they talk about being able to detect with a different color.

      Thank you for your valuable feedback regarding the identification of G1/S cells in our pipeline. To clarify, the G1/S phase identification process is entirely automated within our pipeline. We apologize for any confusion caused by the lack of explicit indication in our manuscript. We will ensure to update the manuscript to clearly state that the identification of G1/S cells is performed automatically by our algorithm, eliminating the need for manual intervention.

      Regarding the visualization of G1/S cells in Figures 1 and S1, we will revise the figures to include all the available frames referred to the G1/S transition. It's important to note that during this transition, fluorescence intensities for both the green and the red channels, are dimmer in comparison with their intensity levels during the G2/M transitions. This can result in frames that may seem visually darker, despite both colors coexisting at the same time point. In our revised figures, we will ensure to include all available frames relevant to the G1/S transition and provide a clearer representation of this phenomenon.

      In response to Reviewer #2's recommendation, we plan to conduct additional experiments to further validate our observations. We will utilize the EdU technology to highlight the S-phase in FUCCI cells, allowing for better discrimination between the red and green fluorescence of the FUCCI reporter during the initial S-phase.

      Additionally, we acknowledge that the link to the Docker container (https://hub.docker.com/r/emanuelsoda/rf_semi_sup)  was not included in the manuscript. We apologize for this oversight, and it will be included in the revised version of the paper.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      A summary of what the authors were trying to achieve.

      The authors cultured pre- and Post-vaccine PBMCs with overlapping peptides encoding S protein in the presence of IL-2, IL-7, and IL-15 for 10 days, and extensively analyzed the T cells expanded during the culture; by including scRNAseq, scTCRseq, and examination of reporter cell lines expressing the dominant TCRs. They were able to identify 78 S epitopes with HLA restrictions (by itself represents a major achievement) together with their subset, based on their transcriptional profiling. By comparing T cell clonotypes between pre- and post-vaccination samples, they showed that a majority of pre-existing S-reactive CD4+ T cell clones did not expand by vaccinations. Thus, the authors concluded that highly-responding S-reactive T cells were established by vaccination from rare clonotypes.

      An account of the major strengths and weaknesses of the methods and results.

      Strengths:

      Selection of 4 "Ab sustainers" and 4 "Ab decliners" from 43 subjects who received two shots of mRNA vaccinations.

      Identification of S epitopes of T cells together with their transcriptional profiling. This allowed the authors to compare the dominant subsets between sustainers and decliners.

      Weaknesses were properly addressed in the revised manuscript, and I do not have any additional concerns.

      We appreciate the reviewer for the constructive comments and recommendations, which were a great help for us to improve our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The paper aims to investigate the relationship between anti-S protein antibody titers with the phenotypes&clonotypes of S-protein-specific T cells, in people who receive SARS-CoV2 mRNA vaccines. To do this, the paper recruited a cohort of Covid-19 naive individuals that receives the SARS-CoV2 mRNA vaccines and collect sera and PBMCs samples on different timepoints. Then they mainly generate three sets of data: 1). Anti-S protein antibody titers on all timepoints. 2) Single-cell RNAseq/TCRseq dataset for divided T cells after stimulation by Sprotein for 10 days. 3) Corresponding epitopes for each expanded TCR clones. After analyzing these result, the paper reports two major findings&claims: A) Individuals having sustained anti-S protein antibody response also have more so-called Tfh cells in their single-cell dataset. B). S-reactive T cells do exist before the vaccination, but they seems to be unable to response to Covid-19 vaccination properly.

      The paper's strength is it uses a very systemic and thorough strategy trying to dissect the relationship between antibody titers, T cell phenotypes, TCR clonotypes and corresponding epitopes, and indeed it reports several interesting findings about the relationship of Tfh clonotypes/sustained antibody and about the S-reactive clones that exist before the vaccination. The conclusion is solid in general but some claims are overstated. My suggestion is the authors should further limit their claims in abstract, for example,

      ”Even before vaccination, S-reactive CD4+ T cell clonotypes did exist, most of which (MAY) cross-reacted with environmental or symbiotic bacteria" -- The paper don't have experimental evidence to show these TCR clones respond to these epitopes.

      We thank the reviewer for pointing out the insufficient demonstration of experimental evidence. We have added the relevant data to Fig. S5 in the newly revised manuscript.

      "These results suggest that de novo acquisition of memory Tfh-like cells upon vaccination (LIKELY) contributes to the longevity of anti-S antibody titers." --Given the small sample size and the statistical analysis was not significant, this claim was overstated.

      "S-reactive T cell clonotypes detected immediately after 2nd vaccination polarized to follicular helper T (Tfh)-like cells (UNDER IN VITRO CULTURE)". -- the conclusion was based on vitro cultured cells, which had limitation.

      We thank the reviewer for the helpful suggestion. We have corrected some sentences in line with these suggestions in the newly revised manuscript.

      Recommendations for the authors:

      Please note: Though most of the overstatement was removed from the original manuscript, authors still need to modify some of the statements in "Abstract".

      We thank the reviewer for carefully reading our manuscript and giving us detailed suggestions. We have modified these statements in “Abstract” accordingly in the newly revised manuscript.

    2. Reviewer #1 (Public Review):

      • A summary of what the authors were trying to achieve.

      The authors cultured pre- and Post-vaccine PBMCs with overlapping peptides encoding S protein in the presence of IL-2, IL-7, and IL-15 for 10 days, and extensively analyzed the T cells expanded during the culture; by including scRNAseq, scTCRseq, and examination of reporter cell lines expressing the dominant TCRs. They were able to identify 78 S epitopes with HLA restrictions (by itself represents a major achievement) together with their subset, based on their transcriptional profiling. By comparing T cell clonotypes between pre- and post-vaccination samples, they showed that a majority of pre-existing S-reactive CD4+ T cell clones did not expand by vaccinations. Thus, the authors concluded that highly-responding S-reactive T cells were established by vaccination from rare clonotypes.

      • An account of the major strengths and weaknesses of the methods and results.

      Strengths

      • Selection of 4 "Ab sustainers" and 4 "Ab decliners" from 43 subjects who received two shots of mRNA vaccinations.<br /> • Identification of S epitopes of T cells together with their transcriptional profiling. This allowed the authors to compare the dominant subsets between sustainers and decliners.

      Weaknesses were adequately addressed in the revised manuscript, and I do not have any additional concerns.

    3. eLife assessment

      This important study by Lu et al aimed to determine the key factors of T cell responses associated with durable antibody responses following the initial two shots of COVID-19 mRNA vaccinations. By comparing the SARS-CoV-2 spike protein (S)-specific T cell subsets between "Ab sustainers" and "Ab decliners" that were present post-vaccination, the authors concluded that S-specific CD4+ T cells in "Ab sustainers" were enriched with Tfh cells. There is solid evidence as the authors applied multiple methods and approaches to address the key questions, and the presented data are robust.

    4. Reviewer #3 (Public Review):

      The paper aims to investigate the relationship between anti-S protein antibody titers with the phenotypes & clonotypes of S-protein-specific T cells in people who receive SARS-CoV2 mRNA vaccines. The paper recruited a cohort of COVID-19 naive individuals who received the SARS-CoV2 mRNA vaccines and collected sera and PBMCs samples on different time points. Then, three sets of data were generated: 1). Anti-S protein antibody titers on all time points. 2) Single-cell RNAseq/TCRseq analysis for divided T cells after in vitro stimulation by S-protein. 3) Peptide epitopes for each expanded TCR clone. Based on these, the paper reports two major findings: A) Individuals having more sustained anti-S protein antibody response also have more Tfh-featured S-specific cells in their blood after 2nd-dose vaccination. B). S-specific cross-reactive T cells exist in COVID-19 naive individuals, but most of these T cell clones are not expanded after SARS-CoV-2 vaccination.

      The paper's strength is that it uses a very systemic strategy trying to dissect the relationship between antibody titers, T cell phenotypes, TCR clonotypes and corresponding epitopes. The conclusion is solid in general. However, the weaknesses include the relatively small sample size (4 sustainers vs. 4 decliners) and the use of in vitro stimulated cells for analysis, which may 'blur' the classification of T cell subsets. Nevertheless, it may have great impact on future vaccine design because it demonstrated that promoting Tfh differentiation is crucial for the longevity of antibody response. Additionally, this paper nicely showed that most cross-reactive clones that are specific to environmental/symbiotic microbes did not expand post- vaccination, providing important fundamental insights into the establishment of T-cell responses after SARS-CoV-2 vaccination.

    1. Author Response

      The following is the authors’ response to the current reviews.

      At this stage the referees had only minor comments. Referee #1 asked whether archerfish indeed generalize in egocentric rather than allocentric coordinates. It might be that the current results do not rule out the idea that archerfish are unaware of changes in body position, they continue with previously successful actions, that seems as egocentric generalization. We agree with referee #1 and updated lines 255-260 in the results and added lines 329-336 in the discussion text that mentions this possibility. Referee #2 mentioned that a portion of fish did not make it to the final test which raises the question whether all individuals are able to solve the task. We agree with referee #2 and added paragraph at the discussion section to mention this point (lines 384-388). We also added the salinity of the water in the water tanks (line 98) as per suggestion of the Referee #2. Referee #2 suggested using a different term than “washout” in the behavioral experiments. Since the term “washout” is standard in the field, we keep the term in the text.


      The following is the authors’ response to the original reviews.

      eLife assessment

      This useful study explores how archerfish adapt their shooting behavior to environmental changes, particularly airflow perturbations. It will be of interest to experts interested in mechanisms for motor learning. While the evidence for an internal model for adaptation is solid, evidence for adaptation to light refraction, as initially hypothesized, is inconclusive. As such, the evidence supporting an egocentric representation might be caused by alternative mechanisms to airflow perturbations.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors examined whether archerfish have the capacity for motor adaptation in response to airflow perturbations. Through two experiments, they demonstrated that archerfish could adapt. Moreover, when the fish flipped its body position with the perturbation remaining constant, it did not instantaneously counteract the error. Instead, the archerfish initially persisted in correcting for the original perturbation before eventually adapting, consistent with the notion that the archerfish's internal model has been adapted in egocentric coordinates.

      Evaluation:

      The results of both experiments were convincing, given the observable learning curve and the clear aftereffect. The ability of these fish to correct their errors is also remarkable. Nonetheless, certain aspects of the experiment's motivation and conclusions temper my enthusiasm.

      (1) The authors motivated their experiments with two hypotheses, asking whether archerfish can adapt to light refractions using an innate look-up table as opposed to possessing a capacity to adapt. However, the present experiments are not designed to arbitrate between these ideas. That is, the current experiments do not rule out the look-up table hypothesis, which predicts, for example, that motor adaptation may not generalize to de novo situations with arbitrary actionoutcome associations. Such look-up table operations may also show set-size effects, whereas other mechanisms might not. Whether their capacity to adapt is innate or learned was also not directly tested, as noted by the authors in the discussion. Could the authors clarify how they see their results positioned in light of the two hypotheses noted in the Introduction?

      We agree with the referee that look up tables only confuse the issue. The question we tested is whether or not the fish uses adaptation mechanisms to correct its shooting. We have now changed the introduction both to eliminate the entire question of look up tables and also to clarify that both innate mechanisms and learning mechanisms can contribute to fish shooting, and that our research focuses on the question of whether the fish can adapt to a perturbation in its shooting caused by a change in its physical environment.

      (2) The authors claim that archerfish use egocentric coordinates rather than allocentric coordinates. However, the current experiments do not make clear whether the archerfish are "aware" that their position was flipped (as the authors noted, no visual cues were provided). As such, for example, if the fish were "unaware" of the switch, can the authors still assert that generalization occurs in egocentric coordinates? Or simply that, when archerfish are ostensibly unaware of changes in body position, they continue with previously successful actions.

      The fish has access to the body position switch: there are clues in a water tank that can help the fish orient inside the water tank. Additionally, there are no clues to the presence or direction of the air flow above the water tank. Moreover, previous experience has shown that the fish is sensitive to the visual cues and uses them to achieve consistent orientation within the tank when possible. These points have been added to the main text [lines 143-144, 254-257]

      (3) The experiments offer an opportunity to examine whether archerfish demonstrate any savings from one session to another. Savings are often attributed to a faster look-up table operation. As such, if archerfish do not exhibit savings, it might indicate a scenario where they do not possess a refined look-up table and must rely on implicit mechanisms to relearn each time.

      This is an important question. Indeed, we looked for the ‘saving’ effect in the data, but its noisy nature prevented us from drawing a concrete conclusion. We now mention this in lines 247-249.

      We have also eliminated the discussion of look up tables from the article.

      (4) The authors suggest that motor adaptation in response to wind may hint at mechanisms used to adapt to light refraction. However, how strong of a parallel can one draw between adapting to wind versus adapting to light refraction? This seems important given the claims in this paper regarding shared mechanisms between these processes. As a thought experiment, what would the authors predict if they provided a perturbation more akin to light refraction (e.g., a film that distorts light in a new direction, rather than airflow)?

      This is an important point. Indeed, our project started by looking for options to distort the refraction index or distort the light in a new direction. However, given the available ways of distorting the light to a new direction, it is hard to achieve that on the technical level. Initially, we tried using prism goggles, however the archerfish found it hard to shoot with the heavy load on the head. We have also explored oil on the water surface. However, given the available oils and the width of the film above water, it is hard to achieve considerable perturbation.

      Fish response to the perturbation matches the response to what would be expected for a change in light refraction. Light refraction perturbation does not change with the change in fish body position relative to the target. However, in response to (and in agreement with) the referees, we have generalized the context in which we see our results and discuss the results in terms of adaptation of the fish shooting behavior to changes in physical factors including light refraction, wind, fatigue, and others.

      (5) The number of fish excluded was greater than those included. This raises the question as to whether these fish are merely elite specimens or representative of the species in general.

      The filtering of the fish was in the training stage. The requirements were quite strict: the fish had to produce enough shots each day in the experimental setup. Very few fish succeeded. But all fish that got to the stage of perturbation exhibited the adaptation effect. We do not see a reason to think that the motivation to shoot will have a strong interaction with the shooting adaptation mechanisms.

      Reviewer #2 (Public Review):

      Summary:

      The work of Volotsky et al presented here shows that adult archerfish are able to adjust their shooting in response to their own visual feedback, taking consistent alterations of their shot, here by an air flow, into account. The evidence provided points to an internal mechanism of shooting adaptation that is independent of external cues, such as wind. The authors provide evidence for this by forcing the fish to shoot from 2 different orientations to the external alteration of their shots (the airflow). This paper thus provides behavioral evidence of an internal correction mechanism, that underlies adaptive motor control of this behavior. It does not provide direct evidence of refractory index-associated shoot adjustance.

      Strengths:

      The authors have used a high number of trials and strong statistical analysis to analyze their behavioral data.

      Weaknesses:

      While the introduction, the title, and the discussion are associated with the refraction index, the latter was not altered, and neither was the position of the target. The "shot" was altered, this is a simple motor adaptation task and not a question related to the refractory index. The title, abstract, and the introduction are thus misleading. The authors appear to deduce from their data that the wind is not taken into account and thus conclude that the fish perceive a different refractory index. This might be based on the assumption that fish always hit their target, which is not the case. The airflow does not alter the position of the target, thus the airflow does not alter the refractive index. The fish likely does not perceive the airflow, thus alteration of its shooting abilities is likely assumed to be an "internal problem" of shooting. I am sorry but I am not able to understand the conclusion they draw from their data.

      This is an important point. Indeed, our project started by looking for options to distort the refraction index or distort the light in a new direction. However, given the available ways of distorting the light to a new direction, it is hard to achieve that on the technical level. Initially, we tried using prism goggles, however the archerfish found it hard to shoot with the heavy load on the head. We have also explored oil on the water surface. However, given the available oils and the width of the film above water, it is hard to achieve considerable perturbation.

      Fish response to the perturbation matches the response to what would be expected for a change in light refraction. Light refraction perturbation does not change with the change in fish body position relative to the target. However, in response to (and in agreement with) the referees, we have generalized the context in which we see our results and discuss the results in terms of adaptation of the fish shooting behavior to changes in physical factors including light refraction, wind, fatigue, and others.

      Reviewer #2 (Recommendations For The Authors):

      I have had a hard time trying to understand how the authors concluded that the RI is important here as it is not altered. Thus I did not understand the conclusions drawn from this paper. The experiments are well described, but the conclusions are not to me. Maybe schematics would help to clarify. I am from outside the field and represent a naïve reader with an average intellect. The authors need to do a better job of explaining their results if they want others to understand their conclusions.

      See response to the public comments.

      Minor comments:

      Line 9: omit the "an".

      Done.

      Line 11: this sentence would fit way better if it followed the next one.<br /> Done.

      Line 15: and all the rest of the paper: washout is a strange term and for me associated with pharmacological manipulations - might only be me. I suggest using recovery instead throughout the manuscript.

      The term ‘washout’ is often used in the field of motor adaptation to describe the return to original condition. For example:

      Kluzik J, Diedrichsen J, Shadmehr R, Bastian AJ (2008) Reach adaptation: what determines whether we learn an internal model of the tool or adapt the model of our arm? J Neurophysiol 100:1455-64. doi: 10.1152/jn.90334.2008

      Donchin O, Rabe K, Diedrichsen J, Lally N, Schoch B, Gizewski ER, Timmann D (2012) Cerebellar regions involved in adaptation to force field and visuomotor perturbation. J Neurophysiol 107:134-47

      Line 19: the fish does not expect the flow, it expects that it shoots too short- no?

      Done.

      Line 35: fix the citation - in your reference manager.

      Done.

      Line 52: provide some examples of the mechanisms you think of or papers of it for naive readers. Otherwise, this sentence is not helpful for the reader.

      Done.

      Line 183: it's unclear which parameter you mean. Rephrase.

      Done.

      Line 197: should read to test "the" - same sentence: you repeat yourself- rephrase the sentence.

      Done.

      Figure 4: it was unclear to me why the figure was differentiating between fishes until I read the legend. Why not include direct information in the figure? A schematic maybe? Legend: you have a double "that" in C.

      We added the title for each column with the information about the direction of air.

      Figures: in all figures, perturbation is wrongly spelled! Change the term washout to recovery.

      Done. We kept the term ‘washout’

    1. Author response:

      We are grateful to reviewer #1 for positive evaluation of our work and for providing valuable comments that will significantly enhance the presentation of our results. We understand reviewer #2's negative assessment because we did not discuss an alternative model of dosage compensation in Drosophila. We will address this omission in the Introduction section of the revised manuscript and remove any controversial statements from other parts of the text. However, it is important to clarify that our study does not focus on the mechanisms of dosage compensation. The main goal of the manuscript was to investigate the assembly of the MSL complex and its specific binding to the Drosophila X chromosome. We utilized male survival data to demonstrate the efficacy of MSL complex binding to the X chromosome, a relationship that has been supported by numerous independent studies. We understand that Reviewer #2 agrees that disruption of the MSL complex binding results in male lethality. As far as we understand, Reviewer #2 suggests that the MSL complex does not activate transcription of X chromosome genes, but instead facilitate the recruitment of MOF protein and potentially other general transcription factors to the X chromosome. This could explain the decrease in autosomal gene expression due to a reduction in activating factors like MOF at autosomal promoters. In the upcoming revision, we aim to strike a balance between the two models that elucidate dosage compensation in Drosophila. We appreciate your feedback and look forward to enhancing the clarity and coherence of our manuscript based on your insightful comments.

      Reviewer #2 (Public Review):

      Summary:

      A deletion analysis of the MSL1 gene to assess how different parts of the protein product interact with the MSL2 protein and roX RNA to affect the association of the MSL complex with the male X chromosome of Drosophila was performed.

      Strengths:

      The deletion analysis of the MSL1 protein and the tests of interaction with MSL2 are adequate.

      We thank the reviewer for the positive assessment of the experimental work done.

      This reviewer does not adhere to the basic premise of the authors that the MSL complex is the primary mediator of dosage compensation of the X chromosome of Drosophila.

      We completely agree with this reviewer's claim. In the Introduction section we’ll attempt to make clear that there are two models for the functional role of specific recruitment of the MSL complex to the X chromosome in males.

      Several lines of evidence from various laboratories indicate that it is involved in sequestering the MOF histone acetyltransferase to the X chromosome but there is a constraint on its action there. When the MSL complex is disrupted, there is no overall loss of compensation but there is an increase in autosomal expression. Sun et al (2013, PNAS 110: E808-817) showed that ectopic expression of MSL2 does not increase expression of the X and indeed inhibits the effect of acetylation of H4Lys16 on gene expression. Aleman et al (2021, Cell Reports 35: 109236) showed that dosage compensation of the X chromosome can be robust in the absence of the MSL complex. Together, these results indicate that the MSL complex is not the primary mediator of X chromosome dosage compensation. The authors use sex-specific lethality as a measure of disruption of dosage compensation, but other modulations of gene expression are the likely cause of these viability effects.

      Sun et al (2013, PNAS 110: E808-817) showed that recruitment of the MSL complex-specific subunit MSL2 or the MOF protein to the UAS promoter resulted in recruitment of the entire MSL complex in males but not transcriptional activation. This important result argues that the MSL complex does not activate transcription. However, it must be taken into account that the GAL4 DNA binding region used to recruit the chimeric MSL2 protein to the UAS promoter was directly fused to the MSL2 RING domain, which is critical for interaction of MSL2 with MSL1 and its ubiquitination activity (this activity could potentially be involved in transcription activation). It also remains poorly understood what happens to the MSL complex after recruitment to the promoters or HAS on the X chromosome. Subcomplex MSL1/MSL3/MOF can acetylate TF and H4K16 during RNA polymerase II elongation, resulting in increasing of transcription. The separate role of MSL2 and MSL1 in the activation of transcription of gene promoters is also shown. Sun et al. showed that in females, recruitment of MOF to the UAS promoter leads to a strong increase in transcription, which is associated with the inclusion of MOF in the non-specific lethal (NSL) complex, which is bound to promoters and is required for strong transcription activation. In males, MOF is preferentially recruited to the UAS promoter in the full MSL complex or perhaps in the MSL1/MSL3/MOF subcomplex, which stimulates transcription during RNA polymerase II elongation much less strongly than NSL complex. The same result was obtained in the Prestel et al. 2010 (Mol Cell 38:815-26). In this study the GAL4 binding sites were inserted upstream of the lacZ and mini-white genes. Activation of transcription after recruitment of GAL4-MOF to the GAL4 sites was studied in males and females. As in Sun et al. 2013, strong activation of the reporter was observed in females. A weak transcriptional activation of the reporter gene in males was shown, and the MOF protein was detected not only on the promoter, but also in the coding and 3’ regions of the reporter.

      We do not understand how the paper by Aleman et al (Cell Reports 35: 109236, 2021) is consistent with the hypothesis that the MSL complex is not involved in the transcriptional activation of X chromosomal genes. The main conclusions of this paper: 1) Inactivation of Mtor leads to selective activation of the male X chromosome. 2) Mtor-driven attenuation of male X occurs in broad domains linked by the MSL complex. 3) Mtor genetically interacts with MSL components and reduces male mortality; 4) Mtor restrains dose-compensated expression at the level of nascent transcription. Thus, the paper shows that the MSL complex has an activator activity that is partially inhibited by Mtor. Accordingly, inactivation of Mtor only partially restored the survival of males in which dosage compensation was not completely inactivated.

      A detailed explanation was provided by Birchler and Veitia (2021, One Hundred Years of Gene Balance: How stoichiometric issues affect gene expression, genome evolution, and quantitative traits. Cytogenetics and Genome Research 161: 529-550).

      We agree that an alternative model of the dosage compensation mechanism is reasonable. We can assume that both mechanisms can function jointly provide effective dosage compensation in Drosophila males. At the suggestion of the reviewer to reconsider the entire context of the article, we will make many small changes throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Overall, I found the text well written and the figures logically organized (especially Figure 5, which had the potential to confuse). The authors especially excelled in bringing together the decades of literature in the Discussion.

      I offer several suggestions to improve the readability:

      Consider presenting the coiled-coil domain homology in Figure 1A as a contrast for the N-terminal region, which the authors claim is poorly conserved.

      We’ll add the coiled-coil domain homology in Figure 1A in new version of MS.

      It is difficult to visualize the red MSL2 in Figure 2; the green and red panels should be presented separately in the main text, as they are in the Supplemental Figure 2.

      We’ll prepare Figure 2 with separate green and red panels.

      The ChIP-seq experiments for MSL proteins are well presented, but in my opinion, add little to the overall conclusions:

      Figure 6 mostly recapitulates what has already been published and utilized by several groups, most recently the authors themselves (Tikhonova 2019): that MSL expressed in females targets the X/HAS, similar to in males. While these are nice supporting data for the female transgenic system, I do not believe this figure should be prominently featured as if this is a novelty of the current study.

      We fully agree with the reviewer's comment about the limitation of scientific novelty in Figure 6. It has an auxiliary meaning. Therefore, we decided to transfer this figure to Supplementary material.

      The ChIP experiments in Figure 7 agree with the conclusions in Figures 2 and 3 (polytene chromosome immunostaining) when it comes to X/autosome localization. I believe it would help with the flow of the paper if these experiments were combined or at least placed closer together in the narrative, rather than falling at the end.

      We’ll move Figure 7 closer to polytene chromosome immunostaining. We agree with reviewer that this placement of the figure will make it easier to perceive the meaning of the article as a whole.

      I find Figure 8 difficult to understand, especially since the "clusters" are not annotated in the figure, but are described in the text. I struggled to follow the authors' conclusions based on these data. The authors could clarify the figure with annotations, although to be honest I do not currently see the value of this analysis/figure.

      In the new version of the article, we will try to make this figure more understandable: we will add explanations to the figure and a legend to it, and we will also try to place emphasis more clearly in the text of the article.

    2. eLife assessment

      In this paper, the male sex-lethal (MSL) complex of proteins and RNA is studied through a domain analysis of one of its components, MSL1, and its interaction with others. While these results could be useful to researchers in the field, several studies have shown that the view that the MSL complex mediates dosage compensation is no longer considered tenable. Since there are many ways to alter viability, claims based on sex-specific viability as a reflection of dosage compensation should be viewed with much caution, and the evidence is currently considered inadequate to support the claims.

    3. Reviewer #1 (Public Review):

      Summary:

      Babosha et al. deeply investigate the N-terminal region of the Drosophila dosage compensation protein MSL1. Much of the prior research into the dosage compensation complex has focused on the male-specific MSL2 protein. However, the authors point out prior evidence that the N-terminus of MSL1 is important for protein function, including interaction with MSL2. Through a series of transgenic deletions and substitutions, the authors pinpoint two regions: N-terminal amino acids 3-7 and 41-65, which are critical for the binding of MSL1 to the X-chromosome and recruitment of MSL2. To deepen these observations, the authors perform well-controlled immunoprecipitation experiments to test the interaction of mutant MSL1 proteins with the lncRNA roX2, which is critical for the stability and localization of the dosage compensation complex. Through immunoprecipitation, the authors discover that the interaction of their mutant MSL1 proteins with roX2 is compromised. They suggest that the roX-MSL1 interaction is mediated by the N-terminal amino acids and is also critical for interaction with MSL2 and X-specific localization. This agrees with previous models that MSL1 and MSL2 directly interact through other regions.

      This work lays the foundation for future investigations into the overall structure of the dosage compensation machinery, which allows this unique complex to specifically target the X-chromosome through still unclear mechanisms.

      Strengths:

      The data provided by the authors is of high quality and supports the authors' conclusions, which are nicely contextualized in the text with previous models. The novelty of this study is specifically pinpointing the amino acid regions of MSL1 that interact with roX. The authors point out that, surprisingly, the N-terminal region of MSL1 is not particularly well conserved, indicating that the interactions outlined in this study might be Drosophila/Diptera-specific.

      The major strength of this study is that the authors find agreement between multiple dimensions of experimentation: the regions of MSL1 that are required for roX2 interaction (immunoprecipitation experiments) are also the regions that are critical for MSL1 localization to polytene chromosomes in an artificial female in vivo system, which are also critical for male-specific survival. The authors later suggest that it is the roX2 interaction that is responsible for the latter observations, although there is no direct evidence for this suggestion.

      Weaknesses:

      A minor weakness of the study is that it largely supports, and incrementally expands, the existing model in the field: that roX RNAs mediate the assembly of the complex on chromatin. I hesitate to call this a weakness, as supporting an existing model is still strong scientifically. However, the current study does not dramatically push the model forward.

    4. Reviewer #2 (Public Review):

      Summary:

      A deletion analysis of the MSL1 gene to assess how different parts of the protein product interact with the MSL2 protein and roX RNA to affect the association of the MSL complex with the male X chromosome of Drosophila was performed.

      Strengths:

      The deletion analysis of the MSL1 protein and the tests of interaction with MSL2 are adequate.

      Weaknesses:

      This reviewer does not adhere to the basic premise of the authors that the MSL complex is the primary mediator of dosage compensation of the X chromosome of Drosophila. Several lines of evidence from various laboratories indicate that it is involved in sequestering the MOF histone acetyltransferase to the X chromosome but there is a constraint on its action there. When the MSL complex is disrupted, there is no overall loss of compensation but there is an increase in autosomal expression. Sun et al (2013, PNAS 110: E808-817) showed that ectopic expression of MSL2 does not increase expression of the X and indeed inhibits the effect of acetylation of H4Lys16 on gene expression. Aleman et al (2021, Cell Reports 35: 109236) showed that dosage compensation of the X chromosome can be robust in the absence of the MSL complex. Together, these results indicate that the MSL complex is not the primary mediator of X chromosome dosage compensation. The authors use sex-specific lethality as a measure of disruption of dosage compensation, but other modulations of gene expression are the likely cause of these viability effects.

      A detailed explanation was provided by Birchler and Veitia (2021, One Hundred Years of Gene Balance: How stoichiometric issues affect gene expression, genome evolution, and quantitative traits. Cytogenetics and Genome Research 161: 529-550). The relevant portions of that article that pertain to Drosophila are quoted below. The cited references can be found in that publication.

      "In Drosophila, the sex chromosomes consist of an X and a Y. The Y in this species contains only a few genes required for male fertility (Zhang et al., 2020). The X consists of approximately 20% of the genome. Thus, females have two X chromosomes and males have one. Muller (1932) found that the expression of genes between the two sexes was similar but when individual genes on the X were varied in dosage they exhibited a proportional dosage effect. Each copy in a male was expressed at about twice the level as each copy in a female. Females with three X chromosomes are highly inviable but when they do survive to the adult stage, Stern (1960) found that they too exhibited dosage compensation in that the expression in the triple X genotype was similar to normal females and males. Studies in triploid flies found that dosage compensation also occurred among X; AAA, XX; AAA, and XXX; AAA genotypes via upregulation of the Xs, where X indicates the dosage of the X and A indicates the triploid nature of the autosomes (see Birchler, 2016 for further discussion). Diploid and triploid females have a similar per-gene expression but the other five genotypes each must modulate gene expression by different amounts equivalent to an inverse relationship between the X versus autosomal dosage to achieve a balanced expression between the X and the A (Birchler, 1996).

      Some years ago, mutations were sought in Drosophila that were lethal to males but viable in females. A number of such mutations were found and termed Male Specific Lethal (MSL) loci (Belote and Lucchesi, 1980). Once the products of these genes were identified, they were found to be at high concentrations on the male X chromosome (Kuroda et al., 1991). One of these genes encodes a histone acetyltransferase that acetylates Lysine16 of Histone H4 (Bone et al., 1994; Hilfiker et al., 1997). The recognition of the MSL complex and its association with the male X was an important set of contributions to an understanding of sex chromosome evolution in Drosophila (Kuroda et al., 2016). Thus, the hypothesis arose that the MSL complex accumulated this chromatin modifier on the male X to activate the expression about two-fold to bring about dosage compensation. Other data that contributed to this hypothesis were that when autoradiography of nascent transcription on salivary gland polytene chromosomes was examined in the MSL maleless mutation, the ratio of the number of grains over the X versus an autosomal region was reduced compared to the normal ratio (Belote and Lucchesi, 1980).

      It has been pointed out (Hiebert and Birchler, 1994; Bhadra et al., 1999; Pal Bhadra et al., 2005; Sun et al., 2013a; Birchler, 2016), however, that the grain counts over the X and the autosomes when considered in absolute terms rather than as a ratio show that the X more or less retained dosage compensation and the autosomal numbers are about doubled, i.e. exhibit an inverse dosage effect. The same situation occurs with the msl3 mutation (Okuno et al., 1984), another MSL gene, in that the autoradiographic grain numbers as an absolute measure show retention of X dosage compensation and an autosomal increase. The data treatment to produce an X to A ratio seemed reasonable in the context of the time when all regulation in eukaryotes was considered positive. However, when studies were conducted in such a manner as to assay the absolute effect on gene expression in the maleless mutation, in adults (Hiebert and Birchler, 1994), larvae (Hiebert and Birchler, 1994; Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005), and embryos (Pal Bhadra et al., 2005), the trend was for retention of dosage compensation of X linked genes and an increase in expression of autosomal genes.

      In global studies, if the X to autosomal expression does not change between mutant and normal, one can conclude that dosage compensation is operating. However, a lower X to A ratio could be a loss of compensation or an increased transcriptome size from the increase of the autosomes, as suggested by the absolute data of Belote and Lucchesi (1980) and Okuno et al (1984) and was visualized directly in embryos (Pal Bhadra et al., 2005). The transcriptome size in aneuploids can change, which cannot be detected in RNA-seq analyses alone (Yang et al., 2021), so it is an important consideration for studies of dosage compensation. It was recently acknowledged that in MSL2 knockdowns the relative X expression is decreased and a moderate autosomal increase is found (Valsecchi et al., 2021b). A similar trend is evident in the microarray data on MSL2 knockdown in SL2 tissue culture cells (Hamada et al., 2005) and in the roX RNA (noncoding RNAs essential for MSL localization on the male X) mutants (Deng and Meller, 2006). This trend is in fact consistent with the absolute data that suggest an increase in the transcriptome size (Figure 7). A global change in transcriptome size can cause a generalized dosage compensation of a single chromosome to appear as a proportional dosage effect (loss of compensation) to some degree (Figure 7).

      Examination of expression in triple X metafemales, where there is no MSL complex, found that X-linked genes generally show dosage compensation but there is a generalized inverse effect on the autosomes, which could account for the detrimental effects of metafemales (Birchler et al., 1989; Sun et al., 2013b). An examination in metafemales of alleles of the white eye color gene that do or do not exhibit dosage compensation in males, showed the same response, namely, increased expression if there was no dosage compensation in males and no difference from normal females for the male dosage-compensated alleles (Birchler, 1992). This experiment demonstrated a relationship between the mechanism of dosage compensation in males and metafemales and implicated the inverse dosage effect in both. An involvement of the inverse effect in Drosophila dosage compensation provides an explanation for how the five levels of gene expression can be explained (Birchler, 1996), whereas an all-or-none presence of a complex on the X does not. The stoichiometric relationship of regulatory gene products provides a means to read the relative dosage at multiple doses to produce the appropriate inverse level.

      What then is the function of the MSL complex? It was discovered that the MSL complex will actually constrain the effect of H4 lysine16 acetylation to prevent it from causing overexpression of genes (Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005; Sun and Birchler 2009; Sun et al., 2013a). Indeed, in the chromatin remodeling Imitation Switch (ISWI) mutants, the male X chromosome was specifically overexpressed suggesting that its normal function is needed for the constraint to occur (Pal Bhadra et al., 2005). Independently, the Mtor nuclear pore component shows a similar specific male X upregulation when Mtor is knocked down and this effect was shown to operate on the transcriptional level (Aleman et al., 2021). Interestingly, the increased expression of the X in the Mtor knockdown is accompanied by an inverse modulation of a substantial subset of autosomal genes, illustrating why the constraining process evolved to counteract male X overexpression. The constraining effect might involve a number of gene products (Birchler, 2016) and is an interesting direction for further study.

      Furthermore, when the H4Lys16 acetylase was individually targeted to reporter genes, there was an increase in expression (Sun et al., 2013a). However, when other members of the MSL complex were present in normal males or ectopically expressed, this increase did not occur (Sun et al., 2013a). It thus appears that the function of the MSL complex is to sequester the acetylase from the autosomes and constrain it on the X (Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005; Sun and Birchler, 2009; Sun et al., 2013a). Indeed, in the Mtor knockdowns, the X-linked genes with the greatest upregulation were those with the greatest association with the acetylase and the H4K16ac histone mark (Aleman et al 2021), supporting the idea of a constraining activity that becomes released in the Mtor knockdown. When the MSL complex is disrupted, there is an inverse effect on the autosomes that occurs but in normal circumstances the sequestration mutes this effect. The MSL complex disruption releases the acetylase to be uniformly distributed across all chromosomes as determined cytologically (Bhadra et al., 1999) or via ChIPseq for H4Lys16ac (Valsecchi et al., 2021a). Indeed, the quantity of the H4Lys16ac mark only has a proportional effect on gene expression when the constraining activity is disrupted (Aleman et al., 2021) or when the MSL complex is not present (Sun et al., 2013a). Thus, in normal flies, there is a more or less equalized expression of the X and autosomes despite the monosomy for 20% of the genome.

      The component of the complex that is expressed in males and thought to organize the complex to the male X, MSL2, was recently found to also be associated with autosomal dosage-sensitive regulatory genes (Valsecchi et al., 2018). MSL2 was found to modulate these autosomal dosage-sensitive genes in various directions, which illustrates that MSL2 has a role in dosage balance that goes beyond the X chromosome. This finding is consistent with the evolutionary scenario that the initial attraction of the complex to the X chromosome was to upregulate dosage-sensitive genes in hemizygous regions as the progenitor Y became deleted for them, with the constraining activity evolving to prevent an overexpression as the amount of acetylase on the male X increased with time (Birchler, 2016).

      The MSL hypothesis takes an X-centric view that does not accommodate what is now known about dosage effects across the whole genome. The idea that dissolution of the MSL complex would cause a reduction in expression of the male X-linked genes without any consequences for the autosomes is not consistent with current knowledge of gene regulatory networks and their dosage sensitivity. Indeed, the finding of dosage compensation in large autosomal aneuploids that operates on the transcriptional level (Devlin et al., 1982; 1984; Birchler et al., 1990; Sun et al., 2013c), as well as a predominant inverse effect by the same (Devlin, et al., 1988; Birchler et al., 1990), argues that one must consider the inverse effect for an understanding of the evolution of dosage compensation in Drosophila (and other species). Further discussion of models of Drosophila compensation has been published (Birchler, 2016).

      What is likely to be the most critical issue with sex chromosome evolution is the consequences for dosage-sensitive regulatory genes. This fact is nicely illustrated by the retention of these types of genes in different independent vertebrate sex chromosome evolutions (Bellott and Page, 2021). In Drosophila, by contrast, dosage compensation is more of a blanket effect on most but not all X-linked genes despite the fact that many genes on the X are unlikely to have dosage detrimental effects, although dosage-sensitive genes might have played a role as noted above. The particularly large size of the X in Drosophila compared to the whole genome is potentially a contributing factor because such a large genomic imbalance is likely to modulate most genes across the genome. Also, there is no evidence of a WGD in Drosophila as there is in other species for which the inverse effect has been documented (maize, Arabidopsis, yeast, mice, human). These other species have various numbers of retained duplicate dosage-sensitive regulatory genes from WGDs. Thus, the relative change of regulatory genes in aneuploids in these species will not be as great compared to some of their interactors in the remainder of the genome, which could result in lesser magnitudes of some trans-acting effects, similar to how aneuploids in ascending ploidies have fewer effects as described above. The absence of duplicate regulatory genes in Drosophila would predict a stronger inverse effect in general and that could have been capitalized upon to produce dosage compensation of most genes on the X chromosome despite many of them not being dosage critical. While sex chromosome evolution must accommodate dosage-sensitive genes for proper development and viability, it could also be capitalized upon to evolve sexual dimorphisms in expression (Sun et al., 2013c)."

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      I have only a few comments that I think will improve the manuscript and help readers better appreciate the context of the reported results.

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible.

      One paradox, that the authors point out, is that the drastic effects of TALK-1 L114P on plasma membrane potential do not result in a complete loss of insulin secretion. One important consideration is the role of intracellular stores in insulin secretion at physiological levels of hyperglycemia. This needs to be discussed more thoroughly, especially in the light of recent papers like Postic et al 2023 AJP and others. The authors do show an upregulation of IP3-induced Ca release. It is not clear whether they think this is a direct or indirect effect on the ER. Is there more IP3? More IP3R? Are the stores more full?

      The reviewer brings up an important point. Although we see a significant reduction in glucose-stimulated depolarization in most islets from TALK-1 L114P mice, some glucosestimulated calcium influx is still present (especially from female islets); this suggests that a subset of islet β-cells are still capable of depolarization. Because our original membrane potential recordings were done in whole islets without identification of the cell type being recorded, we have now repeated these electrical recordings in confirmed β-cells (see Supplemental figure 6). The new data shows that 33% of TALK-1 L114P β-cells show action potential firing in 11 mM glucose, which would be predicted to stimulate insulin secretion from a third of all TALK-1 L114P β-cells; this could be responsible for the remaining glucosestimulated insulin secretion observed from TALK-1 L114P islets. However, ER calcium store release could also allow for some of the calcium response in the TALK-1 L114P islets. We have now detailed this in the discussion; this now details the Postic et. al. study showing that glucose-stimulated beta-cell calcium increases involve ER calcium release as it occurs in the presence of voltage-dependent calcium channel inhibition. Future studies can assess this using SERCA inhibitors and determining if glucose-stimulated calcium influx in TALK-1 L114P islets is lost. We also find that muscarinic stimulated calcium influx from ER stores is greater in TALK-1 L114P mice. We currently do not have data to support the mechanism for this enhancement of muscarinic-induced islet calcium responses from islets expressing TALK1 L114P. Our hypothesis is that greater TALK-1 current on the ER membrane is enhancing ER calcium release in response to IP3R activation. There is an equivalent IP3R expression in control and TALK-1 L114P islets based on transcriptome analysis, which is now included in the manuscript. However, whether there is greater IP3 production, greater ER calcium storage, and/or greater ER calcium release requires further analysis. Because this finding was not directly related to the metabolic characterization of this TALK-1 L114P MODY mutation, we are planning to examine the ER functions of TALK-1L114P thoroughly in a future manuscript.

      The authors point to the possible roles of TALK-1 in alpha and delta cells. A limitation of the global knock-in approach is that the cell type specificity of the effects can't easily be determined. This should be more explicitly described as a limitation.

      We thank the reviewer for this suggestion and have added this to the discussion. This is now included in a paragraph at the end of the discussion detailing the limitations of this manuscript.

      The official gene name for TALK-1 is KCNK16. This reviewer wonders whether it wouldn't be better for this official name to be used throughout, instead of switching back and forth. The official name is used for Abcc8 for example.

      We thank the reviewer for this suggestion and have revised the manuscript to include Kcnk16 L114P. The instances of TALK-1 L114P that remain in the manuscript are in cases where the text specifically discusses TALK-1 channel function.

      There are several typos and mistakes in editing. For example, on page 5 it looks like "PMID:11263999" has not been inserted. I suggest an additional careful proofreading.

      We have revised this reference, thoroughly proofread the revised manuscript, and corrected typos.

      The difference in lethality between the strains is fascinating. Might be good to mention other examples of ion channel genes where strain alters the severe phenotypes? Additional speculation on the mechanism could be warranted. It also offers the opportunity to search for genetic modifiers. This could be discussed.

      We thank the reviewer for this suggestion and have added details on mutations where strain alters lethality.

      The sex differences are interesting. Of course, estrogen plays a role as mentioned at the bottom of page 16, but there have been more involved analyses of islet sex differences, including a recent paper from the Rideout group. Is there a sex difference in the islet expression of KCNK16 mRNA or protein, in mice or humans?

      We thank the reviewer for the important comments on the TALK-1 L114P sex differences. We have revised the manuscript to include greater discussion about female β cell resilience to stress, which may allow greater insulin secretion in the presence of the TALK-1 L114P channels; this is based on the Brownrigg et. al. study pointed out by the reviewer (PMID: 36690328). Because these sex differences in islet function were examined in mice, we looked at KCNK16 expression in mouse beta-cells. While there is a trend for greater KCNK16 expression in sorted male beta-cells (average RPKM 6296.25 +/-953.84) compared to sorted female beta-cells (5148.25 +/- 1013.22). Similarly, there was a trend toward greater KCNK16 expression in male HFD treated mouse beta-cells (average RPKM 8020.75 +/- 1944.41) compared to female HFD treated mouse beta-cells (average RPKM 7551 +/- 2952.70). We have now added this to the text.

      Page 15-16 "Indeed, it has been well established that insulin signaling is required for neonatal survival; for example, a similar neonatal lethality phenotype was observed in mice without insulin receptors (Insr-/-) where death results from hyperglycemia and diabetic ketoacidosis by P3 (40)." Formally, the authors are not examining insulin signaling. A better comparison is that of the Ins1/Ins2 double knockout model of complete hypoinsulinemia.

      We thank the reviewer for suggesting this as the appropriate comparison model and have now revised the manuscript to detail the 48-hour average life expectancy of Ins1/Ins2 double knockout mice (PMID: 9144203).

      There are probably too many abbreviations in the paper, making it harder to read by nonspecialists. I recommend writing out GOF, GSIS, WT, K2P, etc.

      We thank the reviewer for this suggestion and have revised the manuscript to reduce the use of most abbreviations.

      Reviewer #2:

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible. We have thoroughly addressed all the reviewer’s comments and revised the manuscript accordingly. These changes have strengthened the manuscript and are summarized below.

      (1) The authors perform an RNA-sequencing showing that the cAMP amplifying pathway is upregulated. Is this also true in humans with this mutation? Other follow-up comments and questions from this observation:

      a) Will this mean that the treatment with incretins will improve glucose-stimulated insulin secretion and Ca2+ signalling and lower blood glucose? The authors should at least present data on glucose-stimulated insulin secretion and/or Ca2+ signalling in the presence of a compound increasing intracellular cAMP.

      b) Will an OGTT give different results than the IPGTT performed due to the fact that the cAMP pathway is upregulated?

      c) Is the increased glucagon area and glucagon secretion a compensatory mechanism that increases cAMP? What happens if glucagon receptors are blocked?

      We thank the reviewer for the suggestions. Although cAMP pathways were upregulated in the TALK-1 L114P islets, the changes in expression were only modest as examined by qRTPCR. Thus, we are not sure if this plays a role in secretion. For humans with this mutation, there have been such a small number of patients and no islets isolated from these patients. Therefore, we are unaware if the cAMP amplifying pathway is upregulated in humans with the MODY associated TALK-1 L114P mutation. We have performed the suggested experiment assessing calcium from TALK-1 L114P islets in response to liraglutide (see Supplemental figure 10); there was no liraglutide response in TALK-1 L114P islets. We have also performed the OGTT experiments as suggested and these have now been added to the manuscript (see Supplemental figure 3). We do not believe that the increased glucagon is a compensatory response, because: 1. TALK-1 deficient islets have less glucagon secretion due to reduced SST secretion (see PMID: 29402588); 2. There is no change in insulin secretion at 7mM glucose, however, glucagon secretion is significantly elevated from islets isolated from TALK-1 L114P mice; 3. TALK-1 is highly expressed in delta-cells, and in these cells TALK-1 L114P would be predicted to cause significant hyperpolarization and significant reductions in calcium entry as well as SST secretion. Thus, reduced SST secretion may be responsible for the elevation of glucagon secretion. We plan to investigate delta-cells within islets from TALK-1 L114P mice in future studies to determine if changes in SST secretion are responsible for the elevated glucagon secretion from TALK-1 L114P islets.

      (2) The performance of measurements in both male and female mice is praiseworthy. However, despite differences in the response, the authors do not investigate the potential reason for this. Are hormonal differences of importance?

      We thank the reviewer for this important point. It is indeed becoming clear that there are many differences between male and female islet function and responses to stress. Thus, we have revised the manuscript to include greater discussion about these differences such as female β cell resilience to stress, which may allow greater insulin secretion in the presence of the TALK-1 L114P channels; this is based on the Brownrigg et. al. study pointed out by reviewer 1 (PMID: 36690328). While the differences in islet function and GTT between male and female L114P mice are clear, they both show diminished islet calcium handling, defective hormone secretion, and development of glucose intolerance. This manuscript was intended to demonstrate how the MODY TALK-1 L114P causing mutation caused glucose dyshomeostasis, which we have determined in both male and female mice. The mechanistic determination for the differences between male and female mice and islets with TALK-1 L114P could be due to multiple potential causes (as detailed in PMID: 36690328), thus, we believe that comprehensive studies are required to thoroughly determine how the TALK-1 L114P mutation differently impacts male and female mice and islets, which we plan to complete in a future manuscript.

      (3) MINOR: Page 5 .." channels would be active at resting Vm PMID:11263999.." The actual reference has not been added using the reference system.

      We thank the reviewer for noticing this mistake, which has now been corrected.

      Reviewer #3:

      The manuscript is overall clearly presented and the experimental data largely support the conclusions. However, there are a number of issues that need to be addressed to improve the clarity of the paper.

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible. We have thoroughly addressed all the reviewer’s comments and revised the manuscript accordingly. These changes have strengthened and improved the clarity of the manuscript.

      Specific comments:

      (1) Title: The terms "transient neonatal diabetes" and "glucose dyshomeostasis in adults" are used to describe the TALK-1 L114P mutant mice. Transient neonatal diabetes gives the impression that diabetes is resolved during the neonatal period. The authors should clarify the criteria used for transient neonatal diabetes, and the difference between glucose dyshomeostasis and MODY. Longitudinal plasma glucose and insulin data would be very informative and help readers to follow the authors' narrative.

      We appreciate the helpful comment and have added longitudinal plasma glucose from neonatal mice to address this (see Supplemental figure 2). The new data now shows the TALK-1 L114P mutant mice undergo transient hyperglycemia that resolves by p10 and then occurs again at week 15. Insulin secretion from P4 islets is also included that shows that male animals homozygous for the TALK-1 L114P mutation have the largest impairment in glucosestimulated insulin secretion, followed by male heterozygous TALK-1 L114P P4 islets that also have impaired insulin secretion (see Figure 1). The amount of hyperglycemia correlates with the defects in neonatal islet insulin secretion.

      (2) Another concern for the title is the term "α-cell overactivity." This could be taken to mean that individual α-cells are more active and/or that there are more α-cells to secrete glucagon. The study does not provide direct evidence that individual α-cells are more active. This should be clarified.

      We appreciate the helpful comment and have revised the manuscript title accordingly.

      (3) In the Introduction, it is stated that because TALK-1 activity is voltage-dependent, the GOF mutation is less likely to cause neonatal diabetes, yet the study shows the L114P TALK-1 mutation actually causes neonatal diabetes by completely abolishing glucose-stimulated Ca2+ entry. This seems to imply TALK-1 activity (either in the plasma membrane or ER membrane) has more impact on Vm or cytosolic Ca2+ in neonates than initially predicted. Some discussion on this point is warranted.

      These are important points and we have added details to the discussion about this. For example, the discussion now states that, “This suggests a greater impact of TALK-1 L114P in neonatal islets compared to adult islets. Future studies during β-cell maturation are required to determine if TALK-1 activity is greater on the plasma membrane and/or ER membrane compared with adult β-cells.” The introduction has also been revised to clarify the voltagedependence of TALK-1.

      (4) What is the relative contribution of defects in plasma membrane depolarization versus ER Ca2+ handling on defective insulin secretion response?

      We thank the reviewer for bringing up this important point. TALK-1 L114P islets show blunted glucose-stimulated depolarization and glucose-stimulated calcium entry, however, the L114P islets show equivalent Ca2+ entry as control islets in response high KCl (Figure 5GH). As the KCl stimulated Ca2+ influx is similar between control and TALK-1 L11P islets, this indicates that plasma membrane TALK-1 L114P has a hyperpolarizing role that significantly blunts glucose-stimulated depolarization and reduces activation of voltage-dependent calcium channels. We have further tested this by looking at glucose-stimulated β-cell membrane potential depolarization in TALK-1 L11P islets, which is significantly blunted (Figure4 A and B; Supplemental figure 6). However, 33% of TALK-1 L11P β-cells showed glucose-stimulated electrical excitability (Supplemental figure 6), which likely accounts for the modest GSIS from TALK-1 L11P islets. New data has also been included showing that KCl stimulation causes a significant depolarization of β-cells from TALK-1 L11P islets (Supplemental figure 6). Because plasma membrane TALK-1 L114P is largely responsible for the hyperpolarized membrane potential and blunted glucose-stimulated Ca2+ entry, this suggests that TALK-1 L11P on the plasma membrane is primarily responsible for the altered insulin secretion. The discussion has been revised to reflect this.

      (5) The Jacobson group has previously shown that another K2P channel TASK-1 is also involved in ER Ca2+ homeostasis and that TASK inhibitors restored ER Ca2+ in TASK-1 expressing cells. Is TASK-1 expressed in β-cell ER membrane? Can the mishandling of Ca2+ caused by TALK-1 L114P be reversed by TASK-1 inhibitors?

      We thank the reviewer for bringing up this important point in relation to ER calcium handling by K2P channels. We have found that TASK-1 channels expressed in alpha-cells enhance ER calcium release and that inhibitors or TASK-1 channels elevate alpha-cell ER calcium storage. We did not observe any significant changes in the gene (Kcnk3) encoding TASK-1 between islets from control or TALK-1 L11P mice, which has now been added to the manuscript. However, because the TALK-1 L11P-mediated reduction of glucose-stimulated depolarization and inhibition of calcium entry are both prevented in the presence of high KCl (see Figure X); this strongly suggests that TALK-1 L114P K+ flux at the membrane is hyperpolarizing the membrane potential and limiting depolarization and calcium entry. This suggests that TALK-1 L114P control of ER calcium handling is not the primary contributor to the blunted glucose-stimulate calcium handling. Furthermore, acetylcholine stimulation of islets from both control and TALK-1 L114P islets elicited ER calcium release, which indicates that for the most part ER calcium release is still responsive to cues that control release, but they are altered. Taken together this suggests that the TALK-1 L114P impact on ER calcium is not the primary mediator of blunted glucose-stimulated islet calcium entry and insulin secretion.

      (6) The electrical recording experiments were conducted using whole islets. The authors should comment on how the cells were identified as β-cells, especially in mutant islets in which there is an increased number of α-cells.

      The reviewer brings up an important point. As indicated, the original membrane potential recordings were conducted using whole islets. While the recorded cells could mostly be βcells based on mouse islets typically containing >80% β-cells, there is a possibility that some of the cells included in these recordings were α-cells or δ-cells (especially because of the noted α-cell hyperplasia in TALK-1 L114P islets). Thus, we have now included data from bcells that were identified with an adenoviral construct containing a rat insulin promoter driving a fluorescent reporter. This allowed the fluorescent β-cells to be monitored with electrophysiological membrane potential recordings. The new data (see Supplemental figure 6) shows a significant reduction in glucose-stimulated depolarization in 67% of β-cells with the L114P mutation compared to controls.

      Minor:

      (1) Some references need formatting.

      The references have been revised accordingly.

      (2) Please define glucose-stimulated phase 0 Ca2+ response for non-expert readers.

      This has been defined accordingly.

      (3) Page 14 bottom: The sentence "Unlike the only other MODY-associated.........., TALK-1 is not inhibited by sulfonylureas" seems out of place and lacks context.

      We thank the reviewer for this suggestion and have deleted this sentence.

      (4) Figure 6: It would be helpful to provide a protein name for the genes shown in panel D.

      The protein names for the genes have now been included in the discussion of these genes.

    2. Reviewer #2 (Public Review):

      Summary:

      This work follows previous work from the group where they have demonstrated the role of TASK1 in the regulation of glucose stimulated insulin secretion. Moreover, a recent study links a mutation in KCNK16, the gene encoding TALK-1 channels to MODY. Here the authors have constructed a mouse model with the specific mutation (TALK-1 L114P mutation) and investigated the phenotype. They have to perform a couple of breeding tricks to find a model that is lethal in adult which might complicate the conclusions, however, the phenotype of the heterozygote model used have a MODY-like phenotype. The study is convincing and solid.

      Strengths:

      (1) The work is a natural follow-up from previous studies from the groups.<br /> (2) The authors present convincing and solid data that in the long perspective will help patients with this mutations.<br /> (3) Both in vivo and in vitro data are presented to give the full picture of the phenotype.<br /> (4) Data from both female and male mice are presented.

      Weaknesses:

      The authors have answered all my comments in the revised version and I find no more weaknesses. Some questions still remain but have been clearly discussed in the new version of the manuscript.

    3. eLife assessment

      This study characterizes how a point mutation in the TALK-1 potassium channel, encoded by the KCNK16 gene, causes MODY diabetes. The mutation, L114P, causes a gain-of-function to increase K+ currents and inhibit glucose-stimulated insulin secretion. Increased glucagon likely results from paracrine effects in the islets. The data are convincing and the work will be valuable for understanding islet function.

    4. Reviewer #1 (Public Review):

      Summary:

      This paper focuses on the effects of a L114P mutation in the TALK-1 channel on islet function and diabetes. This mutation is clinically relevant and a cause of MODY diabetes. This work employs a mouse model with heterozygous and homozygous mutants. The homozygous mice are homozygous lethal from severe hyperglycemia. The work shows that the mutation increases K+ currents and inhibits insulin secretion. This is a very nice paper with mechanistic insight and clear clinical importance. It is generally well written and the data is well presented.

      Comments on revision:

      I have no further comments to add at this time. The authors have adequately addressed my concerns.

    5. Reviewer #3 (Public Review):

      Summary

      The L114P gain of function mutation in the K2P channel TALK-1 encoded by KCNJ16 has been associated with maturity-onset diabetes of the young (MODY). In this study, Nakhe et al. generated mice carrying L114P TALK-1 and evaluated the impact of the mutation on pancreatic islet functions and glucose homeostasis. The authors report that the mutation increases neonatal lethality, owing to hyperglycemia caused by a lack of glucose-stimulated Ca2+ influx and insulin secretion. Adult mutant mice showed glucose intolerance and fasting hyperglycemia, which is attributed to blunted glucose-stimulated insulin secretion as well as increased glucagon secretion. Interestingly, male mice were more affected than female mice. Islets from adult mutant mice were found to have reduced Ca2+ entry upon glucose stimulation but also enhanced IP3-induced ER Ca2+ release, consistent with previous studies from the group showing a role of TALK-1 in ER Ca2+ homeostasis. Finally, comparison of bulk RNA sequencing results from WT and mutant islets revealed altered expression of genes involved in β-cell identify, function and signaling, which also contributes to the observed islet dysfunction.

      Strengths

      This is a well-executed and rigorous study that will be of great interest to the diabetes and islet biology communities. The findings provide convincing evidence supporting a causal role of the L114P gain of function TALK-1 mutation in glucose-stimulated insulin secretion defects and diabetes. The neonatal diabetes phenotype and the gender difference uncovered by the study have important clinical implications. The complexity of TALK-1 expression and hormone secretion in different endocrine cell types and how it impacts glucose homeostasis is elegantly illustrated in the L114P TALK-1 mouse model. The authors carefully and thoroughly addressed limitations of their study and discussed future directions. The importance of TALK-1 in β-cell and islet function demonstrated by this study will prompt future efforts targeting this important channel for diabetes treatment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We appreciate the thoughtful review of our manuscript by the reviewers, along with their valuable suggestions for enhancing our work. In response to these suggestions, we conducted additional experiments and made significant revisions to both the text and figures. In the following sections, we first highlight the major changes made to the manuscript, and thereafter address each reviewer's comments point-by-point. We hope these additional data and revisions have improved the robustness and clarity of the study and manuscript. Please note that as part of a suggested revision we have changed the manuscript title to be: Bacterial vampirism mediated through taxis to serum.

      Major revisions and new data:

      (1) We conducted additional experiments testing taxis to serum using a swine ex vivo enterohemorrhagic lesion model in which we competed wildtype versus chemotaxis deficient strains (Fig. 8). We selected swine for these experiments due to their similarity in gastrointestinal physiology to humans. In these experiments we see that chemotaxis, and the chemoreceptor Tsr, mediate localization to, and migration into, the lesion. We also tested, and confirmed, taxis to serum from swine and serum from horse, that supporting that serum attraction is relevant in other host-pathogen systems.

      (2) We present additional experimental data and quantification of chemotaxis responses to human serum treated with serine-racemase (Fig. S3). This treatment reduces wildtype chemoattraction and the wildtype no longer possesses an advantage over the tsr strain, providing further evidence that L-serine is the specific chemoattractant responsible for Tsr-mediated attraction to serum.

      (3) We present additional data in the form of 17 videos of chemotaxis experiments with norepinephrine and DHMA showing null-responses under various conditions. These data provide additional support to the conclusion that these chemicals are not responsible for bacterial attraction to serum. We have included these raw data as a new supplementary file (Data S1) for those in the field that are interested in these chemicals.

      (4) Based on comments from Reviewer 2 regarding whether the position of the ligand and ligand-binding site residues in the previously-reported EcTsr LBD structure are incorrect, or whether these differences are due to the proteins being from different organisms, we performed paired crystallographic refinements to determine which positions result in model improvement (Fig. 7J). Altering the EcTsr structure to have the ligand and ligandbinding site positions from our new higher resolution and better-resolved structure of Salmonella Typhimurium Tsr results in a demonstrably better model, with both Rwork and Rfree lower by about 1% (Fig. 7J). These data support our conclusion that the correct positions for both structures are as we have modeled them in the S. Typhimurium Tsr structure. We also solved an additional crystal structure of SeTsr LBD captured at neutral pH (7-7.5) that confirms our structure captured with elevated pH (7.5-9.7) has no major changes in structure or ligand-binding interactions (Fig. S6, Table S2).

      (5) Based on comments from Reviewer 2 on the accuracy of the diffusion calculations, we present a new analysis (Fig. S2) comparing the experimentally-determined diffusion of A488 compared to its calculated diffusion. We found that:

      [line 111]: “As a test case of the accuracy of the microgradient modeling, we compared our calculated values for A488 diffusion to the normalized fluorescence intensity at time 120 s. We determined the concentration to be accurate within 5% over the distance range 70270 µm (Fig. S2). At smaller distances (<70 µm) the measured concentration is approximately 10% lower than that predicted by the computation. This could be due to advection effects near the injection site that would tend to enhance the effective local diffusion rate.”

      (6) Both reviewers asked us to better justify why we focused on the chemoreceptor Tsr, and had questions about why we did not investigate Tar. The low concentration of Asp in serum suggests Tar could have some effect, but less so than Trg or Tsr (see Fig. 4A). We have revised the text throughout to better convey that we agree multiple chemoreceptors are involved in the response and clarify our rationale for studying the role of Tsr:

      [line 178]: “We modeled the local concentration profile of these effectors based on their typical concentrations in human serum (Fig. 4B). Of these, by far the two most prevalent chemoattractants in serum are glucose (5 mM) and L-serine (100-300 µM) (Fig. 4B-F). This suggested to us that the chemoreceptors Trg and/or Tsr could play important roles in serum attraction.”

      [line 186]: “Since tsr mutation diminishes serum attraction but does not eliminate it, we conclude that multiple chemoattractant signals and chemoreceptors mediate taxis to serum. To further understand the mechanism of this behavior we chose to focus on Tsr as a representative chemoreceptor involved in the response, presuming that serum taxis involves one, or more, of the chemoattractants recognized by Tsr that is present in serum: L-serine, NE, or DHMA.”

      [line 468] “Serum taxis occurs through the cooperative action of multiple bacterial chemoreceptors that perceive several chemoattractant stimuli within serum, one of these being the chemoreceptor Tsr through recognition of L-serine (Fig. 4).”

      Point-by-point responses to reviewer comments:

      Reviewer #1:

      (1) Presumably in the stomach, any escaping serum will be removed/diluted/washed away quite promptly? This effect is not captured by the CIRA assay but perhaps it might be worth commenting on how this might influence the response in vivo. Perhaps this could explain why, even though the chemotaxis appears rapid and robust, cases of sepsis are thankfully relatively rare.

      To clarify, the Enterobacteriaceae species we have tested here are colonizers of the intestines, not the stomach, and cases of bacteremia from these species are presumably due to bloodstream entry through intestinal lesions. Whether or not intestinal flow acts as a barrier to bloodstream entry is not something we test here, and so we have not commented on this idea in the manuscript. We do demonstrate that attraction to serum occurs within seconds-to-minutes of exposure. We expect that the major protective effects against sepsis are the host antibacterial factors in serum, which are well-described in other work. We have been careful to state throughout the text that we see attraction responses, and growth benefits, to serum that is diluted in an aqueous media, which is different than bacterial growth in 100% serum or in the bloodstream.

      (2) The authors refer to human serum as a chemoattractant numerous times throughout the study (including in the title). As the authors acknowledge, human serum is a complex mixture and different components of it may act as chemoattractants, chemo-repellents (particularly those with bactericidal activities) or may elicit other changes in motility (e.g. chemokinesis). The authors present convincing evidence that cells are attracted to serine within human serum - which is already a well-known bacterial chemoattractant. Indeed, their ability to elucidate specific elements of serum that influence bacterial motility is a real strength of the study. However, human serum itself is not a chemoattractant and this claim should be re-phrased - bacteria migrate towards human serum, driven at least in part by chemotaxis towards serine.

      Throughout the text we have changed these statements, including in the title, to either be ‘taxis to serum’ or ‘serum attraction.’ On the timescales we tested our data support that chemotaxis, not chemokineses or other forms of direction motility, is what drives rapid serum attraction, since a motile but non-chemotactic cheY mutant cannot localize to serum (Fig. 4). We present evidence of one of these chemotactic interactions (L-Ser).

      (3) Linked to the previous point, several bacterial species (including E. coli - one of the bacterial species investigated here) are capable of osmotaxis (moving up or down gradients in osmolality). Whilst chemotaxis to serine is important here, could movement up the osmotic gradient generated by serum injection play a more general role? It could be interesting to measure the osmolality of the injected serum and test whether other solutions with similar osmolality elicit a similar migratory response. Another important control here would be to treat human serum with serine racemase and observe how this impacts bacterial migration.

      As addressed above, we have added additional experiments of serum taxis treated with serine racemase showing competition between WT and cheY, and WT and tsr (Fig. S3). These data support a role for L-serine as a chemoattractant driving attraction to serum. The idea of osmotaxis is interesting, but outside the scope of this work since we focus on chemoattraction to L-serine as one of the mechanisms driving serum attraction, and have multiple lines of evidence to support that.

      (4) The migratory response of E. coli looks striking when quantified (Fig. 6C) but is really unclear from looking at Panel B - it would be more convincing if an explanation was offered for why these images look so much less striking than analogous images for other species (E.g. Fig. 6A).

      We agree that the E. coli taxis to serum response is less obvious. We have brightened those panels to hopefully make it clearer to interpret (more cells in field of view over time). Also, as stated in the y-axes of these plots, this quantification was performed by enumerating the number of cells in the field of view, and the Citrobacter and Escherichia responses are shown on separate y-axes (now Fig. 8C). As indicated, the experiments have different numbers of starting motile cells, which we presume accounts for the difference in attraction magnitude. When investigating diverse bacterial systems we found there to be differences in motility under the culturing and experimental conditions we employed, for multiple reasons, and so for these data we thought it best to report raw cell numbers rather data normalized to the starting number of bacteria, as we do elsewhere. In the specific case of these E. coli responding to serum, please view Supplementary Movie S3, which both clearly shows the attraction response and that the bacteria grew in a longer, semi-filamentous form that seem to impair their swimming speed.

      (5) It is unclear why the fold-change in bacterial distribution shows an approximately Gaussian shape with a peak at a radial distance of between 50 -100 um from the source (see for example Fig. 2H). Initially, I thought that maybe this was due to the presence of the microcapillary needle at the source, but the CheY distribution looks completely flat (Fig. 3I). Is this an artifact of how the fold-change is being calculated? Certainly, it doesn't seem to support the authors' claim that cells increase in density to a point of saturation at the source. Furthermore, it also seems inappropriate to apply a linear fit to these non-linear distributions (as is done in Fig. 2H and in the many analogous figures throughout the manuscript).

      We have revised the text to address this point, and removed the comment about cells increasing in density to a point of saturation: [Line 138] “We noted that in some experiments the population peak is 50-75 µm from the source, possibly due to a compromise between achieving proximity to nutrients in the serum and avoidance of bactericidal serum elements, but this behavior was not consistent across all experiments. Overall, our data show S. enterica serovars that cause disease in humans are exquisitely sensitive to human serum, responding to femtoliter quantities as an attractant, and that distinct reorganization at the population level occurs within minutes of exposure (Fig. 3, Movie 2).”

      We can confirm that this is not an artifact of quantification. Please refer to the videos of these responses, which demonstrates this point (Movies 1-5).

      (6) The authors present several experiments where strains/ serovars competed against each other in these chemotaxis assays. As mentioned, these are a real strength of the study - however, their utility is not always clear. These experiments are useful for studying the effects of competition between bacteria with different abilities to climb gradients.

      However, to meaningfully interpret these effects, it is first necessary to understand how the different bacteria climb gradients in monoculture. As such, it would be instructive to provide monoculture data alongside these co-culture competition experiments.

      Thank you for this suggestion. We agree that the coculture experiments showing strains competing for the same source of effector give a different perspective than monoculture. These experiments allow us to confirm taxis deficiencies or advantages with greater sensitivity, and ensure that the bacteria in competition have experienced the same gradient. This type of competition experiment is often used in in vivo experimentation for the same advantages. We note that in the gut the bacteria are not in monoculture and chemotactic bacteria do have to compete against each other for access to nutrients. Repeating all of the experiments we present to show both the taxis responses in coculture and monoculture would be an extraordinary amount of work that we do not believe would meaningfully change the conclusions of this study.

      (7) Linked to the above point, it would be especially instructive to test a tsr mutant's response in monoculture. Comparing the bottom row of Fig. 3G to Fig. 3I suggests that when in co-culture with a cheY mutant, the tsr mutant shows a higher fold-change in radial distribution than the WT strain. Fig. 4G shows that a tsr mutant can chemotaxis towards aspartate at a similar, but reduced rate to WT. This could imply that (like the trg mutant), a tsr mutant has a more general motility defect (e.g. a speed defect), which could explain why it loses out when in competition with the WT in gradients of human serum, but actually seems to migrate strongly to human serum when in co-culture with a cheY mutant. This should be resolved by studying the response of a tsr mutant in monoculture.

      Addressed above.

      (8) In Fig. 4, the response of the three clinical serovars to serine gradients appears stronger than the lab serovar, whilst in Fig. 1, the response to human serum gradients shows the opposite trend with the lab serovar apparently showing the strongest response. Can the authors offer a possible explanation for these slightly confusing trends?

      We suspect this relates to the fact that pure L-serine is a chemoattractant, whereas treatment with serum exposes the bacteria both to chemoattractants and, likely, chemorepellents. Strains may navigate the landscape of these stimuli different for a variety of reasons that are not simple to tease apart. The final magnitude of change in bacterial localization depends on multiple factors including swimming speed, adaptation, sensitivity of chemoattraction, and cooperative signaling of the chemoreceptor nanoarray. Thus, we cannot state with certainty how and why these strains are different across all experiments, but we can state that they are attracted to both serum and L-serine.

      (9) In Fig. S2, it seems important to present quantification of the effect of serine racemase and the reported lack of response to NE and DHMA - the single time-point images shown here are not easy to interpret.

      As suggested, we present quantification of the serum racemase treated samples (now Fig. S3). To assist in the interpretation of this max projections Fig. S3 now noted the chemotactic response (chemoattraction for L-serine, null-response for NE/DHMA). Further, we revised the text to state: [line 209: “We observed robust chemoattraction responses to L-serine, evident by the accumulation of cells toward the treatment source (Fig. S3E, Movie 4), but no response to NE or DHMA, with the cells remaining randomly distributed even after 5 minutes of exposure (Fig. S3F-I, Movie 5, Movie S1).”

      (10) Importantly, the authors detail how they controlled for the effects of pH and fluid flow (Line 133-136). Did the authors carry out similar controls for the dual-species experiments where fluorescent imaging could have significantly heated the fluid droplet driving stronger flow forces?

      Most of our microfluidics experiments were performed in a temperature-controlled chamber (see Methods). Since the strains in the coculture experiments experienced the same experimental conditions we have no evidence of fluorescence-imaginginduced temperature changes that have impacted whether or not the bacteria are attracted to serum or the effectors we investigated.

      (11) The inference of the authors' genetic analysis combined with the migratory response of E. coli and C. koseri to human serum shown in Fig. 6 is that Tsr drives movement towards human serum across a range of Enterobacteriaceae species. The evidence for the importance of Tsr here is currently correlative - more causal evidence could be presented by either studying the response of tsr mutants in these two species (certainly these should be readily available for E. coli) or by studying the response of these two species to serine gradients.

      We have revised the text to state: [line 402] “Without further genetic analyses in these strain backgrounds, the evidence for Tsr mediating serum taxis for these bacteria remains circumstantial. Nevertheless, taxis to serum appears to be a behavior shared by diverse Enterobacteriaceae species and perhaps also Gammaproteobacteria priority pathogen genera that possess Tsr such as Serratia, Providencia, Morganella, and Proteus (Fig. 8B).”

      We note that other work has thoroughly investigated E. coli serine taxis.

      Figure Suggestions

      (1) Fig. 2 - The inset bar charts in panels H-J and the font size in their axes labels are too small - this suggestion also applies to all analogous figures throughout the manuscript.

      We have increased the size of the text for these inset plots. We have also broken up some of the larger figures.

      (2) Panel 2F - the cartoon bacterial cell and 'number of bacteria' are confusing and seem to contradict the y-axis label. This also applies to several other figures throughout the manuscript where the significance of this cartoon cell is quite hard to interpret.

      As suggested, we have removed this cartoon.

      (3) Panels G-I in Fig. 3 are currently tricky to interpret - it would be easier if the authors were to use three different colours for the three different strains shown across these panels.

      We have broken up Figure 2 (which also had these types of plots) so that hopefully these labels are more clear. For the Figure in question (now Fig. 4), due to the many figures and different types of data and comparisons it was difficult to find a color scheme for these strains that would be consistent across the manuscript. These colors also reflect the fluorescence markers. We note that not only do we use color to indicate the strain but also text labels.

      (4) Panels 3B-F would be best moved to a supplementary figure as this figure is currently very busy. Similarly, I would potentially consider presenting only the bottom row of panels in Panels G-I in the main figure (which would then be consistent with analogous data presented elsewhere).

      We have opted to keep these panels in the main text (now Fig. 4) as they are relevant to understanding (1) our justification for why to pursue certain chemoeffector-chemoreceptor interactions and not others, and (2) how the chemoattraction response can be understood both in terms of bacterial population distribution and relevant cells over time.

      (5) Fig. 4 and possibly elsewhere - perhaps best not to use Ser as an abbreviation for Serine here because it could potentially be confused with an abbreviation for serum.

      It is unfortunate that these two words are so similar. However, Ser is the canonical abbreviation for the amino acid serine. Serum does not have a canonical abbreviation.

      (6) Fig. 4 - I would move panels H - K to a separate supplementary figure - currently, they are too squished together and it is hard to make out the x-axis labels. I would also consider moving panels E-G to supplementary as well so that the microscopy images presented elsewhere in the figure can be presented at an appropriate size.

      Since we are allowed more figures, we could also break some of these figures up into multiple ones.

      (7) Similarly, I would move some panels from Fig. 5 to supplementary as the figure is currently quite busy.

      We have rearranged the figure (now Fig. 7) to move the bioinformatics data to Fig. 8 to allow more space for the panels.

      Other suggestions

      (8) Line 179 - how do the concentrations quote for serine and glucose compare to aspartate? This would be helpful to justify the authors' decision not to investigate Tar as a potential chemoreceptor.

      This is addressed in our comments above and in Fig. 4A and Fig. 4B-F. Human serum L-Asp is much lower concentration (about 20-fold).

      (9) Line 282 - Serine levels in serum are quantified at 241 uM, but this is only discussed in the context of serum growth effects. Could this information be better used to design/ inform the serine gradients that were tested in chemotaxis assays?

      We tested a wide range of serine concentrations and show even much lower sources of serine than is present in serum is sufficient for chemoattraction. Also, the K1/2 for serine is 105 uM (Fig. S4), which is surpassed by the concentration in serum (Fig. S5).

      (10) The word 'potent' in the title might be too vague, especially as the strength of the response varies between strains/species. It may perhaps be more useful to focus on the rapidity/sensitivity of the response. However, presumably the sensitivity of the response will be driven by the sensitivity of the response to serine (which is already known for E. coli at least). Also, as noted in the public review, human serum itself is not a chemoattractant so I would consider re-phasing this in the title and elsewhere.

      As suggested, and discussed above, we have implemented this change.

      (11) Typo line 59 'context of colonizing of a healthy gut'.

      Addressed.

      (12) Typo line 538 - there is an extra full stop here.

      Addressed.

      Reviewer #2:

      (1) This study is well executed and the experiments are clearly presented. These novel chemotaxis assays provide advantages in terms of temporal resolution and the ability to detect responses from small concentrations. That said, it is perhaps not surprising these bacteria respond to serum as it is known to contain high levels of known chemoattractants, serine certainly, but also aspartate. In fact, the bacteria are shown to respond to aspartate and the tsr mutant is still chemotactic. The authors do not adequately support their decision to focus exclusively on the Tsr receptor. Tsr is one of the chemoreceptors responsible for observed attraction to serum, but perhaps, not the receptor. Furthermore, the verification of chemotaxis to serum is a useful finding, but the work does not establish the physiological relevance of the behavior or associate it with any type of disease progression. I would expect that a majority of chemotactic bacteria would be attracted to it under some conditions. Hence the impact of this finding on the chemotaxis or medical fields is uncertain.

      We agree that the data we show are mostly mechanistic and further work is required to learn whether this bacterial behavior is relevant in vivo and during infections. We present new data using an ex vivo intestinal model which supports the feasibility of serum taxis mediating invasion of enterohemorrhagic lesions (Fig. 8).

      (2) The authors also state that "Our inability to substantiate a structure-function relationship for NE/DHMA signaling indicates these neurotransmitters are not ligands of Tsr." Both norepinephrine (NE) and DHMA have been shown previously by other groups to be strong chemoattractants for E. coli (Ec), and this behavior was mediated by Tsr (e.g. single residue changes in the Tsr binding pocket block the response). Given the 82% sequence identity between the Se and Ec Tsr, this finding is unexpected (and potentially quite interesting). To validate this contradictory result the authors should test E. coli chemotaxis to DHMA in their assay. It may be possible that Ec responds to NE and DHMA and Se doesn't. However, currently, the data is not strong enough to rule out Tsr as a receptor to these ligands in all cases. At the very least the supporting data for Tsr being a receptor for NE/DHMA needs to be discussed.

      Addressed above. The focus of this study is serum attraction and the mechanisms thereof. We never saw any evidence to support the idea that NE/DHMA drives attraction to serum, nor are chemoeffectors for Salmonella, and provide these null-results in Data S2.

      (3) The authors also determine a crystal structure of the Se Tsr periplasmic ligand binding domain bound to L-Ser and note that the orientation of the ligand is different than that modeled in a previously determined structure of lower resolution. I agree that the SeTsr ligand binding mode in the new structure is well-defined and unambiguous, but I think it is too strong to imply that the pose of the ligand in the previous structure is wrong. The two conformations are in fact quite similar to one another and the resolution of the older structure, is, in my view, insufficient to distinguish them. It is possible that there are real differences between the two structures. The domains do have different sequences and, moreover, the crystal forms and cryo-cooling conditions are different in each case. It's become increasingly apparent that temperature, as manifested in differential cooling conditions here, can affect ligand binding modes. It's also notable that full-length MCPs show negative cooperativity in binding ligands, which is typically lost in the isolated periplasmic domains. Hence ligand binding is sensitive to the environment of a given domain. In short, the current data is not convincing enough to say that a previous "misconception" is being corrected.

      Thank you for this comment, which spurred us to investigate this idea more rigorously. As described above we performed new refinements of the E. coli structure edited to have the positions of the ligand and ligand-binding site as modeled in our new Tsr structure from Salmonella (Fig. 7J). The best model is obtained with these poses. Along with the poor fit of the E. coli model to the density, the best interpretations for these positions, for both structures, are as we have modeled them in the Salmonella Tsr structures.

      Figure suggestions

      (1) Figure 2 looks busy and unorganized. Fig 2C could be condensed into one image where there are different colored rings coming from the source point that represent different time points.

      Addressed above. Fig. 2 has been broken apart to help improve clarity.

      (2) What is the second (bottom) graph of 2D? I think only the top graph is necessary.

      We have added an explanation to the figure legend that the top graph shows the means and the bottom shows SEM. The plots cannot easily be overlaid.

      (3) Similarly, Fig 2E doesn't need to have so many time points. Perhaps 4 at maximum.

      As the development of the response over time is a key take-home of the study, we do not wish to reduce the timepoints shown.

      (4) The legend for Figure 2F uses the unit 'µM' to mean micrometers but should use 'µm'.

      Corrected.

      (5) In Figures 2H-J, the lime green text is difficult to read. The word "serum" does not need to be at the top of each panel. I recommend shortening the y-axis titles on the graphs so you can make the graphs themselves larger.

      Addressed above.

      (6) In Figures 2H-J, I am confused about what is being shown in the inset graph. The legend says it's the AUC for the data shown. However, in the third panel (S. Typhimurium vs. S. Enteriditus) the data appears to be much more disparate than the inset indicates. I don't think that this inset is necessary either.

      The point of this inset graph is to quantify the response through integration of the curve, i.e., area under the curve, which is a common way to quantify complex curves and compare responses as single values. We are using this method to calculate statistical significant of the response compared to a null response. We have added further clarification to the figure legend regarding these plots: Inset plots show foldchange AUC of strains in the same experiment relative to an expected baseline of 1 (no change). p-values shown are calculated with an unpaired two-sided t-test comparing the means of the two strains, or one-sided t-test to assess statistical significance in terms of change from 1-fold (stars).

      (7) Line 154, change "relevant for" to "observed in".

      Changed.

      (8) Line 171, according to the Mist4 database, Salmonella enterica has seven chemoreceptors. Why are only Tar, Tsr, and Trg mentioned? Why were only Tsr and Trg tested?

      Addressed above.

      (9) Line 192, be clear that you are referring to genes and not proteins, as italics are used.

      Revised to make this distinction clear.

      (10) Line 193, have other studies found a Trg deletion strain to be non-chemotactic? If so, cite this source here.

      We state that the Trg deletion strain had deficiencies in motility, and also have revised the text to include the clarification that this was not noted in earlier work with this strain: [line 173]: We were surprised to find that the trg strain had deficiencies in swimming motility (data not shown). This was not noted in earlier work but could explain the severe infection disadvantage of this mutant 34. Because motility is a prerequisite for chemotaxis, we chose not to study the trg mutant further, and instead focused our investigations on Tsr.

      (11) Why wasn't a Tar deletion mutant also analyzed? The authors say that based on the known composition of serum, serine and glucose are the most abundant. However, the serum does have aspartate at 10s of micromolar concentrations.

      Addressed above.

      (12) “The Tsr deletion strain still exhibits an obvious chemoattraction to serum. There are other protein(s) involved in chemoattraction to serum but the text does not discuss this.”

      Addressed above.

      (13) “In Figure 3B-F, the text is very difficult to read even when zoomed in on.”

      We have increased the font size of these panels.

      (14) “All of the text in Figure 5 is extremely small and difficult to read.”

      Addressed above. We split this figure in two to help improve clarity.

      (15) “I wonder about the accuracy of the concentration modeling. It seems like there are a lot of variables that could affect the diffusion rates, including the accuracy of the delivery system. Could the concentrations be verified by the dye experiments?”

      Addressed above. We provide a new analysis comparing experimental diffusion of A488 dye compared to calculations (Fig. S2).

    2. eLife assessment

      This work uses an interdisciplinary approach combining microfluidics, structural biology, and genetic analyses to provide important findings that show that pathogenic enteric bacteria exhibit taxis toward human serum. The data are compelling and show that the behavior utilizes the bacterial chemotaxis system and the chemoreceptor Tsr, which senses the amino acid L-serine. The work provides an ecological context for the role of serine as a bacterial chemoattractant and could have clinical implications for bacterial bloodstream invasion during episodes of gastrointestinal bleeding.

    3. Reviewer #1 (Public Review):

      Updated summary:

      Glenn et al. present solid evidence that both lab and clinical Salmonella enterica serovars rapidly migrate towards human serum using an exciting approach that combines microfluidics, structural biology and genotypic analysis. The authors succeed in bringing to light a novel context for the role of serine as a bacterial chemoattractant as well as documenting what is likely to be a key step in bloodstream entry for some of the main sepsis-associated pathogens during gastrointestinal bleeding. They illustrate the generality of their findings through phylogenetic analysis, testing additional species within the Enterobacteriaceae family and showing attraction towards swine and equine serum. Their interdisciplinary approach here greatly increases the scope of their findings.<br /> I would also like to note that, whilst I enjoyed the interdisciplinary scope of this study, I am personally not well placed to review the protein structural aspects of this work.

      Additional strengths of the revised manuscript:

      All weaknesses raised in my review of the original manuscript have been satisfactorily addressed in the revised manuscript. It is interesting to note that the accumulation pattern of the bacteria 50-75 um from the source of serum could, as the author's now note, be due to the avoidance of bactericidal serum elements. Alternative explanations, however, could include chemoreceptor saturation (i.e. close to the serum source, high ligand concentrations could saturate chemoreceptors preventing further chemotaxis) or Weber's Law considerations (cell's ability to detect a given change in chemical concentrations diminishes with increasing background concentrations - thus, as cells get closer to the serum source, their ability to chemotax decreases).

      The authors have also added new experimental data and analyses and these constitute major new strengths of the revised manuscript:<br /> - The authors show that the competitive advantage of WT cells relative to a tsr mutant is removed when serum is treated with serine-racemase and this provides strong evidence that chemotaxis towards serine is responsible for the reduced attraction of the tsr mutant towards serum (i.e. rather than any possible pleiotropic effects).<br /> - New experimental data showing Salmonella enterica is also attracted to swine and equine serum (including an ex vivo swine model) is a useful addition that hints at the potential generality of the response reported here.<br /> - The authors now include additional data to back up the intriguing lack of a movement response towards norepinephrine and DHMA reported here.

      Additional weaknesses of the revised manuscript:

      - The addition of an ex vivo swine model is an exciting new inclusion in the updated manuscript. However, information regarding biological and technical replication here is currently unclear or missing.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript characterizes a chemoattractant response to human serum by pathogenic bacteria, focusing on pathogenic stratins of Salmonella enterica Se. The researchers conduct the chemotaxis assays using a micropipette injection method that allows real-time tracking of bacterial population densities. They found that clinical isolates of several Se strains present a chemoattractant response to human serum. The specific chemoattractant within the serum is identified as L-serine, a highly characterized and ubiquitous chemoattractant, that is sensed by the Tsr receptor. They further show that chemoattraction to serum is impaired with a mutant strain devoid of Tsr. X-ray crystallography is then used to determine the structure of L-serine in the Se Tsr ligand binding domain, which differs slightly from a previously determine structure of a homologous domain. They went on to identify other pathogens that have a Tsr domain through a bioinformatics approach and show that these identified species also present a chemoattractant response to serum.

      Strengths and Weaknesses:

      This study is well executed and the experiments are clearly presented. These novel chemotaxis assays provide advantages in terms of temporal resolution and ability to detect responses from small concentrations. That said, it is perhaps not surprising these bacteria respond to serum as it is known to contain high levels of known chemoattractants, serine certainly, but also aspartate. In fact, the bacteria are shown to respond to aspartate and the tsr mutant is still chemotactic. The authors do not adequately support their decision to focus exclusively on the Tsr receptor. Tsr is one of the chemoreceptors responsible for observed attraction to serum, but perhaps, not the receptor. Furthermore, the verification of chemotaxis to serum is a useful finding, but the work does not establish the physiological relevance of the behavior or associate it with any type of disease progression. I would expect that a majority of chemotactic bacteria would be attracted to it under some conditions. Hence the impact of this finding on the chemotaxis or medical fields is uncertain.

      The authors also state that "Our inability to substantiate a structure-function relationship for NE/DHMA signaling indicates these neurotransmitters are not ligands of Tsr." Both norepinephrine (NE) and DHMA have been shown previously by other groups to be strong chemoattractants for E. coli (Ec), and that this behavior was mediated by Tsr (e.g. single residue changes in the Tsr binding pocket block the response). Given the 82% sequence identity between the Se and Ec Tsr, this finding is unexpected (and potentially quite interesting). To validate this contradictory result the authors should test E. coli chemotaxis to DHMA in their assay. It may be possible that Ec responds to NE and DHMA and Se doesn't. However, currently the data is not strong enough to rule out Tsr as a receptor to these ligands in all cases. At the very least the supporting data for Tsr being a receptor for NE/DHMA needs to be discussed.

      The authors also determine a crystal structure of the SeTsr periplasmic ligand binding domain bound to L-Ser and note that the orientation of the ligand is different than that modeled in a previously determined structure of lower resolution. I agree that the SeTsr ligand binding mode in the new structure is well-defined and unambiguous, but I think it is too strong to imply that the pose of the ligand in the previous structure is wrong. The two conformations are in fact quite similar to one another and the resolution of the older structure, is, in my view, insufficient to distinguish them. It is possible that there are real differences between the two structures. The domains do have different sequences and, moreover, the crystal forms, and cryo-cooling conditions are different in each case. It's become increasingly apparent that temperature, as manifested in differential cooling conditions here, can affect ligand binding modes. It's also notable that full-length MCPs show negative cooperativity in binding ligands, which is typically lost in the isolated periplasmic domains. Hence ligand binding is sensitive to the environment of a given domain. In short, the current data is not convincing enough to say that a previous "misconception" is being corrected.

    1. Reviewer #2 (Public Review):

      In this study, Wang et al., report the significance of XAP5L and XAP5 in spermatogenesis, involved in transcriptional regulation of the ciliary gene in testes. In previous studies, the authors demonstrate that XAP5 is a transcription factor required for flagellar assembly in Chlamydomonas. Continuing from their previous study, the authors examine the conserved role of the XAP5 and XAP5L, which are the orthologue pair in mammals.

      XAP5 and XAP5L express ubiquitously and testis specifically, respectively, and their absence in the testes causes male infertility with defective spermatogenesis. Interestingly, XAP5 deficiency arrests germ cell development at the pachytene stage, whereas XAP5L absence causes impaired flagellar formation. RNA-seq analyses demonstrated that XAP5 deficiency suppresses ciliary gene expression including Foxj1 and Rfx family genes in early testis. By contrast, XAP5L deficiency abnormally remains Foxj1 and Rfx genes in mature sperm. From the results, the authors conclude that XAP5 and XAP5L are the antagonistic transcription factors that function upstream of Foxj1 and Rfx family genes.

      This reviewer thinks the overall experiments are performed well and that the manuscript is clear. However, the current results do not directly support the authors' conclusion. For example, the transcriptional function of XAP5 and XAP5L requires more evidence. In addition, this reviewer wonders about the conserved XAP5 function of ciliary/flagellar gene transcription in mammals - the gene is ubiquitously expressed despite its functional importance in flagellar assembly in Chlamydomonas. Thus, this reviewer thinks authors are required to show more direct evidence to clearly support their conclusion with more descriptions of its role in ciliary/flagellar assembly.

    2. eLife assessment

      This study reports useful data suggesting the critical roles of two ancient proteins, XAP5 and XAP5L, in controlling the transcriptional program of ciliogenesis during mouse spermatogenesis. However, this study is considered incomplete because the data only partially support the conclusion. This work will be of interest to biomedical researchers who work on ciliogenesis and reproduction.

    3. Reviewer #1 (Public Review):

      Summary:

      Wang et al. generate XAP5 and XAP5L knockout mice and find that they are male infertile due to meiotic arrest and reduced sperm motility, respectively. RNA-Seq was subsequently performed and the authors concluded that XAP5 and XAP5L are antagonistic transcription factors of cilliogenesis (in XAP5-KO P16 testis: 554 genes were unregulated and 1587 genes were downregulated; in XAP5L-KO sperm: 2093 genes were unregulated and 267 genes were downregulated).

      Strengths:

      Knockout mouse models provided strong evidence to indicate that XAP5 and XAP5L are critical for spermatogenesis and male fertility.

      Weaknesses:

      The key conclusions are not supported by evidence. First, the authors claim that XAP5 and XAP5L transcriptionally regulate sperm flagella development; however, detailed molecular experiments related to transcription regulation are lacking. How do XAP5 and XAP5L regulate their targets? Only RNA-Seq is not enough. Second, the authors declare that XAP5 and XAP5L are antagonistic transcription factors; however, how do XAP5 and XAP5L regulate sperm flagella development antagonistically? Only RNA-Seq is not enough. Third, I am concerned about whether XAP5 really regulates sperm flagella development. XAP5 is specifically expressed in spermatogonia and XAP5-cKO mice are in meiotic arrest, indicating that XAP5 regulates meiosis rather than sperm flagella development.

    1. eLife assessment

      This important study demonstrated that ablation of astrocytes in the lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms. This study may also offer a new therapeutic strategy for the treatment of debilitating neuropathic pain in patients with SCI.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study the authors demonstrated that ablation of astrocytes in lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. Furthermore, RNA sequencing and bioinformatics revealed an activation of STING/type I IFNs signal pathway in spinal cord microglia after astrocyte ablation.

      Strengths:

      The findings are novel and interesting and provide new insights into astrocyte-microglia interaction in neuropathic pain. This study may also offer a new therapeutic strategy for the treatment of debilitating neuropathic pain in patients with SCI.

      Weaknesses:

      More details are needed to justify the sample size, statistics, and sex of animals.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript, Zhao et al. have carried out a thorough examination of the effects of targeted ablation of resident astrocytes on behavior, cellular responses, and gene expression after spinal cord injury. Employing transgenic mice models alongside pharmacogenetic techniques, the authors have successfully achieved the selective removal of these resident astrocytes. This intervention led to a notable reduction in neuropathic pain and induced a shift in microglial cell reactivation states within the spinal cord, significantly altering transcriptome profiles predominantly associated with interferon (IFN) signaling pathways.

      Strengths:

      The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and valuable insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms. This contribution is highly relevant and suggests that further exploration in this direction could yield meaningful results.

      Weaknesses:

      The methodology and evidence underpinning the study are solid, yet some areas would benefit from further clarification, particularly concerning methodological details and the choice of statistical analyses. Additionally, the manuscript's organization and clarity could be improved, as certain figures and schematics appear inconsistent or misleading.

    1. eLife assessment

      This study presents a valuable finding that the blood-brain barrier functionality changes with age and differs between males and females. The analysis is solid, comprising a large and racially diverse dataset, and utilizes a contrast-agent-free MRI method. Since limited work has been done in the MRI field on the blood-brain barrier using this method, this study is of great interest to neuroimaging researchers and clinicians.

    2. Reviewer #1 (Public Review):

      Summary:

      This work revealed an important finding that the blood-brain barrier (BBB) functionality changes with age and is more pronounced in males. The authors applied a non-invasive, contrast-agent-free approach of MRI called diffusion-prepared arterial spin labeling (DP-pCASL) to a large cohort of healthy human volunteers. DP-pCASL works by tracking the movement of magnetically labeled water (spins) in blood as it perfuses brain tissue. It probes the molecular diffusion of water, which is sensitive to microstructural barriers, and characterizes the signal coming from fast-moving spins as blood and slow-moving spins as tissue, using different diffusion gradients (b-values). This differentiation is then used to assess the water exchange rates (kw) across the BBB, which acts as a marker for BBB functionality. The main finding of the authors is that kw decreases with age, and in some brain regions, kw decreases faster in males. The neuroprotective role of the female sex hormone, estrogen, on BBB function is discussed as one of the explanations for this finding, supported by literature. The study also shows that BBB function remains stable until the early 60s and remarkably decreases thereafter.

      Strengths:

      The two main strengths of the study are the MRI method used and the amount of data. The authors employed a contrast-agent-free MRI method called ASL, which offers the opportunity to repeat such experiments multiple times without any health risk - a significant advantage of ASL. Since ASL is an emerging field that requires further exploration and testing, a study evaluating blood-brain barrier functionality is of great importance. The authors utilized a large dataset of healthy humans, where volunteer data from various studies were combined to create a substantial pool. This strategy is effective for statistically evaluating differences in age and gender.

      Weaknesses:

      Gender-related differences are only present in some brain regions, not in the whole brain or gray matter - which is usually the assumption unless stated otherwise. From the title, this was not clear. Including simulations could increase readers' understanding related to model fitting and the interdependence of parameters, if present. The discussion follows a clear line of argument supported by literature; however, focusing solely on AQP4 channels and missing a critical consideration of other known/proven changes in transport mechanisms through the BBB and their effects substantially weakens the discussion.

    3. Reviewer #2 (Public Review):

      Summary:

      This study used a novel diffusion-weighted pseudo-continuous arterial spin labelling (pCASL) technique to simultaneously explore age- and sex-related differences in brain tissue perfusion (i.e., cerebral blood flow (CBF) & arterial transit time (ATT) - a measure of CBF delivery to brain tissue) and blood-brain barrier (BBB) function, measured as the water exchange (kw) across the BBB. While age- and sex-related effects on CBF are well known, this study provides new insights to support the growing evidence of these important factors in cerebrovascular health, particularly in BBB function. Across the brain, the decline in CBF and BBB function (kw) and elevation in ATT were reported in older adults, after the age of 60, and more so in males compared to females. This was also evident in key cognitive regions including the insular, prefrontal, and medial temporal regions, stressing the consideration of age and sex in these brain physiological assessments.

      Strengths:

      Simultaneous assessment of CBF with BBB along with transit time and at the voxel-level helped elucidate the brain's vulnerability to age and sex-effects. It is apparent that the investigators carefully designed this study to assess regional associations of age and sex with attention to exploring potential non-linear effects.

      Weaknesses:

      It appears that no brain region showed concurrent CBF and BBB dysfunction (kw), based on the results reported in the main manuscript and supplemental information. Was an association analysis between CBF and kw performed? There is a potential effect of the level of formal education on CBF (PMID: 12633147; 15534055), which could have been considered and accounted for as well, especially for a cohort with stated diversity (age, race, sex).

    1. eLife assessment

      This work substantially advances our understanding of pharmacological inhibition of SWI/SNF as a therapeutic approach for cancer. The study is well-written and provides compelling evidence, including comprehensive datasets, compound screens, gene expression analysis, epigenetics, as well as animal studies. This study provides a fundamental advance for the uveal melanoma research field that might be exploited to target this deadly cancer and more generally for targeting transcriptional dependency in cancers.

    2. Reviewer #1 (Public Review):

      Summary:

      The presented study by Centore and colleagues investigates the inhibition of BAF chromatin remodeling complexes. The study is well-written, and includes comprehensive datasets, including compound screens, gene expression analysis, epigenetics, as well as animal studies. This is an important piece of work for the uveal melanoma research field, and sheds light on a new inhibitor class, as well as a mechanism that might be exploited to target this deadly cancer for which no good treatment options exist.

      Strengths:

      This is a comprehensive and well-written study.

      Weaknesses:

      There are minimal weaknesses.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors generate an optimized small molecule inhibitor of SMARCA2/4 and test it in a panel of cell lines. All uveal melanoma (UM) cell lines in the panel are growth-inhibited by the inhibitor making the focus of the paper. This inhibition is correlated with the loss of promoter occupancy of key melanocyte transcription factors e.g. SOX10. SOX10 overexpression and a point mutation in SMARCA4 can rescue growth inhibition exerted by the SMARCA2/4 inhibitor. Treatment of a UM xenograft model results in growth inhibition and regression which correlates with reduced expression of SOX10 but not discernible toxicity in the mice. Collectively the data suggest a novel treatment of uveal melanoma.

      Strengths:

      There are many strengths of the study including the strong challenge of the on-target effect, the assays used, and the mechanistic data. The results are compelling as are the effects of the inhibitor. The in vivo data is dose-dependent and doses are low enough to be meaningful and associated with evidence of target engagement.

      Weaknesses:

      The authors introduce the field stating that SMARCA4 inhibitors are more effective in SMARCA2 deficient cancers and the converse. Since the desirable outcome of cancer therapy would be synthetic lethality it is not clear why a dual inhibitor is desirable. Wouldn't this be associated with more side effects? It is not known how the inhibitor developed here impacts normal cells, in particular T cells which are essential for any durable response to cancer therapies in patients. Another weakness is that the UM cell lines used do not molecularly resemble metastatic UM. These UM most frequently have mutations in the BAP1 tumor suppressor gene. It is not clear if the described SMARCA2/4 inhibitor is efficacious in BAP1 mutant UM cell lines in vitro or BAP1 mutant patient-derived xenografts in vivo.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript reports the discovery of new compounds that selectively inhibit SMARCA4/SMARCA2 ATPase activity that work through a different mode as previously developed SMARCA4/SMARCA2 inhibitors. They also demonstrate the anti-tumor effects of the compounds on uveal melanoma cell proliferation and tumor growth. The findings indicate that the drugs exert their effects by altering chromatin accessibility at binding sites for lineage-specific transcription factors within gene enhancer regions. In uveal melanoma, altered expression of the transcription factor, SOX10, and SOX10 target gene underlies the anti-proliferative effects of the compounds. This study is significant because the discovery of new SMARCA4/SMARCA2 inhibitory compounds that can abrogate uveal melanoma tumorigenicity has therapeutic value. In addition, the findings provide evidence for the therapeutic use of these compounds in other transcription factor-dependent cancers.

      Strengths:

      The strengths of this manuscript include biochemical evidence that the new compounds are selective for SMARCA4/SMARCA2 over other ATPases and that the mode of action is distinct from a previously developed compound, BRM014, which binds the RecA lobe of SMARCA2. There is also strong evidence that FHT1015 suppresses uveal melanoma proliferation by inducing apoptosis. The in vivo suppression of tumor growth without toxicity validates the potential therapeutic utility of one of the new drugs. The conclusion that FHT1015 primarily inhibits SMARCA4 activity and thereby suppresses chromatin accessibility at lineage-specific enhancers is substantiated by ATAC-seq and ChIP-seq studies.

      Weaknesses:

      The weaknesses include a lack of more precise information on which SMARCA4/SMARCA2 residues the drugs bind. Although the I1173M/I1143M mutations are evidence that the critical residues for binding reside outside the RecA lobe, this site is conserved in CHD4, which is not affected by the compounds. Hence, this site may be necessary but not sufficient for drug binding or specifying selectivity. A more precise evaluation of the region specifying the effect of the new compounds would strengthen the evidence that they work through a novel mode and that they are selective. Another concern is that the mechanisms by which FHT1015 promotes apoptosis rather than simply cell cycle arrest are not clear. Does SOX10 or another lineage-specific transcription factor underlie the apoptotic effects of the compounds?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) It is nice that the authors compared their model to the one "without lookahead" in Figure 4, but this comparison requires more evidence in my opinion, as I explain in this comment. The model without lookahead is closely related or possibly equivalent to the standard predictive coding. In predictive coding, one can make the network follow the stimulus rapidly by reducing the time constant tau. However, as the time constant decreases, the network would become unstable both in simulations (due to limited integration time step) and physical implementation (due to noise). Therefore I wonder if the proposed model has an advantage over standard predictive coding with an optimized time constant. Hence I suggest to also add a comparison between the proposed model, and the predictive coding with parameters (such as tau) optimized independently for each model. Of course, we know that the time-constant of biological neurons is fixed, but biological neurons might have had different time constants (by changing leak conductance) and such analysis could shed light on the question of why the neurons are organized the way they are.

      The comparison with a predictive network for which the neuronal time constants shrink towards 0 is in fact helpful. We added two news subsections in the SI that formally compares the NLA with other approaches, Equilibrium propagation and the Latent Equilibrium, with a version of Equilibrium Propagation also covering the standard predictive coding you describe (SI, Sect.C and D). The Subsection C concludes: “In the Equilibrium propagation we cannot simply take the limit t0 since then the dynamics either disappears (when tau remains on the left, t Du  0) or explodes (when t is moved to the right, dt/ t  ∞), leading to either too small or too big jumps.”

      We have also expanded the passage on the predictive coding in the main text, comparing our instantaneous network processing (up to a remaining time constant tin) with experimental data from humans (see page 10 of the revised ms). The new paragraph ends with:

      “Notice that, from a technical perspective, making the time constants of individual cortical neurons arbitrarily short leads to network instabilities and is unlikely the option chosen by the brain (see SI Sect. C, Comparison to the Equilibrium Propagation).”

      A new formal definition of the moving equilibrium in the Methods (Sect. F) helps to understand this notion of being in a balanced equilibrium state during the dynamics. This formal definition directly leads to the contraction analysis in the SI, Sect. D, showing why the Latent Equilibrium is always contractive, while the current form of the NLA may show jumps at the corner of a ReLu (since a second order derivative of the transfer function enters in the error propagation).

      The reviewer perhaps has additional simulations in mind that compare the robustness of the different models. However, as this paper is more about presenting a novel concept with a comprehensive theory (summing up to 45 pages), we prefer to not add more than the simulations necessary to check the statements of the theorems.

      (2) I found this paper difficult to follow, because the Results sections went straight into details, and various elements of the model were introduced without explaining why they are necessary. Furthermore, the neural implementation was introduced after the model simulations. I suggest reorganizing the manuscript, to describe the model following Marr's levels of description and then presenting the results of simulations. In particular, I suggest starting the Results section by explaining what computation the network is trying to achieve (describe the setup, function L, define its integral over time, and explain that the goal is to find a model minimizing this integral). Then, I suggest presenting the algorithm the neurons need to employ to minimize this integral, i.e. their dynamics and plasticity (I wonder if r=rho(u) + tau rho(u)' is a consequence of action minimization or a necessary assumption - please clarify it). Next please explain how the algorithms could be implemented in biological neurons. Afterward please present the results of the simulation.

      We are sorry to realize that we could not convey the main message clearly enough. After rewriting the paper and straightening the narrative, we hope it is simpler to understand now.

      The paper does not suggest a new model to solve a task, and writing down the function to be minimized is not enough. The point of the NLA is that the time integral of our Lagrangian is minimized with respect to the prospective coordinates, i.e. the discounted future voltage. It is about the question how dynamic equations in biology are derived. Of course, we also solve these equations, prove theorems and perform simulations. But the main point that biology seems to deal with time differently than physics deals with time. Biology “thinks” in terms of future quantities, physics “thinks” in terms of current quantities. We tried to explain this better now in the Introduction, the Results (e.g. after Eq. 5) and the Methods.

      (3) Understanding the paper requires background knowledge that most readers of eLife are unlikely to have, even if they are mathematically minded. For example, I am from the field of computational neuroscience, and I have never heard about Least Action principle from physics or the EulerLagrange equation. I felt lost after reading this paper, and to be able to write this review I needed to watch videos on the Euler-Lagrange equation. To help other readers, I have two suggestions: First, I feel that Eq 4-6 could be moved to the methods, because I found the concept of u~ difficult to understand, and it does not appear in the algorithm. Second, I advise to write in the Introduction, what knowledge is required to follow this paper, and point the readers to resources where they can find the required information. The authors may specify what background is required to follow the main text, and what is required to understand the methods.

      We hope that after explaining the rationale better, it becomes clear that we cannot skip the equations for the prospective coordinates. Likewise, the Euler-Lagrange equations need to be presented in the abstract form, since these are the equations that are eventually transformed into the “model”. We tried to give the basic intuition for this in the main text. As we explained above, the equations asked to be skipped represent the essence of the proposal. It is about how to derive a model equations.

      Moreover, we give more explanations in the Methods to understand the derivations, and we refer to the specifically sections in the SI for further details. We are aware that a full understanding of the theory requires some basic knowledge of the calculus of variation.

      We are hesitating to write in the Introduction what type of knowledge is required to understand the paper. An understanding can be on various levels. Moreover, the materials that are considered to be helpful depend on the background. While for some it is a Youtube, for some Wikipedia, and for others it is a textbook where specific ingredients can be extracted. But we do cite two textbooks in the Results and more in the SI, Sect. F, when referring to the principle of least action in physics and the mathematics, including weblinks.

      Minor comments

      Eq.3: The Authors refer to this equation as a Lagrangian. Could you please clarify why? Is the logic to minimize the energy subject to a constraint that Cost = 0?

      Thanks for asking. The cost is not really a constraint, it is globally minimized, in parallel steps. We are explaining this right after Eq. 3. “We `prospectively' minimize L locally across a voltage trajectory, so that, as a consequence, the local synaptic plasticity for W will globally reduce the cost along the trajectory (Theorem 1 below).”

      We were adding two sentence that explain why this function in Eq. 3 is called a Lagrangian: “While in classical energy-based approaches L is called the total energy, we call it the `Lagrangian' because it will be integrated along real and virtual voltage trajectories as done in variational calculus (leading to the Euler-Lagrange equations, see below and SI, Sect. F)”

      p.4, below Eq. 5 - Please explain the rationale behind NLA, i.e. why is it beneficial that "the trajectory u˜(t) keeps the action A stationary with respect to small variations δu˜"? I guess you wish to minimize L integrated over time, but this is not evident from the text.

      Hmm, yes and no. We wish to minimize the cost, and on the way there minimize the action. Since the global minimization of C is technically difficult, one looks for stationary trajectory as defined in the cited sentence, while minimizing L with respect to W, to eventually minimize the cost.

      In the text we now explain after Eq. 5:

      “The motivation to search for a trajectory that keeps the action stationary is borrowed from physics. The motivation to search for a stationary trajectory by varying the near-future voltages ũ instead of u is assigned to the evolutionary pressure in biology to 'think ahead of time'. To not react too late, internal delays involved in the integration of external feedback need to be considered and eventually need to be overcome. In fact, only for the 'prospective coordinates' defined by looking ahead into the future, even when only virtually, will a real-time learning from feedback errors become possible (as expressed by our Theorems below).”

      Bottom of page 8. The authors say that in the case of single equilibrium and strong nudging the model reduced to the Least Control Principle. Does it also reduce to Predictive coding for supervised learning? If so, it would be helpful to state so.

      Yes, in this case the prediction error in the apical dendrite becomes the one of predictive coding. We are stating this now right at the end of the cited sentence:

      “In the case of strong nudging and a single steady-state equilibrium, the NLA principle reduces to the Least-Control Principle (Meulemans et al., 2022) that minimizes the mismatch energy E^M for a constant input and a constant target, with the apical prediction error becoming the prediction error from standard predictive coding (Rao & Ballard, 1999).”

      In the Discussion we also added a further point (iv) to compare the NLA principle with predictive coding. Both “improve” the sensory representation, but the NLA does in favor of an output, and the predictive coding in favor of the sensory prediction itself (see Discussion).

      Whenever you refer to supplementary materials, please specify the section, so it is easier for the reader to find it.

      Done. Sorry to not have done it earlier. We are now also indicate specific sections when referring to the Methods.

      Reviewer #2 (Recommendations For The Authors):

      There are no major issues with this article, but I have several considerations that I think would greatly improve the impact, clarity, and validity of the claims.

      (1) Unifying the narrative. There are many many ideas put forward in what feels like a deluge. While I appreciate the enthusiasm, as a reader I found it hard to understand what it was that the authors thought was the main breakthrough. For instance, the abstract, results, introduction, and discussion all seem to provide different answers to that question. The abstract seems to focus on the motor error idea. The introduction seems to focus on the novel prospective+predictive setup of the energy function. The discussion lists the different perks of the theory (delay compensation, moving equilibrium, microcircuit) without referring to the prospective+predictive setup of the energy function.

      Thanks much for these helpful hints. Yes, the paper became an agglomerate of many ideas, also own to the fact that we wish to show how the NLA principle can be applied to explain various phenomenology in neurosicence. We now simplified the narrative to this one point of providing a novel theoretical framework for neuroscience, and explaining why this is novel and why it “suddenly works” (the prospective minimization of the energy).

      As you can see from the dominating red in the revised pdf, we did fully rewrite Abstract, Introduction and Discussion under the narrative of the NLA and prospective coding.

      (2) Laying out the organization of the notation clearly. There are quite a few subtle distinctions of what is meant by the different weight matrices (omnibus matrix then input vs recurrent then layered architecture), different temporal horizon formalisms (bar, not bar, tilde), different operators (L, curly L, derivative version, integral version). These different levels are introduced on the fly, which makes it harder to grasp. The fact that there are many duplicate notations for the same quantities does not help the reader. For instance u_0 becomes equal to u_N at one point (above Eq 25). Another example is the constant flipping between integrated and 'current input' pictures. So laying out the multiple layers early, making a table or a figure for the notation, or sticking with one level would help convey the idea to a wide readership.

      Thanks for the hints. We included the table you suggested, but put it to the SI as it became a full page itself. We banned the curly L abbreviating the look-ahead operator.

      The “change of notation” you are alluding to is tricky, though. In a recurrent layer, the index of the output neuron is called o. In a forward network with N layer, the index of the output neurons becomes the last layer N. One has to introduce the layer index l anway for the deeper layers l < N, and we found it more consistent to explain that, while switching from the recurrent to the forward network, the voltage of the output layer becomes now u_o = u_N. There are more of these examples, like the weight matrix W splitting into a intrinsic network part W_net across which errors backpropagate, and a part conveying the input, W_in, that has to be excluded when writing the backpropagation formula for general networks. Again, in the case of the feedforward networks, the notation reduces to W_l, with index l coding for the layer. Presenting the general approach and a specific example may appear as we would duplicate notations – we haven’t found a solution here.

      (3) Separate the algorithm from the implementation level. I particularly struggled with separating the ideas that belonged to the algorithm level (cost function, optimization objectives) and the biophysics. The two are interwoven in a way that does not have to be. Particularly, some of the normative elements may be implemented by other types of biophysics than the authors have in mind. It is for this reason that I think that separating more clearly what belongs to the implementation and algorithm levels would help make the ideas more widely understood. On this point, a trigger point for me was the definition of the 'prospective input rates' e_i, which comes in the second paragraph.

      We are very sorry to have made you thinking that the 'prospective input rates' would be e_i. The prospective input rates are r_i. The misunderstanding likely appeared by an unclear formulation from our side that is now corrected (see first and second paragraph of the Results where we introduce r_i and e_i).

      From a biophysical perspective, it is quite arbitrary to define the input to be the difference between the basal input and the somatic (prospective) potential. It sounds like it comes from some unclear normative picture at this point. But the authors seem to have in mind to use the fact that the somatic potential is the sum of apical and basal input, that's the biophysical picture.

      We hope to have disentangled the normative and biophysical view in the 2nd and 3rd paragraph of the Results, respectively. We introduce the prospective error ei as abstract notion in the first paragraph, while explaining that it will be interpreted as somato-dendritic mismatch error in neuron I in the next paragraph. The second paragraph contains the biophysical details with the apical and basal morphology.

      (4) Experts and non-expert would appreciate an explanation of why/how the choice of state variables matters in the NLA. The prospective coding state variables cannot be said to be the naïve guess. Why does the simple u, dot{u} not work as state variables applied on the same energy function, as would be a naïve application of the Lagrangian ideas?

      We are very glad for this hint to present an intuition behind the variation of the action with respect to a prospective state, instead of the state itself. The simple L(u, dot{u}) does not work because one does not obtain the first-order voltage dynamics compatible with the biophysics. We made an effort to explain the intuition to non-experts and experts in an additional paragraph right after presenting the voltage and error dynamics (Eq. 7 on page 4).

      Here is how the paragraph starts (not displaying the formulas here):

      “From the point of view of theoretical physics, where the laws of motion derived from the least-action principle contain an acceleration term (as in Newton's law of motion, like … for a harmonic oscillator), one may wonder why no second-order time derivative appears in the NLA dynamics. As an intuitive example, consider driving into a bend. Looking ahead in time helps us to reduce the lateral acceleration by braking early enough, as opposed to braking only when the lateral acceleration is already present. This intuition is captured by minimizing the neuronal action A with respect to the discounted future voltages ũi instead of the instantaneous voltages ui.

      Keeping up an internal equilibrium in the presence of a changing environment requires to look ahead and compensate early for the predicted perturbations.

      Technically, …”

      More details are given in the Methods after Eq. 20. Moreover, in the last part of the SI, Sect. F, we have made the link to the least-action principle in physics more explicitly. There we show how the voltage dynamics can be derived from the physical least-action principle by including the Rayleigh dissipation (Eq. 92 and 95).

      (5) Specify that the learning rules have not been observed. Though the learning rules are Hebbian, the details of the rules have not to my knowledge been observed. Would be worth mentioning as this is a sticking point of most related theories.

      We agree, and we do now explicitly write in the Discussion that the learning rule still awaits to be experimentally tested.

      6) Some relevant literature. Chalk et al. PNAS (2018) have explored the relationship between temporal predictive coding and Rao & Ballard predictive coding based on the parameters of the cost function. Harkin et al. eLife (2023) have shown that 'prospective coding' also takes place in the serotonergic system, while Kim ... Ma (2021) have put forward similar ideas for dopamine, both may participate in setting the cost function. Instantaneous voltage propagation is also a focus of Greedy et al. (2023). The authors cite Zenke et al. for spiking error propagation, but there are biological references to that end.

      Thanks much for these hints. We do now cite the book of Gerstner & Kistler on spiking neurons, and more specifically the spike-based approach for learning to represent signals (Brendel, .., Machens, Denève, PLoS CB, 2020). Otherwise, we had difficulties to incorporate the other literature that seems to us not directly related to our approach, even when related notions come up (like predictive coding and temporal processing in Chalk et al. (2018), where various temporal coding schemes coding efficiency is studied as a function of the signal-to-noise ratio), or the apical activities in Greedy et al. (2022), where bursting, multiplexing and synaptic facilitation arises). We found it would confuse more than it would help if we would cite these papers too (we do already cite 95 papers).

      (7) In the main text, theorem two is presented as proof without assumptions on the level of nudging, but the actual proof uses strong assumptions in that respect, relying on numerical ad hoc observations for the general case.

      Thanks for pointing this out. We agree it is a better style to state all the critical assumptions in Theorem itself, rather than deferring them to the Methods. We now state: “Then, for suitable top-down nudging, learning rates, and initial conditions, the ….weights …evolve such that…”.

      (8) In the discussion regarding error-backpropagation, it seems to me that it could be clarified that the current algorithm asks for a weight alignment between FF and FB matrices as well as between FB and interneuron circuit matrices. Whether all of these matrices can be learned together remains to be shown; neither Akrout, Kunin nor Max et al. have shown this explicitly. Particularly when there are other inputs to the apical dendrites from other areas.

      Yes, it is difficult to learn to align all in parallel. Nevertheless, our simulations in fact do align the lateral and vertical circuits, at is also claimed in Theorem 2. Yet, as specified in the theorem, “for suitable learning rates” (that were all the same, but were commonly reduced after some training time, as previously explained in the Methods, Details for Fig. 5).

      In the Discussion we now emphasis that, in general, simulating all the circuitries jointly from scratch in a single phase is tricky. We write:

      “A fundamental difficulty arises when the neuronal implementation of the Euler-Lagrange equations requires an additional microcircuit with its own dynamics. This is the case for the suggested microcircuit extracting the local errors. Formally, the representation of the apical feedback errors first needs to be learned before the errors can teach the feedforward synapses on the basal dendrites. We showed that this error learning can itself be formulated as minimizing an apical mismatch energy. What the lateral feedback through interneurons cannot explain away from the top-down feedback remains as apical prediction error.

      Ideally, while the network synapses targetting the basal tree are performing gradient descent on the global cost, the microcircuit synapses involved in the lateral feedback are performing gradient descent on local error functions, both at any moment in time.

      The simulations show that this intertwined system can in fact learn simultaneously with a common learning rate that is properly tuned. The cortical model network of inter- and pyramidal neurons learned to classify handwritten digits on the fly, with 10 digit samples presented per second. Yet, the overall learning is more robust if the error learning in the apical dendrites operates in phases without output teaching but with corresponding sensory activity, as may arise during sleep (see e.g. Deperrois et al., 2022 and 2023).”

      (9) The short-term depression model is assuming a slow type of short-term depression, not the fast types that are the focus of much recent experimental literature (like Campagnola et al. Science 2022).

      This assumption should be specified.

      Thanks for hinting to this literature that we were not aware of. We are now citing the releaseindependent plasticity (Campagnola et al. 2022) in the context of our synaptic depression model.

      (10) There seems to be a small notation issue: Eq 21 combines vectors of the size of the full network (bar{e}) and the size of the readout network (bar{e}star).

      Well, for notational convenience we set the target error to e*=0 for non-output neurons. This way we can write the total error for an arbitrary network neuron as the sum of the backpropagated error plus the putative target error (if the neuron is an output neuron). Otherwise we would always have to distinguish between network neuron that may be output neurons, and those that are not. We did say this in the main text, but are repeating it now again right after Eq. 21. -- Notations are often the result of a tradoff.

    2. eLife assessment

      This manuscript describes a potentially important theoretical framework to link predictive coding, error-based learning, and neuronal dynamics. The provided evidence is solid, but some details would benefit from additional clarification. The exposition of the manuscript is targeted for a specialist audience.

    3. Reviewer #1 (Public Review):

      The manuscript considers a hierarchical network of neurons, of the type that can be found in sensory cortex, and assumes that they aim to constantly predict sensory inputs that may change in time. The paper describes the dynamics of neurons and rules of synaptic plasticity that minimize the integral of prediction errors over time.

      The manuscript describes and analyses the model in great detail, and presents multiple and diverse simulations illustrating the model's functioning. However, the manuscript could be made more accessible and easier to read. The paper may help to understand the organization of cortical neurons, their properties, as well as the function of its particular components (such as apical dendrites).

    4. Reviewer #2 (Public Review):

      Neuroscientists often state that we have no theory of the brain. The example of theoretical physics is often cited, where numerous and quite complex phenomena are explained by a compact mathematical description. Lagrangian and Hamiltonian pictures provide such powerful 'single equation'. These frameworks are referred to as 'energy', an elegant way to turn numerous differential equations into a single compact relationship between observable quantities (state variables like position and speed) and scaling constants (like the gravity constant or the Planck constant). Such energy pictures have been used in theoretical neuroscience since the 1980s.

      The manuscript "neuronal least-action principle for real-time learning in cortical circuits" by Walter Senn and collaborators describes a theoretical framework to link predictive coding, error-based learning, and neuronal dynamics. The central concept is that an energy function combining self-supervised and supervised objectives is optimized by realistic neuronal dynamics and learning rules when considering the state of a neuron as a mixture of the current membrane potential and its rate of change. As compared with previous energy functions in theoretical neuroscience, this theory captures a more extensive range of observations while satisfying normative constraints. Particularly, no theory had to my knowledge related adaptive dynamics widely observed in the brain (referred to as prospective coding in the text, but is sometimes referred to as adaptive coding or redundancy reduction) with the dynamics of learning rules.

      The manuscript first exposes the theory of two previously published papers by the same group on somato-dendritic error with apical and basal dendrites. These dynamics are then related to an energy function, whose optimum recovers the dynamics. The rest of the manuscript illustrates how features of this model fits either normative or observational constraints. Learning follows a combination of self-supervised learning (learning to predict the next step) and supervised learning (learning to predict an external signal). The credit assignment problem is solved by an apical-compartment projecting set of interneurons with learning rules whose role is to align many weight matrices to avoid having to do multiplexing. An extensive method section and supplementary material expand on mathematical proofs and make more explicit the mathematical relationship between different frameworks.

      Experts would say that much of the article agglomerates previous theoretical papers by the same authors that have been published recently either in archival servers or in conference proceedings. A number of adaptations to previous theoretical results were necessary, so the present article is not easily reduced to a compendium of previous pre-prints. However, the manuscript is by no means easy to read. Also, there remain a few thorny assumptions (unobserved details of the learning rules or soma-dendrites interactions), but the theory is likely going to be regarded as an important step towards a comprehensive theory of the brain.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript presents a compelling model to explain the impact of mosaicism in preimplantation genetic testing for aneuploidies.

      Strengths:

      A new view of mosaicism is presented with a computational model, that brings new insights into an "old" debate in our field. It is a very well-written manuscript.

      Weaknesses:

      Although the manuscript is very well written, this is in a way that assumes that the reader has existing knowledge about specific terms and topics. This was apparent through a lack of definitions and minimal background/context to the aims and conclusions for some of the author's findings.

      There is a need for some examples to connect real evidence and scenarios from clinical reports with the model.

      We thank the reviewer for their assessment. Some background was condensed for space, and we wrote the manuscript to be understood by readers with existing reproductive genetics background. We will add more detail and explain terminology more clearly. There are a number of published case studies that can link real-life clinical data with the model’s findings. We will include a summary of them in the text.

      Reviewer #2 (Public Review):

      Summary:

      Although an oversimplification of the biological complexities, this modeling work does add, in a limited way, to the current knowledge on the theoretical difficulties of detecting mosaicism in human blastocysts from a single trophectoderm biopsy in PGT. However, many of the premises that the modeling was built on are theoretical and based on unproven biological and clinical assumptions that could yet lead to be untrue. Therefore, the work should be considered only as a simplified model that could assist in further understanding of the complexities of preimplantation embryo mosaicism, but assumptions of real-world application are, at this stage, premature and should not be considered as evidence in favour of any clinical strategies.

      Strengths:

      The work has presented an intriguing theoretical model for elaborating on the interpretation of complex and still unclear biological phenomena such as chromosomal mosaicism in preimplantation embryos.

      We thank the reviewer for this detailed review, and that they see the value of theoretical modelling. We agree that this model makes simplifications; we took this simplified approach to focus on the core contradiction between clinical experience and previous modelling. Expanding the model to consider additional aspects of balanced mitotic nondisjunctions and technical accuracy is something we want to address; we are discussing whether this is something that can be practically added to this manuscript, or will involve enough work that should be developed as a further study.

      Weaknesses:

      Lines 134-138: The spatial modeling of mitotic errors in the embryo was oversimplified in this manuscript. There is only limited (and non-comprehensive) evidence that meiotic errors leading to chromosome mosaicism arise from chromosome loss or gain only (e.g. anaphase lag). This work did not take into account the (more recognised) possibility of mitotic nondisjunction where following the event there would be clones of cells with either one more or one less of the same chromosome. Although addressed in the discussion (lines 572-574), not including this in the most basic of modeling is a significant oversight that, based on the simple likelihood, could significantly affect results.

      As above, we certainly plan to address this in future modelling; developing the model to account for this while also incorporating the issue of technical uncertainty in the state of each cell in the biopsy from sequencing.

      General comment: the premise of the manuscript is that an embryologist (embryology laboratory) is aware of and can accurately quantify the number of cells in a blastocyst or TE biopsy. The reality is that it is not possible to accurately do this without the destruction of the sample which is obviously not clinically applicable. Based on many assumptions the findings show that taking small biopsies poorly classifies mosaic embryos, which is not disputed. However, extrapolating this to the clinic and making suggestions to biopsy a certain amount of cells (lines 539-540) is careless and potentially harmful by suggesting the introduction of potential change in clinical practice without validation. Additionally, no embryologist in the field can tell how many cells are present in a clinical TE biopsy, making this suggestion even more impractical.

      We will revise this to make the technical limitations of clinical TE biopsies clearer.

      On a more general clinical consideration, the authors should acknowledge that when reporting findings of unproven clinical utility and unknown predictive values this inevitably results in negative consequences for infertile couples undergoing IVF. It is proven and established that when couples face the decision on how to manage a putative mosaicism finding, the vast majority decide on embryo disposal. It was recently reported in an ESHRE survey that about 75% of practitioners in the field consider discarding or donating to research embryos with reported mosaicism. A prospective clinical trial showed that about 30% live birth rate reduction can be expected if mosaic embryos are not considered (Capalbo et al., AJHG 2021). The real-world experience is that when mosaicism is reported, embryos with almost normal reproductive potential are discarded. The authors should be more careful with the clinical interpretation and translation of these theoretical findings.

      The clinical potential of mosaic embryos is much more nuanced than a simple ‘they should be discarded’ or ‘they should be treated like euploid embryos’. While the study mentioned by the reviewer (Capalbo et al., AJHG 2021) does indeed suggest that embryos with putative low level mosaicism have good potential, it also suggests that embryos with putative high level mosaicism are largely to be considered aneuploid and should therefore be discarded. Therefore, even the mentioned study supports a ‘ranking’ of embryos by their mosaic result. Furthermore, large controlled retrospective studies have indicated that even high level mosaic embryos have reproductive potential (Viotti Fertility & Sterility 2021 and Viotti F&S 2023). Recent case reports have shown that mosaicism can occasionally persist from embryo to late gestation and even birth, at times associating with negative medical findings. Therefore, while the true clinical potential of embryos classified as mosaic is still being defined, here we are merely suggesting that from a modelling standpoint, the features of mosaicism detected with PGT-A can help guide clinical decisions (complementing the observations reported in the clinical studies).

      There is a robust consensus within the field of clinical genetics and genomics regarding the necessity to exclusively report findings that possess well-established clinical validity and utility. This consensus is grounded in the imperative to mitigate misinterpretation and ineffective actions in patient care. However, the clinical framework delineated in this manuscript diverges from the prevailing consensus in clinical genetics. Clinical genetics and genomics prioritize the dissemination of findings that have undergone rigorous validation processes and have demonstrated clear clinical relevance and utility. This emphasis is crucial for ensuring accurate diagnosis, prognosis, and therapeutic decision-making in patient care. By adhering to established standards of evidence and clinical utility, healthcare providers can minimize the potential for misinterpretation and inappropriate interventions. The framework proposed in this manuscript appears to deviate from the established principles guiding clinical genetics practice. It is imperative for clinical frameworks to align closely with the consensus guidelines and recommendations set forth by professional organizations and regulatory bodies in the field. This alignment not only upholds the integrity and reliability of genetic testing and interpretation but also safeguards patient well-being and clinical outcomes.

      References:

      ACMG Board of Directors. (2015). Clinical utility of genetic and genomic services: a position statement of the American College of Medical Genetics and Genomics. Genetics in Medicine, 17(6), 505-507. https://doi.org/10.1038/gim.2014.194.

      Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., ... ACMG Laboratory Quality Assurance Committee. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405-424. https://doi.org/10.1038/gim.2015.30

      We will update where necessary to match these references.

      Line 61: "Self correction" - This terminology is unfortunately indiscriminately used in the field for PGT when referring to mosaicism and implies that the embryo can actively correct itself from a state of inherent abnormality. Apart from there being no evidence to suggest that there is an active process by which the embryo itself can correct chromosomal errors, most presumed euploid/aneuploid mosaic embryos will have been euploid zygotes and therefore "self-harm" may be a better explanation. True self-correction in the form of meiotic trisomy/monosomy rescue is of course theoretically possible but not at all clinically significant. The concept being conveyed in this part of the manuscript is not disputed but it is strongly suggested that the term "self correction" is not used in this context, nor in the rest of the manuscript, to prevent the perpetuation of misinformation in the field and instead use a better description.

      This is a good point. We have used ‘self correction’ as a shorthand, but the reality is more nuanced. It will often be a passive process in which aneuploid cell lineages fail to proliferate over time (‘aneuploidy depletion’). The idea of ‘self harm’ is interesting; aneuploidy arising from a healthy euploid embryo. We can also see a further situation where the gametes suffered damage (e.g. DNA fragmentation, unresolved crossovers, persistence of meiotic breaks) leading to mitotic errors. In that case, the embryo would suffer the consequences of harm in the gametes, and ‘aneuploidy rescue’ may be a useful term also. We will discuss this further and reword the terminology along these lines.

      Lines 69-73: The ability to quantify aneuploidy in known admixtures of aneuploid cells is indeed well established. However, the authors claim that the translation of this to embryo biopsy samples is inferred with some confidence and that if a biopsy shows an intermediate chromosome copy number (ICN), that the biopsy and the embryo are mosaic. There are no references provided here and indeed the only evidence in the literature relating to this is to the contrary. Multifocal biopsy studies have shown that an ICN result in a single biopsy is often not seen in other biopsies from the same embryo (Capalbo et al 2021; Kim et al., 2022; Girardi et al., 2023; Marin, Xu, and Treff 2021). Multifocal biopsies showing reciprocal gain and loss which would provide stronger validation for the presence of true mosaicism are also rare. In this work, the entire manuscript is based on the accuracy of ICN in a biopsy being reflective of mosaicism in the embryo. The evidence however points to a large proportion of ICN detected in embryo biopsy potentially being technical artifacts (misdiagnosing both constitutionally normal and abnormal (meiotic aneuploid) embryos as mosaic. Therefore, although results from the modelling provide insight into theoretical results, these can not be used to inform clinical decision-making at all.

      We thank the reviewer for raising this important conceptual point, which needs to be addressed. The fact that mosaicism is often not observed in serial biopsies of the same embryo is precisely an inherent feature of mosaicism and is an invalid argument to discount the original diagnosis as false. The detection of ICN is not trivial and certain PGT-A platforms might not have the capability to discern noise from true ICN, hence the need for proper validation of the technology. The most stringent validation method for mosaicism detection remains the admixture experiment, such that when ICN patterns are detected the most obvious conclusion is that the biopsy contained a mosaic mix of cells. We aim to add wording regarding these points in the manuscript.

      Lines 87-89: The authors make the claim that emerging evidence is suggestive that the majority of embryos are mosaic to some degree. If in fact, mosaicism is the norm, the clinical importance may be limited.

      If the majority of embryos are mosaic to some degree, it is important to understand the impacts that this may have on PGT-A biopsies and how informative such biopsies may be. Returning to the point the reviewer made above about mitotic aneuploidies as an important consideration: a mitotic nondisjunction at the first cleavage would result in a embryo that was entirely aneuploid. A mitotic nondisjunction occurring at the second cleavage would result in an embryo with 50% aneuploid cells, at the third cleavage, 25% aneuploid cells. If these aneuploid cells fail to proliferate, or are removed (either actively or passively), the level of aneuploidy will fall over time. While mosaicism is a binary (an embryo is or is not a mosaic of karyotypes), even if most embryos are mosaic, the clinical importance will depend on the level of aneuploidy.

      Line 102-103: The statement that data shows that the live birth rate per ET is generally lower in mosaic embryos than euploid embryos is from retrospective cohort studies that suffer from significant selection bias. The authors have ignored non-selection study results (Capalbo et al, ajhg 2021) that suggest that putative mosaicism has limited predictive value when assessed prospectively and blinded.

      We will add the referenced multifocal biopsy study, but in contrast to the reviewer we see the data it contains as supporting our position in this paper. Capalbo et al. performed rebiopsies of trophectoderm and a biopsy of inner cell mass and found that high level mosaic or aneuploid trophectoderm tended to correlate with abnormal karyotypes in the inner cell mass while low level mosaics correlated with a normal inner cell mass. This supports our point that measuring levels of aneuploidy in the trophectoderm is relevant, and that this gives useful information for ranking embryos.

      Lines 94-98: The authors have misrepresented the works they have presented as evidence for biopsy result accuracy (Kim et al., 2023; Victor et al 2019; Capalbo et al., 2021; Girardi et al., 2023, and any others). These studies show that a mosaic biopsy is not representative of the whole embryo and can actually be from embryos where the remainder of the embryo shows no evidence of mosaicism. There is also a missing key reference of Capalbo et al, AJHG 2021, and Girardi et al., HR 2023 where multifocal biopsies were taken.

      As above, we will add more information on these multifocal biopsy studies; we believe these studies also support our position: that individual biopsies are not predictive of aneuploidy level in an embryo. If mosaicism is detected in the biopsy, then the embryo is mosaic, but if the remainder of the embryo is euploid then that single biopsy was not an accurate representation of the embryo. This could also apply in reverse - if mosaicism is not detected in the biopsy, it does not mean there is no mosaicism in the embryo, only that mosaicism could not be identified.

      Lines 371-372: "Selecting the embryo with the lowest number of aneuploid cells in the biopsy for transfer is still the most sensible decision". Where is the evidence for this other than the modeling which is affected by oversimplification and unproven assumptions? Although the statement seems logical at face value, there is no concrete evidence that the proportion of aneuploid cells within a biopsy is valuable for clinical outcomes, especially when co-evaluated with other more relevant clinical information.

      We made this statement as part of a thought experiment to explain the difference between the concepts of absolute measurements versus embryo ranking. This section is not a result of the model, or clinical advice; it is a statement that in the specific example embryos given, the embryo with the fewest aneuploid cells in the biopsy would still be the embryo with the fewest aneuploid cells overall, and thus transferring this embryo (in the absence of any other differences of embryo quality) would remain sensible.

      Lines 431-463: In this section, the authors discuss clinical outcome data from the transfer of putative mosaic embryos and make conclusions about the relationship between ICN level in biopsy and successful pregnancy outcomes. The retrospective and selective nature of the data used in forming the results has the potential to lead to incorrect conclusions when applied to prospective unselected data.

      We believe the clinical data is a useful biological reality check, and we are discussing how to integrate it better with the modelling.

      Reviewer #3 (Public Review):

      Unfortunately, this study fails to incorporate the most important variable impacting the ability to predict mosaicism, the accuracy of the test. The fact is that most embryos diagnosed as mosaic are not mosaic. There may be 4 cases out of thousands and thousands of transfers where a confirmation was made. Mosaicism has become a category of diagnosis in which embryos with noisy NGS profiles are placed. With VeriSeq NGS it is not possible to routinely distinguish true mosaicism from noise. An analysis of NGS noise levels (MAPD) versus the rate of mosaics by clinic using the registry will likely demonstrate this is the case. Without accounting for the considerable inaccuracy of the method of testing the proposed modeling is meaningless.

      We disagree with the reviewer that the modelling is meaningless; we disagree that mosaicism is rare (see our other points). However, if we grant that mosaicism is rare, that almost all embryos are euploid or aneuploid, and that technical noise is the primary factor generating intermediate copy number values, then it is still important to understand how to interpret such intermediate values. Low-level mosaics would more likely represent miscalled euploid embryos, and high-level mosaics would more likely represent miscalled aneuploid embryos. We demonstrate that ranking on these intermediate values correlates with implantation rates and live birth rates, supporting their use. We do agree that technical accuracy of the NGS is an important consideration, and we will be incorporating this into our modelling in the future.

      Recent data using more accurate methods of identifying mosaicism indicate that the prevalence of true preimplantation embryonic mosaicism is only 2%, which is also consistent with findings made post-implantation. This model fails to account for the possibility that, because so few embryos are actually mosaic, there is actually no relevance to clinical care whatsoever. In fact, differences in clinical outcomes of embryos designated as mosaic could be entirely attributed to poor embryo quality resulting in noise levels that make NGS results fall into the "mosaic" category.

      As we also wrote in the point above, we disagree; it is possible that a euploid embryo may be misinterpreted as a mosaic. It is also possible that an aneuploid embryo is misinterpreted as a mosaic. Whether the intermediate copy number values arise through biological or technical reasons, they contain information that is useful to decisions on whether to transfer. We also note a recent paper that performed single-cell dissociation of trophectoderm versus inner cell mass which found that mosaicism in human embryos is very common (Chavli et al, 2024, DOI:10.1172/JCI174483).

      Additional comments:

      “Indeed, as more data emerges, it appears that the majority of embryos from both healthy and infertile couples are mosaic to some degree (Coticchio et al., 2021; Griffin et al., 2022).”

      This statement should be softened as all embryos will be considered mosaic when a method with a 10% false positive rate is applied to 10 more parts of the same embryo. The distinction between artifact and true mosaicism cannot be made with nearly all current methods of testing. When virtually no embryos display uniform aneuploidy in a rebiopsy study, there should be great concern over the accuracy of the testing used. The vast majority of aneuploidy is meiotic in origin.

      We note that reviewer 2 wrote that mitotic aneuploidy was the key concern, whereas reviewer 3 states meiotic aneuploidy is more common; we argue that both are relevant; a recent study by McCoy et al, 2023 (DOI:10.1186/s13073-023-01231-1) found that both drive arrest of human IVF embryos.

      “Experimental data provides strong evidence that, for the most part, the biopsy result obtained accurately represents the chromosome constitution of the rest of the embryo (Kim 96 et al., 2022; Navratil et al., 2020; Victor et al., 2019).”

      This statement is incorrect given published systematic review of the literature indicates a 10% false positive rate based on rebiopsy results.

      This shows that accurately classifying a mosaic embryo based on a single biopsy is not robust.

      This is exactly why the practice of designating embryo mosaics with intermediate copy numbers should not exist.

      We agree that accurately classifying a mosaic embryo based on a single biopsy is not robust. That is one of the main messages of this paper. What we show here is that biopsies from a mosaic embryo are indeed likely to disagree with each other - but we find that there is still enough information at a population level for this to be an indicator or embryo outcomes. We have not yet performed modelling to explore the effect of technical error, so we will not speculate on the impact, but we reiterate a point made earlier: the most stringent validation method for mosaicism detection remains the admixture experiment, such that when intermediate copy number patterns are detected the most obvious conclusion is that the biopsy contained a mosaic mix of cells.

    1. eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. However, a limitation of the work is the lack of control experiments with bedaquiline and pretomanid only, to further dissect the relevant contributions of linezolid and spectinamide in efficacy and adverse effects. Although the manuscript is well written overall, a re-formulation of some of the stated hypotheses and conclusions, as well as the addition of text to contextualize translatability, would improve its value.

    2. Reviewer #2 (Public Review):

      Summary:

      Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, and reduces hematological changes and pro-inflammatory responses.

      Strengths:

      The authors not only measure efficacy but also quantify histological changes, hematological responses, and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency.

      Weaknesses:

      The articulation of objectives and hypotheses could be improved.

    3. Reviewer #3 (Public Review):

      Summary:

      In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients long-term, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaL did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:

      The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups, etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well-written and cited article with an informative introduction.

      Weaknesses:

      The authors performed a large amount of work with the drugs given at the doses and dosing intervals started, but at present, there is no exposure data available in the paper. It would be of great value to understand the exposures achieved in plasma at least (and in the lung if more relevant for S) in order to better understand how these relate to clinical exposures that are observed at marketed doses for B, Pa, and L as well as to understand the exposure achieved at the doses being evaluated for S. If available as historical data this could be included/cited. Considering the great attempts made to evaluate parameters that are relevant to clinical adverse events, it would add value to understand what exposures of drug effects such as anemia, weight loss, and bone marrow effects, are being observed.

      It would also be of value to add an assessment of whether the weight loss, anemia, or bone marrow effects observed for BPaL are considered adverse, and the extent to which we can translate these effects from mouse to patient (i.e. what are the limitations of these assessments made in a mouse study?). For example, is the small weight loss seen as significant, or is it reversible? Is the magnitude of the changes in blood parameters similar to the parameters seen in patients given L?

      In addition, it is always challenging to interpret findings for combinations of drugs, so the addition of language to explain this would add value: for example, how confident can we be that the weight loss seen for only the BPaL group is due to L as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa?

      Turning to the evaluations of activity in mouse TB models, unfortunately, the evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection and so, to this reviewer's understanding of the data, comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS; it is possible that BPa would have shown the same efficacy as the 3 drug combinations. It would be valuable to conduct a study including a BPa control and with a shorter treatment time to allow comparison of BPa, BPaS, and BPaL. In the Kramnik lungs, as the authors rightly note, the studies do not support any contribution of S or L to BPa - i.e. the activity observed for BPa, BPaL, and BPaS did not significantly differ. Although the conclusions note equivalency of BPaL and BPaS, which is correct, it would be helpful to also include BPa in this statement; it would be useful to conduct a study dosing for a longer period of time or assessing a relapse endpoint, where it is possible that a contribution of L and/or S may be seen - thus making a stronger argument for S contributing an equivalent efficacy to L. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.

    4. Reviewer #1 (Public Review):

      Summary:

      This manuscript is an extension of previous studies by this group looking at the new drug spectinamide 1599. The authors directly compare therapy with BPaL (bedaquiline, pretomanid, linezolid) to a therapy that substitutes spectinamide for linezolid (BPaS). The Spectinamide is given by aerosol exposure and the BPaS therapy is shown to be as effective as BPaL without adverse effects. The work is rigorously performed and analyses of the immune responses are consistent with curative therapy.

      Strengths:

      (1) This group uses 2 different mouse models to show the effectiveness of the BPaS treatment.

      (2) Impressively the group demonstrates immunological correlates associated with Mtb cure with the BPaS therapy.

      (3) Linezolid is known to inhibit ribsomes and mitochondria whereas spectinaminde does not. The authors clearly demonstrate the lack of adverse effects of BPaS compared to BPaL.

      Weaknesses:

      (1) Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The manuscript by Jingsong Zhou and colleagues tries to uncover the reasons for the resistance of extraocular muscles (EOMs) to degenerative changes induced by amyotrophic lateral sclerosis (ALS). The findings of the study offer valuable information that EOMs are spared in ALS because they produce protective factors for the NMJ and, more specifically, factors secreted by EOM-derived satellite cells. While most of the experimental approaches are convincing, the use of sodium butyrate (NaBu) in this study needs further investigation, as NaBu might have a variety of biological effects. Overall, this work may help develop future therapeutic interventions for patients with ALS.

      We agree with the editor that NaBu have a variety of biological effects that require further investigation. Our team previously have explored the effect of NaBu treatment on intestinal microbiota and intestinal epithelial permeability (DOI: 10.1016/j.clinthera.2016.12.014), on the mitochondrial respiratory function of NSC-34 motor neuron cell line overexpressing hSOD1G93A (DOI: 10.3390/biom12020333) and on the mitochondrial function of skeletal muscle myofibers of G93A mice (DOI: 10.3390/ijms22147412). Other research teams have also explored the role of NaBu (or HDAC inhibition) in neuronal survival and axonal transport (DOIs: 10.1073/pnas.0907935106; 10.1038/s41467-017-00911-y; 10.15252/embj.2020106177; 10.1093/hmg/ddt028).

      Since the theme of this manuscript is the transcriptomic characteristics of EOM SCs, to include data of how NaBu affect cellular/molecular processes of other tissues will somewhat deviate from the theme. It would be more appropriate to develop a separate manuscript focusing on other tissues.

      We appreciate the feedback from the Editors and reviewers. We realized that our previous description on butyrate’s beneficial role might be overstated in the Abstract Section. We have made two changes to avoid potential overstatement of our finding: (1) We modified the Abstract to state that “the NaBu-induced transcriptomic changes resembling the patterns of EOM SCs “may contribute to” (instead of “underlie”) the beneficial effects observed in G93A mice” (Page 1, Line 29); (2) We have edited the corresponding paragraph in the Discussion section to emphasize that the effect of NaBu treatment is multi-faceted (Page 11, Line 459-461).

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      line 388-389. The sentence has been corrected but is still not clear. What do the authors mean by ".....resulting in higher proportion of COX-deficient myofibers than other muscles». What other muscles do they refer to?

      Other muscles refer to muscles whose stem cells remain dormant under physiological conditions (uninjured, innervated), such as EDL. We have edited the sentence accordingly. (Page 10, Line 431-432)

      In reference to the results shown in Fig. 2, 7, 8 and 9. Since the experimenters were not blinded, this should be explicitly stated in the Methods section.

      We have added the disclaimer in the current “Data analysis and statistics” section in Methods as follows: “The experimenters were not blinded to the samples in data collection and analysis.” (Page 15, Line 636)

      Figure 7 C has been amended but now the inserted ANOVA values interfere with the correct visualization of Fig. 7D, can panels D be moved down so that they are better separated from panels in Fig. 7C

      Thanks for the comment and we have edited Figure 7 accordingly.

      Reviewer #4 (Recommendations For The Authors):

      The authors have revised the manuscript per the reviewer's comments in this study. While most of the concerns were addressed, a few concerns remain.

      The molecular basis of how AAV-mediated delivery of Cxcl12 improves the phenotype of satellite cells is still unclear.

      Thanks for the comment. As one of the earliest discovered chemokines, the chemotactic role of Cxcl12-Cxcr4 axis on cells and cellular processes (such as axons) has been comprehensively investigated by different functional assays from overexpression to protein application to inhibitor application to knockdown by shRNAs in different types of tissues. To list a few examples, the establishment of the correct routing trajectories of mammalian motor axons and oculomotor axons during embryonic development (DOIs: 10.1016/j.neuron.2005.08.011; 10.1167/iovs.18-25190). The regeneration of injured motor axon terminals guided by terminal Schwann cells in adult mice (DOI: 10.15252/emmm.201607257). The migration of neural crest cells to sympathetic ganglia in the formation of sympathetic nerve system during embryogenesis (DOI: 10.1523/JNEUROSCI.0892-10.2010). The migration of myoblasts in the process of fusion into myotubes (DOIs: 10.1242/jcs.066241; 10.1111/boc.201200022; 10.1074/jbc.M706730200).

      Because the existence of so many detailed mechanistic studies, our goal for this manuscript is not to identify a novel mechanism of how Cxcl12-mediated chemotaxis is achieved. Rather, we used it as one of the proof-of-concept mechanisms contributing to the resistance of EOMs against ALS and benefits of NaBu treatment. Certainly, it is not the sole mechanism.

      To address the reviewer’s concern, we have expanded discussion about the previous studies regarding the chemotactic effect of Cxcl12 in the discussion section. (Page 10, Line 435-436, Page 11, Line 445-446)

      The NaBu experiments may need additional support from other approaches. NaBu effects may not be directly related to satellite cells or muscle cells. Thus, the animal experiment results need to be carefully interpreted.

      We agree that NaBu have a variety of biological effects that require further investigation. Our team previously have explored the effect of NaBu treatment on intestinal microbiota and intestinal epithelial permeability (DOI: 10.1016/j.clinthera.2016.12.014), on the mitochondrial respiratory function of NSC-34 motor neuron cell line overexpressing hSOD1G93A (DOI: 10.3390/biom12020333) and on the mitochondrial function of skeletal muscle myofibers of G93A mice (DOI: 10.3390/ijms22147412). Other research teams have also explored the role of NaBu (or HDAC inhibition) in neuronal survival and axonal transport (DOIs: 10.1073/pnas.0907935106; 10.1038/s41467-017-00911-y; 10.15252/embj.2020106177; 10.1093/hmg/ddt028).

      Since the theme of this manuscript is the transcriptomic characteristics of EOM SCs, to include data of how NaBu affect cellular/molecular processes of other tissues will somewhat deviate from the theme. It would be more appropriate to develop a separate manuscript specifically addressing the impact of NaBu on other tissues.

      We appreciate the feedback from the reviewers. We realized that our previous description on butyrate’s beneficial role might be overstated in the Abstract Section. In response, we have made two changes to avoid potential overstatement of our finding: (1) We modified the Abstract to state that “the NaBu-induced transcriptomic changes resembling the patterns of EOM SCs “may contribute to” (instead of “underlie”) the beneficial effects observed in G93A mice” (Page 1, Line 29); (2) We edited the corresponding paragraph in the Discussion section to emphasize that the effect of NaBu treatment is multi-faceted (Page 11, Line 459-461).

    2. eLife assessment

      The manuscript by Jingsong Zhou and colleagues uncovers why the extraocular muscles (EOMs) are preserved while other muscles undergo degenerative changes in amyotrophic lateral sclerosis (ALS). In this work, the authors have used a mouse model of familial ALS that carries a G93A mutation in the Sod1 gene to demonstrate that NaBu treatment partially restores the integrity of NMJ in the limb and diaphragm muscles of G93A mice. The findings of the study offer important information that EOMs are spared in ALS because they produce protective factors for the NMJ and, more specifically, factors secreted by EOM-derived satellite cells. While most of the experimental approaches are convincing, the use of sodium butyrate (NaBu) in this study needs further investigation, as NaBu might have a variety of biological effects. Overall, this work may help develop future therapeutic interventions for patients with ALS.

    3. Joint Public Review:

      Summary:

      In their paper Li et al. investigate the transcriptome of satellite cells obtained from different muscle types including hindlimb, diaphragm and extraocular muscles (EOM) from wild type and G93A transgenic mice (end stage ALS) in order to identify potential factors involved in the maintenance of the neuromuscular junction. The underlying hypothesis being that since EOMs are largely spared from this debilitating disease, they may secrete NMJ-protective factors. The results of their transcriptome analysis identified several axon guidance molecules including the chemokine Cxcl12, which are particularly enriched in EOM-derived satellite cells. Transduction of hindlimb-derived satellite cells with AAV encoding Cxcl12 reverted hindlimb-derived myotubes from the G93A mice into myotubes sharing phenotypic characteristics similar to those of EOM-derived satellite cells. Additionally, the authors were able to demonstrate that EOM-derived satellite cell myotube cultures are capable of enhancing axon extensions and innervation in co-culture experiments.

      Strengths:

      The strength of the paper is that the authors successfully isolated and purified different populations of satellite cells, compared their transcriptomes, identified specific factors release by EOM-derived satellite cells, overexpressed one of these factors (the chemokine Cxcl12) by AAV-mediated transduction of hindlimb-derived satellite cells. The transduced cells were then able to support axon guidance and NMJ integrity. They also show that administration of Na butyrate to mice decreased NMJ denervation and satellite cell-depletion of hind limbs. Furthermore, addition of Na Butyrate to hindlimb derived satellite cell myotube cultures increased Cxcl12 expression. These are impressive results providing important insights for the development of therapeutic targets to slow the loss on neuromuscular function characterizing ALS.

      Comments on latest version:

      The authors have sufficiently acknowledged and discussed the limitations of experiments involving NaBu treatment. The authors have also addressed the use of AAV-mediated delivery of Cxcl12.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ngo et al. report a peculiar effect where a single base mismatch (CC) can enhance the mechanical stability of a nucleosome. In previous studies, the same group used a similar state-of-the-art fluorescence-force assay to study the unwrapping dynamics of 601-DNA from the nucleosome and observed that force-induced unwrapping happens more slowly for DNA that is more bendable because of changes in sequence or chemical modification. This manuscript appears to be a sequel to this line of projects, where the effect of CC is tested. The authors confirmed that CC is the most flexible mismatch using the FRET-based cyclization assay and found that unwrapping becomes slower when CC is introduced at three different positions in the 601 sequence. The CC mismatch only affects the local unwrapping dynamics of the outer turn of nucleosomal DNA.

      Strengths:

      These results are in good agreement with the previously established correlation between DNA bendability and nucleosome mechanical stability by the same group. This well-executed, technically sound, and well-written experimental study contains novel nucleosome unwrapping data specific to the CC mismatch and 601 sequence, the cyclizability of DNA containing all base pair mismatches, and the unwrapping of 601-DNA from xenophus and yeast histones. Overall, this work will be received with great interest by the biophysics community and is definitely worth attention.

      Weaknesses:

      The scope and impact of this study are somewhat limited due to the lack of sequence variation. Whether the conclusion from this study can be generalized to other sequences and other bendability-enhancing mismatches needs further investigation.

      Major questions:

      (1) As pointed out by the authors, the FRET signal is not sensitive to nucleosome position; therefore, the increasing unwrapping force in the presence of CC can be interpreted as the repositioning of the nucleosome upon perturbation. It is then also possible that CC-containing DNA is not positioned exactly the same as normal DNA from the start upon nucleosome assembly, leading to different unwrapping trajectories. What is the experimental evidence that supports identical positioning of the nucleosomes before the first stretch?

      We added the following and refer to our recent publication1 to address this question.

      “This is consistent with a previous single nucleotide resolution mapping of dyad position from of a library of mismatches in all possible positions along the 601 sequence or a budding yeast native sequence which showed that a single mismatch (A-A or T-T) does not affect the nucleosome position27.”

      (2) The authors chose a constant stretching rate in this study. Can the authors provide a more detailed explanation or rationale for why this rate was chosen? At this rate, the authors found hysteresis, which indicates that stretching is faster than quasi-static. But it must have been slow and weak enough to allow for reversible unwrapping and wrapping of a CC-containing DNA stretch longer than one helical turn. Otherwise, such a strong effect of CC at a single location would not be seen. I am also curious about the biological relevance of the magnitude of the force. Can such force arise during nucleosome assembly in vivo?

      To address the comment about the magnitude of force, we added the following paragraph to Introduction. “RNA polymerase II can initiate transcription at 4 pN of hindering force2 and its elongation activity continues until it stalls at ~ 10 pN of hindering force3,4. Therefore, the transcription machinery can generate picoNewtons of force on chromatin as long as both the machinery and the chromatin segment in contact are tethered to stationary objects in the nucleus. Another class of motor protein, chromatin remodeling enzymes, was also shown to induce processive and directional sliding of single nucleosomes when the DNA is under similar amount of tension (~ 5 pN)5. Therefore, measurements of nucleosomes at a few pN of force will expand our knowledge of the physiology roles of nucleosome structure and dynamics.”

      To address the comment about the stretching rate, we added the following to Results. We note that the physiological loading rate has been challenging to determine for any biomolecular interactions, and the only quantitative measurement we are aware of is that of an integrin that we are citing.

      “The force increases nonlinearly and the loading rate, i.e. the rate at which the force increases, was approximately in the range of 0.2 pN/s to 6 pN/s, similar to the cellular loading rates for a mechanosensitive membrane receptor6.”

      (3) In this study, the CC mismatch is the only change made to the 601 sequence. For readers to truly appreciate its unique effect on unwrapping dynamics as a base pair defect, it would be nice to include the baseline effects of other minor changes to the sequence. For example, how robust is the unwrapping force or dynamics against a single-bp change (e.g., AT to GC) at the three chosen positions?

      Unfortunately, we are unable to perform the suggested unwrapping experiment in a timely manner because the instrument has been disassembled during our recent move. However, we previously performed unwrapping experiments not only as a function of sequence but also as a function of cytosine modification and showed that we can detect even more subtle effects7,8. In addition, please note that we are not claiming that simply changing basepair at the chosen sites changes the mechanical stability of a nucleosome so we do not believe the requested experiment is necessary.

      (4) The last section introduces yeast histones. Based on the theme of the paper, I was expecting to see how the effect of CC is or is not preserved with a different histone source. Instead, the experiment only focuses on differences in the unwrapping dynamics. Although the data presented are important, it is not clear how they fit or support the narrative of the paper without the effect of CC.

      We apologize for giving the reviewer a wrong impression. We included the data because we believe that information on how the histone core can determine the translation of DNA mechanics into nucleosome mechanical stability will be of interest to the readers of this manuscript. We now mention explicitly that the observation was made using intact DNA, i.e. no mismatch, in the abstract and elsewhere.

      (5) It is stated that tRNA was excluded in experiments with yeast-expressed nucleosomes. What is the reason for excluding it for yeast nucleosomes? Did the authors rule out the possibility that tRNA causes the measured difference between the two nucleosome types?

      We normally include tRNA because we found that it reduces sticking of beads to the surface over several hours of experiments. In yeast nucleosomes, we found that tRNA causes the nucleosome to disassemble. Therefore, we did not include tRNA in yeast nucleosome experiments. We now mention this in Methods as reproduced below.

      “tRNA, which we normally include to reduce sticking of beads to the surface over the hours of single molecule experiments in a sealed chamber, was excluded in experiments with yeastexpressed nucleosomes because tRNA induced disassembly of nucleosomes assembled using yeast histones.”

      We cannot not formally rule out the possibility that tRNA causes the measured difference between Xenopus - vs Yeast- nucleosomes. However, we have shown in our previous publication7 that the asymmetric unwrapping in Xenopus nucleosomes was modulated by the DNA sequence. When we swapped the sequence of the inner turn between the two sides, while tRNA was included in all experiments, we observed stochastic unwrapping instead. As part of our response to another reviewer’s comments, we also added the following on the relevant differences between the species in Discussion.

      “The crystal structure of the yeast nucleosome suggests that yeast nucleosome architecture is subtly destabilized in comparison with nucleosomes from higher eukaryotes9. Yeast histone protein sequences are not well conserved relative to vertebrate histones (H2A, 77%; H2B, 73%; H3, 90%; H4, 92% identities), and this divergence likely contributes to differences in nucleosome stability. Substitution of three residues in yeast H3 a3-helix (Q120, K121, K125) very near the nucleosome dyad with corresponding human H3.1/H3.3 residues (QK…K replaced with MP…Q) caused severe growth defects, elevated nuclease sensitivity, reduced nucleosome positioning and nucleosome relocation to preferred locations predicted by DNA sequence alone 10. The yeast histone octamer harboring wild type H3 may be less capable of wrapping DNA over the histone core, leading to reduced resistance to the unwrapping force for the more flexible half of the 601positioning sequence.”

      Reviewer #2 (Public Review):

      Summary:

      Mismatches occur as a result of DNA polymerase errors, chemical modification of nucleotides, during homologous recombination between near-identical partners, as well as during gene editing on chromosomal DNA. Under some circumstances, such mismatches may be incorporated into nucleosomes but their impact on nucleosome structure and stability is not known. The authors use the well-defined 601 nucleosome positioning sequence to assemble nucleosomes with histones on perfectly matched dsDNA as well as on ds DNA with defined mismatches at three nucleosomal positions. They use the R18, R39, and R56 positions situated in the middle of the outer turn, at the junction between the outer turn and inner turn, and in the middle of the inner turn, respectively. Most experiments are carried out with CC mismatches and Xenopus histones. Unwrapping of the outer DNA turn is monitored by singlemolecule FRET in which the Cy3 donor is incorporated on the 68th nucleotide from the 5'-end of the top strand and the Cy5 acceptor is attached to the 7th nucleotide from the 5' end of the bottom strand. Force is applied to the nucleosomal DNA as FRET is monitored to assess nucleosome unwrapping. The results show that a CC mismatch enhances nucleosome mechanical stability. Interestingly, yeast and Xenopus histones show different behaviors in this assay. The authors use FRET to measure the cyclization of the dsDNA substrates to test the hypothesis that mismatches enhance the flexibility of the 601 dsDNA fragment and find that CC, CA, CT, TT, and AA mismatches decrease looping time, whereas GA, GG, and GT mismatches had little to no effect. These effects correlate with the results from DNA buckling assays reported by Euler's group (NAR 41, 2013) using the same mismatches as an orthogonal way to measure DNA kinking. The authors discuss that substitution rates are higher towards the middle of the nucleosome, suggesting that mismatches/DNA damage at this position are less accessible for repair, consistent with the nucleosome stability results.

      Strengths:

      The single-molecule data show clear and consistent effects of mismatches on nucleosome stability and DNA persistence length.

      Weaknesses:

      It is unclear in the looping assay how the cyclization rate relates to the reporting looping time. The biological significance and implications such as the effect on mismatch repair or nucleosome remodelers remain untested. It is unclear whether the mutational pattern reflects the behavior of the different mismatches. Such a correlation could strengthen the argument that the observed effects are relevant for mutagenesis.

      Reviewer #3 (Public Review):

      Summary:

      The mechanical properties of DNA wrapped in nucleosomes affect the stability of nucleosomes and may play a role in the regulation of DNA accessibility in eukaryotes. In this manuscript, Ngo and coworkers study how the stability of a nucleosome is affected by the introduction of a CC mismatched base pair, which has been reported to increase the flexibility of DNA. Previously, the group has used a sophisticated combination of single-molecule FRET and force spectroscopy with an optical trap to show that the more flexible half of a 601 DNA segment provides for more stable wrapping as compared to the other half. Here, it is confirmed with a single-molecule cyclization essay that the introduction of a CC mismatch increases the flexibility of a DNA fragment. Consistent with the previous interpretation, it also increased the unwrapping force for the half of the 601 segment in which the CC mismatch was introduced, as measured with single-molecule FRET and force spectroscopy. Enhanced stability was found up to 56 bp into the nucleosome. The intricate role of mechanical stability of nucleosomes was further investigated by comparing force-induced unwrapping profiles of yeast and Xenopus histones. Intriguingly, asymmetric unwrapping was more pronounced for yeast histones.

      Strengths:

      (1) High-quality single-molecule data.

      (2) Novel mechanism, potentially explaining the increased prominence of mutations near the dyads of nucleosomes.

      (3) A clear mechanistic explanation of how mismatches affect nucleosome stability.

      Weaknesses:

      (1) Disconnect between mismatches in nucleosomes and measurements comparing Xenopus and yeast nucleosome stability.

      (2) Convoluted data in cyclization experiments concerning the phasing of mismatches and biotin site. ---

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific comments:

      In Figure 1 legend, "the black diamonds on the DNA bends represent the mismatch position with R18 and R39 on minor grooves and R56 on a major groove." Minor and major grooves should be phrased as histone-facing minor and major grooves.

      We fixed the problem.

      In Materials and Methods, the sentence that describes the stretching rate cites reference 1, which does not seem to be relevant.

      We fixed the problem.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the introduction, the authors should also discuss the context of mismatches occurring during homologous recombination in meiosis or somatic cells in non-allelic recombination between near identical repeats.

      Introduction now has the following.

      “DNA base-base mismatches are generated by nucleotide misincorporation during DNA synthesis, meiotic recombination, somatic recombination between nearly identical repeats, or chemical modification such as hydrolytic deamination of cytosine.”

      (2) Generally, it seems counter-intuitive in terms of biology that mismatches containing nucleosomes are more stable, as mismatches require repair and/or detection for heteroduplex rejection during recombination. Some discussion of this apparent paradox should be added.

      To address this comment, we added the following to Discussion.

      “The higher frequency of substitutions in the nucleosomal DNA may be attributed to the difficulty of accessing the extra-stable nucleosomes. We also note that even without an enhanced stability, a mismatch within a nucleosome would be more difficult to detect for mismatch repair machineries compared to a mismatch in a non-nucleosomal DNA. Because mismatch repair machineries accompany the replisome, most of nascent mismatches may be detected for repair before nucleosome deposition. Therefore, the decrease in accessibility predicted based on our data here may be important only in rare cases a mismatch is not detected prior to the deposition of a nucleosome on the nascent DNA or in cases where a mismatch is generated via a non-replicative mechanism.”

      (3) The authors discuss that the substitution rate is higher while the indel (insertion and deletion) rate is lower nearer the center of a positioned nucleosome. Are the differences between individual mismatches reported in Figure 6 reflected in the mutagenic profile?

      We cannot currently compare them because the mutagenic profile even when it is available is a complex convolution of mismatch generation, mismatch repair and selection. Mismatch generation occurs through several different processes and how they are affected by nucleosomes and their mismatch type and sequence context is unknown. Mismatch repair process itself depends on mismatch type and sequence context as recently shown by a high throughput in vivo study11. And because the population genetics does not simply reflect de novo mutation profiles due to selection, comparison between mismatch-induced DNA mechanical changes and mutagenic profiles is further complicated. We added the following to the revision.

      “If and how the mismatch type-dependent DNA mechanics affects the sequence-dependent mismatch repair efficiency in vivo, as recently determined in a high through study in E. coli11, remains to be investigated. Comparison of mismatch-type dependent DNA mechanics to population genetics data is challenging because mutation profiles reflect a combined outcome of mismatch-generation, mismatch repair and selection in addition to other mutational processes.”

      (4) The looping assay should be explained better, especially how the cyclization rate is related to the reported looping time.

      We modified Figure 5 to include examples of looping time determination through fitting of the looped fraction vs time, and added the following to the figure caption.

      “To calculate the looping time, the fraction of looped molecules (high FRET) as a function of time is fitted to an exponential function, 𝑒−𝑡⁄(𝑙𝑜𝑜𝑝𝑖𝑛𝑔 𝑡𝑖𝑚𝑒) (right panel for one run of experiments).

      Furthermore, we added the following sentence to Results.

      “The rate of loop formation, which is the inverse of looping time determined from an exponential fitting of loop fraction vs time, was used as a measure of apparent DNA flexibility influenced by a mismatch 12,13.”

      *Reviewer #3 (Recommendations For The Authors):

      I have some concerns that, when addressed upon revision, would improve the manuscript:

      (1) Page 6 and Supplementary Figure S1C: Though the FRET levels are the same for all nucleosomes, the distribution between the two levels is not. The nucleosomes with CC mismatches appear to have a larger fraction in the low-FRET population. This seems to contradict the higher mechanical stability. A comment on this should clarify it, or make this conundrum explicit.

      Thank you for the comment. The low FRET population also includes the nucleosomes that do not have an active acceptor the fraction of which varies between preparations. We now note this in the supplementary figure caption.

      (2) It is intriguing that a more stable nucleosome forms after several pulling cycles and it is argued that this might be due to shifting of the nucleosome. This seems reasonable and has important consequences both for the interpretation of the current experimental data and for the general mechanisms involved in nucleosome maintenance and remodeling. It is puzzling though how this would work mechanistically since it only seems to happen when nucleosomes are half-wrapped and when the unwrapped half contains the mismatch. From the previous work of the group and the current manuscript, it seems that shift does not occur in DNA without mismatches (Correct?). Does shifting happen for the 601-R18 and 601-R56 nucleosomes as well?

      The mismatch-containing half is the half that is mechanically less stable in an intact, mismatch-free 601 nucleosome. So indeed, that is the half that is unwrapped in an intact nucleosome. But because the introduction of mismatch makes that half more mechanically stable, it can stay wrapped until higher forces, and the resulting structural distortion may cause the shift although we acknowledge that this interpretation remains speculative. Shifting occurs for all three constructs with a mismatch but not for the intact nucleosome without a mismatch.

      (3) Could the shifting be related to the differences in sub-population distribution observed in Supplementary Figure S1C?

      /See our response to comment (1) above.

      (4) The paper would have more impact if the mechanism of possible shifting could be clarified. This can be done experimentally with a fluorescent histone, as suggested in the manuscript. But having a FRET pair on positions in the DNA that would shift to closer proximity upon shifting, either at the ED2 or at the ED1 site will also work, is in line with the current experiments and seems feasible.

      We revised the text as follows in order not to exclude labeling configurations with both fluorophores on the DNA while reporting on the shift. We are also happy to add an appropriate reference if the reviewer can help us identify an existing study that measured dyad position shifts through such a labeling configuration.

      “However, since the FRET values in our DNA construct are not sensitive to the nucleosome position, further experiments with fluorophores conjugated to strategic positions that allow discrimination between different dyad positions14 will be required to test this hypothesis.”

      (5) Figures 5 and 6: To appreciate the quality of the data, state the number of molecules that contributed to the cyclization essay, or better, share a figure of the number of looped molecules as a function of time as supplementary data.

      We added the requested figures to Figure 5 and a new supplementary Figure 2, and added the following to Methods.

      “Approximately 2500 – 3500 molecules were quantified at each timestamp during the experiment, and three independent experiments were performed for each sequence (Supplemental Figure S2).”

      (6) Page 8/9: A control is added to confirm that the phasing of the biotin relative to the end affects the observed cyclization rate. However, the mismatch sites were chosen such that they included 5 bp phase shifts. This convolutes the outcomes, as the direction of flexibility due to the phasing of the mismatch relative to the biotin may also influence the rate. Was this checked?

      We would like to clarify that the phasing of the biotin is not so much as with respect to the end, as it is with respect to the full molecule. Static curvature and poloidal angle associated with the DNA molecule (which is something that is ultimately determined by the full chemical composition of the molecule, including its sequence and the mismatch) could make the molecule prefer a looped configuration where the biotin points towards the “inside” of the molecule. Such a configuration would be sterically unfavoured during the single molecule looping reaction where the biotin is attached to a surface via avidin. However, if the biotin is moved by half the helical repeat (or an off multiple of half the helical repeat, essentially 16 nt as done in the manuscript), it would now point to the “outside” of the molecule. Therefore, to make sure that the difference between the looping rates of any two DNA constructs (say the 601-RH and 601-R18-RH) is a better reflection of differences in dynamic flexibility, we ensure that the difference persists even when the biotin is moved by an odd multiple of half the helical repeat. We revised the section as follows.

      “For example, moving the location of the biotin tether by half the helical repeat (~ 5 bp) can lead to a large change in cyclization rate15, likely due to the preferred poloidal angle of a given DNA16 that determines whether the biotin is facing towards the inside of the circularized DNA, thereby hindering cyclization due to steric hindrance caused by surface tethering.”

      (7) Page 9/10: The comparison of yeast vs Xenopus is interesting, albeit a bit disconnected. Since the single-molecule statistics are relatively small, did the nucleosomes show similar bulk FRET distributions, or did they also show a shift in FRET levels?

      We included the data because we believe that information on how the histone core can determine the translation of DNA mechanics into nucleosome mechanical stability will be of interest to the readers of this manuscript. The FRET values were similarly distributed.

      (8) The discussion calls for a more detailed analysis of the structural differences of the histones of the two species to rationalize the observed asymmetry in flexibility dependence: why would yeast nucleosomes be less sensitive to sequence asymmetries?

      We added the following to Discussion to address this comment.

      “The crystal structure of the yeast nucleosome suggests that yeast nucleosome architecture is subtly destabilized in comparison with nucleosomes from higher eukaryotes9. Yeast histone protein sequences are not well conserved relative to vertebrate histones (H2A, 77%; H2B, 73%; H3, 90%; H4, 92% identities), and this divergence likely contributes to differences in nucleosome stability. Substitution of three residues in yeast H3 3-helix (Q120, K121, K125) very near the nucleosome dyad with corresponding human H3.1/H3.3 residues (QK…K replaced with MP…Q) caused severe growth defects, elevated nuclease sensitivity, reduced nucleosome positioning and nucleosome relocation to preferred locations predicted by DNA sequence alone 10. The yeast histone octamer harboring wild type H3 may be less capable of wrapping DNA over the histone core, leading to reduced resistance to the unwrapping force for the more flexible half of the 601positioning sequence.”

      (9) It would also be interesting if the increased stability due to the introduction of mismatches observed on Xenopus nucleosomes holds in yeast. Or does the reduced stability remove this effect? This is relevant to substantiate the broad claims in the context of evolution and cancer that are discussed in the manuscript.

      Unfortunately, we are unable to perform the suggested unwrapping experiment in a timely manner because the instrument has been disassembled during our recent move. However, in terms of cancer relevance, our mismatch dependence experiments were performed using vertebrate nucleosomes (Xenopus) so repeating this for yeast nucleosomes would not provide relevant information.

      Minor comments:

      (1) Supplementary Figure S1 misses the label '(C)' in its caption.

      We fixed it.

      (2) The supplementary data sequences for the fleezer measurements contain entrees 'R39 construct' and miss the positions of the Cy3 and Cy labels; the color code (levels of grey) is not explained.

      We fixed the labeling mistake and added detailed annotations of the highlighted features.

      References

      (1) Park, S., Brandani, G.B., Ha, T. & Bowman, G.D. Bi-directional nucleosome sliding by the Chd1 chromatin remodeler integrates intrinsic sequence-dependent and ATP-dependent nucleosome positioning. Nucleic Acids Res 51, 10326-10343 (2023).

      (2) Fazal, F.M., Meng, C.A., Murakami, K., Kornberg, R.D. & Block, S.M. Real-time observation of the initiation of RNA polymerase II transcription. Nature 525, 274-7 (2015).

      (3) Galburt, E.A., Grill, S.W., Wiedmann, A., Lubkowska, L., Choy, J., Nogales, E., Kashlev, M. & Bustamante, C. Backtracking determines the force sensitivity of RNAP II in a factor-dependent manner. Nature 446, 820-3 (2007).

      (4) Schweikhard, V., Meng, C., Murakami, K., Kaplan, C.D., Kornberg, R.D. & Block, S.M. Transcription factors TFIIF and TFIIS promote transcript elongation by RNA polymerase II by synergistic and independent mechanisms. Proc Natl Acad Sci U S A 111, 6642-7 (2014).

      (5) Kim, J.M., Carcamo, C.C., Jazani, S., Xie, Z., Feng, X.A., Yamadi, M., Poyton, M., Holland, K.L., Grimm, J.B., Lavis, L.D., Ha, T. & Wu, C. Dynamic 1D Search and Processive Nucleosome Translocations by RSC and ISW2 Chromatin Remodelers. bioRxiv (2024). (6) Jo, M.H., Meneses, P., Yang, O., Carcamo, C.C., Pangeni, S. & Ha, T. Determination of singlemolecule loading rate during mechanotransduction in cell adhesion. Science (in press).

      (7) Ngo, T.T., Zhang, Q., Zhou, R., Yodh, J.G. & Ha, T. Asymmetric unwrapping of nucleosomes under tension directed by DNA local flexibility. Cell 160, 1135-44 (2015).

      (8) Ngo, T.T., Yoo, J., Dai, Q., Zhang, Q., He, C., Aksimentiev, A. & Ha, T. Effects of cytosine modifications on DNA flexibility and nucleosome mechanical stability. Nat Commun 7, 10813 (2016).

      (9) White, C.L., Suto, R.K. & Luger, K. Structure of the yeast nucleosome core particle reveals fundamental changes in internucleosome interactions. EMBO J 20, 5207-18 (2001).

      (10) McBurney, K.L., Leung, A., Choi, J.K., Martin, B.J., Irwin, N.A., Bartke, T., Nelson, C.J. & Howe, L.J. Divergent Residues Within Histone H3 Dictate a Unique Chromatin Structure in Saccharomyces cerevisiae. Genetics 202, 341-9 (2016).

      (11) Kayikcioglu, T., Zarb, J.S., Lin, C.-T., Mohapatra, S., London, J.A., Hansen, K.D., Rishel, R. & Ha, T. Massively parallel single molecule tracking of sequence-dependent DNA mismatch repair in vivo. bioRxiv, 2023.01.08.523062 (2023).

      (12) Jeong, J., Le, T.T. & Kim, H.D. Single-molecule fluorescence studies on DNA looping. Methods 105, 34-43 (2016).

      (13) Jeong, J. & Kim, H.D. Base-Pair Mismatch Can Destabilize Small DNA Loops through Cooperative Kinking. Phys Rev Lett 122, 218101 (2019).

      (14) Blosser, T.R., Yang, J.G., Stone, M.D., Narlikar, G.J. & Zhuang, X. Dynamics of nucleosome remodelling by individual ACF complexes. Nature 462, 1022-7 (2009).

      (15) Basu, A., Bobrovnikov, D.G., Qureshi, Z., Kayikcioglu, T., Ngo, T.T.M., Ranjan, A., Eustermann, S., Cieza, B., Morgan, M.T., Hejna, M., Rube, H.T., Hopfner, K.P., Wolberger, C., Song, J.S. & Ha, T. Measuring DNA mechanics on the genome scale. Nature 589, 462-467 (2021).

      (16) Yoo, J., Park, S., Maffeo, C., Ha, T. & Aksimentiev, A. DNA sequence and methylation prescribe the inside-out conformational dynamics and bending energetics of DNA minicircles. Nucleic Acids Res 49, 11459-11475 (2021).

    2. eLife assessment

      This manuscript reports important data on the stability of nucleosomes with dsDNA substrates containing defined mismatches at three defined nucleosomal positions. Compelling evidence obtained by single-molecule FRET experiments shows that certain mismatches lead to more stable nucleosomes likely because mismatches kink to enhance DNA flexibility leading to higher nucleosome stability. The biological significance and implications of the findings remain unclear.

    3. Reviewer #1 (Public Review):

      In this manuscript, Ngo et al. report a peculiar effect where a single base mismatch (CC) can enhance the mechanical stability of a nucleosome. In previous studies, the same group used a similar state-of-the-art fluorescence-force assay to study the unwrapping dynamics of 601-DNA from the nucleosome and observed that force-induced unwrapping happens more slowly for DNA that is more bendable because of changes in sequence or chemical modification. This manuscript appears to be a sequel to this line of projects, where the effect of CC is tested. The authors confirmed that CC is the most flexible mismatch using the FRET-based cyclization assay and found that unwrapping becomes slower when CC is introduced at three different positions in the 601 sequence. The CC mismatch only affects the local unwrapping dynamics of the outer turn of nucleosomal DNA.

    4. Reviewer #2 (Public Review):

      Mismatches occur as a result of DNA polymerase errors, chemical modification of nucleotides, during homologous recombination between near-identical partners, as well as during gene editing on chromosomal DNA. Under some circumstances, such mismatches may be incorporated into nucleosomes but their impact on nucleosome structure and stability is not known. The authors use the well-defined 601 nucleosome positioning sequence to assemble nucleosomes with histones on perfectly matched dsDNA as well as on ds DNA with defined mismatches at three nucleosomal positions. They use the R18, R39, and R56 positions situated in the middle of the outer turn, at the junction between the outer turn and inner turn, and in the middle of the inner turn, respectively. Most experiments are carried out with CC mismatches and Xenopus histones. Unwrapping of the outer DNA turn is monitored by single-molecule FRET in which the Cy3 donor is incorporated on the 68th nucleotide from the 5'-end of the top strand and the Cy5 acceptor is attached to the 7th nucleotide from the 5' end of the bottom strand. Force is applied to the nucleosomal DNA as FRET is monitored to assess nucleosome unwrapping. The results show that a CC mismatch enhances nucleosome mechanical stability. Interestingly, yeast and Xenopus histones show different behaviors in this assay. The authors use FRET to measure the cyclization of the dsDNA substrates to test the hypothesis that mismatches enhance the flexibility of the 601 dsDNA fragment and find that CC, CA, CT, TT, and AA mismatches decrease looping time, whereas GA, GG, and GT mismatches had little to no effect. These effects correlate with the results from DNA buckling assays reported by Euler's group (NAR 41, 2013) using the same mismatches as an orthogonal way to measure DNA kinking. The authors discuss that substitution rates are higher towards the middle of the nucleosome, suggesting that mismatches/DNA damage at this position are less accessible for repair, consistent with the nucleosome stability results.

    5. Reviewer #3 (Public Review):

      The mechanical properties of DNA wrapped in nucleosomes affect the stability of nucleosomes and may play a role in the regulation of DNA accessibility in eukaryotes. In this manuscript, Ngo and coworkers study how the stability of a nucleosome is affected by the introduction of a CC mismatched base pair, which has been reported to increase the flexibility of DNA. Previously, the group has used a sophisticated combination of single-molecule FRET and force spectroscopy with an optical trap to show that the more flexible half of a 601 DNA segment provides for more stable wrapping as compared to the other half. Here, it is confirmed with a single-molecule cyclization essay that the introduction of a CC mismatch increases the flexibility of a DNA fragment. Consistent with the previous interpretation, it also increased the unwrapping force for the half of the 601 segment in which the CC mismatch was introduced, as measured with single-molecule FRET and force spectroscopy. Enhanced stability was found up to 56 bp into the nucleosome. The intricate role of mechanical stability of nucleosomes was further investigated by comparing force-induced unwrapping profiles of yeast and Xenopus histones. Intriguingly, asymmetric unwrapping was more pronounced for yeast histones.

      Note from Reviewing Editor:

      The authors addressed the points in the reviews by making appropriate text additions and clarifications.

    1. eLife assessment

      This important study identifies the anti-inflammatory function of PEGylated PDZ peptides that are derived from the ZO-1 protein. Results from cellular and in vivo experiments tracking key inflammatory markers are compelling. Although the mechanism of action remains largely unknown, this study provides a proof of concept for developing novel strategies against acute inflammatory conditions such as sepsis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors investigate the potential therapeutic effects of the PEGylated PDZ peptide, derived from the ZO-1 protein, in suppressing LPS-induced systemic inflammation. The authors found that the pretreatment of PEGylated PDZ peptide led to a restoration of tissue injuries in the kidney, liver, and lung, and diminished alterations in biochemical plasma markers induced by LPS. This was accompanied by decreased production of pro-inflammatory cytokines in the plasma and lung BALF of the PDZ-administered mice.

      Strengths:

      - The data presented here is solid and the results provide the groundwork for developing novel anti-inflammatory therapeutic strategies.<br /> - The authors employ various cells and in vivo models to test the efficacy of the peptide.

      Weaknesses:<br /> The mechanism of action remains largely unknown.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigated systemic inflammation induced by LPS in various tissues and also examined immune cells of the mice using tight junction protein-based PDZ peptide. They explored the mechanism of anti-systemic inflammatory action of PDZ peptides, which enhanced M1/M2 polarization and induced the proliferation of M2 macrophages. Additionally, they insisted on the physiological mechanism that inhibited the production of ROS in mitochondria, thereby preventing systemic inflammation.

      Strengths:<br /> In the absence of specific treatments for septic shock or sepsis, the study demonstrating that tight junction-based PDZ peptides inhibit systemic inflammation caused by LPS is highly commendable. Whereas previous research focused on antibiotics, this study proves that modifying parts of intracellular proteins can significantly suppress symptoms caused by septic shock. The authors expanded the study of localized inflammation caused by LPS or PM2.5 in the respiratory tract, to systemic inflammation, presenting promising results. They not only elucidated the physiological mechanism by identifying the transcriptome through RNA sequencing but also demonstrated that PDZ peptides inhibit the production of ROS in mitochondria and prevent mitochondrial fission. This research is highly regarded as an excellent study with potential as a treatment for septic shock or sepsis.

      Weaknesses<br /> (1) The authors focused intensively on acute inflammation for a short duration instead of chronic inflammation.<br /> (2) LPS was used to induce septic shock, but administrating actual microbes such as E.coli would yield more accurate results.<br /> (3) The authors used pegylated peptides, but future research should utilize the optimized peptides to derive the optimal peptide, and further, PK/PD studies are also necessary.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Beyond my general review, some descriptions of the results and methods could be further clarified, which I've outlined below:

      (1) Page 3, Line 118-120: Based on results from Fig 1A, the authors reported 15 nanobodies neutralized both delta and BA.1 out of the 41 tested. However, I only counted 14. Could the authors double check?

      We recounted the nanobodies and confirmed there are 15 as follows:

      (1) RBD-15

      (2) RBD-22

      (3) RBD-24

      (4) RBD-9S1-4

      (5) S1-35

      (6) RBD-6

      (7) RBD-5

      (8) RBD-21

      (9) RBD-16

      (10) S1-46

      (11) S1-49dimer

      (12) S2-10dimer

      (13) S2-3

      (14) S2-62

      (2) Page 5, Lines 134-135: the authors described that the heatmap reflects the neutralizing strength of the representative nanobodies from each group. For groups where multiple nanobodies were selected for visualization, how was the neutralization strength calculated? Was the IC50 averaged first before being converted into the neutralization strength?

      This has been made clear in the legend for Fig. 1 as follows “For groups with multiple nanobodies, the average -log10 (IC50) is first calculated for the nanobodies within that group, then normalized to a neutralization score within the 0–100 range using the min and max average -log10 (IC50) for that group. A higher score indicates more potent neutralization of the variant relative to the wild type.”

      (3) Page 5, Lines 138-139: What was the authors' rationale for selecting certain nanobodies over others for structural modeling and visualizing the neutralization heatmap in Fig 1B? Does it introduce bias to the neutralizing epitope map on the spike protein?

      We only focused on nanobodies for which we had enough epitope mapping data to unambiguously generate docked nanobody-spike models, as explained in our previous study (Mast et. al, eLife 2021). When multiple nanobodies within the same group had sufficient epitope mapping data available, we selected only representative candidates that had better binding affinity and/or neutralization potency. As epitope mapping via escape mutants relied largely on random point mutagenesis of Spike, there should be little introduced bias.

      Overall, groups I-VII cover an exhaustive set of target areas on the RBD (including the lone glycan site in Group-II), while groups VII and IX are representative areas on NTD and S2. Using group-average IC50s and suitable normalization as mentioned in point 3 above further prevent potential biases due to unequal number of Nbs modeled from each group.

      We have modified the text with the following:

      “For computational epitope modeling, we selected nanobody candidates using a series of experimentally obtained structural restraints, as described in Mast, Fridy et al. 2021.”

      (4) Page 5, Lines 161-167: It would be good to include Fig S1 as a main figure as it places the epitope landscape of nanobodies being investigated in this manuscript into the broader context of clinically approved monoclonal antibody therapeutics for COVID-19.

      We have amended the Figures to accommodate the reviewers suggestion. Figure S1 is now Figure 2.

      (5) Page 6, Lines 173-175: The neutralization breadth for S1-46 is quite encouraging. Any speculations on why this particular nanobody is so broadly targeting? Any additional thoughts on why its high binding affinity (nM) did not translate into strong neutralization (as it is in the 0.1-1 uM range)?

      S1-46 binds a region on spike that is conserved across all variants observed to date. Its epitope is difficult to access unless the RBD is in the up conformation, which may explain why monoclonal antibodies rarely bind. We state this in the text as follows:

      “S1-46 binds a region on spike that is conserved across all variants to date, but which may be relatively inaccessible and is not targeted by any of the mAbs that previously received EUA by the FDA (Cox, Peacock et al. 2023).”

      Relating neutralization activity to binding activity requires more insight into the mechanisms of binding and activity. Nonetheless, we are also encouraged by S1-46’s breadth and numerous avenues can be pursued to greatly improve its neutralizing activity (e.g. synergistic combinations).

      (6) Page 6, Lines 173-175: For the remaining two nanobodies S1-31 and S1-RBD-11 in group VII, the target epitopes on the spike proteins of either delta or BA.1 do not seem to bear any mutations, at least based on the mutation maps in Fig 1B. Yet their neutralizing capacities against delta and BA.1 variants were abolished. Do the authors have any idea about what is going on here?

      For group VII, only the epitope of S1-46 was mapped whereas S1-31 and S1-RBD-11 were assigned to group VII based on our lower resolution binning experiments. Thus, without knowing precisely where they bind, we can make only limited conclusions at this time. In the absence of supporting structural information, we speculate that the epitopes of RBD-11 and S1-31 may be in a region that overlaps with or is in close proximity to a mutation that could affect the binding of the nanobody enough to result in loss of neutralizing ability.

      (7) Page 7, Line 195-200: Please provide PRNT50 or logPRNT50 for the five nanobodies selected for BA.4/5 PRNT assay.

      We have added this suggested information. Additionally, a supporting table (Table S1) is now provided.

      (8) Page 8, Lines 223-224: Similar to comment 3, what was the rationale here for choosing certain nanobodies over others for structural modeling and visualizing the binding heatmap in Fig 2B?

      The set of nanobodies chosen for structural modeling and visualization of neutralization data is identical to the set of anti-RBD nanobodies chosen for binding.

      (9) Page 11, Lines 326-328: Can the authors include mutation maps as part of Fig 4C to show the mutation distributions on the XBB/BQ.1/BQ/1.1 spikes?

      We have updated and added a supplemental figure to accompany Fig. 5 (called “supplement for Figure 5”) showing the mutation maps.

      (10) Page 14, Line 409-418: This paragraph is well considered. Given the large number of nanobodies assessed in this manuscript, it would be helpful if the authors could highlight some candidate nanobodies as lead candidates for further optimization.

      While our intention in this manuscript was not to provide targeted recommendations for lead candidates, but rather to reiterate the collective potential of a Nb pool originally targeted towards the 2019 Wuhan variant, the reviewers point is interesting. We speculate that any of the Nbs we have demonstrated to show pan-VoC activity, would be prime candidates for further optimization.

      We have added a statement to this effect as follows: “We propose that any of the Nbs we have demonstrated to show pan-VoC activity, would be prime candidates for further optimization.”

      Reviewer #2 (Recommendations For The Authors):

      Major concerns:

      (1) The main message of the article is the prediction that nanobodies that retain binding to the different SARS-CoV-2 variants including early Omicron strains will retain binding and neutralization against currently circulating strains such XBB and BQ. However, no evidence either via modeling or experimental testing has been provided for that prediction. The study will benefit from mapping amino acid mutations in RBD of XBB and BQ lineages compared to BA.4/5 and demonstrating via computation docking that epitopes of the five nanobodies that retain binding to BA.4/5 RBD are not affected. For example, the crystal structure of XBB.1 RBD PDB:8OIV is available. Binding/neutralization experiment with currently circulating SARS-CoV-2 strains would still be the gold standard test given the fact that only five out of 41 nanobodies retained binding and neutralization to BA.4/5 lineage. Loss of neutralization ability against BA.4/5 without a significant decrease in binding affinity for nanobodies S1-46 and S1-RBD-22 further indicates that neutralization of XBB and BQ lineage should be performed.

      The docking protocol used to predict the spike epitopes uses a C-alpha resolution to represent protein residues, and is data-driven, i.e. it assumes that binding happens in the first place, and then utilizes experimentally obtained structural restraints. So, concluding possible binding from such a docking protocol alone would be noisy. In our revised manuscript we have a new Figure 3B, which shows epitopes of 4 out of the 5 pan-VoC nanobodies, i.e. S1-RBD-{9, 22, 40) and S1-46 mapped to the RBD structures of XBB.1 (8IOU) and BQ.1.1 (8FXC), and we have updated Figure 4 with a supplemental showing the mutation maps.

      (2) Described nanobodies are positioned as very potent neutralizers of SARS-CoV-2. However, they are much less potent in neutralization of ancestral strain as well as early VOCs compared to the mAbs that were approved for COVID-19 treatment. For example, IC50 for casirivimab and imdevimab are 37.4 pM and 42.1 pM, respectively. That is about 27-fold more than IC50 for the most potent nanobody reported in the article, S1-RDB-15.

      This comparison is fraught for several reasons. 1. Experimental differences in pseudovirus assay systems usually result in significant differences in reported IC50s, as IC50 is not an absolute measure, or ultimately comparable to clinical IC50 values. For this reason, in our original publication (Mast et al., 2021) we tested other nanobodies in our experimental set-up as benchmarks (Mast et al., 2021). 2. A typical monoclonal has two binding sites with a large structural Fc linker that is combined ~10 times the size of a nanobody. In a therapeutic setting where monoclonal therapy is provided in g per kg of patient body weight, there is a 5-fold excess of Nb binding to antibody binding capacity. 3. We have previously shown that dimerizing our nanobodies (to produce two antigen binding sites) can dramatically increase potency over 100 fold (Mast et al., 2021).

      In order to make this even clearer in the manuscript, we have added the following: “We note that IC50s are not directly comparable across different experimental set-ups because measured values are highly dependent on the experimental conditions. For this reason, we included other published nanobodies as benchmarks in our original publication and have subsequently maintained standard experimental conditions (Mast, Fridy et al. 2021)”.

      (3) Figure 1A. If each dot represents an independent measurement of the same nanobody, IC50 variation seems too high. For some nanobodies it ranges for almost a log of magnitude, e.g S1-RDB-24, S1-RBD-46, S2-3. Why is that?

      We have deliberately explored the full range of effects that could contribute to experimental variability in our pseudovirus assay, using different batches of nanobody and pseudovirus in each replicate to provide as impartial and comprehensive analysis as possible. While the activity of some nanobodies is remarkably stable from batch to batch, others show the variation noticed by the Reviewer, hence why we performed multiple replicates to define the average IC50 value for our nanobodies.

      (4) The drop in IC50 for BA.1 neutralization is about one log for the majority of tested nanobodies. This should be outlined in the text. For example, for the most potent neutralizer, S1-RDB-15, the drop in IC50 for BA.1 is about 100-fold compared to IC50 for the Delta and Wuhan strains. It is important to note that out of 9 nanobodies for that drop in neutralizing capacity against BA.1 and Delta variants less than one log of magnitude 2 have epitopes in the S2 domain of SRS-CoV-2 spike. Resistance of mAbs targeting the S2 part of the spike has been extensively described in the literature as being due to the highly conserved structure of this region that facilitates membrane fusion. Presented data demonstrate that >80% of the nanobody repertoire is affected by mutations on spike protein. Additionally, it can be helpful for readers if the fold-change in IC50 between Wuhan, Delta, and BA.1 is presented in the text or added to Figure 1 or a table.

      We agree with the Reviewer and to make this more explicit we have made the following change: “In comparison, groups I, I/II, I/IV, V, VII, VIII and the anti-S2 nanobodies contained the majority of omicron BA.1 neutralizers, though here the neutralization potency of many nanobodies was generally decreased tenfold compared to wild-type (emphasis added).”

      (5) The authors should either present the results of the formal correlation analysis or avoid using misleading verbiage such as: "the decrease in neutralization potency largely correlates with the accumulation of omicron BA.1 specific mutations throughout the RBD" or "significant decrease in binding affinity correlated to decreases neutralization potency".

      We thank the Reviewer for this constructive feedback. To address this question, we have performed a correlation analysis using Pearson and Spearman's methods to quantitatively assess the relationship between nanobody neutralization potency (IC50) and binding affinity (KD) across SARS-CoV-2 variants, including the wildtype, delta, and omicron BA.1 variants. Our results indicate a statistically significant correlation for the delta variant (Pearson's PCC: 0.71, p-value: 0.01; Spearman's rho: 0.63, p-value: 0.07), supporting our statement regarding the correlation between decreased neutralization potency and reduced binding affinity for this variant. However, for the wildtype and omicron BA.1 variants, the correlations were not statistically significant (wildtype Pearson's: 0.10, p-value: 0.70; omicron BA.1 Pearson's: 0.27, p-value: 0.31), which we acknowledge does not fully align with the verbiage used in the manuscript. Therefore, we have revised the manuscript to present the correlation analysis data accurately and ensure the discussion is reflective of the statistical evidence as follows:

      “SPR binding assessments to the spike S1 domain or RBD of delta revealed a pattern: nanobodies maintaining binding affinity generally also neutralized the virus with a statistically significant correlation between binding affinity and neutralization efficacy (Pearson's Correlation Coefficient: 0.71, p-value: 0.01; Spearman's rho: 0.63, p-value: 0.07). However, this correlation was not statistically significant for omicron BA.1 (Pearson's Correlation Coefficient: 0.27, p-value: 0.31) (Fig. 3A, Table 1). Notably, while some nanobodies bound to the variants, they did not consistently neutralize them, suggesting additional factors influence neutralization beyond mere binding.”

      (6) Figure 3 shows approximated curves for live virus neutralization assay with quite a broad 90% CI. It will be helpful to present, at least, in supplementary, primary data for live-virus neutralization that were used to perform non-linear regression.

      We have added the reviewer’s suggestion.

      (7) It is not clear what are the "variant-specific nanobody groups" exactly? A definition/description of the term is not provided. If the nanobody library was generated with the Wuhan strain, how did strain-specific nanobodies that bind/neutralize only Delta, BA.1 or BA.4/5 appear in the repertoire and were isolated? This statement also contradicts data in Table 4 where all nanobodies listed bind and neutralize Wuhan strain.

      We agree with the reviewer. All nanobodies tested bind/neutralize the Wuhan strain as they were selected from our original repertoire of 116 nanobodies (Mast, et al., 2021). To clarify, variant-specific nanobodies are nanobodies that bind only one variant that arose from the original Wuhan strain. They were categorized into variant-specific groups based on whether they were able to bind each variant (other than Wuhan).

      We have thus added to the manuscript, “we define variant-specific nanobodies as nanobodies that bind a single additional variant alongside the original Wuhan strain...”

      (8) Describing the categorization of nanobody epitope groups presented in Figure 4, the authors state that binding to Wuhan, Delta, BA/1, and BA.4/5 predicts that these nanobodies will be "effective binders against current circulating strains of the virus including XBB and BQ lineages"? How exactly is this conclusion corollary to the data shown?

      The epitopes of XBB and BQ.1 are not divergent enough within the regions we propose the nanobodies to bind, to suggest that nanobodies that bind in those regions will lose binding ability. We hypothesize that the region at which these nanobodies bind represents regions on spike that are vulnerable to our specified nanobodies in Fig. 4. We have generated a new Fig. 3B and added a supporting figure for Fig. 4 to address this.

      (9) Figures 4C and 6 describe how the nanobodies will retain binding to currently circulating strains of XBB lineage. However, epitopes are mapped on the same Wuhan, Delta, BA.1, and BA.4/5 virus strains. The predicted binding of nanobodies to XBB lineage RBD is not actually shown in Figure 6. It is clear from the figure that the nanobody binding footprint (red area) decreases with antigenic distance in every spike projection from Wuhan through the BA.4/5 strain. It is unclear how this indicates that nanobodies will remain active against even more distant XBB, BQ, EU, and CH strains accumulating more mutations in spike protein.

      We have added the following to the manuscript to clarify: “Strikingly, we have in our cohort 8 nanobodies able to bind delta, and the omicron lineages BA.1/BA.4/BA.5/XBB/BQ.1.1 (Fig. 5B). We further predict these 8 nanobodies will be effective binders against current circulating strains of the virus including omicron EG.5 and HV.1 as the epitope regions (or predicted epitopes) of these nanobodies do not vary significantly from omicron lineages XBB and BQ.1.1 (Fig. 5C and Supplement to Fig. 5).”

      (10) Despite major advances in the development of nanobodies as therapeutic molecules there are only a few nanobody-based drugs that have so far been approved for clinical use and all of them are nanobody fusions to immunoglobulin Fc fragment. It is dictated by the small size of the nanobody itself, 15 kDa molecule, that leads to rapid kidney clearance within hours post-injection, and also by the necessity of having antibody effector functions allowing for example killing of malignant cells. It is hard to predict how each individual nanobody will tolerate multimerization and if it will still retain binding ability as its size dramatically increases. It should be noted that IC50 for BA.4/5 is in the submicromolar range for the 5 nanobodies retaining neutralization of this strain. From a therapeutic perspective, this is quite a high IC50 that dictates a high dosage to achieve a therapeutic effect. Furthermore, it can be expected that additional mutations in the SARS-CoV-2 spike will further affect binding affinity and therefore reduce the neutralization ability of these nanobodies resulting in even higher doses required to achieve therapeutic effect. Therefore, authors should discuss the limitations of the nanobody approach as a therapeutic intervention more granularly.

      While Fc fusions are not strictly required for clinical use (for instance Caplacizumab is not an Fc fusion, being a multimer containing an albumin-binding nanobody), we agree that reformulation would indeed be required to optimize pharmacokinetics for eventual clinical use. Increased valency through multimerizeration is in fact one of several strategies, which also includes synergistic combinations, for significantly enhancing effective IC50. Preclinical nanobody engineering is not within the scope of this paper, but we acknowledge this challenge.

      Minor points:

      (1) Table S1 is missing.

      This is an .xlsx file uploaded as Supplementary File 3. Labeled now as “Figure 6–Source data 2. Neutralization data from synergy experiment”.

      (2) Because Table 1 summarizes all neutralization and binding data, it will be helpful to refer to it while describing data presented in Figure 1.

      This has been added to the revised manuscript.

      (3) Live SARS-CoV-2 PRNT is not described in Materials and Methods.

      This has been added to the revised manuscript.

    2. eLife assessment

      This study presents important insights on the impact of SARS-CoV-2 variants on the binding and neutralization of a small library of nanobodies. The authors should be applauded for their comprehensive in vitro and in silico analyses of nanobody targeting of SARS-CoV-2 variants. The evidence supporting the claims of the authors is now convincing. This work will be of great interest to researchers in the fields of antibody/nanobody engineering and SARS-CoV-2 therapeutics.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ketaren, Mast, Fridy et al. assessed the ability of a previously generated llama nanobody library (Mast, Fridy et al. 2021) to bind and neutralize SARS-CoV-2 delta and omicron variants. The authors identified multiple nanobodies that retain neutralizing and/or binding capacity against delta, BA.1 and BA.4/5. Nanobody epitope mapping on spike proteins using structural modeling revealed possible mechanisms of immune evasion by viral variants as well as mechanisms of cross-variant neutralization by nanobodies. The authors additionally identified two nanobody pairs involving non-neutralizing nanobodies that exhibited synergy in neutralization against the delta variant. These results enabled the refinement of target epitopes of the nanobody repertoire and the discovery of several pan-variant nanobodies for further preclinical development.

      Strengths:

      Overall, this study is well executed and provides a valuable framework for assessing the impact of emerging SARS-CoV-2 variants on nanobodies using a combination of in vitro biochemical and cellular assays as well as computational approaches. There are interesting insights generated from the epitope mapping analyses, which offer possible explanations for how delta and omicron variants escape nanobody responses, as well as how some nanobodies exhibit cross-variant neutralization capacity. These analyses laid out a clear path forward for optimizing these promising next-gen therapeutics, particularly in the face of rapidly emerging SARS-CoV-2 variants. This work will be of interest to researchers in the fields of antibody/nanobody engineering, SARS-CoV-2 therapeutics, and host-virus interaction.

      Weaknesses:

      A main weakness of the study is that the efficacy statement is not thoroughly supported. While the authors comprehensively characterized the neutralizing ability of nanobodies in vitro, there is no animal data involving mice or hamsters to demonstrate the real protective efficacy in vivo. Yet, in the title and throughout the manuscript, the authors repeatedly used phrases like "retains efficacy" or "remains efficacious" to describe the nanobodies' neutralization or binding capacities. This claim is not well supported by the data and underestimates the impact of variants on the nanobodies, especially the omicron sublineages. For example, the authors showed that S1-RBD-15 had a ~100-fold reduction in neutralization titer against Omicron, with an IC50 at around 1 uM. This is much higher than the IC50 value of a typical anti-ancestral RBD nanobody reported in the previous study (Mast, Fridy et al. 2021). In fact, the authors themselves ascribe nanobodies with an IC50 above 1 uM as weak neutralizers. And there were many in the range of 0.1-1 uM. Furthermore, many nanobodies selected for affinity measurement against BA.4/5 had no detectable binding. Without providing in vivo protection data or including monoclonal antibodies that are known to be efficacious against variants in the in vitro assays as a benchmark, it is difficult to evaluate the efficacy just with the IC50 values.

      Comments post revision:

      The authors are to be commended for their comprehensive response to the referees' comments. In the revised manuscript, the authors made extensive changes throughout the texts and added new figures that greatly improved their clarity. While the manuscript is still limited in solely relying on in vitro data for efficacy assessment, it nicely demonstrates how the combination of experimental and computational techniques could lead to the discovery of broadly neutralizing nanobody candidates for further lead optimization.

    4. Reviewer #2 (Public Review):

      Summary:

      Interest in using nanobodies for therapeutic interventions in infectious diseases is growing due to their ability to bind hidden or cryptic epitopes that are inaccessible to conventional immunoglobulins. In the presented study, authors posed to characterize nanobodies derived the library produced earlier with Wuhan strain of SARS-CoV-2, map their epitopes on SARS-CoV-2 spike protein and demonstrate that some nanobodies retain binding and even neutralization against antigenically distant, newly emerging Variants of Concern (VOCs).

      Strengths:

      Authors demonstrate that some nanobodies despite being obtained against ancestral virus strain retain high affinity binding to antigenically distant SARS-CoV-2 strains despite majority of the repertoire loses binding. Despite being limited to only two nanobody combinations, demonstration of synergy in virus neutralization between nanobodies targeting different epitopes is compelling. The ability of nanobodies to bind emerging virus strains has been demonstrated and the possible effect of mutations within epitopes has been thoroughly discussed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review:

      Reviewer #1:

      Summary:

      The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformationspecific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:

      The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

      Weakness:

      Though written with good clarity, the paper will benefit from some clarifications.

      (1) The angular distribution of particles for the 3D reconstructions should be provided (Figure 1 - Sup. 1 & Sup. 2).

      Figure 1 – Figure supplements 1 and 2 now contain particle distribution plots.

      (2) The B-factors for protein and ligand of the model, Map sharpening factor, and molprobity score should be provided (Table 1).

      Table 1 now contains B-factors and molprobity scores.

      The map used to interpret the model was post-processed by density modification, and therefore no data concerning sharpening factors are provided in the output.

      (3) A supplemental Figure to Figure 2B, illustrating how a0-helix interacts with COR-A&LRR before and after GTP binding in atomic details, will be helpful for the readers to understand the critical role of a0-helix during CtRoco activation.

      This is now illustrated in the new Figure 2 – Figure Supplement 1.

      (4) For the following statement, "On the other hand, only relatively small changes are observed in the orientation of the Roc a3 helix. This helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022), is located at the interface of the Roc and CORB domains and harbors the residues H554 and Y558, orthologous to the LRRK2 PD mutation sites N1337 and R1441, respectively." It is not surprising the a3-helix of the ROC domain only has small changes when the ROC domain is aligned (Figure 2E). However, in the study by Zhu et al (DOI: 10.1126/science.adi9926), it was shown that a3-helix has a "see-saw" motion when the COR-B domain is aligned. Is this motion conserved in CtRoco from inactive to active state?

      We indeed describe the conformational changes from the perspective of the Roc domain. When using the COR-B domain for structural alignment, a rotational movement of Roc (including a “seesaw”-like movement of the α3-helix helix around His554) with respect to COR-B is correspondingly observed.

      This is now added to Figure 2E. Additionally, the text was adapted to:

      “Interestingly, this rotational movement of CORB seems to use the H554-Y558-Y804 triad on the interface of Roc and CORB as a pivot point (Figure 2E). Mutation of either of the corresponding residues in LRRK2 (N1437, R1441, Y1699, respectively) is associated with PD and leads to LRRK2 activation. Residues H554 and Y558 are located on the Roc a3 helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022). Indeed, while the orientation of the a3 helix with respect to the rest of the Roc domain only undergoes small changes upon GTPgS binding, it can be observed that this helix undergoes a “seesaw-like” movement with respect to the CORB domain. A similar rearrangement was previously also observed for Rab29-mediated activation of human LRRK2 (Störmer et al., 2023; Zhu et al., 2022).”

      (5) A supplemental figure showing the positions of and distances between NbRoco1 K91 and Roc K443, K583, and K611 would help the following statement. "Also multiple crosslinks between the Nbs and CtRoco, as well as between both nanobodies were found. ... NbRoco1-K69 also forms crosslinks with two lysines within the Roc domain (K583 and K611), and NbRoco1-K91 is crosslinked to K583".

      A figure displaying these crosslinks is now provided as Figure 4–figure supplement 1. However, in interpreting these crosslinks it should be taken into consideration that the additive length of the DSSO spacer and the lysine side chains leads to a theoretical upper limit of ∼26 Å for the distance between the α carbon atoms of cross-linked lysines (and even a cut-off distance of 35 Å when taking into account protein dynamics).

      (6) It would be informative to show the position of CtRoco-L487 in the NF and GTP-bound state and comment on why this mutation favors GTP hydrolysis.

      L487 is located in Switch 1, which is a critical region for nucleotide binding and hydrolysis. Unfortunately, most probably due to flexibility, the Switch 1 region could not be entirely modeled (in neither nucleotide state). Since L487 is located on the edge of the interpretable portion of the Switch 1 in both structures (see Author response image 1 below), any interpretation regarding the role of this residue would be highly speculative.

      Author response image 1.

      The following text was added to the Results section:

      “Also the Switch 1 loop could not be fully modeled in our structure, presumably indicating some flexibility in this region despite the presence of a GTP analogue. Interestingly, the Switch 1 loop harbors the site of the PD-analogous L487A mutation that leads to a stabilization of the CtRoco dimer with a concomitant decrease in GTPase activity (Deyaert et al., 2019). Unfortunately, an exact interpretation of this effect of the L487A mutation is hampered by the lack of a well resolved Switch 1 loop.”

      Reviewer #2:

      Summary

      The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomerGTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:

      The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

      Weakness:

      The strong point of the use of nanobodies is also a potential weak point; these nanobodies may have induced some conformational changes in a part of the protein that will not be present in a GTPyS-bound protein in the absence of nanobodies.

      Two major points need further attention.

      (1) Several parts of the protein are very flexible during the monomer-dimer activity cycle. This flexibility is crucial for protein function, but obviously hampers structure resolution. Forced experiments to reduce flexibility may allow better structure resolution, but at the same time may impede the activation cycle. Therefore, careful experiments and interpretation are very critical for this type of work. This especially relates to the influence of the nanobodies on the structure that may not occur during the "normal" monomerdimer activation cycle in the absence of the nanobodies (see also point 2). So what is the evidence that the nanobody-bound GTPyS-bound state is biochemically a reliable representative of the "normal" GTP-bound state in the absence of nanobodies, and therefore the obtained structure can be confidentially used to interpret the activation mechanism as done in the manuscript.

      See below for an answer to remark 1 and 2.

      (2) The obtained structure with two nanobodies reveals that the nanobodies NbRoco1 and NbRoco2 bind to parts of the protein by which a dimer is impossible, respectively to a0helix of the linker between Roc-COR and LRR, and to the cavity of the LRR that in the dimer binds to the dimerizing domain CORB. It is likely the open monomer GTP-bound structure is recognized by the nanobodies in the camelid, suggesting that overall the open monomer structure is a true GTP-bound state. However, it is also likely that the binding energy of the nanobody is used to stabilize the monomer structure. It is not automatically obvious that in the details the obtained nonobody-Roco-GTPyS structure will be identical to the "normal" Roco-GTPyS structure. What is the influence of nanobody-binding on the conformation of the domains where they bind; the binding energy may be used to stabilize a conformation that is not present in the absence of the nanobody. For instance, NbRoco1 binds to the a0 helix of the linker; what is here the "normal" active state of the Roco protein, and is e.g. the angle between RocCOR and LRR also rotated by 135 degrees? Furthermore, nanobody NbRoco2 in the LRR domain is expected to stabilize the LRR domain; it may allow a position of the LRR domain relative to the rest of the protein that is not present without nanobody in the LRR domain. I am convinced that the observed open structure is a correct representation of the active state, but many important details have to be supported by e,g, their CX-MS experiments, and in the end probably need confirmation by more structures of other active Roco proteins or confirmation by a more dynamic sampling of the active states by e.g. molecular dynamics or NMR.

      Recently, nanobodies have increasingly been used successfully to obtain structural insights in protein conformational states (reviewed in Uchański et al, Curr. Opin. Struc. Biol. 2020). As reviewer # 2 points out, the concern is sometimes raised that antibodies could distort a protein into non-native conformations. Here, it is important to note that the nanobodies were raised by immunizing a llama with the fully native CtRoco protein bound to a non-hydrolysable GTP analogue, after which the nanobodies were selected by phage display using the same fully native and functional form of the protein. As clearly explained in Manglik et al. Annu Rev Pharmacol Toxicol. 2017, the probability of an in vivo matured nanobody inducing a non-native conformation of the antigen is low, although it is possible that it selects a high-energy, low-population conformation of a dynamic protein. Immature B cells require engagement of displayed antibodies with antigen to proliferate and differentiate during clonal selection. Antibodies that induce non-native conformations of the antigen pay a substantial energetic penalty in this process, and B cell clones displaying such antibodies will have a significantly lower probability of proliferation and differentiation into mature antibody-secreting B lymphocytes. Hence, many recent experiments and observation give credence to the notion that nanobodies bind antigens primarily by conformational selection and not induced fit (e.g. Smirnova et al. PNAS 2015).

      Extrapolated to the case of CtRoco, which is clearly very flexible in its GTP-bound form, this means that the nanobodies are able to trap and stabilize one conformational state that is representative of the “active state” ensemble of the protein. In this respect, it is clear from our experiments (XL-MS, affinity and effect on GTPase activity) that the effects of NbRoco1 and NbRoco2 are additive (or even cooperative), meaning that both nanobodies recognize different features of the same CtRoco “active state”. Correspondingly, the monomeric, elongated “open” conformation is also observed in the structure of CtRoco bound to NbRoco1 only (Figure1 - supplement 2), albeit that this structure still displays more flexibility. The monomerization and conformational changes that we observe and describe in the current paper at high resolution are also in very good agreement with earlier observations for CtRoco in the GTP-bound form in absence of any nanobodies, including negative stain EM (Deyaert et al. Nature Commun, 2017), hydrogen-deuterium exchange experiments (Deyaert et al. Biochem. J. 2019) and native MS (Leemans et al. Biochem J. 2020).

      In the revised manuscript we added the following text to the discussion:

      “To decrease this flexibility, we have now used two previously developed conformationspecific nanobodies (NbRoco1 and NbRoco2) to stabilize the protein in the GTP-state (Leemans et al., 2020), allowing us to solve its structure using cryo-EM (Figure 1). Recently, Nbs have successfully been used to obtain structural insights in the conformational states of a number of highly dynamic proteins (Uchański et al, 2020). These studies established that Nbs bind antigens primarily by conformational selection rather than by induced fit (Manglik et al., 2017; Smirnova et al.,2015). Since NbRoco1 and NbRoco2 were generated by immunization with fully native CtRoco bound to a nonhydrolysable GTP analogue, and subsequently selected by phase display using the same functional protein, it is thus safe to assume that these Nbs bind to and stabilize a relevant conformation that is present within the “active” CtRoco conformational space (Leemans et al., 2020). Moreover, our current structures are also in very good agreement with previous biochemical studies and data from HDX-MS and negative stain EM (Deyaert et al., 2019; Deyaert, Wauters, et al., 2017).”

      Recommendations for the authors:

      Reviewer #1:

      (1) Figure 2C: please label the residues with meshes (switch 2).

      Labels have been added to figure 2C.

      (2) A supplemental figure for the following statement will be helpful "A remarkable feature of the CtRoco dimer structure was the dimer-stabilized orientation of the P-loop, which would hamper direct nucleotide binding on the dimer. Correspondingly, in the current structure, the P-loop changes orientation, allowing GTPgS to bind, although the EM map does not allow unambiguous placement of the entire P-loop. Surprisingly, also the Switch 1 loop could not be fully modeled, which could indicate some flexibility in this region despite the presence of a GTP analog".

      An additional Figure 2–figure supplement 2 has been added to illustrate this.

      (3) A supplemental figure for the following statement will be helpful "A final important observation in the Roc domain concerns the very C-terminal part of Switch 2 (residues 520 to 533), which could not be modeled in our GTP bound structure due to flexibility, while in the nucleotide-free dimer structure this region is structured and located at the interface of the Roc domain with the LRR-Roc linker and CORA. In this way, the conformational changes induced by GTPgS binding could be relayed via the Switch 2 toward the LRR and CORA domains, and vice versa."

      An additional Figure 2–figure supplement 2 has been added to illustrate this.

      (4) A structural comparison of each domain (LRR, ROC, COR) between NF and GTP-bound states will be greatly useful to understand statements in the manuscript, such as "In addition to the Cterminal dimerization part of CORB that becomes unstructured, also other large conformational changes are observed in the CORA and CORB domains of CtRoco upon GTPgS binding."

      We would like to clarify that with this statement we refer to changes in the relative orientation of the domains between the nucleotide-free and GTPgS-bound states, rather than to conformational changes within each domain. These changes in relative orientation are illustrated in Figure 2 and the associated Figure supplements.

      (5) The statement "to a lesser extent, also between CDR1 and the LRR-Roc linker" is not clearlyillustrated in Figure 3B.

      The reviewer is correct, and we now also show CDR1 in Figure 3B.

      (6) Extra panels can be added in Figure 1 Sup. 4 to illustrate the following statement "In the density map NbRoco2 can easily be identified and placed on the concave side of the LRR domain... Nterminal and C-terminal b-strands interacting with the very C-terminal repeat of the LRR".

      We belief the density map corresponding to NbRoco2 is clearly shown in Figure 1 – supplement 4A. A reference to this figure panel is now added to the main text.

      (7) "In the presence of both Nbs, the hydrolysis rate was increased 4-fold compared to CtRocoL487A alone and 2-fold compared to CtRoco-L487A in the presence of NbRoco1 only, again illustrating a collaboration between the Nbs (Figure 5C)" Here, is it 6-fold instead of 4-fold?

      The reviewer is correct. We changed this accordingly in the manuscript.

      Reviewer #2:

      (1) At many places in the manuscript the lack of structural details is explained by the assumed local flexibility of the protein. This may be true for many cases (such as linker regions), but is probably not always correct; several other explanations are possible to get no local structural details.

      See our answer to point 2, below.

      (2) At several other places in the manuscript the high flexibility is used to explain the lack of structural details (so the reasoning is reversed compared to point 1); this would require that a priori it is known that that the region is flexible and therefore no structure can be expected. An example is found mid-page 8: "A final important observation in the Roc domain concerns the very C-terminal part of Switch 2 (residues 520 to 533), which could not be modeled in our GTP bound structure due to flexibility, while in the nucleotide-free dimer structure this region is structured and located at the interface of the Roc domain with the LRR-Roc linker and CORA." As written there must be a reference to experiments showing the "due to flexibility"

      The reviewer is correct that additional factors might affect the interpretability of the map, such as the small size of the regions used for the focused refinements (around 50 kDa each) or a preferential distribution of orientation of the particles in the grid. Particle distribution plots are now shown in Figure 1 – Figure supplements 1 and 2. However, due to the intrinsic flexible nature of the Switch 1 and Switch 2 regions, we assume this flexibility to be the major cause of lack of features in the EM maps, especially since some of the neighboring regions display well-resolved maps.

      Nevertheless, in the manuscript we reworded our statements to be more careful. For example, on page 8:

      “Also the Switch 1 loop could not be fully modeled in our structure, presumably indicating some flexibility in this region despite the presence of a GTP analogue.”

      “… potentially due to flexibility of this region in the new position of the Switch 2…”

    2. eLife assessment

      The fundamental study by Galicia C. et al. captured the GTP-bound active structure of CtRoco, a homolog of human LRRK2, using conformation-specific nanobodies. This convincing body of work reports the first structure of a GTP-bound ROCO protein, illustrating how GTP facilitates the dimer-to-monomer transition of CtRoco and functional activation.

    3. Reviewer #1 (Public Review):

      Summary:

      The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformation-specific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:

      The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

    4. Reviewer #2 (Public Review):

      Summary

      The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomer-GTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:

      The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The aim of the present work is to evaluate the role of BMP9 and BMP10 in liver by depleting Bmp9 and Bmp10 from the main liver cell types (endothelial cells (EC), hepatic stellate cells (HSC), Kupffer cells (KC) and hepatocytes (H)) using cell-specific cre recombinases. They show that HSCs are the main source of BMP9 and BMP10 in the liver. Using transgenic ALK1 reporter mice, they show that ALK1, the high affinity type 1 receptor for BMP9 and BMP10, is expressed on KC and EC. They have also performed bulk RNAseq analyses on whole liver, and cell-sorted EC and KC, and showed that loss of Bmp9 and Bmp10 decreased KC signature and that KC are replaced by monocyte-derived macrophages. EC derived from these Bmp9fl/flBmp10fl/flLratCre mice also lost their identity and transdifferentiated into continuous ECs. Liver iron metabolism and metabolic zonation were also affected in these mice. In conclusion, this work supports that BMP9 and BMP10 produced by HSC play a central role in mediating liver cell-cell crosstalk and liver homeostasis.

      We appreciate the comprehensive summary of reviewer 1.

      Strengths:

      This work further supports the role of BMP9 and BMP10 in liver homeostasis. Using a specific HSC-Cre recombinase, the authors show for the first time that it is the BMP9 and BMP10 produced by HSC that play a central role in mediating liver cell-cell crosstalk to maintain a healthy liver. Although the overall message of the key role of BMP9 in liver homeostasis has been described by several groups, the role of hepatic BMP10 has not been studied before. Thus, one of the novelties of this work is to have used liver cell specific Cre recombinase to delete hepatic Bmp9 and Bmp10. The second novelty is the demonstration of the role of BMP9 and BMP10 in KC Differentiation/homeostasis which has already been slightly addressed by this group by knocking out ALK1, the high affinity receptor of BMP9 and BMP10 (Zhao et al. JCI, 2022).

      We appreciate the positive comment of reviewer 1.

      Weaknesses:

      This work remains rather descriptive and the molecular mechanisms are barely touched upon and could have been more explored. Some references should be added; In particular, a work that has already demonstrated, using a different approach (in situ hybridization RNAscope), that in the liver BMP9 and BMP10 are expressed by HSC (Tillet et al., J Biol Chem 2018). Another publication (Bouvard et al., Cardiovasc Res, 2021) has previously showed that deletion of Bmp9 and Bmp10 leads to liver fibrosis and could have thus been cited. There is also a reference that is not correctly cited. Ref 26 (Herrera et al., 2014) does not say that "BMP10 is mostly expressed in the heart, followed by the liver" or that "BMP9 and BMP10 also bind to ALK2" as cited in the manuscript.

      We agree with the comment of reviewer 1 that the molecular mechanisms were barely investigated in our work. Indeed, it has been reported that BMP9/10 induce the expression of ID1/3 in KCs and GATA4 and Maf in liver ECs in vitro culture system. These master regulators play an important role in the differentiation of the two cell types. Thus, we think that the reduced expression of these master regulators can explain the phenotype in KCs and ECs observed in Bmp9fl/flBmp10fl/flLratCre mice. In addition, according to the reviewer’s suggestion, these references will be added or corrected in our revised manuscript.

      The gating strategies for cell sorting which is used for bulk RNAseq and FACS analyses should be better described in order to better follow the manuscript. This point is particularly important for KC gating as the authors show that Tim4 is very strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2c), yet, it seems that this marker is used for gating macrophages (Suppl fig4). Same question with F4/80 which is strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2d) and also used for gating. It is important to show the gating strategy for both Control and Bmp9fl/flBmp10fl/flLratCre mice.

      The authors should explain how they selected the genes shown on each heatmaps and add references that can justify the choice of the genes.

      Thank you for your suggestion. In our study, we used CD45+ Ly6C- F4/80+ CD64+ cells to define liver macrophages. We will delete Tim4 FACS plot from Suppl fig4 to avoid the misunderstanding. Although F4/80 positive cells were reduced in the livers of Bmp9fl/flBmp10fl/flLratCre mice, double staining by anti-F4/80 and anti-CD64 fluorescence antibodies can still clearly distinguish liver macrophages based on above gating strategy. Gating strategy for both control and Bmp9fl/flBmp10fl/flLratCre mice will be presented in our revised manuscript.

      Quantifications of Immunostaining and FACS data should be added as well as statistical analyses.

      Quantitative data will be added in our revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors characterized the contribution of BMP9/BMP10 expression/secretion from all different hepatic cell types and analysed their impact on the other cell types. They are able to show that HSC derived BMP9/BMP10 controls Kupffer cell and EC differentiation and functions.

      We appreciate the comprehensive summary of reviewer 2.

      Strengths:

      This is the first study to my knowledge to comprehensively analyze the contribution of BMP9/BMP10 expression in such systematic fashion in vivo. This study therefore is a significant contribution to the field and further supports previous studies that have already implied BMP9 and BMP10 in Kupffer cell and EC functions but did not unravel the intercellular cross talk in such detailed fashion.

      We appreciate the positive comment of reviewer 2.

      Weaknesses:

      Several findings such as the impact of BMP9/10 on Kupffer cells and EC were already known. So these findings are not innovative, however I still believe that the elucidation of the cellular crosstalk makes this publication highly interesting to a broad scientific community.

      Overall the authors achieved their aims and the results are well supporting the conclusions and discussion.

      We appreciate the positive comment of reviewer 2. We agree with the comment of reviewer 2 that although some findings in our paper are somehow expected, the detailed investigation of the crosstalk between different liver cell types is still needed and beneficial to this field.

    2. eLife assessment

      This valuable study delineates the cellular contributions of BMP signaling in liver development and function. The findings are convincing, and the study employs state-of-the-art molecular, genetic, and cellular approaches to demonstrate that hepatic stellate cells play a central role in liver health by mediating cell-to-cell crosstalk via the production of specific BMP proteins. This study will be of interest to scientists interested in developmental biology and organ physiology.

    3. Reviewer #1 (Public Review):

      Summary:

      The aim of the present work is to evaluate the role of BMP9 and BMP10 in liver by depleting Bmp9 and Bmp10 from the main liver cell types (endothelial cells (EC), hepatic stellate cells (HSC), Kupffer cells (KC) and hepatocytes (H)) using cell-specific cre recombinases. They show that HSCs are the main source of BMP9 and BMP10 in the liver. Using transgenic ALK1 reporter mice, they show that ALK1, the high affinity type 1 receptor for BMP9 and BMP10, is expressed on KC and EC. They have also performed bulk RNAseq analyses on whole liver, and cell-sorted EC and KC, and showed that loss of Bmp9 and Bmp10 decreased KC signature and that KC are replaced by monocyte-derived macrophages. EC derived from these Bmp9fl/flBmp10fl/flLratCre mice also lost their identity and transdifferentiated into continuous ECs. Liver iron metabolism and metabolic zonation were also affected in these mice. In conclusion, this work supports that BMP9 and BMP10 produced by HSC play a central role in mediating liver cell-cell crosstalk and liver homeostasis.

      Strengths:

      This work further supports the role of BMP9 and BMP10 in liver homeostasis. Using a specific HSC-Cre recombinase, the authors show for the first time that it is the BMP9 and BMP10 produced by HSC that play a central role in mediating liver cell-cell crosstalk to maintain a healthy liver. Although the overall message of the key role of BMP9 in liver homeostasis has been described by several groups, the role of hepatic BMP10 has not been studied before. Thus, one of the novelties of this work is to have used liver cell specific Cre recombinase to delete hepatic Bmp9 and Bmp10. The second novelty is the demonstration of the role of BMP9 and BMP10 in KC Differentiation/homeostasis which has already been slightly addressed by this group by knocking out ALK1, the high affinity receptor of BMP9 and BMP10 (Zhao et al. JCI, 2022).

      Weaknesses:

      This work remains rather descriptive and the molecular mechanisms are barely touched upon and could have been more explored.<br /> Some references should be added; In particular, a work that has already demonstrated, using a different approach (in situ hybridization RNAscope), that in the liver BMP9 and BMP10 are expressed by HSC (Tillet et al., J Biol Chem 2018). Another publication (Bouvard et al., Cardiovasc Res, 2021) has previously showed that deletion of Bmp9 and Bmp10 leads to liver fibrosis and could have thus been cited. There is also a reference that is not correctly cited. Ref 26 (Herrera et al., 2014) does not say that "BMP10 is mostly expressed in the heart, followed by the liver" or that "BMP9 and BMP10 also bind to ALK2" as cited in the manuscript.<br /> The gating strategies for cell sorting which is used for bulk RNAseq and FACS analyses should be better described in order to better follow the manuscript. This point is particularly important for KC gating as the authors show that Tim4 is very strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2c), yet, it seems that this marker is used for gating macrophages (Suppl fig4). Same question with F4/80 which is strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2d) and also used for gating. It is important to show the gating strategy for both Control and Bmp9fl/flBmp10fl/flLratCre mice.<br /> The authors should explain how they selected the genes shown on each heatmaps and add references that can justify the choice of the genes.<br /> Quantifications of Immunostaining and FACS data should be added as well as statistical analyses.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors characterized the contribution of BMP9/BMp10 expression/secretion from all different hepatic cell types and analysed their impact on the other cell types. They are able to show that HSC derived BMP9/BMP10 controls Kupffer cell and EC differentiation and functions.

      Strengths:

      This is the first study to my knowledge to comprehensively analyze the contribution of BMP9/BMP10 expression in such systematic fashion in vivo. This study therefore is a significant contribution to the field and further supports previous studies that have already implied BMP9 and BMP10 in Kupffer cell and EC functions but did not unravel the intercellular cross talk in such detailed fashion.

      Weaknesses:

      Several findings such as the impact of BMP9/10 on Kupffer cells and EC were already known. So these findings are not innovative, however I still believe that the elucidation of the cellular crosstalk makes this publication highly interesting to a broad scientific community.

      Overall the authors achieved their aims and the results are well supporting the conclusions and discussion.

    1. Reviewer #2 (Public Review):

      Summary:

      This work proposes a synaptic plasticity rule that explains the generation of learned stochastic dynamics during spontaneous activity. The proposed plasticity rule assumes that excitatory synapses seek to minimize the difference between the internal predicted activity and stimulus-evoked activity, and inhibitory synapses try to maintain the E-I balance by matching the excitatory activity. By implementing this plasticity rule in a spiking recurrent neural network, the authors show that the state-transition statistics of spontaneous excitatory activity agree with that of the learned stimulus patterns, which are reflected in the learned excitatory synaptic weights. The authors further demonstrate that inhibitory connections contribute to well-defined state transitions matching the transition patterns evoked by the stimulus. Finally, they show that this mechanism can be expanded to more complex state-transition structures including songbird neural data.

      Strengths:

      This study makes an important contribution to computational neuroscience, by proposing a possible synaptic plasticity mechanism underlying spontaneous generations of learned stochastic state-switching dynamics that are experimentally observed in the visual cortex and hippocampus. This work is also very clearly presented and well-written, and the authors conducted comprehensive simulations testing multiple hypotheses. Overall, I believe this is a well-conducted study providing interesting and novel aspects of the capacity of recurrent spiking neural networks with local synaptic plasticity.

      Weaknesses:

      This study is very well-thought-out and theoretically valuable to the neuroscience community, and I think the main weaknesses are in regard to how much biological realism is taken into account. For example, the proposed model assumes that only synapses targeting excitatory neurons are plastic, and uses an equal number of excitatory and inhibitory neurons.

      The model also assumes Markovian state dynamics while biological systems can depend more on history. This limitation, however, is acknowledged in the Discussion.<br /> Finally, to simulate spontaneous activity, the authors use a constant input of 0.3 throughout the study. Different amplitudes of constant input may correspond to different internal states, so it will be more convincing if the authors test the model with varying amplitudes of constant inputs.

    2. eLife assessment

      This is an important study that investigates how neural networks can learn to stochastically replay presented sequences of activity according to learned transition probabilities. The authors use error-based excitatory plasticity to minimize the difference between internally predicted activity and stimulus-driven activity, and inhibitory plasticity to maintain E-I balance. The approach is solid but the choice of learning rules and parameters is not always always justified, lacking a formal derivation and concrete experimental predictions.

    3. Reviewer #1 (Public Review):

      In the presented manuscript, the authors investigate how neural networks can learn to replay presented sequences of activity. Their focus lies on the stochastic replay according to learned transition probabilities. They show that based on error-based excitatory and balance-based inhibitory plasticity networks can self-organize towards this goal. Finally, they demonstrate that these learning rules can recover experimental observations from song-bird song learning experiments.

      Overall, the study appears well-executed and coherent, and the presentation is very clear and helpful. However, it remains somewhat vague regarding the novelty. The authors could elaborate on the experimental and theoretical impact of the study, and also discuss how their results relate to those of Kappel et al, and others (e.g., Kappel et al (doi.org/10.1371/journal.pcbi.1003511)). Overall, the work could benefit if there was either (A) a formal analysis or derivation of the plasticity rules involved and a formal justification of the usefulness of the resulting (learned) neural dynamics; and/or (B) a clear connection of the employed plasticity rules to biological plasticity and clear testable experimental predictions. Thus, overall, this is a good work with some room for improvement.

    4. Reviewer #3 (Public Review):

      Summary:

      Asabuki and Clopath study stochastic sequence learning in recurrent networks of Poisson spiking neurons that obey Dale's law. Inspired by previous modeling studies, they introduce two distinct learning rules, to adapt excitatory-to-excitatory and inhibitory-to-excitatory synaptic connections. Through a series of computer experiments, the authors demonstrate that their networks can learn to generate stochastic sequential patterns, where states correspond to non-overlapping sets of neurons (cell assemblies) and the state-transition conditional probabilities are first-order Markov, i.e., the transition to a given next state only depends on the current state. Finally, the authors use their model to reproduce certain experimental songbird data involving highly-predictable and highly-uncertain transitions between song syllables.

      Strengths:

      This is an easy-to-follow, well-written paper, whose results are likely easy to reproduce. The experiments are clear and well-explained. The study of songbird experimental data is a good feature of this paper; finches are classical model animals for understanding sequence learning in the brain. I also liked the study of rapid task-switching, it's a good-to-know type of result that is not very common in sequence learning papers.

      Weaknesses:

      While the general subject of this paper is very interesting, I missed a clear main result. The paper focuses on a simple family of sequence learning problems that are well-understood, namely first-order Markov sequences and fully visible (no-hidden-neuron) networks, studied extensively in prior work, including with spiking neurons. Thus, because the main results can be roughly summarized as examples of success, it is not entirely clear what the main point of the authors is.

      Going into more detail, the first major weakness I see in this paper is the heuristic choice of learning rules. The paper studies Poisson spiking neurons (I return to this point below), for which learning rules can be derived from a statistical objective, typically maximum likelihood. For fully-visible networks, these rules take a simple form, similar in many ways to the E-to-E rule introduced by the authors. This more principled route provides quite a lot of additional understanding on what is to be expected from the learning process. For instance, should maximum likelihood learning succeed, it is not surprising that the statistics of the training sequence distribution are reproduced. Moreover, given that the networks are fully visible, I think that the maximum likelihood objective is a convex function of the weights, which then gives hope that the learning rule does succeed. And so on. This sort of learning rule has been studied in a series of papers by David Barber and colleagues [refs. 1, 2 below], who applied them to essentially the same problem of reproducing sequence statistics in recurrent fully-visible nets. It seems to me that one key difference is that the authors consider separate E and I populations, and find the need to introduce a balancing I-to-E learning rule.

      Because the rules here are heuristic, a number of questions come to mind. Why these rules and not others - especially, as the authors do not discuss in detail how they could be implemented through biophysical mechanisms? When does learning succeed or fail? What is the main point being conveyed, and what is the contribution on top of the work of e.g. Barber, Brea, et al. (2013), or Pfister et al. (2004)?

      The use of a Poisson spiking neuron model is the second major weakness of the study. A chief challenge in much of the cited work is to generate stochastic transitions from recurrent networks of deterministic neurons. The task the authors set out to do is much easier with stochastic neurons; it is reasonable that the network succeeds in reproducing Markovian sequences, given an appropriate learning rule. I believe that the main point comes from mapping abstract Markov states to assemblies of neurons. If I am right, I missed more analyses on this point, for instance on the impact that varying cell assembly size would have on the findings reported by the authors.

      Finally, it was not entirely clear to me what the main fundamental point in the HVC data section was. Can the findings be roughly explained as follows: if we map syllables to cell assemblies, for high-uncertainty syllable-to-syllable transitions, it becomes harder to predict future neural activity? In other words, is the main point that the HVC encodes syllables by cell assemblies?

      (1) Learning in Spiking Neural Assemblies, David Barber, 2002. URL: https://proceedings.neurips.cc/paper/2002/file/619205da514e83f869515c782a328d3c-Paper.pdf

      (2) Correlated sequence learning in a network of spiking neurons usingmaximum likelihood, David Barber, Felix Agakov, 2002. URL: http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/barber-agakov-TR0149.pdf

    1. eLife assessment

      This useful study investigates the impact of disrupting the interaction of RAS with the PI3K subunit p110α in macrophage function in vitro and inflammatory responses in vivo. Solid data overall supports a role for RAS-p110α signalling in regulating macrophage activity and so inflammation, however for many of the readouts presented the magnitude of the phenotype is not particularly pronounced. Further analysis would be required to substantiate the claims that RAS-p110α signalling plays a key role in macrophage function. Of note, the molecular mechanisms of how exactly p110α regulating the functions in macrophage have not yet been established.

    2. Reviewer #1 (Public Review):

      In this study, Alejandro Rosell et al. uncovers the immunoregulation functions of RAS-p110α pathway in macrophages, including the extravasation of monocytes from the bloodstream and subsequent lysosomal digestion. Disrupting RAS-p110α pathway by mouse genetic tools or by pharmacological intervention, hampers the inflammatory response, leading to delayed resolution and more severe acute inflammatory reactions. The authors proposed that activating p110α using small molecules could be a promising approach for treating chronic inflammation. This study provides insights into the roles and mechanisms of p110α on macrophage function and the inflammatory response, while some conclusions are still questionable because of several issues described below.

      (1) Fig. 1B showed that disruption of RAS-p110α causes the decrease in the activation of NF-κB, which is a crucial transcription factor that regulates the expression of proinflammatory genes. However, the authors observed that disruption of RAS-p110α interaction results in an exacerbated inflammatory state in vivo, in both localized paw inflammation and systemic inflammatory mediator levels. Also, the authors introduced that "this disruption leads to a change in macrophage polarization, favouring a more proinflammatory M1 state" in introduction according to reference 12. The conclusions drew from the signaling and the models seemed contradictory and puzzling. Besides, it is not clear why the protein level of p65 was decreased at 10' and 30'. Was it attributed to the degradation of p65 or experimental variation?

      (2) In Fig 3, the authors used bone-marrow derived macrophages (BMDMs) instead of isolated monocytes to evaluate the ability of monocyte transendothelial migration, which is not sufficiently convincing. In Fig. 3B, the authors evaluated the migration in Pik3caWT/- BMDMs, and Pik3caWT/WT BMDMs treated with BYL-719'. Given that the dose effect of gene expression, the best control is Pik3caWT/- BMDMs treated with BYL-719.

      (3) In Fig. 4E-4G, the authors observed that elevated levels of serine 3 phosphorylated Cofilin in Pik3caRBD/- BMDMs both in unstimulated and in proinflammatory conditions, and phosphorylation of Cofilin at Ser3 increase actin stabilization, it is not clear why disruption of RAS-p110α binding caused a decrease in the F-actin pool in unstimulated BMDMs?

    3. Reviewer #2 (Public Review):

      Summary:

      Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophages function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known.

      Strengths:

      Exploiting a sound previously described genetically mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs the function of macrophages in a cell intrinsic manner. The study is well designed, clearly written with overall high-quality figures.

      Weaknesses:

      My main concern is that for many of the readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is rather modest, even if statistically significant (e.g. Figure 1A, 1C, 2A, 2F, 3B, 4B, 4C). In other cases, such as for the analysis of the H&E images (Figure 1D-E, S1E), the images are not quantified, and it is hard to appreciate what the phenotype in samples from Pik3caRBD mice is or whether this is consistently observed across different animals. Also, the authors claim there is a 'notable decrease' in Akt activation but 'no discernible chance' in ERK activation based on the western blot data presented in Figure 1A. I do not think the data shown supports this conclusion.

      To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling, the manuscript would benefit from testing macrophage activity in vitro and in vivo across other key macrophage activities such as bacteria phagocytosis, cytokine/chemokine production in response to titrating amounts of different PAMPs, inflammasome function, etc. This would be generally important overall but also useful to determine whether the defects in monocyte motility or macrophage lysosomal function are selectively controlled downstream of RAS-p110α signaling.

      Furthermore, given the key role of other myeloid cells besides macrophages in inflammation and immunity it remains unclear whether the phenotype observed in vivo can be attributed to impaired macrophage function. Is the function of neutrophils, dendritic cells or other key innate immune cells not affected?

      Compelling proof of concept data that targeting RAS-p110α signalling constitutes indeed a putative approach for modulation of chronic inflammation is lacking. Addressing this further would increase the conceptual advance of the manuscript and provide extra support to the authors' suggestion that p110α inhibition or activation constitute promising approaches to manage inflammation.

      Finally, the analysis by FACS should also include information about the total number of cells, not just the percentage, which is affected by the relative change in other populations. On this point, Figure S2B shows a substantial, albeit not significant (with less number of mice analysed), increase in the percentage of CD3+ cells. Is there an increase in the absolute number of T cells or does this apparent relative increase reflect a reduction in myeloid cells?

    1. eLife assessment

      The study is useful by attempting to present a new approach of combining two measurements (pHLA binding and pHLA-TCR binding) in order to refine predictions of which patient mutations are likely presented to and recognized by the immune system, but the evidence is incomplete. Whereas the novel methodology proposed is compelling, this article lacks a detailed explanation of the chosen model. The experimental validation confirming the computational predictions with actual immune responses is limited due to sample constraints.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper reports a number of somewhat disparate findings on a set of colorectal tumour and infiltrating T-cells. The main finding is a combined machine-learning tool which combines two previous state-of-the-art tools, MHC prediction, and T-cell binding prediction to predict immunogenicity. This is then applied to a small set of neoantigens and there is a small-scale validation of the prediciton at the end.

      Strengths:

      The prediction of immunogenic neoepitopes is an important and unresolved question.

      Weaknesses:

      The paper contains a lot of extraneous material not relevant to the main claim. Conversely, it lacks important detail on the major claim.

      (1) The analysis of T cell repertoire in Figure 2 seems irrelevant to the rest of the paper. As far as I could ascertain, this data is not used further.

      (2) The key claim of the paper rests on the performance of the ML algorithm combining NETMHC and pmtNET. In turn, this depends on the selection of peptides for training. I am unclear about how the negative peptides were selected. Are they peptides from the same databases as immunogenic petpides but randomised for MHC ? It seems as though there will be a lot of overlap between the peptides used for testing the combined algorithm, and the peptides used for training MHCNet and pmtMHC. If this is so, and depending on the choice of negative peptides, it is surely expected that the tools perform better on immunogenic than on non-immunogenic peptides in Figure 3. I don't fully understand panel G, but there seems very little difference between the TCR ranking and the combined. Why does including the TCR ranking have such a deleterious effect on sensitivity?

      (3) The key validation of the model is Figure 5. In 4 patients, the authors report that 6 out 21 neo-antigen peptides give interferon responses > 2 fold above background. Using NETMHC alone (I presume the tool was used to rank peptides according to bding to the respecitve HLAs in each individual, but this is not clear), identified 2; using the combined tool identified 4. I don't think this is significant by any measure. I don't understand the score shown in panel E but I don't think it alters the underlying statistic.

      In conclusion, the paper demonstrates that combining MHCNET and pmtMHC results in a modest increase in the ability to discriminate 'immunogenic' from 'non-immunogenic' peptide; however, the strength of this claim is difficult to evaluate without more knowledge about the negative peptides. The experimental validation of this approach in the context of CRC is not convincing.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper introduces a novel approach for improving personalized cancer immunotherapy by integrating TCR profiling with traditional pHLA binding predictions, addressing the need for more precise neoantigen CRC patients. By analyzing TCR repertoires from tumor-infiltrating lymphocytes and applying machine learning algorithms, the authors developed a predictive model that outperforms conventional methods in specificity and sensitivity. The validation of the model through ELISpot assays confirmed its potential in identifying more effective neoantigens, highlighting the significance of combining TCR and pHLA data for advancing personalized immunotherapy strategies.

      Strengths:

      (1) Comprehensive Patient Data Collection: The study meticulously collected and analyzed clinical data from 27 CRC patients, ensuring a robust foundation for research findings. The detailed documentation of patient demographics, cancer stages, and pathology information enhances the study's credibility and potential applicability to broader patient populations.

      (2) The use of machine learning classifiers (RF, LR, XGB) and the combination of pHLA and pHLA-TCR binding predictions significantly enhance the model's accuracy in identifying immunogenic neoantigens, as evidenced by the high AUC values and improved sensitivity, NPV, and PPV.

      (3) The use of experimental validation through ELISpot assays adds a practical dimension to the study, confirming the computational predictions with actual immune responses. The calculation of ranking coverage scores and the comparative analysis between the combined model and the conventional NetMHCpan method demonstrate the superior performance of the combined approach in accurately ranking immunogenic neoantigens.

      (4) The use of experimental validation through ELISpot assays adds a practical dimension to the study, confirming the computational predictions with actual immune responses.

      Weaknesses:

      (1) While multiple advanced tools and algorithms are used, the study could benefit from a more detailed explanation of the rationale behind algorithm choice and parameter settings, ensuring reproducibility and transparency.

      (2) While pHLA-TCR binding displayed higher specificity, its lower sensitivity compared to pHLA binding suggests a trade-off between the two measures. Optimizing the balance between sensitivity and specificity could be crucial for the practical application of these predictions in clinical settings.

      (3) The experimental validation was performed on a limited number of patients (four), which might affect the generalizability of the findings. Increasing the number of patients for validation could provide a more comprehensive assessment of the model's performance

    4. Reviewer #3 (Public Review):

      Summary:

      This study presents a new approach of combining two measurements (pHLA binding and pHLA-TCR binding) in order to refine predictions of which patient mutations are likely presented to and recognized by the immune system. Improving such predictions would play an important role in making personalized anti-cancer vaccinations more effective.

      Strengths:

      The study combines data from pre-existing tools pVACseq and pMTNet and applies them to a CRC patient population, which the authors show may improve the chance of identifying immunogenic, cancer-derived neoepitopes. Making the datasets collected publicly available would expand beyond the current datasets that typically describe caucasian patients.

      Weaknesses:

      It is unclear whether the pNetMHCpan and pMTNet tools used by the authors are entirely independent, as they appear to have been trained on overlapping datasets, which may explain their similar scores. The pHLA-TCR score seems to be driving the effects, but this not discussed in detail.

      Due to sample constraints, the authors were only able to do a limited amount of experimental validation to support their model; this raises questions as to how generalisable the presented results are. It would be desirable to use statistical thresholds to justify cutoffs in ELISPOT data.

      Some of the TCR repertoire metrics presented in Figure 2 are incorrectly described as independent variables and do not meaningfully contribute to the paper. The TCR repertoires may have benefitted from deeper sequencing coverage, as many TCRs appear to be supported only by a single read.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Valk and Engert et al. examined the potential relations between three different mental training modules, hippocampal structure and functional connectivity, and cortisol levels over a 9-month period. They found that among the three types of mental training: Presence (attention and introspective awareness), Affect (socio-emotional - compassion and prosocial motivation), and Perspective (socio-cognitive - metacognition and perspective taking) modules; Affect training most consistently related to changes in hippocampal structure and function - specifically, CA1-3 subfields of the hippocampus. Moreover, decreases in diurnal cortisol correlated to bilateral increases in volume, and decreases in diurnal and chronic cortisol left CA1-3 functional connectivity. Chronic cortisol levels also related to right CA4/DG volume and left subiculum function. The authors demonstrate that mindfulness training programs impact hippocampus and are a potential avenue for stress interventions, a potential avenue to improve health. The data contribute to the literature on plasticity of hippocampal subfields during adulthood, the impact of mental training interventions on the brain, and the link between CA1-3 and both short- and long-term stress changes. Additional clarification and extension of the methods is needed to strengthen the authors' conclusions.

      We thank the Reviewer for their positive evaluation and summary of our findings and work. We made additional changes as suggested by the Reviewer and hope this clarified any open points.

      (1) The authors thoughtfully approached the study of hippocampal subfields, utilizing a method designed for T1w images that outperformed Freesurfer 5.3 and that produced comparable results to an earlier version of ASHS. However, given the use of normalized T1-weighted images to delineate hippocampal subfield volume, some caution may be warranted (Wisse et al. 2020). While the authors note the assessment of quality control processes, the difficulty in ensuring valid measurement is an ongoing conversation in the literature. This also extends to the impact of functional co-registration using segmentations. I appreciate the inclusion of Table 5 in documenting reasons for missing data across subjects. Providing additional details on the distribution of quality ratings across subfields would help contextualize the results and ensure there is equal quality of segmentations across subfields.

      We thank the Reviewer for bringing up this point. In the current work, we assessed the overall segmentation of all six subfields per individual. Thus, unfortunately, we have no data of quality of segmentation of individual subfields beyond our holistic assessment. Indeed, registration of hippocampal subfields remains a challenge and we have further highlighted this limitation in the Discussion of the current work.

      “It is of note that the current work relies on a segmentation approach of hippocampal subfields including projection to MNI template space, an implicit correction for total brain volume through the use of a stereotaxic reference frame. Some caution for this method may be warranted, as complex hippocampal anatomy can in some cases lead to over- as well as underestimation of subfield volumes, as well as subfield boundaries may not always be clearly demarcated (1). Future work, studying the hippocampal surface at higher granularity, for example though unfolding the hippocampal sheet (2-5), may further help with both alignment and identification of not only subfield-specific change but also alterations as a function of the hippocampal long axis, a key dimension of hippocampal structural and functional variation that was not assessed in the current work (6, 7).”

      (2) Given the consistent pattern of finding results with CA1-3, in contrast to other subfields, it would help to know if the effects of the different training modules on subfields differed from each other statistically (i.e., not just that one is significant, and one is not) to provide an additional context of the strength of results focused on Affect training and CA1-3 (for example, those shown in Figure 3).

      Our work investigated i) whether the effects of the individual Training Modules differed from each other statistically. We found that the Affect Training Module showed increases in CA1-3 volume, and that these increases remained when testing effects relative to changes in this subfield following Perspective training and in retest controls. Moreover, in CA1-3 we found changes in functional connectivity when comparing the Affect to Perspective training Module. These changes were only present in this contrast, but not significant in each of the Training Modules per se. To test for specificity, we additionally evaluated whether subfield-specific changes were present above and beyond changes in the other ipsilateral hippocampal subfields. Relative to other subfields, right CA1-3 showed increases in the Affect vs Perspective contrast (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015). No other subfield showed significant changes. We now include this statement in the revised Results and Supplementary Tables.

      “Moreover, associations between CA1-3 and Affect, relative to Perspective, seemed to go largely above and beyond changes in the other subfields (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015, see further Supplementary File 1h).”

      Author response table 1.

      Subfield-specific changes following the Training Modules, controlling for the other two ipsilateral subfields

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 1, using different colors for subfields versus the modules (yellow, red, green) would help as it could lead the reader to try to draw connections between the two when it is namely a depiction of the delineations.

      As suggested, we updated Figure 1 accordingly and present the subfields in different shades of purple for clarity. Please find the updated figure below.

      Author response image 1.

      (2) In the Results, it was at times hard to follow when Affect off Perspective where the focus of the results. Perhaps the authors could restructure or add additional context for clarity.

      We are happy to clarify. For the first analysis on Module-specific changes in hippocampal subfield volume, we compared effects across Training Modules. Here, main contrasts were ran between subjects: Presence vs active control and within subjects: Affect versus Perspective. In additional secondary contrasts, we studied training effects vs retest control. After observing consistent increases in bilateral CA1-3 following Affect, in the following analysis, we evaluated 1) intrinsic functional networks in main and supplementary contrasts and 2) diurnal cortisol measures within the Training modules only and all three Training Modules combined, and also adopted 3) a multivariate approach (PLS) (see comments Reviewer 2). We now also report effects of cortisol change on structural and functional subfield change in Presence and Perspective, for additional completeness and clarity.

      “To study whether there was any training module-specific change in hippocampal subfield volumes following mental training, we compared training effects between all three Training Modules (Presence, Affect, and Perspective). Main contrasts were: Presence vs Active control (between subjects) and Affect vs Perspective (within subjects). Supplementary comparisons were made vs retest controls and within training groups.”

      “Overall, for all hippocampal subfields, findings associated with volume increases in CA1-3 fol-lowing the Affect training were most consistent across timepoints and contrasts (Supplementary File 1a-f).”

      “Subsequently, we studied whether hippocampal CA1-3 would show corresponding changes in intrinsic function following the Affect mental training.”

      “In particular, the moderately consistent CA1-3 volume increases following Affect training were complemented with differential functional connectivity alterations of this subfield when comparing Affect to Perspective training”

      “Last, we probed whether group-level changes in hippocampal subfield CA1-3 volume would correlate with individual-level changes in diurnal cortisol indices (Presence: n= 86; Affect: n=92; Perspective: n=81), given that the hippocampal formation is a nexus of the HPA-axis (8). We took a two-step approach. First, we studied associations between cortisol and subfield change, particularly focusing on the Affect module and CA1-3 volume based on increases in CA1-3 volume identified in our group-level analysis.”

      “We observed that increases in bilateral CA1-3 following Affect showed a negative association with change in total diurnal cortisol output […]”

      “We did not observe alterations in CA1-3 volume in relation to change in cortisol markers in Presence or Perspective. Yet, for Presence, we observed association between slope and LCA4/DG change (t=-2.89, p=0.005, q=0.03), (Supplementary File 1uv).”

      “In case of intrinsic function, we also did not observe alterations in CA1-3 in relation to change in cortisol markers in Presence or Perspective, nor in other subfields (Supplementary File 1wx).”

      Author response table 2.

      Correlating change in subfield volume and diurnal cortisol indices in Presence. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold.

      Author response table 3.

      Correlating change in subfield volume and diurnal cortisol indices in Perspective. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold.

      Author response table 4.

      Association between stress-markers and within functional network sub-regions in Affect and Perspective.

      Author response table 5.

      Correlating change in subfield function and diurnal cortisol indices in Presence. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold. For these multiple comparisons (FDRq, corrected for two subfields) values are reported if uncorrected p values are below p<.05.

      Author response table 6.

      Correlating change in subfield function and diurnal cortisol indices in Perspective. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold. For these multiple comparisons (FDRq, corrected for two subfields) values are reported if uncorrected p values are below p<.05.

      (3) In the Methods, the authors note that corrections for multiple comparisons were used where needed, throughout the manuscript there is some switching between corrected and uncorrected p-values. At times, this made it difficult to follow in terms of when these corrections were needed.

      For clarity, we added explicit multiple comparisons information a) in main and supplementary results, and b) wherever extra information was needed. Also, we only included main contrasts in Table 1-3 to avoid confusion and moved the information on changes in SUB and CA4/DG to the Supplementary tables.

      (4) Typically, when correcting for intracranial volume the purpose is the ensure that sexual dimorphism in the size of the brain is accounted for. I would recommend the authors assess whether sex differences are accounted for by the MNI normalization approach taken. In the reading of the original Methods paper for the patch-based algorithm used, ICV was used to transform to MNI152 space. It would help to have additional information on how the normalization was done in the current study in order to draw comparisons to other findings in the literature.

      We are happy to further clarify. In the current work, we used the same approach as in the original paper. Volumes were linearly registered to the MNI template using FSL flirt. We now provided this additional information in the revised methods.

      “Hippocampal volumes were estimated based on T1w data that were linearly registered to MNI152 using FSL flirt (http://www.fmrib.ox.ac.uk/fsl/), such that intracranial volume was implicitly controlled for.”

      We agree with the Reviewer that sex differences may still be present, and investigated this. At baseline, sex differences were found in all subfields in the left hemisphere, and right CA4/DG (FDRq<0.05). Regressing out ICV resolved remaining sex differences. We then evaluated whether main results of volumetric subfield change were impacted by ICV differences. Differences between Affect and Perspective remained stable. We have now added this additional analysis in the Supplementary Materials.

      “Although stereotaxic normalization to MNI space would in theory account for global sex differences in intra-cranial volume, we still observed sex differences in various subfield volumes at baseline. Yet, accounting for ICV did not impact our main results suggesting changes in CA1-3 following Affect were robust to sex differences in overall brain volume (Supplementary File1j).”

      Author response table 7.

      Sex differences (female versus male) in hippocampal subfield volumes.

      Reviewer #2 (Public Review):

      In this study, Valk, Engert et al. investigated effects of stress-reducing behavioral intervention on hippocampal structure and function across different conditions of mental training and in relation to diurnal and chronic cortisol levels. The authors provide convincing multimodal evidence of a link between hippocampal integrity and stress regulation, showing changes in both volume and intrinsic functional connectivity, as measured by resting-state fMRI, in hippocampal subfield CA1-3 after socio-affective training as compared to training in a socio-cognitive module. In particular, increased CA1-3 volume following socio-affective training overlapped with increased functional connectivity to medial prefrontal cortex, and reductions in cortisol. The conclusions of this paper are well supported by the data, although some aspects of the data analysis would benefit from being clarified and extended.

      A main strength of the study is the rigorous design of the behavioral intervention, including test-retest cohorts, an active control group, and a previously established training paradigm, contributing to an overall high quality of included data. Similarly, systematic quality checking of hippocampal subfield segmentations contributes to a reliable foundation for structural and functional investigations.

      We thank the Reviewer for the thoughtful summary and appreciation of our work, as well as requests for further clarification and analyses. We addressed each of them in a point by point fashion below.

      Another strength of the study is the multimodal data, including both structural and functional markers of hippocampal integrity as well as both diurnal and chronic estimates of cortisol levels.

      (1) However, the included analyses are not optimally suited for elucidating multivariate interrelationships between these measures. Instead, effects of training on structure and function, and their links to cortisol, are largely characterized separately from each other. This results in the overall interpretation of results, and conclusions, being dependent on a large number of separate associations. Adopting multivariate approaches would better target the question of whether there is cortisol-related structural and functional plasticity in the hippocampus after mental training aimed at reducing stress.

      We thank the Reviewer for this suggestion. Indeed, our project combined different univariate analyses to uncover the association between hippocampal subfield structure, function, and cortisol markers. While systematic, a downside of this approach is indeed that interpretation of our results depend on a large number of analyses. To further explore the question whether there is cortisol-related structural and functional plasticity in the hippocampus, we followed the Reviewer’s suggestion and additionally adopted a multivariate partial least squares (PLS) model. We ran two complementary models. One focusing on the bilateral CA1-3, as this region showed increases in volume following Affect training and differential change between Affect and Perspective training in our resting state analyses and one model including all subfields. Both models included all stress markers. We found that both models could significantly relate stress markers to brain measures, and that in particular Affect showed strong associations with significant the latent markers. Both analyses showed inverse effects of structure and function in relation to stress markers and both slope and AUC changes showed strongest loadings. We now include these analyses the revised manuscript.

      Abstract

      “Of note, using a multivariate approach we found that other subfields, showing no group-level changes, also contributed to alterations in cortisol levels, suggesting circuit-level alterations within the hippocampal formation.”

      Methods

      “Partial least squares analysis

      To assess potential relationships between cortisol change and hippocampal subfield volume and functional change, we performed a partial least squares analysis (PLS) (9, 10). PLS is a multivariate associative model that to optimizes the covariance between two matrices, by generating latent components (LCs), which are optimal linear combinations of the original matrices (9, 10). In our study, we utilized PLS to analyze the relationships between change in volume and intrinsic function of hippocampal subfields and diurnal cortisol measures. Here we included all Training Modules and regressed out effects of age, sex, and random effects of subject on the brain measures before conducting the PLS analysis. The PLS process involves data normalization within training groups, cross-covariance, and singular value decomposition. Subsequently, subfield and behavioral scores are computed, and permutation testing (1000 iterations) is conducted to evaluate the significance of each latent factor solution (FDR corrected). We report then the correlation of the individual hippocampal and cortisol markers with the latent factors. To estimate confidence intervals for these correlations, we applied a bootstrapping procedure that generated 100 samples with replacement from subjects’ RSFC and behavioral data.”

      Results

      “Last, to further explore the question whether there is concordant cortisol-related structural and functional plasticity in the hippocampus we adopted a multivariate partial least square approach, with 1000 permutations to account for stability (9, 10) and bootstrapping (100 times) with replacement. We ran two complementary models including all Training Modules whilst regressing out age, sex and random effects of subject. First, we focused on the bilateral CA1-3, as this region showed increases in volume following Affect training and differential change between Affect and Perspective training in our resting state analyses. In the second model included structural and functional data of all subfields. Both models included all stress markers. We found that both models could identify significant associations between cortisol stress markers and hippocampal plasticity (FDRq<0.05), and that in particular Affect showed strongest associations with the latent markers for CA1-3 (Table 5). Both analyses showed inverse effects of subfield structure and function in relation to stress markers and both slope and AUC changes showed strongest associations with the latent factor.”

      Author response table 8.

      Multivariate PLS analyses linking cortisol markers to hippocampal subfield volume and function.

      Discussion

      “Last, performing multivariate analysis, we again observed associations between CA1-3 volume and function plasticity and stress change, strongest in Affect. Yet combining all subfields in a single model indicated that other subfields also link to stress alterations, indicating that ultimately circuit-level alterations within the hippocampal formation relate to latent changes in diurnal stress markers across Training Modules.”

      “This interpretation is also supported by our multivariate observations.”

      “In line with our observations in univariate analysis, we found multivariate associations between hippocampal subfield volume, intrinsic function and cortisol markers. Again, the contribution of volume and intrinsic function was inverse. This may possibly relate to the averaging procedure of the functional networks. Combined, outcomes of our univariate and multivariate analyses point to an association between change in hippocampal subfields and stress markers, and that these changes, at the level of the individual, ultimately reflect complex interactions within and across hippocampal subfields and may capture different aspects of diurnal stress. Future work may more comprehensively study the plasticity of the hippocampal structure, and link this to intrinsic functional change and cortisol to gain full insights in the specificity and system-level interplay across subfields, for example using more detailed hippocampal models (3). Incorporating further multivariate, computational, models is needed to further unpack and investigate the complex and nuanced association between hippocampal structure and function, in particular in relation to subfield plasticity and short and long-term stress markers.”

      “…based on univariate analysis. Our multivariate analysis further nuanced this observation, but again pointed to an overall association between hippocampal subfield changes and cortisol changes, but this time more at a systems level.”

      “Lastly, our multivariate analyses also point to a circuit level understanding of latent diurnal stress scores.”

      Author response image 2.

      Multivariate associations between changes in structure and function of hippocampal subfield volume and markers of stress change in Affect. A) Multivariate associations between bilateral CA1-3 volume and intrinsic function and stress markers. Left: Scatter of loadings, colored by Training Module; Right upper: individual correlations of stress markers; Right lower: individual correlation of subfields; B). Multivariate associations between all subfields’ volume and intrinsic function and stress markers. Left: Scatter of loadings, colored by Training Module; Right upper: individual correlations of stress markers; Right lower: individual correlation of subfields.

      (2) The authors emphasize a link between hippocampal subfield CA1-3 and stress regulation, and indeed, multiple lines of evidence converge to highlight a most consistent role of CA1-3. There are, however, some aspects of the results that limit the robustness of this conclusion. First, formal comparisons between subfields are incomplete, making it difficult to judge whether the CA1-3, to a greater degree than other subfields, display effects of training.

      We thank the Reviewer for this comment. To further test for specificity, we additionally evaluated subfield-specific changes relative to other subfields for our main contrasts (Presence versus Active Control and Affect versus Perspective). Relative to other subfields, right CA1-3 showed increases in the Affect vs Perspective contrast (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015); no other subfield showed significant changes. We now include this statement in Results and Supplementary Tables.

      “Moreover, associations between CA1-3 and Affect, relative to Perspective, seemed to go largely above and beyond changes in the other subfields (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015, see further Supplementary File 1h).”

      Author response table 9.

      Subfield-specific changes following the Training Modules, controlling for the other two ipsilateral subfields

      (3) Relatedly, it would be of interest to assess whether changes in CA1-3 make a significant contribution to explaining the link between hippocampal integrity and cortisol, as compared to structure and functional connectivity of the whole hippocampus.

      We thank the Reviewer for this comment. Please see the PLS analysis performed above (R2Q1). Indeed, not only CA1-3 but also other subfields seem to show a relationship with cortisol, in line with circuit level accounts on stress regulation and hippocampal circuit alterations (8, 11-15).

      (4) Second, both structural and functional effects (although functional to a greater degree), were most pronounced in the specific comparison of "Affect" and "Perspective" training conditions, possibly limiting the study's ability to inform general principles of hippocampal stress-regulation.

      We agree with the Reviewer that the association between stress and hippocampal plasticity, on the one hand, and mental training and hippocampal plasticity, on the other hand, make it not very straightforward to inform general principles on hippocampal stress regulation. However, as underscored in the discussion, in previous work we could also link mental training to stress reductions(16-18). We hope that the additional analyses and explanations further explain the multilevel insights of the current work, on the one hand using group-level analysis to investigate and illustrate the association between mental training and hippocampal subfield volume and intrinsic function, and on the other hand using individual level analysis to unpack the association between cortisol change and hippocampal subfield change.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the Results, the description of how the hippocampal subfields' functional networks were defined would benefit from some clarification. It is also somewhat unclear what is meant by (on page 10): "Evaluating functional connectivity changes, we found that connectivity of the right CA1-3 functional network showed differential changes when comparing Affect training to Perspective training (2.420, p=0.016, FDRq=0.032, Cohens D =0.289), but not versus retest control (Table 1 and Supplementary Table 8-14)." Were there significant changes in CA1-3 FC following both training conditions (but these differed from each other)? A description of what this difference reflected would increase the reader's understanding.

      We are happy to clarify. We included information of change of individual modules in the Supplementary materials, Supplementary Table 1 and 2, 9 and 10. Changes for functional connectivity were largely due to the differences in Modules, but did not show strong effects in one Module alone. We now include information on Affect and Perspective un-contrasted change in the main results text:

      “… which could be attributed to decreases in right CA1-3 mean FC following Perspective (t=-2.012, p=0.045, M:-0.024, std: 0.081, CI [-0.041 -0.006]), but not Affect (t=1.691, p=0.092, M: 0.010, std: 0.098, CI [-0.01 0.031]); changes were not present when comparing Affect training versus retest control (Table 1 and Supplementary File 1k-q).”

      (2) As described in the Public Review, the lack of multivariate assessments may risk selling the data short. Including analyses of concomitant functional and structural changes, in relation to cortisol, seems like an approach better adapted to characterize meaningful interrelationships between these measures.

      We thank the Reviewer for suggesting multivariate assessments. To understand the interrelation between behavioral intervention, hippocampal plasticity, and cortisol changes, the current work first evaluates a simpler operationalization of the relationship between hippocampal subfield structure and volume, and cortisol as a function of mental training. Thus, given the complex nature of the study, we initially opted for a model where we assess structural and functional changes independently, with structural changes as the basis of our investigations. Now we have also included a multivariate approach (PLS) to further test the association between hippocampal subfields and cortisol markers, please see our additions to the manuscript above. We now highlighted multivariate associations in the Discussion as well, and suggest this as an important next step for more detailed, future investigations.

      “Incorporating further multivariate, computational, models is needed to further unpack and investigate the complex and nuanced association between hippocampal structure and function, in particular in relation to subfield plasticity and short and long-term stress markers.”

      (3) A minor comment regards the Figures. Some main effects should be visualized in a clearer manner. For instance, the scatterplots in Figure 1, panel D. Also, some of the current headings within the figures could be made more intuitive to the reader.

      We thank the Reviewer for this comment. To improve clarity, we updated figure headings. For Figure 1D, the challenge is that the data are quite scattered and we aimed to visualize our observations in a naturalistic way. Therefore, we added additional y-axis information to further clarify the figures. Creating more overlap or differentiation would make other elements of the figure less clear, hence we remained with the current set-up detailing the intra- and inter-individual alterations of the current model.

      (1) Wisse LEM, Chetelat G, Daugherty AM, de Flores R, la Joie R, Mueller SG, et al. (2021): Hippocampal subfield volumetry from structural isotropic 1 mm(3) MRI scans: A note of caution. Hum Brain Mapp. 42:539-550.

      (2) DeKraker J, Kohler S, Khan AR (2021): Surface-based hippocampal subfield segmentation. Trends Neurosci. 44:856-863.

      (3) DeKraker J, Haast RAM, Yousif MD, Karat B, Lau JC, Kohler S, et al. (2022): Automated hippocampal unfolding for morphometry and subfield segmentation with HippUnfold. Elife. 11.

      (4) Vos de Wael R, Lariviere S, Caldairou B, Hong SJ, Margulies DS, Jefferies E, et al. (2018): Anatomical and microstructural determinants of hippocampal subfield functional connectome embedding. Proc Natl Acad Sci U S A. 115:10154-10159.

      (5) Bernhardt BC, Bernasconi A, Liu M, Hong SJ, Caldairou B, Goubran M, et al. (2016): The spectrum of structural and functional imaging abnormalities in temporal lobe epilepsy. Ann Neurol. 80:142-153.

      (6) Vogel JW, La Joie R, Grothe MJ, Diaz-Papkovich A, Doyle A, Vachon-Presseau E, et al. (2020): A molecular gradient along the longitudinal axis of the human hippocampus informs large-scale behavioral systems. Nat Commun. 11:960.

      (7) Genon S, Bernhardt BC, La Joie R, Amunts K, Eickhoff SB (2021): The many dimensions of human hippocampal organization and (dys)function. Trends Neurosci. 44:977-989.

      (8) McEwen BS (1999): Stress and hippocampal plasticity. Annu Rev Neurosci. 22:105-122.

      (9) Kebets V, Holmes AJ, Orban C, Tang S, Li J, Sun N, et al. (2019): Somatosensory-Motor Dysconnectivity Spans Multiple Transdiagnostic Dimensions of Psychopathology. Biol Psychiatry. 86:779-791.

      (10) McIntosh AR, Lobaugh NJ (2004): Partial least squares analysis of neuroimaging data: applications and advances. Neuroimage. 23 Suppl 1:S250-263.

      (11) Paquola C, Benkarim O, DeKraker J, Lariviere S, Frassle S, Royer J, et al. (2020): Convergence of cortical types and functional motifs in the human mesiotemporal lobe. Elife. 9.

      (12) DeKraker J, Ferko KM, Lau JC, Kohler S, Khan AR (2018): Unfolding the hippocampus: An intrinsic coordinate system for subfield segmentations and quantitative mapping. Neuroimage. 167:408-418.

      (13) McEwen BS, Nasca C, Gray JD (2016): Stress Effects on Neuronal Structure: Hippocampus, Amygdala, and Prefrontal Cortex. Neuropsychopharmacology. 41:3-23.

      (14) Sapolsky RM (2000): Glucocorticoids and hippocampal atrophy in neuropsychiatric disorders. Arch Gen Psychiatry. 57:925-935.

      (15) Jacobson L, Sapolsky R (1991): The role of the hippocampus in feedback regulation of the hypothalamic-pituitary-adrenocortical axis. Endocr Rev. 12:118-134.

      (16) Engert V, Hoehne K, Singer T (2023): Specific reduction in the cortisol awakening response after socio-affective mental training. Mindfulness.

      (17) Puhlmann LMC, Vrticka P, Linz R, Stalder T, Kirschbaum C, Engert V, et al. (2021): Contemplative Mental Training Reduces Hair Glucocorticoid Levels in a Randomized Clinical Trial. Psychosom Med. 83:894-905.

      (18) Engert V, Kok BE, Papassotiriou I, Chrousos GP, Singer T (2017): Specific reduction in cortisol stress reactivity after social but not attention-based mental training. Sci Adv. 3:e1700495.

    2. eLife assessment

      This important work examines the potential utility of socio-emotional and socio-cognitive mental training on hippocampal subfield structure and function, and cortisol levels. The authors provide convincing evidence that CA1-3 volume is sensitive to socio-emotional training, with changes related to function plasticity and cortisol levels. Further, the authors provide evidence of change across all subfields and training modules related to stress.

    3. Reviewer #1 (Public Review):

      Valk and Engert et al. examined the potential relations between three different mental training modules, hippocampal structure and functional connectivity, and cortisol levels (stress) over a 9-month period. They found that among the three types of mental training: Presence (attention and introspective awareness), Affect (socio-emotional - compassion and prosocial motivation), and Perspective (socio-cognitive - metacognition and perspective taking) modules; Affect training most robustly related to changes in hippocampal structure and function - specifically, CA1-3 subfields of the hippocampus. Moreover, change in intrinsic functional connectivity related to changes in diurnal cortisol release and long-term cortisol exposure. These changes are proposed to result from a combination of factors, which is supported by multivariate analyses showing changes across subfields and training content relate to cortisol changes.

      The authors demonstrate that mindfulness training programs are a potential avenue for stress interventions that impact hippocampal structure and cortisol, providing a promising approach to improve health. The data contribute to the literature on plasticity of hippocampal subfields during adulthood, the impact of mental training interventions on the brain, and the link between CA1-3 and both short- and long-term stress changes.

      The authors thoughtfully approached the study of hippocampal subfields, utilizing a method designed for T1w images that outperformed Freesurfer 5.3 and that produced comparable results to an earlier version of ASHS. The authors note the limitations of their approaches and provide detailed information on the data used and analyses conducted. The results provide a strong basis from which future studies can expand using computational approaches or more fine-grained investigations of the impact of mindfulness training on cortisol levels and the hippocampus.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to Reviewer 1:

      It wouldn't be very surprising to identify the association between PhenoAgeAccel and cancer risk, since the PhenoAgeAccel was constructed as a predictor for mortality which attributed a lot to cancer. Although cancer is an essential mediator for the association, sensitivity analyses using cancer-free mortality may provide an additional angle.

      As suggested, we retrained the PhenoAge in cancer-free participants based on mortality and recalculated PhenoAgeAccel in the UK Biobank. As expected, the re-calculated PhenoAgeAccel was still significantly associated with an increased risk of overall cancer in both men and women. The relevant results have been added to Appendix 1-table6.

      It would be interesting to see, to what extent, PhenoAgeAccel could be reversed by environmental or lifestyle factors. G by E for PhenoAgeAccel might be worth a try.

      As suggested, we performed interaction analysis between genetic and lifestyle factors on PhenoAgeAccel, and added the methods and results in the revision as follows:

      “55 independent PhenoAgeAccel-associated SNPs (P < 5 × 10-8) and corresponding effect sizes were derived from a large-scale PhenoAgeAccel GWAS including 107,460 individuals of European ancestry (Kuo, Pilling, Liu, Atkins, & Levine, 2021). A PhenoAgeAccel PRS was created using an additive model as previously described (Dai et al., 2019). In short, the genotype dosage of each risk allele for each individual was summed after multiplying by its respective effect size of PhenoAgeAccel.” (Page 6)

      “We performed additive interaction analysis between genetic risk (defined by CPRS) and PhenoAgeAccel on overall cancer risk, as well as genetic risk (defined by PhenoAgeAccel PRS) and lifestyle on PhenoAgeAccel using two indexes: the relative excess risk due to interaction (RERI) and the attributable proportion due to interaction (AP).” (Page 9)

      “However, we did not observe any interaction between genetic risk and lifestyle on PhenoAgeAccel in both men and women (Appendix 1-table 11).” (Page 13)

      Responses to Reviewer 2:

      Since the UK biobank has a large sample size, it should have enough power to split the dataset into discovery and validation sets. Why did the authors use 10-fold cross-validation instead of splitting the dataset?

      There may have been some misunderstandings in the interpretation of methods that 10-fold cross-validation was applied to select biomarkers when calculating PhenoAge in the previous manuscript (Levine et al., 2018). In this study, we analyzed the association between PhenoAgeAccel and incident cancer risk by dividing participants into ten groups based on the deciles of PhenoAgeAccel and assessed the associations of each group compared to the lowest decile. To avoid any confusion, we have removed the description of 10-fold cross-validation from the Methods section (Page 5).

      Recommendations for the authors:

      In addition, there is extant literature on the role of Phenotypic Age Acceleration in cancer risk and mortality that should be reviewed. Please also address possible overlap with previous work that used the UK Biobank cohort study (PMCID: PMC9958377).

      As suggested, we have reviewed the association of Phenotypic Age Acceleration with cancer risk, and added it into the Discussion section as follows:

      “Recently, several studies have confirmed the associations between PhenoAgeAccel and cancer risk. Mak et al. explored three measures of biological age, including PhenoAge, and assessed their associations with the incidence of overall cancer and five common cancers (breast, prostate, lung, colorectal, and melanoma) (Mak et al., 2023). In our previous study, we investigated the association between PhenoAgeAccel and lung cancer risk and analyzed the joint and interactive effects of PhenoAgeAccel and genetic factors on the risk of lung cancer (Ma et al., 2023). In comparison to these studies, our analysis expanded the range of cancers to 20 types and further explored the associations in different genetic and lifestyle contexts. Moreover, we also evaluated the potential implications of PhenoAge in population-level cancer screening.” (Page 15).

      Other minor comments:

      Line 216, "-4.35 to -1.25" or "-4.35, -1.25" may be better.

      As suggested, we have adjusted text accordingly.

      Line 260, please clarify the PRS used for G by E interaction testing. It could be site-specific PRS or CPRS.

      We used CPRS for G by E interaction testing, and we have changed the description of our methods as follows:

      “We performed additive interaction analysis between genetic risk (defined by CPRS) and PhenoAgeAccel on overall cancer risk, as well as genetic risk (defined by PhenoAgeAccel PRS) and lifestyle on PhenoAgeAccel using two indexes: the relative excess risk due to interaction (RERI) and the attributable proportion due to interaction (AP).” (Page 9)

      Line 223, The discussion/interpretation for "while negatively associated with risk of prostate cancer" is lacking.

      As suggested, we have discussed this as follows:

      “In addition, we observed a negative association between PhenoAgeAccel and prostate cancer risk. The unexpected association may have been confounded by diabetes and altered glucose metabolism, both of which are closely linked to aging. When we removed HbA1c and serum glucose from the biological age algorithms, the association became non-statistically significant. Similar findings were also reported by Mak et al. (Mak et al., 2023) and Dugue et al. (Dugue et al., 2021).” (Page 15).

      It is not clear how to define "biologically older" and "biologically younger". Whether the individuals fall in the "middle area" will impact the results.

      We defined "biologically older" and "biologically younger" based on Phenotypic Age Acceleration (PhenoAgeAccel), which was defined as the residual obtained from a linear model when regressing Phenotypic Age on chronological age. We categorized individuals with PhenoAgeAccel > 0 as biologically older and those with PhenoAgeAccel < 0 as biologically younger.

      Compared with individuals at low accelerated aging (the bottom quintile of PhenoAgeAccel), we found those in the "middle area" (quintiles 2 to 4) and high accelerated aging (the top quintile) had a significantly higher risk of overall cancer (Table 2). Individuals fall in the "middle area" also had a moderate risk of overall cancer, when reclassified accelerated aging levels according to quartiles or tertiles of the PhenoAgeAccel (Appendix 1-table 2).

      Do men and women have distinct biological ages, so they were analyzed separately?

      We found that men (median PhenoAgeAccel: 0.34, IQR: -2.42 to 3.53) have higher biological ages than women (median PhenoAgeAccel: -1.38, IQR: -4.26 to 1.96) (P < 0.0001). In addition, men and women have different cancer incidence patterns (Rubin, 2022). Therefore, we conducted separate analyses to investigate the associations of PhenoAgeAccel with cancer risk in men and women.

      Dai, J., Lv, J., Zhu, M., Wang, Y., Qin, N., Ma, H., . . . Shen, H. (2019). Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations. Lancet Respir Med, 7(10), 881-891. doi: 10.1016/S2213-2600(19)30144-4

      Dugue, P. A., Bassett, J. K., Wong, E. M., Joo, J. E., Li, S., Yu, C., . . . Milne, R. L. (2021). Biological Aging Measures Based on Blood DNA Methylation and Risk of Cancer: A Prospective Study. JNCI Cancer Spectr, 5(1). doi: 10.1093/jncics/pkaa109

      Kuo, C. L., Pilling, L. C., Liu, Z., Atkins, J. L., & Levine, M. E. (2021). Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell, 20(6), e13376. doi: 10.1111/acel.13376

      Levine, M. E., Lu, A. T., Quach, A., Chen, B. H., Assimes, T. L., Bandinelli, S., . . . Horvath, S. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY), 10(4), 573-591. doi: 10.18632/aging.101414

      Ma, Z., Zhu, C., Wang, H., Ji, M., Huang, Y., Wei, X., . . . Shen, H. (2023). Association between biological aging and lung cancer risk: Cohort study and Mendelian randomization analysis. iScience, 26(3), 106018. doi: 10.1016/j.isci.2023.106018

      Mak, J. K. L., McMurran, C. E., Kuja-Halkola, R., Hall, P., Czene, K., Jylhava, J., & Hagg, S. (2023). Clinical biomarker-based biological aging and risk of cancer in the UK Biobank. Br J Cancer, 129(1), 94-103. doi: 10.1038/s41416-023-02288-w

      Rubin, J. B. (2022). The spectrum of sex differences in cancer. Trends Cancer, 8(4), 303-315. doi: 10.1016/j.trecan.2022.01.013

    2. eLife assessment

      This study presents fundamental findings that advance our understanding of the role of phenotypic aging in cancer risk. This article presents compelling results that show Phenotypic Age Acceleration (PhenoAgeAccel) can predict cancer incidence of different types and could be used with genetic risk to facilitate the identification of cancer-susceptible individuals. These results will be of broad interest to the research community and clinicians.

    3. Reviewer #1 (Public Review):

      Bian et al showed that biomarker-informed PhenoAgeAccel was consistently related to an increased risk of site-specific cancer and overall cancer within and across genetic risk groups. The results showed that PhenoAgeAccel and genetic liability of a bunch of cancers serve as productive tools to facilitate the identification of cancer-susceptible individuals under an additive model. People with a high genetic risk for cancer may benefit from PhenoAgeAccel-imformed interventions.

      As the authors pointed out, the large sample size, the prospective design UK Biobank study, and the effective application of PhenoAgeAccel in predicting the risk of overall cancer are the major strengths of the study. Meanwhile, the CPRS seems to be a solid and comprehensive score based on incidence-weighted site-specific polygenic risk scores across 20 well-powered GWAS for cancers.

    4. Reviewer #2 (Public Review):

      Bian et al. calculated Phenotypic Age Acceleration (PhenoAgeAccel) via a linear model regressing Phenotypic Age on chronological age. They examined the associations between PhenoAgeAccel and cancer incidence using 374,463 individuals from the UK Biobank and found that older PhenoAge was consistently related to an increased risk of incident cancer, even among each risk group defined by genetics.

      The study is well-designed, and uses a large sample size from the UK biobank.

      Comments on revised version:

      The authors have addressed all my concerns.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      We wish to thank the Reviewers for their critical analysis of the article and for their suggestions and comments.

      In addition and beside the point-by-point answer to the Reviewers, we wish here to emphasize on three essential points that have been raised: First, we never intended (nor pretended) to address the incidence of the two EHT cell emergence processes on downstream fate, after release from the aortic floor (see for example the last paragraph of our initially submitted manuscript). We only wished to bring evidence on cell biological heterogeneity of the HE, particularly relying on cell polarity control and polarity reestablishment/reinforcement in the case of EHT pol+ cells, thus leading to emergence morphodynamic complexity. In the general context of cell extrusion in which all polarity features are generally downregulated, these are remarkable features.

      Second, we inform the Reviewers that we have performed a major revision of the work on the Pard3 proteins issue the outcome of which, hopefully, substantiates significantly the idea of a tuning of cell polarity features in the HE and all along the EHT time-window, for supporting EHT pol- and EHT pol+ types of emergence. To achieve this, we entirely revised the experimental strategy to increase specificity and sensitivity of detection of Pard3 protein isoforms expressed in the vascular system, based on endothelial FACS-sorting, qRT-PCR and single-molecule whole mount in situ hybridization using RNAscope. Importantly, we wish to stress that, by addressing Pard3 proteins, we initially aimed at substantiating our observations on the localization of our podxl2 construct (del-podxl2) used to label apical membranes. Hence, we sought to bring correlative evidence on the variation of expression of polarity proteins at early and later time points of the EHT time-window (suggesting tightly regulated expression control of polarity determinants, possibly at the mRNA level). This was clearly written and justified in the text, lines 227 or 303 of the initial manuscript. Also, this may have led to identify (a) specific isoform(s), including splicing variants as initially addressed.

      As the Reviewers will see, while performing the revision of our work, we now have been able to point at a specific isoform of Pard3, namely Pard3ba, whose mRNA expression level, in aortic cells and at the single cell resolution, is uniquely and specifically enhanced in cells contacting emergence ‘hot spots’. Using our Runx1 mutant fish line (dt-Runx1), we also show that expression of Pard3ba mRNAs, in these specific aortic regions, is sensitive to interference with Runx1 activity (i.e dt-Runx1 increases Pard3ba expression). Altogether, our new results strongly support our idea, initially proposed, on the regulation of polarity features during EHT; they indicates intercellular coordination, throughout cooperative cross-talk between aortic and HE/EHT cells. This is compatible with the idea of a ‘tuning’ of apico-basal polarity during the entire EHT time-window (including maturation of the HE to become competent for emergence and the emergence process per se whose morphodynamic complexity relies on regulating apico-basal polarity associated functions (ex: for controlling the specific junctional recycling modes of EHT pol+ and EHT pol- cells, as we suggest using JAM proteins that we have chosen owing to their function in the recruitment of Pard3 proteins for apico-basal polarity establishment)). This complements nicely our work and highlights the relevance of studying the interplay between aortic and HE/EHT cells (which we have started to dissect in the second part of our manuscript). Further work is obviously required to address local, dynamic variations of mRNAs encoding for this specific isoform of Pard3 as well as specific interference with its functions at the spatial and temporal levels (hence on live tissues), which is far beyond the scope of our currently submitted work.

      Finally, this emphasizes the importance of the aortic context, at the mesoscopic level, in the regulation of the EHT.

      Third, based on these major points and Reviewers suggestions, we propose to take into account the fact that the heterogeneity in emergence morphodynamics was not highlighted and propose the following title:

      ‘Tuning apicobasal polarity and junctional recycling in the hemogenic endothelium orchestrates the morphodynamic complexity of emerging pre-hematopoietic stem cells’

      Regarding Results and Figures, the previous Figures 3 and 4 have been entirely revised, with the support of Supplement Figures (3 and 4 supplement figures, respectively as well as a supplement video to Figure 3). Supplement Figures have also been included to the revised version, for nearly all results that appeared as data not shown (Figure 1 – figure supplement 2: illustrating the maintenance of EHT pol+ and EHT pol- cells after division; Figure 1 – figure supplement 3: illustrating the expression of the hematopoietic marker CD41 by EHT pol+ and EHT pol- cells). Also, a new supplemental figure, Figure 7 – figure supplement 7, has been added to substantiate the impact of interfering with ArhGEF11/PDZ-RhoGEF alternative splicing on hematopoiesis. Finally, a Figure for the Reviewers is added at the end of this file that shows that virtually 100% of aortic floor cells that we consider as hemogenic cells are positive for the hematopoietic marker Gata2b which is upstream of Runx1 (using RNAscope which allows achieving cellular resolution unambiguously).

      Reviewer #1 (Public Review):

      Summary:

      In this research article, the authors utilized the zebrafish embryo to explore the idea that two different cell types emerge with different morphodynamics from the floor of the dorsal aorta based on their apicobasal polarity establishment. The hypothesis that the apical-luminal polarity of the membrane could be maintained after EHT and confer different functionality to the cell is exciting, however, this could not be established. There is a general lack of data supporting several of the main statements and conclusions. In addition, the manuscript is difficult to follow and needs refinement. We present below some questions and suggestions with the goal of guiding the authors to improve the manuscript and solidify their findings.

      Here, we wish to emphasize that we do not make the hypothesis that ‘…the apical-luminal polarity of the membrane could be maintained after EHT …’ but that the apico-basal polarity establishment/maintenance controls the type of emergence and their associated cell biological features (EHT pol+ and EHT pol- cellular morphodynamics, establishment of membrane domains). Hence, our work suggests that these emergence modes, as a consequence of their intrinsic characteristics and differences, might have an impact on cellular behavior after the release (to place the work in the broader context of hematopoietic cell fate and differentiation). More specifically, the difference in the biological features of the luminal versus abluminal membrane for the two EHT types (ex: membrane signaling territories, membrane pools devoted to specific functions), might endow the cells with specific functional properties, after the release. What happens to those cells thereafter, except for illustrating the evolution of the luminal membrane for pol+ EHT cells, is beyond the scope of this paper. Here, we analyze and characterize some of the cell biological features of the EHT process per se (the emergence from the aortic floor), including the dynamic interface with adjoining endothelial cells.

      Strengths:

      New transgenic zebrafish lines developed. Challenging imaging.

      Weaknesses:

      (1) The authors conclude that the truncated version of Podxl2 fused to a fluorophore is enriched within the apical site of the cell. However, based on the images provided, an alternative interpretation is that the portion of the membrane within the apical side is less stretched than in the luminal side, and therefore the fluorophore is more concentrated and easier to identify by confocal. This alternative interpretation is also supported by data presented later in the paper where the authors demonstrate that the early HE is not polarized (membranes are not under tension and stretched yet). Could the authors confirm their interpretation with a different technique/marker like TEM?

      The argument of the apparent enrichment, or exclusion, of a marker depending on membrane stretching (and hence molecular packing) would be valid for any type of molecule embedded in these membranes, including of course endogenous ones (this is one of the general biophysical principles leading to the establishment of membrane domains, structurally and functionally speaking); hence, using another marker would not solve the issue because it would depends on its behavior in regard to packing (in particular lipid packing), which is difficult to anticipate and is a topic in its own (especially in this system that has been poorly investigated in regard to its biophysical and biochemical properties in vivo (including its exposure to the hemodynamics)).

      If we follow the logic of the Reviewer, it appears that it is not consistent with our results on the maturing HE. Indeed, in our dt-Runx1 mutants, mKate2-podxl2 is enriched at the luminal membrane of HE cells (HE cells are elongated, and the two membrane domains have a relative equal surface and bending); in comparison, HE cells have the same morphology in control animals than in mutants but, in controls, eGFP-podxl2 and mKate2-podxl2 are equally partitioned between the luminal and abluminal membranes (see Figure 3 – figure supplement 2 (for mKate2-podxl2) and Figure 2 – figure supplement 1 and 2 (for eGFP-podxl2)). In addition, we took care while designing the eGFP and mKate2 fusions to keep the natural podxl2 sequence containing critical cysteine residues to maintain assembly properties and distance from the transmembrane segment (hence the fluorescent protein per se is not directly exposed to membrane stretching).

      Finally, electron microscopy is not the approach to use for this issue because requiring tissue fixation which is always at risk because modifying significantly membrane properties. On this line, when we fix embryos (and hence membranes, see our new Figure 4 and its Supplemental Figures), we do not appear to maintain obvious EHT pol+ and pol- cell shapes. In addition, to be conclusive, the work would require not TEM but immuno-EM to be able to visualize the marker(s), which is another challenge with this system.

      (2) Could the authors confirm that the engulfed membranes are vacuoles as they claimed, using, for example, TEM? Why is it concluded that "these vacuoles appear to emanate from the abluminal membrane (facing the sub-aortic space) and not from the lumen?" This is not clear from the data presented.

      The same argument regarding electron microscopy mentioned on the point before is valid here (in addition, it would require serial sectioning in the case it would be technically feasible to make sure not to miss the very tinny connection that may only suggest ultimate narrowing down of the facing adjacent bilayers, which is quite challenging). The term vacuole which we use with caution (in fact, more often, we use the term pseudo-vacuoles in the initial manuscript, lines 140, 146, 1467 (legend to Figure 1 – figure supplemental 1 or apparent vacuole-like in the same legend lines 1465 and 1476) is legitimate here because we cannot say that they are portions of the invaginated luminal membrane as we could be accused not to show that these membranes are still connected to the luminal surface; we are here at the limit of the resolution that in vivo imaging is allowing for the moment with this system, and we drive the attention of the Reviewer on the fact that we are reaching here a sub-cellular level which is already a challenge by itself.

      In addition, if there would not be at some point vacuoles (or pseudo-vacuoles) formed in this system (membrane-bounded organelles), it would be difficult to conceive how, after release of the cell, the fluid inherited from the artic lumen would efficiently be chased from these membranes/organelles (see also our model Figure 1 – figure Supplement 1B).

      Why is it concluded that "these vacuoles appear to emanate from the abluminal membrane (facing the sub-aortic space) and not from the lumen?" This is not clear from the data presented.

      This is not referring to our data but to the Sato et al 2023 work. For EHT undergoing cells leading to aortic clusters in mammals and avians, vacuolar structures indeed appear to emanate from the ab-luminal side facing the sub-aortic space (we cannot call it basal because we do not know the polarity status of these cells). In the Revised version of the manuscript, we have moved this paragraph referring to the Sato et al work to the Discussion, which gives the possibility to expand a bit on this issue, for more clarity (see the second paragraph of our new Discussion).

      (3) It is unclear why the authors conclude that "their dynamics appears to depend on the activity of aquaporins and it is very possible that aquaporins are active in zebrafish too, although rather in EHT cells late in their emergence and/or in post-EHT cells, for water chase and vacuolar regression as proposed in our model (Figure 1 - figure supplement 1B)." In our opinion, these figures do not confirm this statement.

      This part of the text has been upgraded and moved to the Discussion (see our answer to point 2), to take Reviewers concern about clarity of the Results text section and allowing elaborating a bit more on this issue. We only wished to drive the attention on the described presence of intracellular vacuolar structures recently addressed in the Sato el al 2023 paper showing EHTcell vacuoles that are proposed to contribute to cellular deformation during the emergence. We take this example to rationalize the regression of the vacuolar structures described Figure 1 - figure supplement 1B, which is why we have written ‘… it is very possible that aquaporins are active in zebrafish too’; the first part of the sentence refers to the Sato et al 2023 paper.

      (4) Could the authors prove and show data for their conclusions "We observed that both EHT pol+ and EHT pol- cells divide during the emergence"; "both EHT pol+ and EHT pol- cells express reporters driven by the hematopoietic marker CD41 (data not shown), which indicates that they are both endowed with hematopoietic potential"; and "the full recovery of their respective morphodynamic characteristics (not shown)?".

      To the new version of our manuscript, we have added new Supplemental information to Figure 1 (two new Supplemental Figures):

      • Figure 1 - figure Supplement 2 that illustrates that both EHT pol+ and EHT pol- cells divide during the emergence as well as the maintenance of morphology for both EHT cell types. We wish also to add here that the maintenance of the EHT pol+ morphology is the most critical point, showing that dividing cells in this system do not necessarily lead to EHT pol- cells.

      • Figure 1 - figure Supplement 3 that shows that both EHT cell types express CD41.

      (5) The authors do not demonstrate the conclusion traced from Fig. 2B. Is there a fusion of the vacuoles to the apical side in the EHT pol+ cells? Do the cells inheriting less vacuoles result in pol- EHT? It looks like the legend for Fig. 2-fig supp is missing.

      As said previously, showing fusion here is not technically possible, but indeed, this is the idea, which fits with the images corresponding to timing points 0-90 minutes (Figure 2A), showing (in particular for the right cell) a large pseudo-vacuole whose membrane is heavily enriched with the polarity marker podxl2 (based on fluorescence signal in a membrane-bounded organelle that, based on its curvature radius, should be more under tension then the more convoluted EHT pol+ cell luminal membrane). Also, EHT pol – cells may be born from HE cells that either inherit from less intracellular vesicles after division (or that are derived from HE cells that are less – or not - exposed to polarity-dependent signaling (see our data presented in the new Figure 4 and the new version of the Discussion (see paragraphs ‘Characteristics of the HE and complexity of pre-hematopoietic stem cell emergence’ and ‘Spatially restricted control of Pard3ba mRNAs by Runx1’).

      Finally, the cartoon Figure 2B is a hypothetical model, consistent with our data, and that is meant to help the reader to understand the idea extrapolated from images that may not be so easy to interpret for people not working on this system. In legend of Figure 2 that describes this issue in the first version of our manuscript (lines 1241-1243), we were cautious and wrote, in parentheses: ‘note that exocytosis of the large vacuolar structure may have contributed to increase the surface of the apical/luminal membrane (the green asterisk labels the lumen of the EHT pol + cell’.

      The legend to Figure 2 – figure supplement 1 is not missing (see lines 1492 – 1499 of the first manuscript). The images of this supplement are not extracted from a time-lapse sequence and show that as early as 30hpf (shortly after the beginning of the EHT time-window – around 28hpf), cells on the aortic floor already exhibit podxl2-containing pseudo-vacuolar structures (which we propose is a prerequisite for HE cell maturation into EHT competent cells; see also Figure 2 – figure supplement 2).

      (6) The title of the paper "Tuning apico-basal polarity and junctional recycling in the hemogenic endothelium orchestrates pre-hematopoietic stem cell emergence complexity" could be interpreted as functional heterogeneity within the HSCs, which is not demonstrated in this work. A more conservative title denoting that there are two types of EHT from the DA could avoid misinterpretations and be more appropriate.

      There was no ambiguity, throughout our initial manuscript, on what we meant when using the word ‘emergence’; it refers only to the extrusion process from the aortic floor.

      Reducing our title only to the 2 types of EHT cells would be very reductionist in regard to our work that also addresses essential aspects of the interplay between hemogenic cells, cells undergoing extrusion (EHT pol+ and pol- cells), and their endothelial neighbors (not to mention what we show in terms of the cell biology for the maturing HE and the regulation of its interface with endothelial cells (evidence for vesicular trafficking, specific regulation of HE-endothelial cell intercalation required for EHT progression etc … ). However, and to take this specific comment into account, we propose a slightly changed title saying that there are emergences differentially characterized by their morphodynamic characteristics:

      ‘Tuning apicobasal polarity and junctional recycling in the hemogenic endothelium orchestrates the morphodynamic complexity of emerging pre-hematopoietic stem cells’

      (7) There are several conclusions not supported by data: "Finally, we have estimated that the ratio between EHT pol+ and EHT pol- cells is of approximately 2/1". "We observed that both EHT pol+ and EHT pol- cells divide during the emergence and remain with their respective morphological characteristics". "We also observed that both EHT pol+ and EHT pol- cells express reporters driven by the hematopoietic marker CD41 (data not shown), which indicates that they are both endowed with hematopoietic potential." These conclusions are key in the paper, and therefore they should be supported by data.

      Most of the requests of the Reviewer in this point have already been asked in point 4 and were added to the revised version.

      Regarding the EHT pol+/pol- ratio, we will keep the ratio to approximately 2/1. The Reviewer should be aware that quantification of EHT cells is a tricky issue and a source of important variability, as can be assessed by the quantifications that we have been performing (see for example figures in which we compare the dt-Runx1 phenotype with Ctrl). This is inherent to this system, more specifically because the EHT process is asynchronous, ranging from approx. 28 hpf to 3 days post fertilization (we have even observed EHT at 5 dpf). We systematically observed heterogeneity in EHT numbers and EHT types between animals and also between experiments (some days we observe EHTs at 48 hpf, others more around 55 hpf or even later). In addition, emergence also proceeds on the lateral side of the aorta and, while it is relatively easy to identify EHT pol+ cells because of their highly characterized morphology, it is more difficult for EHT pol- cells that can be mistaken to round HE cells preparing for division. In the current revision of our work, we provide additional facts and potential explanations on the mechanisms that control this asynchrony and the apparent stochasticity of the EHT process (see results of new Figures 3 and 4).

      Reviewer #2 (Public Review):

      In this study, Torcq and colleagues make careful observations of the cellular morphology of haemogenic endothelium undergoing endothelial to haematopoietic transition (EHT) to become stem cells, using the zebrafish model. To achieve this, they used an extensive array of transgenic lines driving fluorescent markers, markers of apico-basal polarity (podocalixin-FP fusions), or tight junction markers (jamb-FP fusions). The use of the runx truncation to block native Runx1 only in endothelial cells is an elegant tool to achieve something akin to tissuespecific deletion of Runx1. Overall, the imaging data is of excellent quality. They demonstrate that differences in apico-basal polarity are strongly associated with different cellular morphologies of cells undergoing EHT from HE (EHT pol- and EHT pol+) which raises the exciting possibility that these morphological differences reflect the heterogeneity of HE (and therefore HSCs) at a very early stage. They then overexpress a truncated form of Runx1 (just the runt domain) to block Runx1 function and show that more HE cells abort EHT and remain associated with the embryonic dorsal aorta. They identify pard3aa and pard3ab as potential regulators of cell polarity. However, despite showing that loss of runx1 function leads to (late) decreases in the expression of these genes, no evidence for their role in EHT is presented. The FRAP experiments and the 2d-cartography, albeit very elegant, are difficult to interpret and not very clearly described throughout the text, making interpretation difficult for someone less familiar with the techniques. Finally, while it is clear that ArhGEF11 is playing an important role in defining cell shapes and junctions between cells during EHT, there is very little statistical evidence to support the limited data presented in the (very beautiful) images.

      As mentioned in the response to reviewer 1, we revised our whole strategy for the analysis of the role of Pard3 proteins in regulating the emergence of hematopoietic precursors. Our new data, obtained using refined gene expression analysis by qRT-PCR on FACS sorted populations and by in situ gene expression analysis at the single-cell resolution using RNAscope, show first that a unique Pard3 isoform (Pard3ba) is sensitive to runx1 activity, and that its expression is specifically localized in aortic cells contacting hemogenic(HE)/EHT cells. We show a clear correlation between the densification of Pard3ba mRNAs and the presence of contacting HE/EHT cells, suggesting a key role for Pard3ba in a cross talk between aortic and hemogenic cells. Furthermore, we show that our dt-runx1 mutant impacts on the maturation of HE cells; when this mutant is expressed, we observe, in comparison to control, an accumulation of HE cells that are abnormally polarized as well as unusually high numbers of EHT pol+ cells. This strongly suggests that the polarity status of HE cells controls the mode of emergence. Overall, our work shows that regulation of apico-basal polarity features is essential for the maturation of the HE and the proper proceeding of the EHT.

      We made efforts to explain more clearly the FRAP experiments as well as the analysis of 2Dcartography throughout the text to facilitate readers comprehension. 2D-cartography are an invaluable tool to precisely discriminate between endothelial and hemogenic cells, and their usage was essential during the FRAP sessions, to point at specific junctional complexes accurately. Performing FRAP at cellular junctions during aortic development was extremely challenging technically and the outcome subjected to quite significant variability (which often leads to quantitative results at the limit of the statistical significance, which is why we speak of tendencies in our results section reporting on this type of experiments). Apart from constant movement and drifting of the embryos which are sources of variability, the EHT process per se is evolving over time and does so at heterogeneous pace (for example, the apical closure of EHT pol+ cells is characterized by a succession of contraction and stabilization phases, see Lancino et al. 2018) which is an additional source of variability in the measurements. Despite all this, our data collectively and consistently suggest a differential regime of junctional dynamics between EHT cell types and support the critical function of ArhGEF11/PDZ-RhoGEF in the control of junctional turnover at the interface between HE and aortic cells as well as between HE cells to regulate cell-cell intercalation.

      There is a sense that this work is both overwhelming in terms of the sheer amount of imaging data, and the work behind it to generate all the lines they required, and at the same time that there is very little evidence supporting the assertion that pard3 (and even ArhGEF11) are important mediators of cell morphology and cell fate in the context of EHT. For instance, the pard3 expression data, and levels after blocking runx1 (part of Figure 3 and Figure 4) don't particularly add to the manuscript beyond indicating that the pard3 genes are regulated by Runx1.

      We thank the reviewer for the comment on the Pard3 data particularly because it led us to reconsider our strategy to address with more precision and at the cellular resolution the potential function of this protein family during the time-window of the EHT. As summarized in the header of the Public Review, we identified one specific isoform of Pard3 in the zebrafish - Pard3ba – whose sensitivity to runx1 interference and spatial restriction in expression reinforce the idea of a fine control of apico-basal polarity features and associated functions while EHT is proceeding. Our new data also reinforce the interplay between HE/EHT cells and their direct endothelial neighbors.

      Weaknesses

      The writing style is quite convoluted and could be simplified for clarity. For example, there is plenty of discussion and speculation throughout the presentation of the results. A clearer separation of the results from this speculation/discussion would help with understanding. Figures are frequently presented out of order in the text; modifying the figures to accommodate the flow of the text (or the other way around) - would make it much easier to follow the narrative. While the evidence for the different cellular morphologies of cells undergoing EHT is strong, the main claim (or at least the title of the manuscript) that tuning apico-basal polarity and junctional recycling orchestrate stem cell emergence complexity is not well supported by the data.

      We refined our text when necessary, in particular taking care of transferring and substantiating the arguments that appeared in the Results section, to the Discussion. We also made efforts, on several occasions and for clarity, to describe more precisely the results presented in the different panels of the Figures.

      As mentioned in the header of the text of the Public Review and the response to the 6th point of the Public Review of Reviewer 1, we modified slightly the title to avoid ambiguity. In addition, we added a new paragraph to the beginning of our discussion that summarizes the impact of our findings and, we believe, legitimates our title.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Embryonic stages should be indicated in all images presented for clarification.

      We thank the reviewer for this point, we added stages when missing on the figures (Figure 1, Figure 1 - Figure supplement 1, Figure 2, Figure 2 - Figure supplement 1, Figure 5, Figure 6, Figure 6 - Figure supplement 1, Figure 7 - Figure supplement 3, Figure 7 - Figure supplement 5, Figure 7 - Figure supplement 6)

      (2) In which anatomical site/s were images from Fig 1C and D taken? The surrounding environment looks different, for example, cells in Fig1D seem to be surrounded by other cells, resembling the endothelial plexus at the CHT, while the cells in Fig. 1C seem to be in the dorsal aorta. Is there a spatial difference depending on where cells are budding off? The authors state that there are no differences, but no quantification or data demonstrating that statement is provided.

      As mentioned in the figure legend (lines 1206-1209 of the original manuscript), images for Figure 1C and 1D were both taken at the boundary between the end of the AGM and the entry in the caudal hematopoietic tissue. As the images were acquired from different embryos, the labelling of the underlying vein differs between the two panels, with veinous tissues being more sparsely labelled in panel C than in panel D. These images were chosen to illustrate the clearly opposite morphology between the two EHT types that we describe. However, for the rest of the paper, all images and all analysis were exclusively acquired / performed in the dorsal aorta in the AGM, in a region spanning over approximately 10-12 inter-segmentary vessels, starting from the end of the elongated yolk up to the start of the balled yolk. In light of the work from the lab of Zilong Wen showing that only cells emerging anteriorly exhibit long-term replenishment potential (Tian et al. 2017), we specifically chose to limit our comparative analysis to the AGM region and did not quantitatively investigate emergences occurring in the caudal region of the aorta. Additionally, although we routinely observe both types of emergences occurring in the caudal region of the dorsal aorta, we did not quantify the frequency of either EHT events in this region.

      Finally, the EHT pol+ cells that we show Figure 1C are of the highest quality obtained ever; one reason is that these two cells emerge at the entry of the CHT which is a region a lot easier to image at high resolution in comparison to the trunk because the sample is less thick and because we are less perturbed by heart beats.

      (3) Which figure shows "EHT pol- cells were observed in all other Tg fish lines that we are routinely imaging, including the Tg(Kdrl:Gal4;UAS:RFP) parental line that was used for transgenesis, thus excluding the possibility that these cells result from an artefact due to the expression of a deleted form of Podxl2 and/or to its overexpression."? It would be informative to include this figure.

      Other examples of EHT pol- cells were shown Figure 5C as well as Figure 6B using the Tg(kdrl:Jam3b-eGFP; kdrl:nls-mKate2) fish line, that was routinely used for junctional dynamic analyses by FRAP. Furthermore, we add now a new figure (New Figure 1 – figure supplement 3), to illustrate the presence of EHT pol- cells using the Tg(CD41:eGFP) transgenic background, additionally illustrating that EHT pol- cells are CD41 positive.

      (4) Are the spinning disk confocal images a single plane? Or maximum projections? Sometimes this is not specified.

      We made sure to take into account this remark and went through all figures legends to specify the type of images presented (Figure 1 – figure supplement 1, Figure 2, Figure 2 – figure supplement 1, Figure 2 – figure supplement 2, Figure 7 – figure supplement 3) and also, when relevant, we added this information directly to the figure panels (Figure 6A – 6B).

      (5) Could the expression data by RT-qPCR for the Pard3 isoforms be shown? Additionally, it would be appreciated if this expression data could be complemented using Daniocell (https://daniocell.nichd.nih.gov/).

      As mentioned in the first paragraph of our response to Public Reviews, and based on reviewers’ comments, we revised our strategy for the investigation of pard3 proteins expression in the vascular system, for their potential role in EHT and sensitivity to runx1. First, we used FACS sorting as well as tissue dissection to enrich in aortic endothelial cells and perform our qPCR analyses (see the new Figure 4 – figure supplement 1A and Figure 4 – figure supplement 3A for the strategy). As asked by the reviewers and for more transparency, we show the expression relative to the housekeeping gene ef1a in our different control samples (new Figure 4 – figure supplement 1C). Furthermore, we used single-molecule FISH to precisely characterise in situ the expression of several of the Pard3 isoforms (Pard3aa, Pard3ab and Pard3ba, which, based on qPCR, were the most relevant for our investigation in the vascular system) (see lines 386 to 412 in text relative to Figure 4 – figure supplement 2). This new addition nicely shows the different pattern of expression of 3 of the Pard3 zebrafish isoforms in the trunk of 2dpf embryos, outlining interesting specificities of each isoform expression in different tissues.

      We thank the reviewer for this suggestion to complement our data with the published Daniocell dataset. However, and potentially due to the poor annotation of the different pard3 genes on public databases, gene expression information was absent for two of our isoforms of interest (pard3aa and pard3ba), that we ultimately show to be the most enriched in the vascular system in the trunk. Daniocell gene expression data for the Pard3ab isoform at 48hpf show expression in pronephric duct at 48-58hpf, as well as in intestine progenitors and neuronal progenitors, which is consistent with our in situ observations using RNAscope. However, pard3ab is poorly detected within the hematopoietic and vascular clusters. This observation is coherent with our data that do not show any enrichment of this isoform in vascular tissues compared to other structures. On the other hand, pard3bb does not seem to be particularly enriched in vascular/hematopoietic clusters at 48-58hpf in the Daniocell dataset, in accordance to what we observe with our qPCR. Finally, in the Daniocell dataset, all of the pard3 variants (pard3ab, pard3bb, PARD3 and PARD3 (1 of many)) seem to be either scarcely or not detected in the hematopoietic/vascular system. In our case, for all the isoforms we studied in control condition (pard3aa, pard3ab and pard3ba), and although the technic is only semi-quantitative due to the presence of an amplification step, RNAscope assays seem to indicate a very low expression in aortic cell (with sometime as little as one mRNA copy per cell; this explains low detection in single-cell RNAseq datasets and is coherent with the Daniocell dataset.

      (6) It would be informative to add in the introduction some information on apico-basal polarity, tight junctions, JAMs (ArhGEF11/PDZ-RhoGEF).

      We modified the introduction so as to add relevant information on Pard3 proteins, their link with our JAMs reporters in the context of polarity establishment, as well as the role of ArhGEF11/PDZ-RhoGEF and its alternative splicing variants in regulating junctional integrity in the context of epithelial-to-mesenchymal transition (lines 99 to 127). This modification of the introduction also allowed us to lighten some parts of the result section (lines 222 to 224, 345 to 349 and 454 to 456 of the original manuscript).

      Reviewer #2 (Recommendations For The Authors):

      (1) There is lots of data (and lots of work) in this paper; I feel that the pard3 data doesn't substantially add to the paper, and at the same time there is data missing (see point 10, point 11 below for an example).

      To add to the clarity and substantiate our findings on Pard3, we revised entirely our investigation strategy as mentioned in previous paragraphs. We refined the characterization of Pard3 isoforms expression in the vascular tissue, using both cell enrichment by FACS for gene expression analysis as well as single-molecule FISH (RNAscope) to access to spatial information on the expression of pard3 isoforms, reaching sub-cellular resolution.

      This new strategy allowed us to show the unexpected localization of Pard3ba mRNAs in mRNAs enriched regions in the vicinity of HE/EHT cells (new Figure 4, and paragraph Interfering with Runx1 activity unravels its function in the control of Pard3ba expression and highlights heterogeneous spatial distribution of Pard3ba mRNAs along the aortic axis, see the new manuscript). Overall, the new spatial analysis we performed allowed us to substantiate our findings on Pard3ba and suggests a direct interplay between hemogenic cells and their endothelial aortic neighbors; this interplay supposedly relies on apico-basal polarity features that is at least in part regulated by runx1 in the context of HE maturation and EHT.

      (2) Labelling of the figures could be substantially improved. In many instances, the text refers to a figure (e.g. Fig 6A), but it has several panels that are not well annotated (in the case of Fig 6A, four panels) or labelled sparsely in a way that makes it easy to follow the text and identify the correct panel in the figure. Even supplementary figures are sparsely labelled. Labelling to include embryonic stages, which transgenic is being used, etc should be added to the panels to improve clarity for the reader.

      We revised the figures to added relevant information, including stages, types of images and annotations to facilitate the comprehension, including Figure 6A – 6B, Figure 5B – 5C (see response to Reviewer 1, first comment, for a more complete list of all revised figures, transgenic fish lines and embryonic stages annotations). Furthermore, we revised the integrality of the manuscript to fit as much as possible to the figures and added some annotations to more easily link the text to the figures and panels.

      (3) The current numbering of supplementary figures is quite confusing to follow.

      We revised the manuscript so as to make sure all principal and supplementary figures were called in the right order and that supplementary figures appearance was coherent with the unfolding of the text. For Figure 7 only, the majority of the supplemental figures are called before the principal figure, as they relate to our experimental strategy that we comment on before describing the results.

      (4) Graphs in Fig 4, Fig 7 supplement 1 and some of the supplementary figures miss statistical info for some comparison (I assume when non-significant), and sometimes present a p-value of a statistical test being done between samples across stages - but these are not dealt with in the text. Throughout all graphs, the font size used in graphs for annotation (labelling of samples, x-axis, and in some cases the p values) is very small and difficult to read.

      For Figure 7 - figure supplement 1, non-significant p-values of statistical tests were not displayed (as mentioned in the Figure legend, line 1614 of the original manuscript). For the new Figure 4, all p-values are displayed. For new Figure 4 - figure Supplement 1, statistical tests were only performed to compare RFP+ and RFP- cells in the trunk condition (3 biological replicates) and not in the whole embryo condition, for which we did not perform enough replicates for statistical analysis (biological duplicates).

      (5) The results are generally very difficult to follow, with a fair amount of discussion included but then very little detail of the experiments per se.

      We thank the reviewers for these comments that helped us improve the clarity of the manuscript.

      The Results section was revised to move some of the paragraphs to the introduction (see response to Reviewer 1, 6th comment), and some of them to the Discussion (such as lines 149 to 156 or 410 to 416 in the first version of the manuscript referring to vacuolar structures or to the recycling modes of JAMs in EHT pol+ and EHT pol- cells).

      (6) The truncated version of runx1 is introduced but its expected effect is not explained until the discussion. Related to this, is it expected that blocking runx1 with this construct (leading to accumulation of cells in the aorta before they undergo EHT) then leads to increased numbers of T-cell progenitors in the thymus? Abe et al (2005, J Immunol) have used the same strategy to overexpress the runt domain in thymocytes and found a decrease in these cells, rather than an increase. Can you explain this apparent discrepancy?

      We thank the reviewer for this interesting point on the effect of runx1 interference. This phenotype (increased number of thymic cells) seems to be in agreement with the phenotype that was described in zebrafish using homozygous runx1 mutants (Sood et al. 2010 PMID: 20154212), in which the authors show an increase of lymphoid progenitors in the kidney marrow of adult runx1W84X/W84X mutants compared to controls as well as a similar number of intra-thymic lck:eGFP cells in mutants and controls. Notably, the T-lymphoid lineage seems to be the only lineage spared by the mutation of runx1. This could suggest that in this case either the T-lymphoid lineage can develop independently of runx1 or that a compensation phenomenon (for example by another protein of the runx family) occurs to rescue the generation of T-lymphocytes.

      Although our data shows an impact on T-lymphopoiesis, we do not elucidate the exact mechanism leading to an increased number of thymic cells. In our case, we do not know the half-life of our dt-runx1 protein in newly generated hematopoietic cells when our transgene, expressed under the control of the kdrl vascular promoter, ceases to be produced after emergence. The effect we observe could be direct, due to the presence of our mutant protein after 3 days in thymic cells, or indirect, due to the impact of our mutant on the HE, that could lead to the preferential generation of lymphoid-biased progenitors. Similarly, we do not know whether the cells we observe at this stage in the thymus are generated from long-term HSC or short-term progenitors. Indeed, cell tracing analysis from the lab of Zilong Wen (Tian et al. 2017, see our Ref list) show the simultaneous presence of short-term PBI derived and longterm AGM derived thymic cells at 5dpf. Based on this, we can imagine for example that the sur-numerous cells we observe in the thymus are transient populations that could multiply faster in the absence of definitive populations. Conversely, based on our observation of an accumulation of EHT pol+ events, we can imagine that the EHT pol+ and EHT pol- cells are indeed differentially fated and that EHT pol+ may be biased toward a lymphoid lineage. We also know that at the stage we observe (5dpf), RNAscope assay of runx1 show that a vast majority of thymic cells do not express runx1 (our preliminary data), suggesting that the effect we observe would be an indirect one caused by upstream events rather than by direct interference with the endogenous expression of runx1 in thymic cells.

      The article referred to by the reviewer (Sato et al. 2005, PMID: 16177090) investigates on the role of runx1 during TCR selection for thymic cell maturation and shows that runx1 signaling lowers the apoptotic sensitivity of double-positive thymocytes when artificially activated, leading to a reduced number of single-positive thymic cells. Furthermore, this paper references another study from the same lab (Hayashi et al. 2000, PMID: 11120804) that used the same strategy to study the role of runx1 on the positive and negative selection steps of T lymphocytes maturation. This paper, although showing that runx1 is important for later stages of T lymphocytes differentiation — the double-positive to single-positive stage maturation —, also shows a relative increase in the amount of double-negative and double-positive thymocytes, that could be coherent with our observations. Indeed, in our case, although we show an increased number of thymic cells, we do not know the relative proportion of the different thymocyte subsets. We could explain the increased number of thymic cells by increased number of DN/DP thymocytes that would not preclude a decrease in single-positive thymocytes. Finally, the cells we observe in the thymus of our dt-runx1 mutants may also be different lymphoid populations, namely ILCs, that would react differently to runx1 interference.

      (7) Lines 154-155 refer to aquaporins but are missing a reference. This is a bit of speculation right in the results section and I struggled to understand what the point of it was.

      To clarify the argument and ease the flow of the text, as suggested by the reviewers, we transferred this paragraph (lines 149 to 156 of the initial manuscript) to the Discussion section lines 763-789). We additionally made sure to add the missing reference (Sato et al. 2023, see our Ref list).

      (8) Lines 173-175, indicating that both EHTpol+ and pol- express the CD41 transgenic marker - would be useful to show this data.

      We provide a new supplement Figure (Figure 1 – figure supplement 3), where, using an outcross of the CD41:eGFP and kdrl:mKate2-podxl2 transgenic lines, we show unambiguously and for multiple cells that both polarized EHT pol+ cells and non-polarized EHT pol- cells are CD41 positive. In addition, but not commented on in the main text, we can also see that an HE cell, characterized by its elongated morphology (in the middle of the field), its thickened nucleus and its position on the aortic floor, is also CD41 positive.

      (9) Lines 181-201 - it's not clear how HE cells were identified in the first place - was it just morphology? Or were they identified retrospectively?

      HE cells were identified solely on morphology and spatial criteria (as mentioned in the Methods section, lines 1073-1082 and 1108-1111 of the first manuscript). Furthermore, a recent investigation by the lab of Zilong Wen (Zhao et al. 2022, see our Ref list) questioning the common origin of HE cells and of endothelial cells as well as their respective capacity to extrude from the aorta to generate hematopoietic cells showed, by single-cell tracing, that 96% of floor cells are indeed hemogenic endothelial cells. Furthermore, as mentioned in the response to the 8th point, we show in Figure 1 – figure supplement 3 that all floor cells express CD41. Finally, we also used an alternative method to validate the true hemogenic identity of aortic floor cells and show, using RNAscope, that virtually 100% of floor cells that we consider as typical HE cells are indeed expressing an hematopoietic transcription factor upstream of Runx1, namely Gata2b (see Author response image 1).

      Author response image 1.

      All cells from the aortic floor, at 48hpf, express the hematopoietic marker Gata2b. 48 hpf Tg(Kdrl:eGFP) fixed embryos were used for RNAscope using a probe designed to detect Gata2b mRNAs. Subsequently, images were taken using spinning disk confocal microscopy. The image in the top panel is a z-projection of the entire aortic volume of one embryo and shows the full portion of the dorsal aorta from the anterior part (left side, at the limit of the balled yolk) down to the urogenital orifice (UGO, right side). The 4 boxes (1 - 4) delineate regions that have been magnified beneath (2X). The 2X images corresponding to each box are z-projections (top views) or z-sections (bottom views). The bottom views allow to visualize the aortic floor and to mark its position on top views). Pink arrows point at HE cells (elongated in the anteroposterior direction) and at EHT cells (ovoid/round cells; EHT pol+ cell morphology is not preserved after fixation and RNAscope; thus, it cannot be distinguished from ovoid/round EHT pol- cells). Pink dots = RNAscope spots of various sizes. The green cells in the subaortic space that are marked by RNAscope spots are newly born hematopoietic stem and progenitor cells (see for example box 1). This embryo is representative of n = 5 embryos treated and imaged.

      (1) Line 276 - the difference between the egfp-podxl2 and mKate-podxl2 - could that be due to the fluorophore used? Also, it would be good to label Fig 3 supplement 2 better and to see a control alongside the runt overexpression.

      Line 276 does not point at a difference in control conditions between eGFP-podxl2 and mKatepodxl2 (see in new Figure 1 – figure supplement 3, Figure 2 or in new Figure 3 - figure supplement 2 several examples of non-polarized HE cells in control conditions using both fluorophores) but between control and dt-runx1 conditions, both expressing the mKate2podxl2 transgene. Similarly, the new example that we provide now in the CD41 figure (Figure 1 – figure supplement 3) clearly shows that mKate-podxl2 is enriched at the apical/luminal membrane of EHT pol+ cells while no such enrichment is observed for EHT pol- cells. The Reviewer should be informed that EHT cells are not always the most typical in shape, in particular because cells can be squeezed by underlying tissues and for example the vein; or from the luminal side by flow and tensions on the aortic wall because of heart beat (the more we image up in the trunk, the more difficult the imaging and the stability of cell shape during long time-lapse sequences). To also take into account the reviewer’s comments, we added for the new Figure 3 – figure supplement 2A a control condition next to the dt-runx1 condition.

      (2) There is no quantitation data on the number of excess EHT pol+ cells in the DA, or in the thymus data (Figs 3 Supp1 and Fig 3 Supp 3). Can you quantify this data? This would better support the claim that tunin apico-basal polarity alters the morphology of the emerging HE cells.

      We added quantifications relative to both the emergence process itself, showing the accumulation of HE and EHT pol+ cells (new Figure 3B), and on hematopoiesis per se (new Figure 3 – figure supplement 1). Indeed, we show a diminution in the number of newly generated cmyb+ cells in the sub-aortic space. Furthermore, we improved our quantification of the later phenotype on the thymus (new Figure 3 – figure supplement 3), using improved segmentation methods, that indeed validate the increase number of thymic cells that we described.

      (3) The observed changes in pard3 isoforms are just reading out changes in their expression in the runt1 transgenics, rather than demonstrating a role in apico-basal polarity.

      We entirely revised our strategy regarding Pard3 expression analyses (see also the text at the beginning of this file, for the Public Review). But we wish to stress on the point that we did not intend initially to show directly a role of Pard3 proteins in controlling apico-basal polarity in the system, we just intended to provide correlative evidence supporting our observations with the polarity marker podxl2 (by interfering with their function, as written in the text, apico-basal polarity - which is essential for aortic lumenization and maintenance -, would have been impaired, blurring interpretations).

      During the revision, we obtained the unexpected finding, using RNAscope, that one Pard3 isoform, namely Pard3ba, is the one Pard3 that is expressed non-homogenously along the aortic axis and, in vast majority, by aortic cells and in the direct vicinity of emergence domains of the aortic floor (see the new Figure 4 and Figure 4 – figure supplements 2, 3).

      This correlative relation between expression of Pard3ba in aortic endothelial cells neighbouring HE/EHT cells suggests, as we propose, that a cross talk occurs between hemogenic and aortic cells, and that this cross talk relies, at least in part, on the expression of key components of apico-basal polarity and their associated functional features. In addition, we show that junctional recycling differs between both EHT types, based on our observations on the different dynamics in the turnover of JAM molecules, in the two EHT types. As JAM molecules are also required for the recruitment of Pard3, which initiates the establishment of apico-basal polarity, these different dynamics suggest that the control of apico-basal polarity is involved in supporting the morphodynamic complexity of EHT cell types.

      (4) There is a Fig 5, Supp 2 that is neither mentioned nor described anywhere in the manuscript.

      Figure 5 - figure Supplement 2 is mentioned lines 366-370 of the original manuscript, to describe the initial validation that was performed for our eGFP-JAM constructs in multiple cell types using an ubiquitous heat-shock promoter. We developed our description of this supplemental figure in the new manuscript (lines 504 to 514).

      (5) Lines 445-456 - these read like a bit of discussion, not results. There are other similar parts of the results section that also read like a discussion (e.g. 526-533)

      Although we decided to keep this paragraph in the Results section, as it justifies the rationale behind the choice of ArhGEF11/PDZ-RhoGEF, we took the reviewers comment into account and, as mentioned in the response to reviewer 1 6th comment, lightened the Results section by transferring some of the paragraphs to the Introduction or Discussion sections.

      (6) The description of Fig 7A (from line 505) is missing the stages at which the experiments were performed (also not labelled on the figure).

      The stages at which the experiments were performed is stated in the figure legend (line 1366) as well as in the Methods section of the original manuscript (line 1033). We added the information on top of the panels A and B for more clarity.

      (7) Some figures have multiple panels (e.g. Fig 7Aa'), so when referred to in the text, it remains unclear which panel is being referred to.

      We modified the text so as to refer more clearly to the different panels when mentioned in the text, particularly with regards to Figure 7 and 8 but also for all the other figures.

    2. eLife assessment

      This important study presents a detailed characterization of two distinct cellular morphologies of haematopoietic stem cells undergoing endothelial to haematopoietic transition in zebrafish. It brings new information on how regulation of apico-basal polarity influences cellular behaviour, shape, and interaction with neighbouring cells. The evidence supporting the existence of these two distinct morphologies is convincing, using state-of-the-art confocal microscopy and image analysis of 2D-cartography.

    3. Reviewer #2 (Public Review):

      In this study, Torcq and colleagues make carefull observations of the cellular morphology of haemogenic endothelium undergoing endothelial to haematopoietic transition (EHT) to become stem cells, using the zebrafish model. To achieve this, the used an extensive array of transgenic lines driving fluorescent markers, markers of apico-basal polarity (podocalixin-FP fusions) or tight junction markers (jamb-FP fusions). The use of the runx truncation to block native Runx1 only in endothelial cells is an elegant tool to achieve something akin to tissue-specific deletion of Runx1. Overall, the imaging data is of excellent quality. They demonstrate that differences in apico-basal polarity are strongly associated with different cellular morphologies of cells undergoing EHT from HE (EHT pol- and EHT pol+) which raises the exciting possibility that these morphological differences reflect heterogeneity of HE (and potentially HSCs, but this is not addressed in this manuscript) at a very early stage. They then overexpress a truncated form of Runx1 (just the runt domain) to block Runx1 function and show that more HE cells abort EHT and remain associated with the embryonic dorsal aorta. The revised version identifies pard3ab as differentially distributed in dtRunx mutants and correlates that distribution with a potential regulatory role on cell polarity. No direct evidence for their role in EHT is presented.

      The manuscript has now been streamlined and reference to figures made much clearer. It provides for a clearer reading, and clearly a well thought out discussion of HE, polarity and the regulation of the EHT process. The evidence for the different cellular morphologies of cells undergoing EHT is strong, and the main claim that tuning apico-basal polarity and junctional recycling underlie morphological complexity of EHT (rather than of HSCs) is well supported by the data.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study presents valuable data on the antigenic properties of neuraminidase proteins of human A/H3N2 influenza viruses sampled between 2009 and 2017. The antigenic properties are found to be generally concordant with genetic groups. Additional analysis have strengthened the revised manuscript, and the evidence supporting the claims is solid.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      The authors investigated the antigenic diversity of recent (2009-2017) A/H3N2 influenza neuraminidases (NAs), the second major antigenic protein after haemagglutinin. They used 27 viruses and 43 ferret sera and performed NA inhibition. This work was supported by a subset of mouse sera. Clustering analysis determined 4 antigenic clusters, mostly in concordance with the genetic groupings. Association analysis was used to estimate important amino acid positions, which were shown to be more likely close to the catalytic site. Antigenic distances were calculated and a random forest model used to determine potential important sites.

      This revision has addressed many of my concerns of inconsistencies in the methods, results and presentation. There are still some remaining weaknesses in the computational work.

      Strengths

      (1) The data cover recent NA evolution and a substantial number (43) of ferret (and mouse) sera were generated and titrated against 27 viruses. This is laborious experimental work and is the largest publicly available neuraminidase inhibition dataset that I am aware of. As such, it will prove a useful resource for the influenza community.

      (2) A variety of computational methods were used to analyse the data, which give a rounded picture of the antigenic and genetic relationships and link between sequence, structure and phenotype.

      (3) Issues raised in the previous review have been thoroughly addressed.

      Weaknesses

      (1). Some inconsistencies and missing data in experimental methods Two ferret sera were boosted with H1N2, while recombinant NA protein for the others. This, and the underlying reason, are clearly explained in the manuscript. The authors note that boosting with live virus did not increase titres. Additionally, one homologous serum (A/Kansas/14/2017) was not generated, although this would not necessarily have impacted the results.

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      (2) Inconsistency in experimental results

      Clustering of the NA inhibition results identifies three viruses which do not cluster with their phylogenetic group. Again this is clearly pointed out in the paper and is consistent with the two replicate ferret sera. Additionally, A/Kansas/14/2017 is in a different cluster based on the antigenic cartography vs the clustering of the titres

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      (3) Antigenic cartography plot would benefit from documentation of the parameters and supporting analyses

      a. The number of optimisations used

      We used 500 optimizations. This information is now included in the Methods section.

      b. The final stress and the difference between the stress of the lowest few (e.g. 5) optimisations, or alternatively a graph of the stress of all the optimisations. Information on the stress per titre and per point, and whether any of these were outliers

      The stress was obtained from 1, 5, 500, or even 5000 optimizations (resulting in stress values of respectively, 1366.47, 1366.47, 2908.60, and 3031.41). Besides limited variation or non-conversion of the stress values after optimization, the obtained maps were consistent in multiple runs. The map was obtained keeping the best optimization (stress value 1366.47, selected using the keepBestOptimization() function).

      Author response image 1.

      The stress per point is presented in the heat map below.

      The heat map indicates stress per serum (x-axis) and strain (y-axis) in blue to red scale.

      c. A measure of uncertainty in position (e.g. from bootstrapping)

      Bootstrap was performed using 1000 repeats and 100 optimizations per repeat. The uncertainty is represented in the blob plot below.

      Author response image 2.

      (4) Random forest

      The full dataset was used for the random forest model, including tuning the hyperparameters. It is more robust to have a training and test set to be able to evaluate overfitting (there are 25 features to classify 43 sera).

      Explicit cross validation is not necessary for random forests as the out of bag process with multiple trees implicitly covers cross validation. In the random forest function in R this is done by setting the mtry argument (number of variables randomly sampled as candidates at each split). R samples variables with replacement (the same variable can be sampled multiple times) of the candidates from the training set. RF will then automatically take the data that is not selected as candidates as test set. Overfit may happen when all data is used for training but the RF method implicitly does use a test set and does not use all data for training.

      Code:

      rf <- randomForest(X,y=Y,ntree=1500,mtry=25,keep.forest=TRUE,importance=TRUE)

      Reviewer #2 (Public Review):

      Summary:

      The authors characterized the antigenicity of N2 protein of 43 selected A(H3N2) influenza A viruses isolated from 2009-2017 using ferret and mice immune sera. Four antigenic groups were identified, which the authors claimed to be correlated with their respective phylogenic/ genetic groups. Among 102 amino acids differed by the 44 selected N2 proteins, the authors identified residues that differentiate the antigenicity of the four groups and constructed a machine-learning model that provides antigenic distance estimation. Three recent A(H3N2) vaccine strains were tested in the model but there was no experimental data to confirm the model prediction results.

      Strengths:

      This study used N2 protein of 44 selected A(H3N2) influenza A viruses isolated from 2009-2017 and generated corresponding panels of ferret and mouse sera to react with the selected strains. The amount of experimental data for N2 antigenicity characterization is large enough for model building.

      Weaknesses:

      The main weakness is that the strategy of selecting 43 A(H3N2) viruses from 2009-2017 was not explained. It is not clear if they represent the overall genetic diversity of human A(H3N2) viruses circulating during this time. In response to the reviewer's comment, the authors have provided a N2 phylogenetic tree using180 randomly selected N2 sequences from human A(H3N2) viruses from 2009-2017. While the 43 strains seems to scatter across the N2 tree, the four antigenic groups described by the author did not correlated with their respective phylogenic/ genetic groups as shown in Fig. 2. The authors should show the N2 phylogenic tree together with Fig. 2 and discuss the discrepancy observed.

      The discrepancies between the provided N2 phylogenetic tree using 180 selected N2 sequences was primarily due to visualization. In the tree presented in Figure 2 the phylogeny was ordered according to branch length in a decreasing way. Further, the tree represented in the rebuttal was built with PhyML 3.0 using JTT substitution model, while the tree in figure 2 was build in CLC Workbench 21.0.5 using Bishop-Friday substitution model. The tree below was built using the same methodology as Figure 2, including branch size ordering. No discrepancies are observed.

      Phylogenetic tree representing relatedness of N2 head domain. N2 NA sequences were ordered according to the branch length and phylogenetic clusters are colored as follows: G1: orange, G2: green, G3: blue, and G4: purple. NA sequences that were retained in the breadth panel are named according to the corresponding H3N2 influenza viruses. The other NA sequences are coded.

      Author response image 3.

      The second weakness is the use of double-immune ferret sera (post-infection plus immunization with recombinant NA protein) or mouse sera (immunized twice with recombinant NA protein) to characterize the antigenicity of the selected A(H3N2) viruses. Conventionally, NA antigenicity is characterized using ferret sera after a single infection. Repeated influenza exposure in ferrets has been shown to enhance antibody binding affinity and may affect the cross-reactivity to heterologous strains (PMID: 29672713). The increased cross-reactivity is supported by the NAI titers shown in Table S3, as many of the double immune ferret sera showed the highest reactivity not against its own homologous virus but to heterologous strains. In response to the reviewer's comment, the authors agreed the use of double-immune ferret sera may be a limitation of the study. It would be helpful if the authors can discuss the potential effect on the use of double-immune ferret sera in antigenicity characterization in the manuscript.

      Our study was designed to understand the breadth of the anti-NA response after the incorporation of NA as a vaccine antigens. Our data does not allow to conclude whether increased breadth of protection is merely due to increased antibody titers or whether an NA boost immunization was able to induce antibody responses against epitopes that were not previously recognized by primary response to infection. However, we now mention this possibility in the discussion and cite Kosikova et al. CID 2018, in this context.

      Another weakness is that the authors used the newly constructed a model to predict antigenic distance of three recent A(H3N2) viruses but there is no experimental data to validate their prediction (eg. if these viruses are indeed antigenically deviating from group 2 strains as concluded by the authors). In response to the comment, the authors have taken two strains out of the dataset and use them for validation. The results is shown as Fig. R7. However, it may be useful to include this in the main manuscript to support the validity of the model.

      The removal of 2 strains was performed to illustrate the predictive performance of the RF modeling. However, Random Forest does not require cross-validation. The reason is that RF modeling already uses an out-of-bag evaluation which, in short, consists of using only a fraction of the data for the creation of the decision trees (2/3 of the data), obviating the need for a set aside the test set:

      “…In each bootstrap training set, about one-third of the instances are left out. Therefore, the out-of-bag estimates are based on combining only about one- third as many classifiers as in the ongoing main combination. Since the error rate decreases as the number of combinations increases, the out-of-bag estimates will tend to overestimate the current error rate. To get unbiased out-of-bag estimates, it is necessary to run past the point where the test set error converges. But unlike cross-validation, where bias is present but its extent unknown, the out-of-bag estimates are unbiased…” from https://www.stat.berkeley.edu/%7Ebreiman/randomforest2001.pdf

      Reviewer #3 (Public Review):

      Summary:

      This paper by Portela Catani et al examines the antigenic relationships (measured using monotypic ferret and mouse sera) across a panel of N2 genes from the past 14 years, along with the underlying sequence differences and phylogenetic relationships. This is a highly significant topic given the recent increased appreciation of the importance of NA as a vaccine target, and the relative lack of information about NA antigenic evolution compared with what is known about HA. Thus, these data will be of interest to those studying the antigenic evolution of influenza viruses. The methods used are generally quite sound, though there are a few addressable concerns that limit the confidence with which conclusions can be drawn from the data/analyses.

      Strengths:

      • The significance of the work, and the (general) soundness of the methods. -Explicit comparison of results obtained with mouse and ferret sera

      Weaknesses:

      • Approach for assessing influence of individual polymorphisms on antigenicity does not account for potential effects of epistasis (this point is acknowledged by the authors).

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      • Machine learning analyses neither experimentally validated nor shown to be better than simple, phylogenetic-based inference.

      We respectfully disagree with the reviewer. This point was addressed in the previous rebuttal as follows.

      This is a valid remark and indeed we have found a clear correlation between NAI cross reactivity and phylogenetic relatedness. However, besides achieving good prediction of the experimental data (as shown in Figure 5 and in FigureR7), machine Learning analysis has the potential to rank or indicate major antigenic divergences based on available sequences before it has consolidated as new clade. ML can also support the selection and design of broader reactive antigens. “

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      (1) Discuss the discrepancy between Fig. 2 and the newly constructed N2 phylogenetic tree with 180 randomly selected N2 sequences of A(H3N2) viruses from 2009-2017. Specifically please explain the antigenic vs. phylogenetic relationship observed in Fig. 2 was not observed in the large N2 phylogenetic tree.

      Discrepancies were due to different method and visualization. A new tree was provided.

      (2) Include a sentence to discuss the potential effect on the use of double-immune ferret sera in antigenic characterization.

      We prefer not to speculate on this.

      (3) Include the results of the exercise run (with the use of Swe17 and HK17) in the manuscript as a way to validate the model.

      The exercise was performed to illustrate predictive potential of the RF modeling to the reviewer. However, cross-validation is not a usual requirement for random forest, since it uses out-of-bag calculations. We prefer to not include the exercise runs within the main manuscript.

    2. eLife assessment

      This study presents valuable data on the antigenic properties of neuraminidase proteins of human A/H3N2 influenza viruses sampled between 2009 and 2017. The antigenic properties are found to be generally concordant with genetic groups. Compared to a previous version, additional analyses have strengthened the work, with solid evidence supporting the claims of the authors.

    3. Reviewer #1 (Public Review):

      Summary

      The authors investigated the antigenic diversity of recent (2009-2017) A/H3N2 influenza neuraminidases (NAs), the second major antigenic protein after haemagglutinin. They used 27 viruses and 43 ferret sera and performed NA inhibition. This work was supported by a subset of mouse sera. Clustering analysis determined 4 antigenic clusters, mostly in concordance with the genetic groupings. Association analysis was used to estimate important amino acid positions, which were shown to be more likely close to the catalytic site. Antigenic distances were calculated and a random forest model used to determine potential important sites.

      This revision has addressed many of my concerns of inconsistencies in the methods, results and presentation. There are still some remaining weaknesses in the computational work.

      Strengths

      (1) The data cover recent NA evolution and a substantial number (43) of ferret (and mouse) sera were generated and titrated against 27 viruses. This is laborious experimental work and is the largest publicly available neuraminidase inhibition dataset that I am aware of. As such, it will prove a useful resource for the influenza community.

      (2) A variety of computational methods were used to analyse the data, which give a rounded picture of the antigenic and genetic relationships and link between sequence, structure and phenotype.

      (3) Issues raised in the previous review have been thoroughly addressed.

      Weaknesses:

      Some concerns regarding the robustness of the machine learning model and potential overfitting remain.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors characterized the antigenicity of N2 protein of 43 selected A(H3N2) influenza A viruses isolated from 2009-2017 using ferret and mice immune sera. Four antigenic groups were identified, which the authors claimed to be correlated with their respective phylogenic/ genetic groups. Among 102 amino acids differed by the 44 selected N2 proteins, the authors identified residues that differentiate the antigenicity of the four groups and constructed a machine-learning model that provides antigenic distance estimation. Three recent A(H3N2) vaccine strains were tested in the model but there was no experimental data to confirm the model prediction results.

      Strengths:

      This study used N2 protein of 44 selected A(H3N2) influenza A viruses isolated from 2009-2017 and generated corresponding panels of ferret and mouse sera to react with the selected strains. The amount of experimental data for N2 antigenicity characterization is large enough for model building.

      Weaknesses:

      One weakness is the use of double-immune ferret sera (post-infection plus immunization with recombinant NA protein) or mouse sera (immunized twice with recombinant NA protein) to characterize the antigenicity of the selected A(H3N2) viruses. Conventionally, NA antigenicity is characterized using ferret sera after a single infection. Repeated influenza exposure in ferrets has been shown to enhance antibody binding affinity and may affect the cross-reactivity to heterologous strains (PMID: 29672713). The increased cross-reactivity is supported by the NAI titers shown in Table S3, as many of the double immune ferret sera showed the highest reactivity not against its own homologous virus but to heterologous strains. In response to the reviewer's comment, the authors agreed the use of double-immune ferret sera may be a limitation of the study.

      Another weakness is that the authors used the newly constructed a model to predict antigenic distance of three recent A(H3N2) viruses but there is no experimental data to validate their prediction (eg. if these viruses are indeed antigenically deviating from group 2 strains as concluded by the authors). Leaving out data from some strains for testing is a useful check, but due to phylogenetic correlations in the data the generalizability of the machine learning is not guaranteed.

    5. Reviewer #3 (Public Review):

      Summary:

      This paper by Portela Catani et al examines the antigenic relationships (measured using monotypic ferret and mouse sera) across a panel of N2 genes from the past 14 years, along with the underlying sequence differences and phylogenetic relationships. This is a highly significant topic given the recent increased appreciation of the importance of NA as a vaccine target, and the relative lack of information about NA antigenic evolution compared with what is known about HA. Thus, these data will be of interest to those studying the antigenic evolution of influenza viruses. The methods used are generally quite sound, though there are a few addressable concerns that limit the confidence with which conclusions can be drawn from the data/analyses.

      Strengths:

      -The significance of the work, and the (general) soundness of the methods.<br /> -Explicit comparison of results obtained with mouse and ferret sera

      Weaknesses:

      - Machine learning analyses neither experimentally validated nor shown to be better than simple, phylogenetic-based inference.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript titled "Disease modeling and pharmacological rescue of autosomal dominant Retinitis Pigmentosa associated with RHO copy number variation" the authors describe the use of patient iPSC-derived retinal organoids to evaluate the pathobiology of a RHO-CNV in a family with dominant retinitis pigmentosa (RP). They find significantly increased expression of rhodopsin, especially within the photoreceptor cell body, and defects in photoreceptor cell outer segment formation/maturation. In addition, they demonstrate how an inhibitor of NR2E3 (a rod transcription factor required for inducing rhodopsin expression), can be used to rescue the disease phenotype.

      Strengths:

      The manuscript is very well written, the illustrations and data presented are compelling, and the authors' interpretation/discussion of their findings is logical.

      Weaknesses:

      A weakness, which the authors have addressed in the discussion section, is the lack of an isogenic control, which would allow for direct analysis of the RHO-CNV in the absence of the other genetic sequence contained within the duplicated region. As the authors suggest, CRISPR correction of a large CNV in the absence of inducing unwanted on-target editing events in patient iPSCs is often very challenging. Given that they have used a no-disease iPSC line obtained from a family member, controlled for organoid differentiation kinetics/maturation state, and that no other complete disease-causing gene is contained within the duplicated region, it is unlikely that the addition of an isogenic control would yield significantly different results.

      Aims and conclusions:

      This reviewer is of the opinion that the authors have achieved their aims and that their results support their conclusions.

      Discussion:

      The authors have provided adequate discussion on the utility of the methods and data as well as the impact of their work on the field.

      We thank the reviewer for their insightful, and encouraging review of our work that has taken several years to get to current stage.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kandoi et al. describes a new 3D retinal organoid model of a mono-allelic copy number variant of the rhodopsin gene that was previously shown to induce autosomal dominant retinitis pigmentosa via a dominant negative mechanism in patients. With advancements in the low-cost genomics application to detect copy number variations, this is a timely article that highlights a potential disease mechanism that goes beyond the retina field. The evidence is relatively strong that the rod photoreceptor phenotype observed in an adult patient with RP in vivo is similar to that phenotype observed in human stem cell-derived retinal organoids. Increases in RHO expression detected by qPCR, RNA-seq, and IHC support this phenotype. Importantly, the amelioration of photoreceptor rhodopsin mislocalization and related defects using the small molecule drug photoregulin demonstrates an important potential clinical application.

      Overall, the authors succeeded in providing solid evidence that copy number variation via a genomic RHO duplication leads to abnormalities in rod photoreceptors that can be partially blocked by photoregulin. However, there are several points that should be addressed that will enhance this paper.

      Strengths:

      • The use of patient-derived organoids from patients that have visual defects is a major strength of this work and adds relevance to the disease phenotype.

      • The rod phenotype assessed by qPCR, RNA-seq, and IHC supports a phenotype that shares similarities with the patient.

      • The use of a small molecule drug that selectively targets rod photoreceptors, as opposed to cones, is a noteworthy strength.

      We thank the reviewers for highlighting the key strengths of the paper.

      Weaknesses:

      (1) The chromosomal segment that was duplicated had 3 copies of RHO in addition to three copies of each of the flanking genes (IFT122, HIF100, PLXND1). Discussion of the involvement of these genes would be helpful. Would duplication of any of these genes alone cause or contribute to adRP? As an example, a missense mutation in IFT122 was previously implicated in photoreceptor loss (PMID: 33606121 PMCID: PMC8519925).

      Thank you for your comment. It is an interesting question on the contribution of the other duplicated genes. Of these, IFT122 is particularly interesting as pointed out. We did a thorough survey through literature and our genetic testing partner’s database, BluePrint Genetics. We did not find any human retinal degeneration cases with variants in IFT122. IFT122 has been shown to cause recessive phenotype in dogs and in complete knockout zebrafish model but dominant or overexpression has not been shown to have a phenotype. Interestingly, recessive biallelic IFT122 mutation can cause Cranioectodermal Dysplasia (Sensenbrenner syndrome, PMID: 24689072) and none of these patient exhibited retinal dystrophy. HIF100 is an epigenetic modifier gene while PLXND1 is expressed in endothelial cells. We will include a discussion on this in the revised manuscript.

      (2) Related to #1, have the authors considered inserting extra copies of RHO (and/or the flanking genes) of these at a genomic safe harbor site? Although not required, this would allow one to study cells with isogenic-matched genetic backgrounds and would partially address the technical challenge of repairing a 188kb duplication, which as the authors note would be difficult to do. Demonstrating that excess copy numbers in different genetic backgrounds would be a huge contribution to the field. At a minimum, a discussion of the role of the nearby genes should be included. 


      Thank you for your suggestion. We plan to test the relative role of 1-3 extra copies of RHO driven off a NRL promoter in order to drive it only in rods in our future mechanistic analysis studies. We will include a discussion on the potential role of the other genes in the revised manuscript.

      (3) In the patient, the central foveal region was spared suggesting that cones were normal. Was there a similar assessment that cones are unaffected in retinal organoids? 


      We will include this data in our revised manuscript but overall did not see a cone defect in RHO CNV organoids. Additionally, although it is true that the central foveal region was relatively spared in this patient, the cones are definitely not normal. The macular cones that remain have been damaged by chronic edema, and photoreceptor and RPE atrophy has progressed into the macula, sparing only the foveal cones.

      (4) Pathway analysis indicated that glycosylation was perturbed and this was proposed as an explanation as to why rhodopsin was mislocalized. Have the authors verified that there is an actual decrease in glycosylation? 


      These studies are ongoing. We are currently looking into the details of cellular pathophysiology focusing on RHO trafficking in RHO-CNV including role of glycosylation and other post-translational modifications defects.

      (5) Line 182: by what criteria are the authors able to state that " there were no clear visible anatomical changes in apical-basal retinal cell type distribution during the early differentiation timeframe (data not shown)." Was this based on histological staining with antibodies, nuclear counter-staining, or some other evaluation?


      This was based on both IHC for various cell type markers and nuclear (DAPI) staining.

      (6) Figure 2C - the appearance of the inner segments in RC and RM looks very different from one another. Have the authors ruled out the possibility that the RC organoid cell isn't a cone? In addition, the RM structure has what appears to be a well-defined OLM which would suggest well-formed Muller glia. Do these structures also exist in RC organoids? Typically the OLM does form in older organoids. In addition, was this representative in numerous EM preparations?


      For clarification on EM data, we will include additional images in the revision as supplementary data. We have not carefully compared OLM between the patient and control organoids but do observe them in both conditions in the older organoids. The EM preparations were made from multiple organoids from two different batches with consistent results.

      (7) What criteria were used to assess cell loss? Has any TUNEL labeling been performed to confirm cell loss? From the existing data, it seems that rod outer segments appear to be affected in organoids. However, it's not clear if the photoreceptors themselves actually die in this model.

      TUNEL was used to assess cell loss and it was not significantly different between the control and patient organoids at the timepoints examined. We did not expect a change as the disease in the patient developed over decades.

      (8) Figure 5B. The RHO staining in the vehicle-treated sample is perturbed relative to the PR3 treatments as indicated in the text. In the vehicle-treated sample, the number of DAPI-positive cells that are completely negative proximal to the inner segments suggests that there might be non-rod cells there. Have the authors confirmed whether these are cones? Labels would be helpful in the left vehicle panel as the morphology looks very different than the treated samples.


      Thank you very much for the various suggestions and these will be included in the revised manuscript version. A number of the cells in the negative regions are OTX2+/NRL- and likely to be cones (Figure 4 A and B). Unfortunately, we do not have a very good cone nuclear marker as RXRγ does not consistently stain mature cones.

      (9) It is interesting that in addition to increases in RHO, and photo-transduction, there are also increases in PTPRT which is related to synaptic adhesion. Is there evidence of ectopic neurites that result from PTPRT over-expression?

      You are absolutely correct that PTPRT data is very interesting. PTPRT requires similar PTMs like RHO in photoreceptors for its synaptic localization. We did not specifically look at ectopic neurites and test that in the revision. It will interesting to follow-up on its expression pattern to see if it gets processed or localized normally if we can find a working antibody. It is also possible that the gene-expression increase due to feedback upregulation secondary to improper protein processing.

      Reviewer #3 (Public Review):

      This manuscript reports a novel pedigree with four intact copies of RHO on a single chromosome which appears to lead to overexpression of rhodopsin and a corresponding autosomal dominant form of RP. The authors generate retinal organoids from patient- and control-derived cells, characterize the phenotypes of the organoids, and then attempt to 'treat' aberrant rhodopsin expression/mislocalization in the patient organoids using a small molecule called photoregulin 3 (PR3). While this novel genetic mechanism for adRP is interesting, the organoid work is not compelling. There are multiple problems related to the technical approaches, the presentation of the results, and the interpretations of the data. I will present my concerns roughly in the order in which they appear in the manuscript.

      Major concerns:

      (1) Individual human retinal organoids in culture can show a wide range of differentiation phenotypes with respect to the expression of specific markers, percentages of given cell types, etc. For this reason, it can be very difficult to make rigorous, quantitative comparisons between 'wild-type' and 'mutant' organoids. Despite this difficulty, the author of the present manuscript frequently presents results in an impressionistic manner without quantitation. Furthermore, there is no indication that the investigator who performed the phenotypic analyses was blind with respect to the genotype. In my opinion, such blinding is essential for the analysis of phenotypes in retinal organoids. To give an example, in lines 193-194 the authors write "we observed that while the patient organoids developing connecting cilium and the inner segments similar to control organoids, they failed to extend outer segments". Outer segments almost never form normally in human retinal organoids, even when derived from 'wild-type' cells. Thus, I consider it wholly inadequate to simply state that outer segment formation 'failed' without a rigorous, quantitative, and blinded comparison of patient and control organoids.

      We agree it is challenging to generate outer segments in retinal organoids but we are not the first to show this. This has been demonstrated by multiple independent labs (Mayerl et al (PMID: 36206764), Wahlin et al (PMID: 28396597), West at al (PMID: 35334217) including ours (Chirco et al (PMID: 34653402). To clarify, we did not observe any OS like tissue in the patient organoids across multiple EM preps of a number of organoids from two independent 300+ day experiments which matched the phase microscopy data presented in Fig2B.

      (2) The presentation of qPCR results in Figure 3A is very confusing. First, the authors normalize expression to that of CRX, but they don't really explain why. In lines 210-211, they write "CRX, a ubiquitously expressing photoreceptor gene maintained from development to adulthood." Several parts of this sentence are misleading or incomplete. First, CRX is not 'ubiquitously expressed' (which usually means 'in all cell types') nor is it photoreceptor-specific: CRX is expressed in rods, cones, and bipolar cells. Furthermore, CRX expression levels are not constant in photoreceptors throughout development/adulthood. So, for these reasons alone, CRX is a poor choice for the normalization of photoreceptor gene expression.

      As you are aware, all housekeeping genes have shortcomings when used for normalizing PCR data. We went with CRX as within the timepoints chosen, it is not expected to change much and thus represent a good equalizer for relative photoreceptor numbers between the organoids and conditions. While we agree that CRX is weakly expressed in bipolar cells (Yamamoto et al 2020), it is not expected to bias the data too much as we have not seen nor have other reported a huge relative difference in bipolar cell number in organoids. We also confirm this by showing equivalent expression of OTX2, RCVRN and NRL between all conditions.

      Second, the authors' interpretation of the qPCR results (lines 216-218) is very confusing. The authors appear to be saying that there is a statistically significant increase in RHO levels between D120 and D300. However, the same change is observed in both control and patient organoids and is not unexpected, since the organoids are more mature at D300. The key comparison is between control and patient organoids at D300. At this time point, there appears to be no difference between control and patient. The authors don't even point this out in the main text.

      Thank you for the comment and we apologize if this confused you. However, as can been seen in the graph in Figure 3A, we do compare expression of genes including RHO between control and patient organoids at two different time points. There are four conditions: D120-RC, D120-RM, D300-RC and D300-RM with individual data points and error bars for each condition. There is a statistically significant increase at both time points upon comparing the control and patient organoids for RHO. We compared RHO expression between patient organoids at the two time points and it was not statistically different.

      Third, the variability in the number of photoreceptor cells in individual organoids makes a whole-organoid comparison by qPCR fraught with difficulty. It seems to me that what is needed here is a comparison of RHO transcript levels in isolated rod photoreceptors.

      We agree that this makes it challenging. This was the exact reasoning for using CRX for normalization since it is predominantly present in photoreceptors. This was validated by the data showing no difference in expression of photoreceptor markers OTX2, RCVRN or NRL between the organoids.

      (3) I cannot understand what the authors are comparing in the bulk RNA-seq analysis presented in the paragraph starting with line 222 and in the paragraph starting with line 306. They write "we performed bulk-RNA sequencing on 300-days-old retinal organoids (n=3 independent biological replicates). Patient retinal organoids demonstrated upregulated transcriptomic levels of RHO... comparable to the qRT-PCR data." From the wording, it suggests that they are comparing bulk RNA-seq of patients and control organoids at D300. However, this is not stated anywhere in the main text, the figure legend, or the Methods. Yet, the subsequent line "comparable to the qRT-PCR data" makes no sense, because the qPCR comparison was between patient samples at two different time points, D120 and D300, not between patient and control. Thus, the reader is left with no clear idea of what is even being compared by RNA-seq analysis.

      We apologize if the conditions were not obvious and will clarify this in the revised version. The conditions compared are control and patient organoids at D300. Regarding comparison to RT-PCR, as stated above, the comparison shown is between patient and control organoids at two different timepoints.

      Remarkably, the exact same lack of clarity as to what is being compared is found in the second RNA-seq analysis presented in the paragraph starting with line 306. Here the authors write "We further carried out bulk RNA-sequencing analysis to comprehensively characterize three different groups of organoids, 0.25 μM PR3-treated and vehicle-treated patient organoids and control (RC) organoids from three independent differentiation experiments. Consistent with the qRT-PCR gene expression analysis, the results showed a significant downregulation in RHO and other rod phototransduction genes." Here, the authors make it clear that they have performed RNA-seq on three types of samples: PR3-treated patient organoids, vehicle-treated patient organoids, and control organoids (presumably not treated). Yet, in the next sentence, they state "the results showed a significant downregulation in RHO", but they don't state what two of the three conditions are being compared! Although I can assume that the comparison presented in Fig. 6A is between patient vehicle-treated and PR3-treated organoids, this is nowhere explicitly stated in the manuscript.

      Thank you for the comment and we will explicitly state various comparisons in the revised version.

      (4) There are multiple flaws in the analysis and interpretation of the PR3 treatment results. The authors wrote (lines 289-2945) "We treated long-term cultured 300-days-old, RHO-CNV patient retinal organoids with varying concentrations of PR3 (0.1, 0.25 and 0.5 μM) for one week and assessed the effects on RHO mRNA expression and protein localization. Immunofluorescence staining of PR3-treated organoids displayed a partial rescue of RHO localization with optimal trafficking observed in the 0.25 μM PR3-treated organoids (Figure 5B). None of the organoids showed any evidence of toxicity post-treatment."

      There are multiple problems here. First, the results are impressionistic and not quantitative. Second, it's not clear that the investigator was blinded with respect to the treatment condition. Third, in the sections presented, the organoids look much more disorganized in the PR3-treated conditions than in the control. In particular, the ONL looks much more poorly formed. Overall, I'd say the organoids looked considerably worse in the 0.25 and 0.5 microM conditions than in the control, but I don't know whether or not the images are representative. Without rigorously quantitative and blinded analysis, it is impossible to draw solid conclusions here. Lastly, the authors state that "none of the organoids showed any evidence of toxicity post-treatment," but do not explain what criteria were used to determine that there was no toxicity.

      Thank you for your critical insight. The RHO localization data is qualitative as it is very difficult to accurately quantify rhodopsin trafficking within the cell in the organoid. Thus, for quantitative comparison, we have provided expression level changes. Regarding toxicity, we analyzed the organoids by morphology and TUNEL and did not observe significant difference between the conditions. This closely mimics mouse data on PR3 which suppressed rod function in mice following IP injection without any obvious toxicity.

      (5) qPCR-based quantitation of rod gene expression changes in response to PR3 treatment is not well-designed. In lines 294-297 the authors wrote "PR3 drove a significant downregulation of RHO in a dose-dependent manner. Following qRT-PCR analysis, we observed a 2-to-5 log2FC decrease in RHO expression, along with smaller decreases in other rod-specific genes including NR2E3, GNAT1 and PDE6B." I assume these analyses were performed on cDNA derived from whole organoids. There are two problems with this analysis/interpretation. First, a decrease in rod gene expression can be caused by a decrease in the number of rods in the treated organoids (e.g., by cell death) or by a decrease in the expression of rod genes within individual rods. The authors do not distinguish between these two possibilities. Second, as stated above, the percentage of cells that are rods in a given organoid can vary from organoid to organoid. So, to determine whether there is downregulation of rod gene expression, one should ideally perform the qPCR analysis on purified rods.

      The reviewer is correct in pointing the potential reasons for reduction in RHO levels following PR3 treatment. Thus, we have provided NRL expression levels in the graph to show that this key rod-specific gene does not change suggesting equivalent number of rod photoreceptor cells. The suggestion of using purified rods is not practical here, as we do not have any way to sort human rods due to the lack of a rod-specific cell surface marker.

      (6) In Figure 4B 'RM' panels, the authors show RHO staining around the somata of 'rods' but the inset images suggest that several of these cells lack both NRL and OTX2 staining in their nuclei. All rods should be positive for NRL. Conversely, the same image shows a layer of cells scleral to the cells with putative RHO somal staining which do not show somal staining, and yet they do appear to be positive for NRL and OTX2. What is going on here? The authors need to provide interpretations for these findings.

      Since RHO is a cytoplasmic marker and photoreceptor are tightly packed, it is difficult to make a 1:1 comparison to NRL/OTX2 nuclear marker to RHO. Additionally, as the RHO+ cytoplasm moves towards scleral surface, it is expected to pass adjacent to other nuclei. Few of the rods do still have normal Rhodopsin trafficking and it is likely these will not have somal RHO similar to control conditions. We do rarely observe these cells as highlighted by the occasional RHO in IS/OS of RM organoids in the figure. We do agree that the NRL staining in the figure 4B (>D250) is not extremely crisp and we will include an updated figure in the revised version.

    2. eLife assessment

      This study presents an important finding that implicates a rhodopsin gene duplication in the progression of autosomal dominant retinitis pigmentosa in patients. The authors utilize a retinal organoid model to demonstrate a similar disease phenotype and suggest defects can be ameliorated by using photoregulin. The data supporting the conclusions are solid, but there are some concerns regarding the strength of the phenotype in retinal organoids. This work will be of broad interest to vision researchers.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kandoi et al. describes a new 3D retinal organoid model of a mono-allelic copy number variant of the rhodopsin gene that in a patient led to autosomal dominant retinitis pigmentosa. The evidence provided here is relatively strong that the rod photoreceptor phenotype observed in an adult patient with RP in vivo is similar to that phenotype observed in human stem cell-derived retinal organoids. Increases in RHO expression were detected by qPCR, RNA-seq, and IHC support this phenotype. Importantly, the amelioration of photoreceptor rhodopsin mislocalization and related defects using the small molecule drug photoregulin demonstrates an important potential clinical application.

      Strengths:<br /> - Retinal organoids derived from patient with adRP.<br /> - RHO mislocalization could explain the phenotype in patients.

      Weaknesses:

      - Organoids at 300 days do not show PR loss.

      Additional minor weaknesses

      - Bulk RNAseq methods require greater detail, particularly with respect to how total or mRNA was purified, how was it quantified for concentration and integrity (i.e. Nanodrop, Tape station, Bioanalyzer), what reagents were used for library preparation and how many reads were analyzed per sample.

      - Fig. 4. The levels of RHO visualized in tissue sections (panels A-C) does not seem to match the general levels shown for the western blots (panel D) which appear to be far higher in RM western blot samples than in the IHC images. Please clarify why there is such a difference.

      - Line 186: by what criteria are the authors able to state that " there were no clear visible anatomical changes in apical-basal retinal cell type distribution (data not shown)". Was this based on histological staining with antibodies, nuclear counter-staining or some other evaluation?

    4. Reviewer #3 (Public Review):

      This manuscript reports a novel pedigree with four intact copies of RHO on a single chromosome which appears to lead to overexpression of rhodopsin and a corresponding autosomal dominant form of RP. The authors generate retinal organoids from patient- and control-derived cells, characterize the phenotypes of the organoids, and then attempt to 'treat' aberrant rhodopsin expression/mislocalization in the patient organoids using a small molecule called photoregulin 3 (PR3). While this novel genetic mechanism for adRP is interesting, the organoid work is not compelling. There are multiple problems related to the technical approaches, the presentation of the results, and the interpretations of the data. I will present my concerns roughly in the order in which they appear in the manuscript and will separate them into 'major' and 'minor' categories:

      Major concerns:<br /> (1) Individual human retinal organoids in culture can show a wide range of differentiation phenotypes with respect to the expression of specific markers, percentages of given cell types, etc. For this reason, it can be very difficult to make rigorous, quantitative comparisons between 'wild-type' and 'mutant' organoids. Despite this difficulty, the author of the present manuscript frequently present results in an impressionistic manner without quantitation. Furthermore, there is no indication that the investigator who performed the phenotypic analyses was blind with respect to the genotype. In my opinion, such blinding is essential for the analysis of phenotypes in retinal organoids.

      To give an example, in lines 193-194 the authors write "we observed that while the patient organoids developing connecting cilium and the inner segments similar to control organoids, they failed to extend outer segments". Outer segments almost never form normally in human retinal organoids, even when derived from 'wild-type' cells. Thus, I consider it wholly inadequate to simply state that outer segment formation 'failed' without a rigorous, quantitative, and blinded comparison of patient and control organoids.

      (2) The presentation of qPCR results in Fig. 3A in very confusing. First, the authors normalize expression to that of CRX, but they don't really explain why. In lines 210-211 they write "CRX, a ubiquitously expressing photoreceptor gene maintained from development to adulthood." Several parts of this sentence are misleading or incomplete. First, CRX is not 'ubiquitously expressed' (which usually means 'in all cell types') nor is it photoreceptor-specific: CRX is expressed in rods, cones, and bipolar cells. Furthermore, CRX expression levels are not constant in photoreceptors throughout development/adulthood. So, for these reasons alone, CRX is a poor choice for normalization of photoreceptor gene expression.

      Second, the authors' interpretation of the qPCR results (lines 216-218) is very confusing. The authors appear to be saying that there is a statistically significant increase in RHO levels between D120 and D300. However, the same change is observed in both control and patient organoids and is not unexpected, since the organoids are more mature at D300. The key comparison is between control and patient organoids at D300. At this time point, there appears to be no difference control and patient. The authors don't even point this out in the main text.

      Third, the variability in number of photoreceptor cells in individual organoids makes a whole-organoid comparison by qPCR fraught with difficulty. It seems to me that what is needed here is a comparison of RHO transcript levels in isolated rod photoreceptors.

      (3) I cannot understand what the authors are comparing in the bulk RNA-seq analysis presented in the paragraph starting with line 222 and in the paragraph starting with line 306. They write "we performed bulk-RNA sequencing on 300-days-old retinal organoids (n=3 independent biological replicates). Patient retinal organoids demonstrated upregulated transcriptomic levels of RHO... comparable to the qRT-PCR data." From the wording, it suggests that they are comparing bulk RNA-seq of patient and control organoids at D300. However, this is not stated anywhere in the main text, the figure legend, or the Methods. Yet, the subsequent line "comparable to the qRT-PCR data" makes no sense, because the qPCR comparison was between patient samples at two different time points, D120 and D300, not between patient and control. Thus, the reader is left with no clear idea of what is even being compared by RNA-seq analysis.

      Remarkably, the exact same lack of clarity as to what is being compared plagues the second RNA-seq analysis presented in the paragraph starting with line 306. Here the authors write "We further carried out bulk RNA-sequencing analysis to comprehensively characterize three different groups of organoids, 0.25 μM PR3-treated and vehicle-treated patient organoids and control (RC) organoids from three independent differentiation experiments. Consistent with the qRT-PCR gene expression analysis, the results showed a significant downregulation in RHO and other rod phototransduction genes." Here, the authors make it clear that they have performed RNA-seq on three types of sample: PR3-treated patient organoids, vehicle-treated patient organoids, and control organoids (presumably not treated). Yet, in the next sentence they state "the results showed a significant downregulation in RHO", but they don't state what two of the three conditions are being compared! Although I can assume that the comparison presented in Fig. 6A is between patient vehicle-treated and PR3-treated organoids, this is nowhere explicitly stated in the manuscript.

      (4) There are multiple flaws in the analysis and interpretation of the PR3 treatment results. The authors wrote (lines 289-2945) "We treated long-term cultured 300-days-old, RHO-CNV patient retinal organoids with varying concentrations of PR3 (0.1, 0.25 and 0.5 μM) for one week and assessed the effects on RHO mRNA expression and protein localization. Immunofluorescence staining of PR3-treated organoids displayed a partial rescue of RHO localization with optimal trafficking observed in the 0.25 μM PR3-treated organoids (Figure 5B). None of the organoids showed any evidence of toxicity post-treatment."

      There are multiple problems. First, the results are impressionistic and not quantitative. Second, it's not clear that the investigator was blinded with respect to treatment condition. Third, in the sections presented, the organoids look much more disorganized in the PR3-treated conditions than in the control. In particular, the ONL looks much more poorly formed. Overall, I'd say the organoids looked considerably worse in the 0.25 and 0.5 microM conditions than in the control, but I don't know whether or not the images are representative. Without rigorously quantitative and blinded analysis, it is impossible to draw solid conclusions here. Lastly, the authors state that "none of the organoids showed any evidence of toxicity post-treatment," but do not explain what criteria were used to determine that there was no toxicity.

      (5) qPCR-based quantitation of rod gene expression changes in response to PR3 treatment is not well-designed. In lines 294-297 the authors wrote "PR3 drove a significant downregulation of RHO in a dose-dependent manner. Following qRT-PCR analysis, we observed a 2-to-5 log2FC decrease in RHO expression, along with smaller decreases in other rod-specific genes including NR2E3, GNAT1 and PDE6B." I assume these analyses were performed on cDNA derived from whole organoids. There are two problems with this analysis/interpretation. First, a decrease in rod gene expression can be caused by a decrease in the number of rods in the treated organoids (e.g., by cell death) or by a decrease in the expression of rod genes within individual rods. The authors do not distinguish between these two possibilities. Second, as stated above, the percentage of cells that are rods in a given organoid can vary from organoid to organoid. So, to determine whether there is downregulation of rod gene expression, one should ideally perform the qPCR analysis on purified rods.

      (6) In Fig. 4B 'RM' panels, the authors show RHO staining around the somata of 'rods' but the inset images suggest that several of these cells lack both NRL and OTX2 staining in their nuclei. All rods should be positive for NRL. Conversely, the same image shows a layer of cells sclerad to the cells with putative RHO somal staining which do not show somal staining, and yet they do appear to be positive for NRL and OTX2. What is going on here? The authors need to provide interpretations for these findings.

      Minor concerns:

      (1) The writing is poor in many places. Problems include: poor word choice (e.g., 'semi-occasional' is used three times where 'occasional' or 'infrequent' would be better); superfluous use of the definite article in many places (e.g., lines 189-190 "by the light microscopy" should be "by light microscopy"); awkward sentence structures (e.g., lines 208-209: "To equilibrate the data to equivalent the number of photoreceptors in organoids"), opaque expressions (e.g., line 217 "there was a significant ~3 log2 fold change (log2FC)"; why not just say "an ~8-fold change"?); poor proof-reading (Abstract says that 40% of adRP cases are due to mutation in RHO, then the Introduction says the figure is 25%) etc.

      (2) The figures are not numbered, which makes it painful for the reviewer to correlate main text call-outs, figure legends, and actual figures. I had to repeatedly count down the list of figures to determine which figure I should be looking at.

      (3) In the abstract, the authors suggest that the patient's disease "develops from a dominant negative gain of function" mechanism. I don't agree with this interpretation. Typically 'dominant-negative' refers to an aberrant protein which directly interferes with the function of the normal protein, for example by forming non-functional heterodimers. In the present patient, the disease can be explained by a simple overexpression mechanism, as it has been previously demonstrated in mice that even minimal overexpression of rhodopsin (e.g., ~25% more than normal levels) can led to progressive rod degeneration: PMID: 11222515.

      (4) In line 85 the word 'Morphologically' is superfluous and can be deleted.

      (5) In the Introduction the authors should more clearly articulate the rationale for using PR3 to treat this patient: because it leads to downregulation of multiple rod genes including RHO. This isn't clearly explained until the Discussion.

      (6) The authors mention in several places that PR3 may act via inhibition of NR2E3. Although this was the conclusion of the original publication, the evidence that PR3 acts via Nr2e3 in mice is not solid. The original study (PMID: 29148976) showed that the main effect of PR3 application on mouse retinas is downregulation of numerous rod genes. However, knockout of Nr2e3 in mouse has been shown to have very little effect on rod gene expression, and Nr2e3 mutant rods have largely preserved rod function as demonstrated by scotopic ERGs PMIDs: 15634773, 16110338, 15689355). The primary gene expression defect in Nr2e3 mutant mouse rods is upregulation of a subset of cone genes, a change not observed upon application of PR3 to mouse retinas. For these reasons, I am skeptical that PR3 acts via inhibition of Nr2e3 activity, and I would suggest that the present authors qualify that interpretation.

      (7) This mechanistic speculation presented in lines 274-278 is not warranted. Ectopic localization of opsin to the cytoplasmic membrane occurs in a wide range of genetic forms of rod degeneration.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a new and valuable theoretical account of spatial representational drift in the hippocampus. The evidence supporting the claims is convincing, with a clear and accessible explanation of the phenomenon. Overall, this study will likely attract researchers exploring learning and representation in both biological and artificial neural networks.

      We would like to ask the reviewers to consider elevating the assessment due to the following arguments. As noted in the original review, the study bridges two different fields (machine learning and neuroscience), and does not only touch a single subfield (representational drift in neuroscience). In the revision, we also analysed data from four different labs, strengthening the evidence and the generality of the conclusions.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors start from the premise that neural circuits exhibit "representational drift" -- i.e., slow and spontaneous changes in neural tuning despite constant network performance. While the extent to which biological systems exhibit drift is an active area of study and debate (as the authors acknowledge), there is enough interest in this topic to justify the development of theoretical models of drift.

      The contribution of this paper is to claim that drift can reflect a mixture of "directed random motion" as well as "steady state null drift." Thus far, most work within the computational neuroscience literature has focused on the latter. That is, drift is often viewed to be a harmless byproduct of continual learning under noise. In this view, drift does not affect the performance of the circuit nor does it change the nature of the network's solution or representation of the environment. The authors aim to challenge the latter viewpoint by showing that the statistics of neural representations can change (e.g. increase in sparsity) during early stages of drift. Further, they interpret this directed form of drift as "implicit regularization" on the network.

      The evidence presented in favor of these claims is concise. Nevertheless, on balance, I find their evidence persuasive on a theoretical level -- i.e., I am convinced that implicit regularization of noisy learning rules is a feature of most artificial network models. This paper does not seem to make strong claims about real biological systems. The authors do cite circumstantial experimental evidence in line with the expectations of their model (Khatib et al. 2022), but those experimental data are not carefully and quantitatively related to the authors' model.

      We thank the reviewer for pushing us to present stronger experimental evidence. We now analysed data from four different labs. Two of those are novel analyses of existing data (Karlsson et al, Jercog et al). All datasets show the same trend - increasing sparsity and increasing information per cell. We think that the results, presented in the new figure 3, allow us to make a stronger claim on real biological systems.

      To establish the possibility of implicit regularization in artificial networks, the authors cite convincing work from the machine-learning community (Blanc et al. 2020, Li et al., 2021). Here the authors make an important contribution by translating these findings into more biologically plausible models and showing that their core assumptions remain plausible. The authors also develop helpful intuition in Figure 4 by showing a minimal model that captures the essence of their result.

      We are glad that these translation efforts are appreciated.

      In Figure 2, the authors show a convincing example of the gradual sparsification of tuning curves during the early stages of drift in a model of 1D navigation. However, the evidence presented in Figure 3 could be improved. In particular, 3A shows a histogram displaying the fraction of active units over 1117 simulations. Although there is a spike near zero, a sizeable portion of simulations have greater than 60% active units at the end of the training, and critically the authors do not characterize the time course of the active fraction for every network, so it is difficult to evaluate their claim that "all [networks] demonstrated... [a] phase of directed random motion with the low-loss space." It would be useful to revise the manuscript to unpack these results more carefully. For example, a histogram of log(tau) computed in panel B on a subset of simulations may be more informative than the current histogram in panel A.

      The previous figure 3A was indeed confusing. In particular, it lumped together many simulations without proper curation. We redid this figure (now Figure 4), and added supplementary figures (Figures S1, S2) to better explain our results. It is now clear that the simulations with a large number of active units were either due to non-convergence, slow timescale of sparsification or simulations featuring label noise in which the fraction of active units is less affected. Regarding the log(tau) calculation, while it could indeed be an informative plot, it could not be calculated in a simple manner for all simulations. This is because learning curves are not always exponential, but sometimes feature initial plateaus (see also Saxe et al 2013, Schuessler et al 2020). We added a more detailed explanation of this limitation in the methods section, and we believe the current figure exemplifies the effect in a satisfactory manner.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Representational drift as a result of implicit regularization" the authors study the phenomenon of representational drift (RD) in the context of an artificial network that is trained in a predictive coding framework. When trained on a task for spatial navigation on a linear track, they found that a stochastic gradient descent algorithm led to a fast initial convergence to spatially tuned units, but then to a second very slow, yet directed drift which sparsified the representation while increasing the spatial information. They finally show that this separation of timescales is a robust phenomenon and occurs for a number of distinct learning rules.

      Strengths:

      This is a very clearly written and insightful paper, and I think people in the community will benefit from understanding how RD can emerge in such artificial networks. The mechanism underlying RD in these models is clearly laid out and the explanation given is convincing.

      We thank the reviewer for the support.

      Weaknesses:

      It is unclear how this mechanism may account for the learning of multiple environments.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      The process of RD through this mechanism also appears highly non-stationary, in contrast to what is seen in familiar environments in the hippocampus, for example.

      The non-stationarity noted by the reviewer is indeed a major feature of our observations, and is indeed linked to familiarity. We divide learning into three phases (now more clearly stated in Table 1 and Figure 4C). The first, rapid phase, consists of improvement of performance - corresponding to initial familiarity with the environment. The third phase, often reported in the literature of representational drift, is indeed stationary and obtained after prolonged familiarity. Our work focuses on the second phase, which is not as immediate as the first one, and can take several days. We note in the discussion that experiments which include a long familiarization process can miss this phase (see also Table 3). Furthermore, we speculate that real life is less stationary than a lab environment, and this second phase might actually be more relevant there.

      Reviewer #3 (Public Review):

      Summary:

      Single-unit neural activity tuned to environmental or behavioral variables gradually changes over time. This phenomenon, called representational drift, occurs even when all external variables remain constant, and challenges the idea that stable neural activity supports the performance of well-learned behaviors. While a number of studies have described representational drift across multiple brain regions, our understanding of the underlying mechanism driving drift is limited. Ratzon et al. propose that implicit regularization - which occurs when machine learning networks continue to reconfigure after reaching an optimal solution - could provide insights into why and how drift occurs in neurons. To test this theory, Ratzon et al. trained a Feedforward Network trained to perform the oft-utilized linear track behavioral paradigm and compare the changes in hidden layer units to those observed in hippocampal place cells recorded in awake, behaving animals.

      Ratzon et al. clearly demonstrate that hidden layer units in their model undergo consistent changes even after the task is well-learned, mirroring representational drift observed in real hippocampal neurons. They show that the drift occurs across three separate measures: the active proportion of units (referred to as sparsification), spatial information of units, and correlation of spatial activity. They continue to address the conditions and parameters under which drift occurs in their model to assess the generalizability of their findings.

      However, the generalizability results are presented primarily in written form: additional figures are warranted to aid in reproducibility.

      We added figures, and a Github with all the code to allow full reproducibility.

      Last, they investigate the mechanism through which sparsification occurs, showing that the flatness of the manifold near the solution can influence how the network reconfigures. The authors suggest that their findings indicate a three-stage learning process: 1) fast initial learning followed by 2) directed motion along a manifold which transitions to 3) undirected motion along a manifold.

      Overall, the authors' results support the main conclusion that implicit regularization in machine learning networks mirrors representational drift observed in hippocampal place cells.

      We thank the reviewer for this summary.

      However, additional figures/analyses are needed to clearly demonstrate how different parameters used in their model qualitatively and quantitatively influence drift.

      We now provide additional figures regarding parameters (Figures S1, S2).

      Finally, the authors need to clearly identify how their data supports the three-stage learning model they suggest.

      Their findings promise to open new fields of inquiry into the connection between machine learning and representational drift and generate testable predictions for neural data.

      Strengths:

      (1) Ratzon et al. make an insightful connection between well-known phenomena in two separate fields: implicit regularization in machine learning and representational drift in the brain. They demonstrate that changes in a recurrent neural network mirror those observed in the brain, which opens a number of interesting questions for future investigation.

      (2) The authors do an admirable job of writing to a large audience and make efforts to provide examples to make machine learning ideas accessible to a neuroscience audience and vice versa. This is no small feat and aids in broadening the impact of their work.

      (3) This paper promises to generate testable hypotheses to examine in real neural data, e.g., that drift rate should plateau over long timescales (now testable with the ability to track single-unit neural activity across long time scales with calcium imaging and flexible silicon probes). Additionally, it provides another set of tools for the neuroscience community at large to use when analyzing the increasingly high-dimensional data sets collected today.

      We thank the reviewer for these comments. Regarding the hypotheses, these are partially confirmed in the new analyses we provide of data from multiple labs (new Figure 3 and Table 3) - indicating that prolonged exposure to the environment leads to more stationarity.

      Weaknesses:

      (1) Neural representational drift and directed/undirected random walks along a manifold in ML are well described. However, outside of the first section of the main text, the analysis focuses primarily on the connection between manifold exploration and sparsification without addressing the other two drift metrics: spatial information and place field correlations. It is therefore unclear if the results from Figures 3 and 4 are specific to sparseness or extend to the other two metrics. For example, are these other metrics of drift also insensitive to most of the Feedforward Network parameters as shown in Figure 3 and the related text? These concerns could be addressed with panels analogous to Figures 3a-c and 4b for the other metrics and will increase the reproducibility of this work.

      We note that the results from figures 3 and 4 (original manuscript) are based on abstract tasks, while in figure 2 there is a contextual notion of spatial position. Spatial position metrics are not applicable to the abstract tasks as they are simple random mapping of inputs, and there isn’t necessarily an underlying latent variable such as position. This transition between task types is better explained in the text now. In essence the spatial information and place field correlation changes are simply signatures of the movements in parameter space. In the abstract tasks their change becomes trivial, as the spatial information becomes strongly correlated with sparsity and place fields are simply the activity vectors of units. These are guaranteed to change as long as there are changes in the activity statistics. We present here the calculation of these metrics averaged over simulations for completeness.

      Author response image 1.

      PV correlation between training time points averaged over 362 simulations. (B) Mean SI of units normalized to first time step, averaged over 362 simulations. Red line shows the average time point of loss convergence, the shaded area represents one standard deviation.

      (2) Many caveats/exceptions to the generality of findings are mentioned only in the main text without any supporting figures, e.g., "For label noise, the dynamics were qualitatively different, the fraction of active units did not reduce, but the activity of the units did sparsify" (lines 116-117). Supporting figures are warranted to illustrate which findings are "qualitatively different" from the main model, which are not different from the main model, and which of the many parameters mentioned are important for reproducing the findings.

      We now added figures (S1, S2) that show this exactly. We also added a github to allow full reproduction.

      (3) Key details of the model used by the authors are not listed in the methods. While they are mentioned in reference 30 (Recanatesi et al., 2021), they need to be explicitly defined in the methods section to ensure future reproducibility.

      The details of the simulation are detailed in the methods sections. We also added a github to allow full reproducibility.

      (4) How different states of drift correspond to the three learning stages outlined by the authors is unclear. Specifically, it is not clear where the second stage ends, and the third stage begins, either in real neural data or in the figures. This is compounded by the fact that the third stage - of undirected, random manifold exploration - is only discussed in relation to the introductory Figure 1 and is never connected to the neural network data or actual brain data presented by the authors. Are both stages meant to represent drift? Or is only the second stage meant to mirror drift, while undirected random motion along a manifold is a prediction that could be tested in real neural data? Identifying where each stage occurs in Figures 2C and E, for example, would clearly illustrate which attributes of drift in hidden layer neurons and real hippocampal neurons correspond to each stage.

      Thanks for this comment, which urged us to better explain these concepts.

      The different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      Recommendations for the authors:

      The reviewers have raised several concerns. They concur that the authors should address the specific points below to enhance the manuscript.

      (1) The three different phases of learning should be clearly delineated, along with how they are determined. It remains unclear in which exact phase the drift is observed.

      This is now clearly explained in the new Table 1 and Figure 4C. Note that the different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      (2) The term "sparsification" of unit activity is not fully clear. Its meaning should be more explicitly explained, especially since, in the simulations, a significant number of units appear to remain active (Fig. 3A).

      We now define precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

      (3) While the study primarily focuses on one aspect of representational drift-the proportion of active units-it should also explore other features traditionally associated with representational drift, such as spatial information and the correlation between place fields.

      This absence of features is related to the abstract nature of some of the tasks simulated in our paper. In our original submission the transition between a predictive coding task to more abstract tasks was not clearly explained, creating some confusion regarding the measured metrics. We now clarified the motivation for this transition.

      Both the initial simulation and the new experimental data analysis include spatial information (Figures 2,3). The following simulations (Figure 4) with many parameter choices use more abstract tasks, for which the notion of correlation between place cells and spatial information loses its meaning as there is no spatial ordering of the inputs, and every input is encountered only once. Spatial information becomes strongly correlated with the inverse of the active fraction metric. The correlation between place cells is also directly linked to increase in sparseness for these tasks.

      (4) There should be a clearer illustration of how labeling noise influences learning dynamics and sparsification.

      This was indeed confusing in the original submission. We removed the simulations with label noise from Figure 4, and added a supplementary figure (S2) illustrating the different effects of label noise.

      (5) The representational drift observed in this study's simulations appears to be nonstationary, which differs from in vivo reports. The reasons for this discrepancy should be clarified.

      We added experimental results from three additional labs demonstrating a change in activity statistics (i.e. increase in spatial information and increase in sparseness) over a long period of time. We suggest that such a change long after the environment is already familiar is an indication for the second phase, and stress that this change seems to saturate at some point, and that most drift papers start collecting data after this saturation, hence this effect was missed in previous in vivo reports. Furthermore, these effects are become more abundant with the advent on new calcium imaging methods, as the older electrophysiological regording methods did not usually allow recording of large amounts of cells for long periods of time. The new Table 3 surveys several experimental papers, emphasizing the degree of familiarity with the environment.

      (6) A distinctive feature of the hippocampus is its ability to learn different spatial representations for various environments. The study does not test representational drift in this context, a topic of significant interest to the community. Whether the authors choose to delve into this is up to them, but it should at least be discussed more comprehensively, as it's only briefly touched upon in the current manuscript version.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      (7) The methods section should offer more details about the neural nets employed in the study. The manuscript should be explicit about the terms "hidden layer", "units", and "neurons", ensuring they are defined clearly and not used interchangeably..

      We changed the usage of these terms to be more coherent and made our code publicly available. Specifically, “units” refer to artificial networks and “neurons” to biological ones.

      In addition, each reviewer has raised both major and minor concerns. These are listed below and should be addressed where possible.

      Reviewer #1 (Recommendations For The Authors):

      I recommend that the authors edit the text to soften their claims. For example:

      In the abstract "To uncover the underlying mechanism, we..." could be changed to "To investigate, we..."

      Agree. Done

      On line 21, "Specifically, recent studies showed that..." could be changed to "Specifically, recent studies suggest that..."

      Agree. Done

      On line 100, "All cases" should probably be softened to "Most cases" or more details should be added to Figure 3 to support the claim that every simulation truly had a phase of directed random motion.

      The text was changed in accordance with the reviewer’s suggestion. In addition, the figure was changed and only includes simulations in which we expected unit sparsity to arise (without label noise). We also added explanations and supplementary figures for label noise.

      Unless I missed something obvious, there is no new experimental data analysis reported in the paper. Thus, line 159 of the discussion, "a phenomenon we also observed in experimental data" should be changed to "a phenomenon that recently reported in experimental data."

      We thank the reviewer for drawing our attention to this. We now analyzed data from three other labs, two of which are novel analyses on existing data. All four datasets show the same trends of sparseness with increasing spatial information. The new Figure 3 and text now describe this.

      On line 179 of the Discussion, "a family of network configurations that have identical performance..." could be softened to "nearly identical performance." It would be possible for networks to have minuscule differences in performance that are not detected due to stochastic batch effects or limits on machine precision.

      The text was changed in accordance with the reviewer’s suggestion.

      Other minor comments:

      Citation 44 is missing the conference venue, please check all citations are formatted properly.

      Corrected.

      In the discussion on line 184, the connection to remapping was confusing to me, particularly because the cited reference (Sanders et al. 2020) is more of a conceptual model than an artificial network model that could be adapted to the setting of noisy learning considered in this paper. How would an RNN model of remapping (e.g. Low et al. 2023; Remapping in a recurrent neural network model of navigation and context inference) be expected to behave during the sparsifying portion of drift?

      We now clarified this section. The conceptual model of Sanders et al includes a specific prediction (Figure 7 there) which is very similar to ours - a systematic change in robustness depending on duration of training. Regarding the Low et al model, using such mechanistic models is an exciting avenue for future research.

      Reviewer #2 (Recommendations For The Authors):

      I only have two major questions.

      (1) Learning multiple representations: Memory systems in the brain typically must store many distinct memories. Certainly, the hippocampus, where RD is prominent, is involved in the ongoing storage of episodic memories. But even in the idealized case of just two spatial memories, for example, two distinct linear tracks, how would this learning process look? Would there be any interference between the two learning processes or would they be largely independent? Is the separation of time scales robust to the number of representations stored? I understand that to answer this question fully probably requires a research effort that goes well beyond the current study, but perhaps an example could be shown with two environments. At the very least the authors could express their thoughts on the matter.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      (2) Directed drift versus stationarity: I could not help but notice that the RD illustrated in Fig.2D is not stationary in nature, i.e. the upper right and lower left panels are quite different. This appears to contrast with findings in the hippocampus, for example, Fig.3e-g in (Ziv et al, 2013). Perhaps it is obvious that a directed process will not be stationary, but the authors note that there is a third phase of steady-state null drift. Is the RD seen there stationary? Basically, I wonder if the process the authors are studying is relevant only as a novel environment becomes familiar, or if it is also applicable to RD in an already familiar environment. Please discuss the issue of stationarity in this context.

      The non-stationarity noted by the reviewer is indeed a major feature of our observations, and is indeed linked to familiarity. We divide learning into three phases (now more clearly stated in Table 1 and Figure 4C). The first, rapid, phase consists of improvement of performance - corresponding to initial familiarity with the environment. The third phase, often reported in the literature of representational drift, is indeed stationary and obtained after prolonged familiarity. Our work focuses on the second phase, which is not as immediate as the first one, and can take several days. We note in the discussion that experiments which include a long familiarization process can miss this phase (see also Table 3). Furthermore, we speculate that real life is less stationary than a lab environment, and this second phase might actually be more relevant there.

      Reviewer #3 (Recommendations For The Authors):

      Most of my general recommendations are outlined in the public review. A large portion of my comments regards increasing clarity and explicitly defining many of the terms used which may require generating more figures (to better illustrate the generality of findings) or modifying existing figures (e.g., to show how/where the three stages of learning map onto the authors' data).

      Sparsification is not clearly defined in the main text. As I read it, sparsification is meant to refer to the activity of neurons, but this needs to be clearly defined. For example, lines 262-263 in the methods define "sparseness" by the number of active units, but lines 116-117 state: "For label noise, the dynamics were qualitatively different, the fraction of active units did not reduce, but the activity of the units did sparsify." If the fraction of active units (defined as "sparseness") did not change, what does it mean that the activity of the units "sparsified"? If the authors mean that the spatial activity patterns of hidden units became more sharply tuned, this should be clearly stated.

      We now defined precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

      Likewise, it is unclear which of the features the authors outlined - spatial information, active proportion of units, and spatial correlation - are meant to represent drift. The authors should clearly delineate which of these three metrics they mean to delineate drift in the main text rather than leave it to the reader to infer. While all three are mentioned early on in the text (Figure 2), the authors focus more on sparseness in the last half of the text, making it unclear if it is just sparseness that the authors mean to represent drift or the other metrics as well.

      The main focus of our paper is on the non-stationarity of drift. Namely that features (such as these three) systematically change in a directed manner as part of the drift process. This is in The new analyses of experimental data show sparseness and spatial information.

      The focus on sparseness in the second half of the paper is because we move to more abstract These are also easy to study in the more abstract tasks in the second part of the paper. In our original submission the transition between a predictive coding task to more abstract tasks was not clearly explained, creating some confusion regarding the measured metrics. We now clarified the motivation for this transition.

      It is not clear if a change in the number of active units alone constitutes "drift", especially since Geva et al. (2023) recently showed that both changes in firing rate AND place field location drive drift, and that the passage of time drives changes in activity rate (or # cells active).

      Our work did not deal with purely time-dependent drift, but rather focused on experience-dependence. Furthermore, Geva et al study the stationary phase of drift, where we do not expect a systematic change in the total number of cells active. They report changes in the average firing rate of active cells in this phase, as a function of time - which does not contradict our findings.

      "hidden layer", "units", and "neurons" seem to be used interchangeably in the text (e.g., line 81-85). However, this is confusing in several places, in particular in lines 83-85 where "neurons" is used twice. The first usage appears to refer to the rate maps of the hidden layer units simulated by the authors, while the second "neurons" appears to refer to real data from Ziv 2013 (ref 5). The authors should make it explicit whether they are referring to hidden layer units or actual neurons to avoid reader confusion.

      We changed the usage of these terms to be more coherent. Specifically, “units” refer to artificial networks and “neurons” to biological ones.

      The authors should clearly illustrate which parts of their findings support their three-phase learning theory. For example, does 2E illustrate these phases, with the first tenth of training time points illustrating the early phase, time 0.1-0.4 illustrating the intermediate phase, and 0.4-1 illustrating the last phase? Additionally, they should clarify whether the second and third stages are meant to represent drift, or is it only the second stage of directed manifold exploration that is considered to represent drift? This is unclear from the main text.

      The different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      Line 45 - It appears that the acronym ML is not defined above here anywhere.

      Added.

      Line 71: the ReLU function should be defined in the text, e.g., sigma(x) = x if x > 0 else 0.

      Added.

      106-107: Figures (or supplemental figures) to demonstrate how most parameters do not influence sparsification dynamics are warranted. As written, it is unclear what "most parameters" mean - all but noise scale. What about the learning rule? Are there any interactions between parameters?

      We now removed the label noise from Figure 4, and added two supplementary figures to clearly explain the effect of parameters. Figure 4 itself was also redone to clarify this issue.

      2F middle: should "change" be omitted for SI?

      The panel was replaced by a new one in Figure 3.

      116-119: A figure showing how results differ for label noise is warranted.

      This is now done in Figure S1, S2.

      124: typo, The -> the

      Corrected.

      127-129: This conclusion statement is the first place in the text where the three stages are explicitly outlined. There does not appear to be any support or further explanation of these stages in the text above.

      We now explain this earlier at the end of the Introduction section, along with the new Table 1 and marking on Figure 4C.

      132-133 seems to be more of a statement and less of a prediction or conclusion - do the authors mean "the flatness of the loss landscape in the vicinity of the solution predicts the rate of sparsification?"

      We thank the reviewer for this observation. The sentence was rephrased:

      Old: As illustrated in Fig. 1, different solutions in the zero-loss manifold might vary in some of their properties. The specific property suggested from theory is the flatness of the loss landscape in the vicinity of the solution.

      New: As illustrated in Fig. 1, solutions in the zero-loss manifold have identical loss, but might vary in some of their properties. The authors of [26] suggest that noisy learning will slowly increase the flatness of the loss landscape in the vicinity of the solution.

      135: typo, it's -> its

      Corrected.

      Line 135-136 "Crucially, the loss on the 136 entire manifold is exactly zero..." This appears to contradict the Figure 4A legend - the loss appears to be very high near the top and bottom edges of the manifold in 4A. Do the authors mean that the loss along the horizontal axis of the manifold is zero?

      The reviewer is correct. The manifold mentioned in the sentence is indeed the horizontal axis. We changed the text and the figure to make it clearer.

      Equation 6: This does not appear to agree with equation 2 - should there be an E_t term for an expectation function?

      Corrected.

      Line 262-263: "Sparseness means that a unit has become inactive for all inputs." This should also be stated explicitly as the definition of sparseness/sparsification in the main text.

      We now define precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

    2. Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Representational drift as a result of implicit regularization" the authors study the phenomenon of representational drift (RD) in the context of an artificial network which is trained in a predictive coding framework. When trained on a task for spatial navigation on a linear track, they found that a stochastic gradient descent algorithm led to a fast initial convergence to spatially tuned units, but then to a second very slow, yet directed drift which sparsified the representation while increasing the spatial information. They finally show that this separation of time-scales is a robust phenomenon and occurs for a number of distinct learning rules.

      This is a very clearly written and insightful paper, and I think people in the community will benefit from understanding how RD can emerge in such artificial networks. The mechanism underlying RD in these models is clearly laid out and the explanation given is convincing.

      It still remains unclear how this mechanism may account for the learning of multiple environments, although this is perhaps a topic for future study. The non-stationarity of the drift in this framework would seem, at first blush, to contrast with what one sees experimentally, but the authors provide compelling evidence that there are continuous changes in network properties during learning and that stationarity may be the hallmark of overfamiliarized environments. Future experimental work may further shed light on differences in RD between novel and familiar environments.

    3. eLife assessment

      This study presents a new and important theoretical account of spatial representational drift in the hippocampus. The evidence supporting the claims is convincing, with a clear and accessible explanation of the phenomenon. Overall, this study will likely attract researchers exploring learning and representation in both biological and artificial neural networks.

    4. Reviewer #1 (Public Review):

      The authors start from the premise that neural circuits exhibit "representational drift" -- i.e., slow and spontaneous changes in neural tuning despite constant network performance. While the extent to which biological systems exhibit drift is an active area of study and debate (as the authors acknowledge), there is enough interest in this topic to justify the development of theoretical models of drift.

      The contribution of this paper is to claim that drift can reflect a mixture of "directed random motion" as well as "steady state null drift." Thus far, most work within the computational neuroscience literature has focused on the latter. That is, drift is often viewed to be a harmless byproduct of continual learning under noise. In this view, drift does not affect the performance of the circuit nor does it change the nature of the network's solution or representation of the environment. The authors aim to challenge the latter viewpoint by showing that the statistics of neural representations can change (e.g. increase in sparsity) during early stages of drift. Further, they interpret this directed form of drift as "implicit regularization" on the network.

      The evidence presented in favor of these claims is concise, but on balance I find their evidence persuasive, at least in artificial network models. This paper includes a brief analysis of four independent experiments in Figure 3, which corroborates the main claims of the paper. Future work should dig deeper into the experimental data to provide a finer grained characterization. For example, in addition to quantifying the overall number of active units, it would be interesting to track changes in the signal-to-noise ratio of each place field, the widths of the place fields, et cetera.

      To establish the possibility of implicit regularization in artificial networks, the authors cite convincing work from the machine learning community (Blanc et al. 2020, Li et al., 2021). Here the authors make an important contribution by translating these findings into more biologically plausible models and showing that their core assumptions remain plausible. The authors also develop helpful intuition in Figure 5 by showing a minimal model that captures the essence of their result.

    5. Reviewer #3 (Public Review):

      Summary:

      Single unit neural activity tuned to environmental or behavioral variables gradually changes over time. This phenomenon, called representational drift, occurs even when all external variables remain constant, and challenges the idea that stable neural activity supports the performance of well-learned behaviors. While a number of studies have described representational drift across multiple brain regions, our understanding of the underlying mechanism driving drift is limited. Ratzon et al. propose that implicit regularization - which occurs when machine learning networks continue to reconfigure after reaching an optimal solution - could provide insights into why and how drift occurs in neurons. To test this theory, Ratzon et al. trained a recurrent neural network (RNN) trained to perform the oft-utilized linear track behavioral paradigm and compare the changes in hidden layer units to those observed in hippocampal place cells recorded in awake, behaving animals.

      Ratzon et al. clearly demonstrate that hidden layer units in their model undergo consistent changes even after the task is well-learned, mirroring representational drift observed in real hippocampal neurons. They show that the drift occurs across three separate measures: the active proportion of units (referred to as sparsification), spatial information of units, and correlation of spatial activity. They continue to address the conditions and parameters under which drift occurs in their model to assess the generalizability of their findings to non-spatial tasks. Last, they investigate the mechanism through which sparsification occurs, showing that flatness of the manifold near the solution can influence how the network reconfigures. The authors suggest that their findings indicate a three stage learning process: 1) fast initial learning followed by 2) directed motion along a manifold which transitions to 3) undirected motion along a manifold.

      Overall, the authors' results support the main conclusion that implicit regularization in machine learning networks mirrors representational drift observed in hippocampal place cells. Their findings promise to open new fields of inquiry into the connection between machine learning and representational drift in other, non-spatial learning paradigms, and to generate testable predictions for neural data.

      Strengths:

      (1) Ratzon et al. make an insightful connection between well-known phenomena in two separate fields: implicit regularization in machine learning and representational drift in the brain. They demonstrate that changes in a recurrent neural network mirror those observed in the brain, which opens a number of interesting questions for future investigation.

      (2) The authors do an admirable job of writing to a large audience and make efforts to provide examples to make machine learning ideas accessible to a neuroscience audience and vice versa. This is no small feat and aids in broadening the impact of their work.

      (3) This paper promises to generate testable hypotheses to examine in real neural data, e.g., that drift rate should plateau over long timescales (now testable with the ability to track single-unit neural activity across long time scales with calcium imaging and flexible silicon probes). Additionally, it provides another set of tools for the neuroscience community at large to use when analyzing the increasingly high-dimensional data sets collected today.

      Weaknesses:

      The revised manuscript addresses all the weaknesses outlined in my initial review. However, there is one remaining (minor) weakness regarding how "sparseness" is used and defined.

      Sparseness can mean different things to different fields. For example, for engram studies, sparseness could be measured at the population level by the proportion of active cells, whereas for a physiology study, sparseness might be measured at the neuron level by the change in peak firing rate of each cell as an animal enters that cell's place field. In this manuscript, the idea of "sparseness" is introduced indirectly in the last paragraph of the introduction as "...changes in activity statistics (sparseness)...", but it is unclear from the preceding text if the referenced "activity statistics" used to define sparseness are the "fraction of active units," or their "tuning specificity," or both. While sparseness is clearly defined in the Methods section for the RNN, there is no mention of how it is defined for neural data, and spatial information is not mentioned at all. For clarity, I suggest explicitly defining sparseness for both the RNN and real neural data early in the main text, e.g. "Here, we measure sparseness in neural data by A and B, and by the analogous metric(s) of X and Y in our RNN..." This is a small but important nuance that will enhance the ease of reading for a broad neuroscience audience.

    1. Author response:

      The following is the authors’ response to the original reviews.

      General comments

      All three experts have raised excellent ideas and made important suggestions to extend the scope of our study and provide additional information. While we fully acknowledge that these points are valid and would provide exciting new knowledge, we also should not lose track of the fact that a single study cannot cover all bases. Sulfated steroids, for example, are clearly essential components of mouse urine. Unfortunately, however, all chemical analysis approaches are limited and the one we opted for is not suitable for analysis of such signaling molecules. Future studies should certainly focus on these aspects. The same holds true for the fact that we do not know which of the identified compounds are actually VSN ligands. These are inherent limitations of the approach, and we are not claiming otherwise.

      Reviewer #1 (Public Review):

      (1) In this manuscript, Nagel et al. sought to comprehensively characterize the composition of urinary compounds, some of which are putative chemosignals. They used urines from adult males and females in three different strains, including one wild-derived strain. By performing mass spectrometry of two classes of compounds: volatile organic compounds and proteins, they found that urines from inbred strains are qualitatively similar to those of a wild strain. This finding is significant because there is a high degree of genetic diversity in wild mice, with chemosensory receptor genes harboring many polymorphisms.

      We agree and thank the Reviewer for his / her positive assessment.

      (2) In the second part of this work, the authors used calcium imaging to monitor the pattern of vomeronasal neuron responses to these urines. By performing pairwise comparisons, the authors found a large degree of strain-specific response and a relatively minor response to sex-specific urinary stimuli. This is a finding generally in agreement with previous calcium imaging work by Ron Yu and colleagues in 2008. The authors extend the previous work by using urines from wild mice. They further report that the concentration diversity of urinary compounds in different urine batches is largely uncorrelated with the activity profiles of these urines. In addition, the authors found that the patterns of vomeronasal neuron response to urinary cues are not identical when measured using different recipient strains. This fascinating finding, however, requires an additional control to exclude the possibility that this is not due to sampling error.

      We thank Reviewer 1 for pointing this out. We agree that this is truly a “fascinating finding.” Reviewer 1 emphasizes that we need to add an “additional control to exclude […] that this is not due to sampling error”, and he / she elaborates on the required control in his / her Recommendations For The Authors (see below). Reviewer 1 states that “for Fig. 5, in order to conclude that the same urine activates a different population of VSNs in two different strains, a critical control is needed to demonstrate that this is not due to the sampling variability - as compositions of V1Rs and V2Rs could vary between different slices, one preferred control is to use VNO slices from the same strain and compare the selectivity used here across the A-P axis.” Importantly, we believe that this is already controlled for. In fact, for each experiment, we routinely prepare VNO slices along the organ’s entire anterior-to-posterior axis (not including the most anterior tip, where the VNO lumen tapers into the vomeronasal duct, and the most posterior part, the lumen ‘‘twists’’ toward the ventral aspect and its volume decreases (see Figs. 7 & S7 in Hamacher et al., 2024, Current Biology)). This usually yields ~7 slices per individual experiment / session. Therefore, we routinely sample and average across the entire VNO anterior-to-posterior axis for each experiment. In Fig. 5, in which we analyzed whether the “same urine activates a different population of VSNs in two different strains”, individual independent experiments from each strain (C57BL/6 versus BALB/c) amounted to (a) n = 6 versus n = 8; (b) n = 10 versus n = 10; (c) n = 7 versus n = 9; (d) n = 9 versus n = 10; (e) n = 10 versus n = 9; and (f) n = 12 versus n = 10. Together, we conclude that it is very unlikely that the considerably different response profiles measured in different recipient strains result from a “sampling error.”

      To clarify this point in the revised manuscript, we now explain our sampling routine in more detail in the Materials and Methods. Moreover, we now also refer to this point in the Results.

      (3) There are several weaknesses in this manuscript, including the lack of analysis of the compositions of sulfated steroids and other steroids, which have been proposed to be the major constituents of vomeronasal ligands in urines and the indirect (correlational) nature of their mass spectrometry data and activity data.

      Reviewer 1 is correct to point out that our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other nonvolatile small organic molecules for three main reasons: (i) as the reviewer points out, (sulfated) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a feature that we consider a strength of the current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      (4) Overall, the major contribution of this work is the identification of specific molecules in mouse urines. This work is likely to be of significant interest to researchers in chemosensory signaling in mammals and provides a systematic avenue to exhaustively identify vomeronasal ligands in the future.

      We thank the Reviewer for his / her generally positive assessment.

      Reviewer #2 (Public Review):

      (1) This manuscript by Nagel et al provides a comprehensive examination of the chemical composition of mouse urine (an important source of semiochemicals) across strain and sex, and correlates these differences with functional responses of vomeronasal sensory neurons (an important sensory population for detecting chemical social cues). The strength of the work lies in the careful and comprehensive imaging and chemical analyses, the rigor of quantification of functional responses, and the insight into the relevance of olfactory work on lab-derived vs wild-derived mice.

      We thank the Reviewer for his / her generally positive assessment.

      (2) With regards to the chemical analysis, the reader should keep in mind that a difference in the concentration of a chemical across strain or sex does not necessarily mean that that chemical is used for chemical communication. In the most extreme case, the animals may be completely insensitive to the chemical. Thus, the fact that the repertoire of proteins and volatiles could potentially allow sex and/or strain discrimination, it is unclear to what degree both are used in different situations.

      Reviewer 2 is correct to point out that sex- and/or strain-dependent differences in urine molecular composition do not automatically attribute a signaling function to those molecules. We concur and, in fact, stress this point many times throughout the manuscript. In the Results, for example, we point out (i) that “in female urine, BALB/c-specific proteins are substantially underrepresented, a fact not reflected by VSN response profiles”, (ii) that “as observed in C57BL/6 neurons, the skewed distributions of protein concentration indices were not reflected by BALB/c generalist VSN profiles”, and (iii) that “VSN population response profiles do not reflect the global molecular content of urine, suggesting that the VNO functions as a rather selective molecular detector.” Moreover, in the Discussion, we state (i) that “caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone”; (ii) that, for several sex- and/or strain-specific molecules, none “has previously been attributed a chemosensory function. Challenging the mouse VNO with purified recombinant protein(s) will help elucidate whether such functions exist”; (iii) that “generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations”; and (iv) that “to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”

      In the revised manuscript, we now aim to even more strongly emphasize the point made by Reviewer 2. In the Discussion, we have deleted a sentence that read: “Sex- and strain-specific chemical profiles give rise to unique VSN activity patterns.” Moreover, we have added the following statement: “In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all.”

      Reviewer #3 (Public Review):

      (1) One of the primary objectives in this study is to ascertain the extent to which the response profiles of VSNs are specific to sex and strain. The design of these Ca2+ imaging experiments uses a simple stimulus design, using two interleaved bouts of stimulation with pairs of urine (e.g. male versus female C57BL/6, male C57BL/6 versus male BALB/c) at a single dilution factor (1:100). This introduces two significant limitations: (1) the "generalist" versus "specialist" descriptors pertain only to the specific pairwise comparisons made and (2) there is no information about the sensitivity/concentration-dependence of the responses.

      Reviewer 3 points to two limitations of our VSN activity assay. He / she is correct to mention that characterizing a VSN as generalist or specialist based on a “pairwise comparison” should not be the basis of attributing such a “generalist” or “specialist” label in general (i.e., regarding the global stimulus space). We acknowledge this point, but we do not regard this as a limitation of our study since we are not investigating rather broad (i.e., multidimensional) questions of selectivity. All we are asking in the context of this study is whether VSNs - when being challenged with pairs of sex- or strain-specific urine samples - act as rather selective semiochemical detectors. Of course, one can always think of a study design that provides more information. However, we here opted for an assay that - in our hands - is robust, “low noise” (i.e., displays low intrinsic signal variability as evident form reliability index calculations), ensures recovery from VSN adaptation (Wong et al., 2018), and, importantly, answers the specific question we are asking.

      Regarding the second point (“there is no information about the sensitivity/concentrationdependence of the responses”), we would like to emphasize that this was not a focus of our study either. In fact, concentration-dependence of VSN activity has been a major focus of several previous studies referenced in our manuscript (e.g., Leinders-Zufall et al., 2000; He et al., 2008), albeit with contradictory results. In our study, we ask whether a pair of stimuli that we have shown to display, in part, strikingly different chemical composition (both absolute and relative) preferentially activates the same or different VSNs. With this question in mind, we believe that our assay (and its results) are highly informative.

      (2) The functional measurements of VSN tuning to various pairs of urine stimuli are consistently presented alongside mass spectrometry-based comparisons. Although it is clear from the manuscript text that the mass spectrometry-based analysis was separated from the VSN tuning experiments/analysis, the juxtaposition of VSN tuning measurements with independent molecular diversity measurements gives the appearance to readers that these experiments were integrated (i.e., that the diversity of ligands was underlying the diversity of physiological responses). This is a hypothesis raised by the parallel studies, not a supported conclusion of the work. This data presentation style risks confusing readers.

      As Reviewer 3 points out correctly “it is clear from the manuscript text that the mass spectrometry-based analysis was separated from the VSN tuning experiments/analysis.” In the figures, we try make the distinction between VSN response statistics and chemical profiling more obvious by gray shadows that link the plots depicting VSN response characteristics to the general pie charts.

      We now also made an extra effort to avoid “confusing readers” by stating in the Discussion (i) that “caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone”; (ii) that, for several sex- and/or strain-specific molecules, none “has previously been attributed a chemosensory function. Challenging the mouse VNO with purified recombinant protein(s) will help elucidate whether such functions exist”; (iii) that “generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations”; and (iv) that “to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.” Moreover, we have deleted a sentence that read: “sex- and strain-specific chemical profiles give rise to unique VSN activity patterns”, and we have added the following statement: “In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all.”

      However, we believe that there is value in presenting “VSN tuning measurements” next to “independent molecular diversity measurements.” While these are independent measurements, their similarity or, quite frequently, lack thereof are informative. We are sure that by taking the above “precautions” we have now mitigated the risk of “confusing readers.”

      (3) The impact of mass spectrometry findings is limited by the fact that none of these molecules (in bulk, fractions, or monomolecular candidate ligands) were tested on VSNs. It is possible that only a very small number of these ligands activate the VNO. The list of variably expressed proteins - especially several proteins that are preferentially found in female urine - is compelling, but, again, there is no evidence presented that indicates whether or not these candidate ligands drive VSN activity. It is noteworthy that the largest class of known natural ligands for VSNs are small nonvolatiles that are found at high levels in mouse urine. These molecules were almost certainly involved in driving VSN activity in the physiology assays (both "generalist" and "specialist"), but they are absent from the molecular analysis.

      Reviewer 3 is right, of course, that at this point we have not tested the identified molecules on VSNs. This is clearly beyond the scope of the present study. We believe that the data we present will be the basis of (several full-length) future studies that aim to identify specific ligands and - best case scenario - receptor-ligand pairs. We find it hard to concur that our study, which provides the necessary basis for those future endeavors, is regarded as “incomplete”. By design, all studies are somewhat incomplete, i.e., there are always remaining questions and we are not contesting that.

      It is true, of course, that a class of “known natural ligands for VSNs are small nonvolatiles.” As we replied above, our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other non-volatile small organic molecules for three main reasons: (i) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a fact that we consider a key strength of our current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      Reviewer #1 (Recommendations For The Authors):

      (1) I find that the study is highly valuable for researchers in this field. With the finding that wild mouse urines do not elicit significantly more variable responses from urines from inbred strains, researchers can now be reassured to use inbred strains to gain general insights on pheromone signaling.

      A major omission of this study is non-volatile small organic molecules such as steroids. These compounds are the only molecular class in urine that have been identified to stimulate specific vomeronasal receptors to date. It is unclear to me that the specificity of VOC and proteins can alone fully explain the response specificity of the VSNs that have been monitored in this study. The discussion of this topic is highly beneficial for the readers.

      Reviewer 1 is correct to point out that our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other nonvolatile small organic molecules for three main reasons: (i) as the reviewer points out, (sulfated) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a fact that we consider a key strength of our current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      (2) How many different wild mouse urines were tested in this study? Is this sufficient to capture the diversity of wild M. musculus in local (Prague) habitats?

      We thank the reviewer for pointing this out. For the present study, 20 male (M) and 27 female (F) wild mice were caught at six different sites in the broader Prague area (i.e., Bohnice (50.13415N, 14.41421E; 2M+4F), Dolni Brezany (49.96321N, 14.4585E; 3M+4F), Hodkovice (49.97227N, 14.48039E; 5M+6F), Písnice (49.98988N, 14.46625E; 3M+6F), Lhota (49.95369N, 14.43087E; 1M+2F), and Zalepy (49.9532N, 14.40829E; 6M+5F). 18 of the 27 wild females were caught pregnant. The remaining 9 females were mated with males caught at the same site and produced offspring within a month. When selecting 10 male and 10 female individuals from first-generation offspring for urine collection, we ensured that all six capture sites were represented and that age-matched animals displayed similar weight (~17g). We believe that this capture / breeding strategy sufficiently represents “the diversity of wild M. musculus in local (Prague) habitats.” In the revised manuscript, we have now included these details in the Materials and Methods.

      (3) I found Figure 1e and figures in a similar format confusing - one panel describes the response statistics of VSNs, and other panels show the number of compounds found in different MS profiling, which is not immediately obvious from the figures. Is the y-axis legend correct (%)?

      We now try make the distinction between VSN “response statistics” and chemical profiling more obvious by gray shadows that link the plots depicting VSN response characteristics to the general pie charts. Moreover, we thank the Reviewer for pointing out the mislabeling of the y-axis. Accordingly, we have deleted “%” in all corresponding figures.

      (4) For Figure 5, in order to conclude that the same urine activates a different population of VSNs in two different strains, a critical control is needed to demonstrate that this is not due to the sampling variability - as compositions of V1Rs and V2Rs could vary between different slices, one preferred control is to use VNO slices from the same strain and compare the selectivity used here across the A-P axis.

      We thank Reviewer 1 for pointing this out. Importantly, we believe that this is already controlled for (see our response to the Public Review). In fact, for each experiment, we routinely prepare VNO slices along the entire anterior-to-posterior axis (not including the most anterior tip, where the VNO lumen tapers into the vomeronasal duct, and the most posterior part, the lumen ‘‘twists’’ toward the ventral aspect and its volume decreases (see Figs. 7 & S7 in Hamacher et al., 2024, Current Biology)). This usually yields ~7 slices per individual experiment / session. Therefore, we routinely sample and average across the entire VNO anterior-to-posterior axis for each experiment. In Fig. 5, individual independent experiments from each strain (C57BL/6 versus BALB/c) amounted to (a) n = 6 versus n = 8; (b) n = 10 versus n = 10; (c) n = 7 versus n = 9; (d) n = 9 versus n = 10; (e) n = 10 versus n = 9; and (f) n = 12 versus n = 10. Together, we can thus exclude that the considerably different response profiles that we measured using different recipient strains result from a “sampling error.”

      To clarify this point in the revised manuscript, we now explain our sampling routine in more detail in the Materials and Methods. Moreover, we now also mention this point in the Results.

      Reviewer #2 (Recommendations For The Authors):

      (1) Pg 5 Lines 3-16: This summary paragraph contains too much detail given that the reader has not read the paper yet, which makes it bewildering. This should be condensed.

      We agree and have substantially condensed this paragraph.

      (2) Pg 6 Line 5-8: This summary of the experimental design is obtuse and should be edited for clarity.

      We have edited the relevant passage for clarity.

      (3) Pg 6 Line 11: "VSNs were categorized..." Specialist vs generalist is defined as responding to one or both stimuli. This definition is placed right after saying that the cells were also tested with KCl. The reader might think that specialist vs generalist was defined in relation to KCl.

      We have edited this sentence, which now reads: “Dependent on their individual urine response profiles, VSNs were categorized as either specialists (selective response to one stimulus) or generalists (responsive to both stimuli).”

      (4) Pg 6 Line 13: "we recorded urine-dependent Ca2+ signals from a total of 16,715 VSNs". Is a "signal" a response? Did all 16,715 VSNs respond to urine? What was the total of KCl responsive cells recorded?

      We edited the corresponding passage for clarification. The text now reads: “Overall, we recorded >43,000 K+-sensitive neurons, of which a total of 16,715 VSNs (38.4%) responded to urine stimulation. Of these urine-sensitive neurons, 61.4% displayed generalist profiles, whereas 38.6% were categorized as specialists (Figure 1c,d).”

      (5) Pg 7 Line 6: The repeated use of the word "pooled" is confusing as it suggests a variation in the experiment. The authors should establish once in the Methods and maybe in the Results that stimuli were pooled across animals. Then they should just refer to the stimulus as male or female or BALB/c rather than "pooled" male etc.

      We acknowledge the reviewer’s argument. Accordingly, we now introduce the experimental use of pooled urine once in the Methods and in the introductory paragraph of the Results. All other references to “pooled” urine in the Results and Captions have been deleted.

      (6) Pg 7 Line 10: "...detected in >=3 out of 10 male..." For the chemical analysis, were these samples not pooled?

      Correct. We deliberately did not pool samples for chemical analysis, but instead analyzed all individual samples separately (i.e., 60 samples were subjected to both proteomic and metabolomic analyses). Thus, the criterion that a VOC or protein must be detected in at least 3 of the 10 individual samples from a given sex/strain combination for a ‘present’ call (and in at least 6 of the 10 samples to be called ‘enriched’) ensures that the molecular signatures we identify are not “contaminated” by unusual aberrations within single samples.<br /> For clarification, we now explicitly outline this procedure in the Methods (Experimental Design and Statistical Analysis – Proteomics and metabolomics).

      (7) Pg 7 Line 23: In line 7, the specialist rate was defined as 5% in reference to the total KCl responsive cells. Here the specialist rate is defined from responsive cells. This is confusing.

      We apologize for the confusion. In both cases, the numbers (%) refer to all K+-sensitive neurons. We have added this information to both relevant sentences (l. 7 as well as ll. 23-24). Note that the rate in ll. 23-24 refers to generalists.

      (8) Pg 7 Line 25: Concentration index should be defined before its use here.

      We have revised the corresponding sentence, which now reads: “By contrast, analogously calculated concentration indices (see Materials and Methods) that can reflect potential disparities are distributed more broadly and non-normally (Figure 1h).”

      (9) Pg 7 Line 29: change "trivially" to "simply".

      Done

      (10) Pg 7 Line 30: What is meant by a "generalist" ligand? The neurons are generalists. Probably should read "common ligands"

      We have changed the text accordingly.

      (11) Pg 7 Line 31: What is meant by "global observed concentration disparities" ?

      We have changed the text to “…represented by the observed general concentration disparities.”

      (12) Pg 8 Lines 7-11: This section needs to be edited for clarity as it is very difficult to follow. For example, the definition of "enriched" is buried in a parenthetical. Also, it is very difficult to figure out what a "sample" is in this paper. Is it a pooled stimulus, or is it urine from an individual animal?

      We apologize for the confusion. Throughout the paper a “sample” is a pooled stimulus (from all 10 individuals of a given sex/strain combination) for all physiological experiments. For chemical analysis a “sample” refers to urine from an individual animal.

      (13)Pg 8 Line 11: "abundant proteins" Does this mean absolute concentration or enriched in one sample vs another?

      We changed the term “abundant” to “enriched” as this descriptor has been defined (present in ≥6 of 10 individual samples) in the previous sentence.

      (14) Pg 8 Line 18: "While 32.9% of all..." Please edit for clarity. What is the point?

      The main point here is that, for VOCs, the vast majority of compounds (91.3%) are either generic mouse urinary molecules or are sex/strain-specific.

      (15) Pg 10 Line 18: "Increased VSN selectivity..." This title is misleading as it suggests a change in sensitivity with animal exposure. I think the authors are trying to say "VSNs are more selective for strain than for sex". The authors should avoid the term "exposure to" when they mean "stimulation with" as the former suggests chronic exposure prior to testing.

      We thank the reviewer for the advice and have changed the title accordingly. We also edited the text to avoid the term "exposure to" throughout the manuscript.

      (16) Pg 12 Line 10: "we recorded hardly any..." Hardly any in comparison to what? BALB/c?

      We apologize for the confusion. We have edited the text for clarity, which now reads: “In fact, (i) compared to an average specialist rate of 11.2% ± 6.6% (mean ± SD) calculated over all 13 binary stimulus pairs (n = 26 specialist types), we observed only few specialist responses upon stimulation with urine from wild females (2% and 3%, respectively), and…”

      Reviewer #3 (Recommendations For The Authors):

      (1) Related to the pairwise stimulus-response experimental design and analysis: there is precedent in the field for studies that explore the same topic (sex- and strain-selectivity), but measure VSN sensitivity across many urine stimuli, not just two at a time. This has been done both in the VNO (He et al, Science, 2008; Fu, et al, Cell, 2015) and in the AOB (Tolokh, et al, Journal of Neuroscience, 2013). The current manuscript does not cite these studies.

      Reviewer 3 is correct and we apologize for this oversight. We now cite the two VSN-related studies by He et al. and Fu et al. in the Introduction.

      (2) The findings of the mass spectrometry-based profiling of mouse urine - especially for volatiles - is only accessible through repositories, making it difficult to for readers to understand which molecules were found to be highly divergent between sexes/strains. There is value in the list of ligands to further investigate, but this information should be made more accessible to readers without having to comb through the repositories.

      We agree that there “is value in the list of ligands to further investigate” and, accordingly, we now provide a table (Table 1) that lists the top-5 VOCs that – according to sPLS-DA – display the most discriminative power to classify samples by sex (related to Figure 2c) or strain (related to Figure 2d). For ease of identification, all entries list internal mass spectrometry identifiers, identifiers extracted from MS analysis database, the sex or strain that drives separation, which two-dimensional component / x-variate represents the most discriminative variable, PubChem chemical formula, PubChem common or alternative names, Chemical Entities of Biological Interest or PubChem Compound Identification, and the VOC’s putative origin.

      (3) There is a long precedent for integrating molecular assessments and physiological recordings to identify specific ligands for the vomeronasal system: - nonvolatiles (e.g., Leinders-Zufall, et al., Nature, 2000)

      • peptides (e.g., Kimoto et al., Nature, 2005; Leinders-Zufall et al. Science, 2004; Riviere et al., Nature, 2009; Liberles, et al., PNAS, 2009)
      • proteins (e.g., Chamero et al., Nature, 2007; Roberts et al., BMC Biology, 2010)

      • excreted steroids and bile acids (Nodari et al., Journal of Neuroscience, 2008; Fu et al., Cell, 2015; Doyle, et al., Nature Communications, 2016)

      The Leinders-Zufall (2000), Roberts, and Nodari papers are referenced, but the broader efforts by the community to find specific drivers of vomeronasal activity are not fully represented in the manuscript. The focus of this paper is fully related to this broader effort, and it would be appropriate for this work to be placed in this context in the introduction and discussion.

      We now refer to all of the studies mentioned in the Introduction (except the article published by Liberles et al. in 2009, since the authors of that study do not identify vomeronasal ligands).

      (4) Throughout the manuscript (starting in Fig. 1h) the figure panels and captions use the term "response index" whereas the methods define a "preference index." It seems to be the case that these two terms are synonymous. If so, a single term should be consistently used. If not, this needs to be clarified.

      We now consistently use the term “response index” throughout the manuscript.

      (5) It would be useful to provide a table associated with Figure 2 - figure supplement 1 that lists the common names and/or chemical formulas for the volatiles that were found to be of high importance.

      We agree and, accordingly, we now provide a table (Table 2) that lists VOC, which – according to Random Forest classification and resulting Gini importance scores – display the most discriminative power to classify samples by sex (related to Figure 2 - figure supplement 1a) or strain (related to Figure 2 - figure supplement 1b). Notably, it is generally reassuring that several VOCs are listed in both Table 1 and Table 2, emphasizing that two different supervised machine learning algorithms (i.e., sPLS-DA (Table 1) and Random Forest (Table 2)) yield largely congruent results.

      (5) The use of the term "comprehensive" for the molecular analysis is a little bit misleading, as volatiles and proteins are just two of the many categories of molecules present in mouse urine.

      We have now deleted most mentions of the term "comprehensive" when referring to the molecular analysis.

      (7) Page 11, lines 24-27: The sentences starting "We conclude..." and ending in "semiochemical concentrations." These two sentences do not make sense. It is not known how many of the identified proteins are actual VSN ligands. Moreover, there is abundant evidence from other studies that individual VSN activity provides information about distinct semiochemical concentrations.

      We have substantially edited and rephrased this paragraph to better reflect that different scenarios / interpretations are possible. The relevant text now reads: “We conclude that VSN population response strength might not be so strongly affected by strain-dependent concentration differences among common urinary proteins. In that case, it would appear somewhat unlikely that individual VSN activity provides fine-tuned information about distinct semiochemical concentrations. Alternatively, as some (or even many) of the identified proteins could not serve as vomeronasal ligands at all, generalist VSNs might sample information from only a subset of compounds which, in fact, are secreted at roughly similar concentrations.”

      (8) The explanation of stimulus timing is mentioned several times but not defined clearly in methods. Page 19, lines 14-19 have information about the stimulus delivery device, but it would be helpful to have stimulus timing explicitly stated.

      In addition to the relevant captions, we now explicitly state stimulus timing (i.e., 10 s stimulations at 180 s inter-stimulus intervals) in the Results.

      (9) Typos: Page 10, line 7: "male biased" → "male-biased" for clarity

      Wilcoxon "signed-rank" test is often misspelled "Wilcoxon singed ranked test" or "Wilcoxon signed ranked test"

      In the Fig. 3 legend, the asterisk meaning is unspecified.

      "(im)balances" → imbalances (page 27, line 24; page 37, line 16; page 38, line 16)

      Figure 2 - figure supplement 1 and in Figure 2 - figure supplement 2, in the box-andwhisker plots the units are not specified in the graph or legend.”

      We have made all required corrections.

    2. eLife assessment

      This carefully executed study provides a comparison of the chemical composition of mouse urine across strain and sex with the responses of vomeronasal sensory neurons, which are responsible for detecting chemical social cues. While the authors did not examine all molecular classes found in mouse urine or directly test whether the urinary volatile chemicals that vary with sex and strain are effective vomeronasal neuron ligands, solid data are provided that will be of significant interest to those studying chemical communication in rodents. This work should provide a valuable foundation for future research that will determine which molecules drive sex- and strain-specific vomeronasal responses.

    3. Reviewer #1 (Public Review):

      In this manuscript, Nagel et al. sought to characterize the composition of urinary compounds, some of which are putative chemosignals. They used urines from adult males and females in three different strains, including one wild-derived strain. By performing mass spectrometry of two classes of compounds: volatile organic compounds and proteins, they found that urines from inbred strains are qualitatively similar to those of a wild strain. This finding is significant because there is a high degree of diversity in different inbred strains and wild mice, with respect to the polymorphisms of chemosensory receptor genes and expression of vomeronasal ligands previously identified. Notably, their study did not characterize steroids, which represent a major class of urinary chemosignals activating vomeronasal neurons. Therefore, important future studies should address the strain dependence of steroid composition in urines.

      In the second part of this work, the authors used calcium imaging to monitor the pattern of vomeronasal neuron responses to these urines. By performing pairwise comparisons, the authors found a large degree of strain-specific response and a relatively minor response to sex-specific urinary stimuli. This is a finding generally in agreement with previous calcium imaging work by Ron Yu and colleagues in 2008. The authors extend the previous work by using urines from wild mice. They further report that the concentration diversity of urinary compounds in different urine batches is largely uncorrelated with the activity profiles of these urines. In addition, the authors found that the patterns of vomeronasal neuron response to urinary cues are not identical when measured using different recipient strains.

      The pitfalls of this study are the omission of steroids for the mass spectrometry experiments and the indirect (correlational) nature of their mass spectrometry data and activity data. Whether the urinary compounds identified in this study activate vomeronasal neurons were not tested.

      Nevertheless, the major contribution of this work is the identification of specific molecules in mouse urines. This work is likely to be of significant interest to researchers in chemosensory signaling in mammals and could provide a systematic avenue to exhaustively identify additional pheromones in mice.

    4. Reviewer #2 (Public Review):

      This manuscript by Nagel et al provides a comprehensive examination of the chemical composition of mouse urine (an important source of semiochemicals) across strain and sex, and correlates these differences with functional responses of vomeronasal sensory neurons (an important sensory population for detecting chemical social cues). The strength of the work lies in the careful and comprehensive imaging and chemical analyses, the rigor of quantification of functional responses, and the insight into the relevance of olfactory work on lab-derived vs wild-derived mice.

      With regards to the chemical analysis, the reader should keep in mind (and the authors acknowledge) that a difference in the concentration of a chemical across strain or sex does not necessarily mean that that chemical is used for chemical communication. In the most extreme case, the animals may be completely insensitive to the chemical. Thus, the fact that the repertoire of proteins and volatiles could potentially allow sex and/or strain discrimination, it is unclear to what degree both are used in different situations.

    5. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Nagel, et al. describes studies of mouse vomeronasal sensory neuron (VSN) tuning to mouse urine samples across different sexes and strains, including wild mice, alongside mass spectrometry analysis of the same samples. The authors performed live Ca2+ imaging (CAL520 dye) of VSNs in acute vomeronasal organ (VNO) slices to determine how VSNs are tuned to pairs of stimuli that differ in their origin (e.g. male C57BL/6 versus male BALB/c urine, male C57BL/6 versus female C57BL/6, etc.). For each pair of tested odorants, the results measure the proportion of VSNs that respond to both stimuli ("generalists") or just one of the two ("specialists"), as well as metrics of tuning preference and response reliability. The authors find in most cases that generalists make up a larger proportion of responsive VSNs than specialists, but several pairwise comparisons showed a high degree of strain selectivity. Notably, the authors evaluated VSN tuning in both male C57BL/6 and male BALB/c VNOs, finding strain-dependent differences in the representation of mouse urine. Alongside these measurements of VSN tuning, the authors report results of mass spectrometry analyses of volatiles and proteins in the same urine samples. These analyses indicated a number of molecules in each category that vary across sex and strain, and therefore represent candidate vomeronasal ligands. However, this study did not directly test whether any of these candidate molecules drives VSN activity. Overall, this work provides solid information related to mouse vomeronasal chemosensation.

      Strengths:

      A strength of the current study is its focus on characterizing the neural responses of the VNO to urine derived from wild mice. The majority of existing vomeronasal system research has relied on the use of inbred strains for both neural response recordings and investigations of candidate vomeronasal system ligands. Inbreeding in laboratory environments may alter the chemical composition of bodily secretions, thereby potentially changing the information they contain. Moreover, the more homogeneous nature of inbred strains could be critical when studying the AOS mediated social aspects. If there exist noticeable differences in the chemical composition of secretions from wild animals compared to inbred strains, this would suggest that future research must consider natural sources of candidate ligands outside of inbred strains. This work identifies some intriguing differences, worthy of further exploration, between the urine composition of wild mice versus inbred mice, as well as disparities in how the VNO responds to urine from these different sources. However, the molecular composition and VNO responsiveness to wild mouse urine was found to be highly overlapping with inbred mouse urine, supporting the continued investigation of candidate ligands found in inbred mouse urine.

      Another positive aspect of this work is its use of the same set of stimuli as a previous study by the same authors (Bansal et al., 2021) in the downstream accessory olfactory bulb. The consistency in stimulus selection facilitates a comparison of information processing of sex and strain information from the sensory periphery to the brain. Although comparisons between the two connected regions are not a focus of this work, and methodological differences (e.g., Ca2+ imaging versus electrophysiology) may introduce caveats into comparisons, the support of "apples to apples" comparisons across connected circuits is critical to progress in the field.

      Finally, this study directly measured VSN tuning in both male C57BL/6 and male BALB/c VNOs, finding subtle but important differences in the representation of mouse urine in these two recipient strains. Given that there is a long history of behavioral research into strain-specific differences in social behavior, this research paves the way for future studies into how different mouse strains detect and process social chemosignals.

      Weaknesses:

      One of the primary objectives in this study is to ascertain the extent to which the response profiles of VSNs are specific to sex and strain. The design of these Ca2+ imaging experiments uses a simple stimulus design, using two interleaved bouts of stimulation with pairs of urine (e.g., male versus female C57BL/6, male C57BL/6 versus male BALB/c) at a single dilution factor (1:100). This introduces two significant limitations: (1) the "generalist" versus "specialist" descriptors pertain only to the specific pairwise comparisons made and (2) there is no information about the sensitivity/concentration-dependence of the responses.

      The functional measurements of VSN tuning to various pairs of urine stimuli are presented alongside mass spectrometry-based comparisons. However, the mass spectrometry-based analysis was performed separately from VSN tuning experiments/analysis. The juxtaposition of these measurements may give some readers the impression that VSN tuning measurements were integrated with molecular profiling (i.e., that the molecular diversity was causally related physiological responses). This is a hypothesis raised by the parallel studies, but not a supported conclusion of the current work.

      The impact of mass spectrometry findings is acknowledged to be limited to nonvolatile organic compounds and proteins/peptides, and that it is possible that few of these candidate molecules are active in the VNO. Moreover, it remains possible that the VSN responses are driven mostly by small nonvolatiles (e.g., polar steroids), a class of strong VSN ligands that were excluded from molecular analysis.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study utilizes a virus-mediated short hairpin RNA (shRNA) approach to investigate in a novel way the role of the wild-type PHOX2B transcription factor in critical chemosensory neurons in the brainstem retrotrapezoid nucleus (RTN) region for maintaining normal CO2 chemoreflex control of breathing in adult rats. The solid results presented show blunted ventilation during elevated inhaled CO2 (hypercapnia) with knockdown of PHOX2B, accompanied by a reduction in expression of Gpr4 and Task2 mRNA for the proposed RTN neuron proton sensor proteins GPR4 and TASK2. These results suggest that maintained expression of wild-type PHOX2B affects respiratory control in adult animals, which complements previous studies showing that PHOX2B-expressing RTN neurons may be critical for chemosensory control throughout the lifespan and with implications for neurological disorders involving the RTN. When some methodological, data interpretation, and prior literature reference issues further highlighting novelty are adequately addressed, this study will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This important study investigated the role of the PHOX2B transcription factor in neurons in the key brainstem chemosensory structure, the retrotrapezoid nucleus (RTN), for maintaining proper CO2 chemoreflex responses of breathing in the adult rat in vivo. PHOX2B has an important transcriptional role in neuronal survival and/or function, and mutations of PHOX2B severely impair the development and function of the autonomic nervous system and RTN, resulting in the developmental genetic disease congenital central hypoventilation syndrome (CCHS) in neonates, where the RTN may not form and is functionally impaired. The function of the wild-type PHOX2B protein in adult RTN neurons that continue to express PHOX2B is not fully understood. By utilizing a viral PHOX2B-shRNA approach for knockdown of PHOX2B specifically in RTN neurons, the authors' solid results show impaired ventilatory responses to elevated inspired CO2, measured by whole-body plethysmography in freely behaving adult rats, that develop progressively over a four-week period in vivo, indicating effects on RTN neuron transcriptional activity and associated blunting of the CO2 ventilatory response. The RTN neuronal mRNA expression data presented suggests the impaired hypercapnic ventilatory response is possibly due to the decreased expression of key proton sensors in the RTN. This study will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

      Strengths:

      (1) The authors used a shRNA viral approach to progressively knock down the PHOX2B protein, specifically in RTN neurons to determine whether PHOX2B is necessary for the survival and/or chemosensory function of adult RTN neurons in vivo.

      (2) To determine the extent of PHOX2B knockdown in RTN neurons, the authors combined RNAScope® and immunohistochemistry assays to quantify the subpopulation of RTN neurons expressing PHOX2B and neuromedin B (Nmb), which has been proposed to be key chemosensory neurons in the RTN.

      (3) The authors demonstrate that knockdown efficiency is time-dependent, with a progressive decrease in the number of Nmb-expressing RTN neurons that co-express PHOX2B over a four-week period.

      (4) Their results convincingly show hypoventilation particularly in 7.2% CO2 only for PHOX2B-shRNA RTN-injected rats after four weeks as compared to naïve and non-PHOX2B-shRNA targeted (NT-shRNA) RTN injected rats, suggesting a specific impairment of chemosensitive properties in RTN neurons with PHOX2B knockdown.

      (5) Analysis of the association between PHOX2B knockdown in RTN neurons and the attenuation of the hypercapnic ventilatory response (HCVR), by evaluating the correlation between the number of Nmb+/PHOX2B+ or Nmb+/PHOX2B- cells in the RTN and the resulting HCVR, showed a significant correlation between HCVR and number of Nmb+/PHOX2B+ and Nmb+/PHOX2B- cells, suggesting that the number of PHOX2B-expressing cells in the RTN is a predictor of the chemoreflex response and the reduction of PHOX2B protein impairs the CO2-chemoreflex.

      (6) The data presented indicate that PHOX2B knockdown not only causes a reduction in the HCVR but also a reduction in the expression of Gpr4 and Task2 mRNAs, suggesting that PHOX2B knockdown affects RTN neurons transcriptional activity and decreases the CO2 response, possibly by reducing the expression of key proton sensors in the RTN.

      (7) Results of this study show that independent of the role of PHOX2B during development, PHOX2B is still required to maintain proper CO2 chemoreflex responses in the adult brain, and its reduction in CCHS may contribute to the respiratory impairment in this disorder.

      Weaknesses:

      (1) The authors found a significant decrease in the total number of Nmb+ RTN neurons (i.e., Nmb+/PHOX2B+ plus Nmb+/ PHOX2B-) in NT-shRNA rats at two weeks post viral injection, and also at the four-week period where the impairment of the chemosensory function of the RTN became significant, suggesting some inherent cell death possibly due to off-target toxic effects associated with shRNA procedures that may affect the experimental results.

      (2) The tissue sampling procedures for quantifying numbers of cells expressing proteins/mRNAs throughout the extended RTN region bilaterally have not been completely validated to accurately represent the full expression patterns in the RTN under experimental conditions.

      (3) The inferences about RTN neuronal expression of NMB, GPR4, or TASK2 are based on changes in mRNA levels, so it remains speculation that the observed reduction in Gpr4 and Task2 mRNA translates to a reduction in the protein levels and associated reduction of RTN neuronal chemosensitive properties.

      Thank you for sharing the excitement for our study showing novel findings on the contribution of PHOX2B to the chemoreflex response and activity of adult RTN neurons. We believe that reporting the results on cell death following shRNA viral injections, potentially due to some off-target effects, are important to share with the scientific community to help plan experiments of similar kind in various fields of neuroscience.

      Thanks for pointing out your concerns about cell quantification, we have edited the methods and results section to add clarity about our analytical procedure.

      As we discussed in the manuscript, we were only able to assess mRNA levels of Nmb, Gpr4, Task2 as current available antibodies for the 3 targets are still unreliable. Future studies will benefit from the analysis of changes at protein levels and possibly electrophysiological recordings to verify that chemosensitive properties of RTN neurons are impaired due to reduction of PHOX2B expression. We discuss these limitations in the discussion.

      Reviewer #2 (Public Review):

      Summary:

      The authors used a short hairpin RNA technique strategy to elucidate the functional activity of neurons in the retrotrapezoid nucleus (RTN), a critical brainstem region for central chemoreception. Dysfunction in this area is associated with the neuropathology of congenital central hypoventilation syndrome (CCHS). The subsequent examination of these rats aimed to shed light on the intricate aspects of RTN and its implications for central chemoreception and disorders like CCHS in adults. They found that using the short hairpin RNA (shRNA) targeting Phox2b mRNA, a reduction of Phox2b expression was observed in Nmb neurons. In addition, Phox2b knockdown did not affect breathing in room air or under hypoxia, but the hypercapnia ventilatory response was significantly impaired. They concluded that Phox2b in the adult brain has an important role in CO2 chemoreception. They thought that their findings provided new evidence for mechanisms related to CCHS neuropathology. The conclusions of this paper are well supported by data, but careful discussion seems to be required for comparison with the results of various previous studies performed by different genetic strategies for the RTN neurons.

      Strengths:

      The most exciting aspect of this work is the modelling of the Phox2b knockdown in one element of the central neuronal circuit mediating respiratory reflexes, that is in the RTN. To date, mutations in the PHOX2B gene are commonly associated with most patients diagnosed with CCHS, a disease characterized by hypoventilation and absence of chemoreflexes, in the neonatal period, which in severe cases can lead to respiratory arrest during sleep. In the present study, the authors demonstrated that the role of Phox2b extends beyond the developmental period, and its reduction in CCHS may contribute to the respiratory impairment observed in this disorder.

      Weaknesses:

      Whereas the most exciting part of this work is the knockdown of the Phox2b in the RTN in adult rodents, the weakness of this study is the lack of a clear physiological, developmental, and anatomical distinction between this approach and similar studies already reported elsewhere (Ruffault et al., 2015, DOI: 10.7554/eLife.07051; Ramanantsoa et al., 2011, DOI: 10.1523/JNEUROSCI.1721-11.2011; Huang et al., 2017, DOI: 10.1016/j.neuron.2012.06.027; Hernandez-Miranda et al., 2018, DOI: 10.1073/pnas.1813520115; Ferreira et al., 2022 DOI: 10.7554/eLife.73130; Takakura et al., 2008 DOI: 10.1113/jphysiol.2008.153163; Basting et al., 2015 DOI: 10.1523/JNEUROSCI.2923-14.2015; Marina et al., 2010 DOI: 10.1523/JNEUROSCI.3141-10.2010). In addition, several conclusions presented in this work are not directly supported by the provided data.

      Thanks for the feedback on or manuscript. We have further highlighted in our discussion the previous developmental work aimed at determining the role of PHOX2B in embryonic development. Our study was triggered by the fascinating observations that despite its important role in development of the central and peripheral nervous system, PHOX2B is still present in the adult brain and its function in adult neurons is unknown, thus we aimed to investigate its role in the adult RTN by knocking down its expression with a shRNA approach. Therefore, in our model knockdown of PHOX2B does not affect development of the RTN. Previous studies (mentioned by the reviewer, as well as cited in the manuscript) have focused on investigating 1) the role of PHOX2B in the developmental period, 2) the physiological changes associated with the transgenic expression of mutant forms of PHOX2B in relation to CCHS, 3) the killing or the acute silencing/excitation of neuronal activity of PHOX2B+ RTN neurons. Our study had a different aim: to test whether the transcription factor PHOX2B had a physiologically relevant role in adult RTN neurons. In this experimental approach PHOX2B is not altered throughout embryonic or postnatal development. By knocking down PHOX2B in the Nmb+ cells of the RTN our results show a reduction in chemoreflex response and mRNA expression of protein sensors. Hence, we conclude that PHOX2B alters the function of Nmb+ RTN neurons, possibly through transcriptional changes including the reduction in Gpr4 and Task2 mRNA expression.

      Reviewer #3 (Public Review):

      A brain region called the retrotrapezoid nucleus (RTN) regulates breathing in response to changes in CO2/H+, a process termed central chemoreception. A transcription factor called PHOX2B is important for RTN development and mutations in the PHOX2B gene result in a severe type of sleep apnea called Congenital Central Hypoventilation Syndrome. PHOX2B is also expressed throughout life, but its postmitotic functions remain unknown. This study shows that knockdown of PHOX2B in the RTN region in adult rats decreased expression of Task2 and Gpr4 in Nmb-expressing RTN chemoreceptors and this corresponded with a diminished ventilatory response to CO2 but did not impact baseline breathing or the hypoxic ventilatory response. These results provide novel insight regarding the postmitotic functions of PHOX2B in RTN neurons.

      Main issues:

      (1) The experimental approach was not targeted to Nmb+ neurons and since other cells in the area also express Phox2b, conclusions should be tempered to focus on Phox2b expressing parafacial neurons NOT specifically RTN neurons.

      (2) It is not clear whether PHOX2B is important for the transcription of pH sensing machinery, cell health, or both. If knockdown of PHOX2B knockdown results in loss of RTN neurons this is also expected to decrease Task2 and Gpr4 levels, albeit by a transcription-independent mechanism.

      Although we did not specifically target Nmb+ neurons, we performed viral injections within the area where neurons expressing PHOX2B and Nmb are localized (i.e., the RTN region). We carefully quantified the impact of PHOX2B knockdown on Nmb expressing neurons, as well as the effects on the adjacent TH expressing C1 population and FN neurons (figure 5). As reported in the results section, significant changes in the numbers of PHOX2B expressing neurons was only observed at the site of injection in PHOX2B+/Nmb+ neurons. We did not observe changes in the total number of C1 cells (TH+/PHOX2B+), in the number of TH cells coexpressing PHOX2B, or in the hypoxic ventilatory response (which is dependent on the health status of C1 neuron). We have updated figure 5 to show representative expression of PHOX2B in TH+ neurons in the ventral medulla to complement our cell count analysis. To address potential effects on other cell populations we have edited our discussion as follows:

      “PHOX2B knockdown was also restricted to RTN neurons, as adjacent C1 TH+ neurons did not show any change in number of TH+/PHOX2B+ expressing cells, although we cannot exclude that some C1 cells may have been infected and their relative PHOX2B expression levels were reduced. To support the lack of significant alterations associated with the possible loss of C1 function was the absence of significant changes in the hypoxic response that has been shown to be dependent on C1 neurons (Malheiros-Lima et al., 2017).”

      Where appropriate, we have substituted “RTN” with “Nmb expressing neurons of the RTN” throughout the manuscript.

      We have clarified in the methods and results section how we quantified Task2 and Gpr4 mRNA expression. The quantification was performed on a pool of single cells (200-250/rat) expressing Nmb. Hence, the overall reduction is not a result of general fluorescence loss in the RTN region, but specifically assessed in single cells expressing Nmb. This is therefore independent of the reduction of the total number of Nmb cells.

      We propose that cell death is not a direct effect of PHOX2B knockdown, but rather it is associated with the injection of the viral constructs that have been already reported to promote some off-target effects (as reported in the manuscript). While modest cell death is observed only in the first two weeks post-infection, it does not increase further between 2 and 4 weeks post infection, when the reduction in PHOX2B (not associated with a further reduction in Nmb+ cells, hence no further cell death in RTN) is evident and the respiratory chemoreflex is impaired. These results suggest that 1) reduction of PHOX2B is not responsible for cell death; 2) it is the reduction of PHOX2B levels that promotes chemoreflex impairment. Given the observation that Nmb cells with no detectable PHOX2B protein show reduced expression of Task2 and Gpr4 mRNA, we propose that one of the possible mechanisms of chemoreflex impairment in PHOX2B shRNA rats is the reduction of Task2 and Gpr4. In the discussion we also suggest possible additional mechanisms that can be investigated in further studies.

      Recommendations for the authors:.

      In revising this manuscript, the authors should carefully address the issues raised by the reviewers to substantially improve the manuscript and solidify the reviewers' general assessment of the potential importance of this work.

      Reviewer #1 (Recommendations For The Authors):

      Major concerns:

      (1) The cell counts for Nmb+/PHOX2B+ and Nmb+/PHOX2B- RTN neurons are a critical component of the study, and it is unclear how the tissue sampling procedures (eight sections per animal) for quantifying numbers of cells expressing proteins/mRNAs throughout the extended RTN region bilaterally has been validated to accurately represent the full expression patterns in the RTN under the experimental conditions. It is possible that the sampling/quantification procedures used may be adequate, but validation is important. Also, quantification of the CTCF signal for Nmb, Gpr4, and Task2 mRNA is an important component of this study, but only four sections/rats were used.

      Thank you for pointing out your concern on our quantification method. We have clarified in the methods section the procedure for cell counting and quantification of the CTCF signal. We have sampled the area of the RTN in order to identify Nmb cells of RTN.

      We have edited the methods section as follows:

      “To quantify Nmb+/PHOX2B- and Nmb+/PHOX2B+ neurons within the RTN region, we analysed one in every seven sections (210 µm interval; 8 sections/rat in total) along the rostrocaudal distribution of the RTN on the ventral surface of the brainstem and compared total bilateral cell counts of PHOX2B-shRNA rats with non-target control (NT-shRNA) and naïve rats. Cells that expressed Nmb and Phox2b mRNAs but did not show co-localization with PHOX2B protein were considered Nmb+/PHOX2B-.

      The Corrected Total Cell Fluorescence (CTCF) signal for Nmb, Gpr4 and Task2 mRNAs was quantified as previously described (Cardani et al., 2022; McCloy et al., 2014). Briefly, a Leica TCS SP5 (B-120G) Laser Scanning Confocal microscope was used to acquire images of the tissue. Exposure time and acquisition parameters were set for the naïve group and kept unchanged for the entire dataset acquisition. The collected images were then analysed by selecting a single cell at a time and measuring the area, integrated density and mean grey value (McCloy et al., 2014). For each image, three background areas were used to normalize against autofluorescence. We used 4 sections/rat (210 µm interval) to count Nmb, Gpr4 and Task2 mRNA CTCF in the core of the RTN area where several Nmb cells could be identified. For each section two images were acquired with a 20× objective, so that at least fifty cells per tissue sample were obtained for the mRNA quantification analysis. To evaluate changes in Nmb mRNA expression levels following PHOX2B knockdown at the level of the RTN, we compared, the fluorescence intensity of each RTN Nmb+ cell (223.2 ± 37.1 cells/animal) with the average fluorescent signal of Nmb+ cells located dorsally in the NTS (4.3 ± 1.2 cells/animal) (Nmb CTCF ratio RTN/NTS) as we reasoned that the latter would not be affected by the shRNA infection and knockdown.

      To quantify Gpr4 and Task2 mRNA expression in Nmb cells of the RTN, we first quantified single cell CTCF for either Gpr4 (200.7 ± 13.2 cells/animal) or Task2 (169.6 ± 10.3 cells/animal) mRNA in Nmb+ RTN neurons in the 3 experimental groups (naïve, NT shRNA and PHOX2B shRNA) independent of their PHOX2B expression. We then compared CTCF values of Gpr4 and Task2 mRNA between Nmb+/PHOX2B+ and Nmb+/PHOX2B- RTN neurons in PHOX2B-shRNA rats to address changes in their mRNA expression induced by PHOX2B knockdown.

      (2) Furthermore, to evaluate changes in Nmb mRNA expression following PHOX2B knockdown at the level of the RTN, it is stated in Materials and Methods "we compared, on the same tissue section, the fluorescence intensity of RTN Nmb+ cells with the signal of Nmb+ cells in the NTS (Nmb CTCF ratio RTN/NTS)". How this was accomplished is unclear, considering the non-overlapping locations of the RTN and rostral NTS. Providing images would be helpful.

      The first sections containing Nmb cells in the ventral medulla also express few Nmb cells in the dorsal medulla. We used those cells as reference for fluorescence levels since they would not be affected by the viral infection. Similar cells are also present in the brains of mice and reported in the Allen Brain atlas (https://mouse.brain-map.org/experiment/show/71836874). We have clarified our procedure in the methods section (see above) and included a sample image of Nmb in both ventral and dorsal regions in Figure 5.

      (3) The staining for tyrosine hydroxylase (TH) to identify and quantify C1 cells (TH+/PHOX2B+) following shRNA injection provides important information, and it would be useful to show images of histological examples to accompany Fig. 5A.

      We included in figure 5A a sample image of C1 neurons used for our TH quantification.

      Minor:

      (1) Provide animal ns in the text of the Results section for the four weeks of PHOX2B knockdown.

      They have been included.

      (2) Please state in the legends for Figures 2 & 3, which images are superimposition images.

      We have in the figure information about merged images.

      Reviewer #2 (Recommendations For The Authors):

      This manuscript by Cardani and colleagues attempts to address whether a reduction in Phox2b expression in chemosensitive neuromedin-B (NMB)-expressing neurons in the RTN alters respiratory function. The authors used a short hairpin RNA technique to silence RTN chemosensor neurons. The present study is very interesting, but there are several major concerns that need to be addressed, including the main hypothesis.

      Major

      (1) Page 6, lines 119-121: I did not grasp the mechanistic property described by the authors in this passage, nor did I understand the experiments they conducted to establish a mechanistic link between Phox2b and the chemosensitive property. Could the authors provide further clarification on these points?

      We believe the reviewer refers to this paragraph: “In order to have a better understanding of the role of PHOX2B in the CO2 homeostatic processes we used a non-replicating lentivirus vector of two short-hairpin RNA (shRNA) clones targeting selectively Phox2b mRNA to knockdown the expression of PHOX2B in the RTN of adult rats and tested ventilation and chemoreflex responses. In parallel, we also determined whether knockdown of PHOX2B in adult RTN neurons negatively affected cell survival. Finally, we sought to provide a mechanistic link between PHOX2B expression and the chemosensitive properties of RTN neurons, which have been attributed to two proton sensors, the proton-activated G protein-coupled receptor (GPR4) and the proton-modulated potassium channel (TASK-2).”

      The rationale for running these experiments is based on the fact that it is well known in the literature that PHOX2B is an important transcription factor for the development of several neuronal populations. PHOX2B Knockout mice die before birth and heterozygous mice have some anatomical defects, but respiration is only impaired in the early post-natal period. While many developmental transcription factors are generally downregulated in the post-natal period, PHOX2B is still expressed in some neurons into adulthood. What is the function of PHOX2B in these fully developed neurons? We do not know as we do not yet know the entire set of target genes that PHOX2B regulates in the adult brain. Hence we decided to test what would happen if we knocked down the PHOX2B protein in the Nmb neurons of the RTN, an area that is critical for central chemoreception and involved in the presentation of CCHS. Our results show that reduction of PHOX2B blunts the CO2 chemoreflex response and reduces mRNA expression of Task2 and Gpr4, two pH sensors that have been shown to be key for RTN chemosensitive properties. We also show that the Nmb mRNA and cell survival are not affected by PHOX2B knockdown and we propose that the reduced CO2 chemoreflex may be attributed to a reduction of chemosensory function of Nmb neurons of the RTN due to partial loss of Gpr4 and Task2.

      (2) It is imperative for the authors to enhance the description of their hypothesis, as, from my perspective, the contribution of the data to the field is not clearly articulated. Numerous more selectively designed experiments were conducted to investigate the role of Phoxb-expressing neurons at the RTN level and their involvement in respiratory activity. In summary, the current study appears to lack novelty.

      We respectfully disagree with this statement. We believe we have adequately summarized previous work, although we realize we can’t reference every single publication on this topic. As described above, the developmental role of PHOX2B has been elegantly investigated in mouse embryonic studies (extensively cited in the manuscript). Furthermore, very interesting studies have shown that when the CCHS defining mutant PHOX2B protein (+7Ala PHOX2B) and other mutations linked to CCHS have been transgenically expressed in mice through development, severe anatomical defects are observed and respiratory function is impaired (extensively cited in the manuscript). We have also cited papers relevant to this study that describe the role of PHOX2B/Nmb RTN neurons and the pH protein sensors in the CO2 chemoreflex. If we missed some papers that the reviewer deems essential in the context of this study we will be happy to include them.

      We are not aware of other studies that have investigated the specific role of the PHOX2B protein in the adult RTN in the absence of confounding developmental pathogenesis (i.e. in an otherwise ‘healthy’ animal), and of no other studies that looked at the effects on the RTN proton sensors and Nmb expression following PHOX2B knockdown. Hence we believe that our results are novel and, in our opinion, very interesting.

      (3) On pages 13 and 14 (Results section), I am seeking clarity on the novelty of the findings. Doug Bayliss's prior work has already demonstrated the role of Gpr4 and Task2 on Phox2b neurons in regulating ventilation in conscious rodents.

      Bayliss’ group has elegantly demonstrated that Gpr4 and Task2 are the two proton sensors in the PHOX2B/Nmb neurons of the RTN that have a key role in chemoreception (cited in the manuscript). The novelty of our findings is that we show that a reduction in PHOX2B protein is associated with a reduction of mRNA levels of Gpr4 and Task2. This is a novel finding. Currently, we do not know what transcriptional activity PHOX2B has in adult RTN neurons (i.e., what gene targets PHOX2B has in this cell population and many others) and here we propose that Nmb is not a gene target of PHOX2B while Gpr4 and Task2 are.

      (4) The authors assert that the transcription factor Phox2b remains not fully understood. While I concur, the present study falls short of fully investigating the actual contribution of Phox2b to breathing regulation. In other words, the knockdown of Phox2b neurons did not add much to the knowledge of the field.

      We respectfully disagree with the reviewer. With the exception of very few target genes, the transcriptional role of PHOX2B beyond the embryonic development is poorly understood. No mechanistic connection has been made before between the transcriptional activity of PHOX2B with the expression of proton sensors in the RTN. Other groups have investigated the role of stimulating or depressing the neuronal activity of PHOX2B/NMB neurons in the RTN showing a key role of RTN on respiratory control, but these prior studies did not test whether changing the expression of the PHOX2B protein in these neurons had a role on respiratory control and the central chemoreflex. No other study has investigated the role of the PHOX2B protein within the RTN cells, with the exception of PHOX2B knockout mice or transgenic expression of the mutated PHOX2B that are relevant for CCHS. Again, these previous studies were done on a background of developmental impairment and to the best of our knowledge did not seek to show any association between PHOX2B expression and expression of Gpr4 or Task2.

      (5) I recommend removing the entire section entitled "The role of Phox2b in development and in the adult brain." The authors merely describe Phox2b expression without contextualizing it within the obtained data.

      Because reviewers raised the issue about not including important information about the role of PHOX2B in development and respiratory control we prefer to keep the section.

      (6) Are the authors aware of whether the shRNA in Phox2b/Nmb neurons truly induced cell death or solely depleted the expression of the transcription factor protein? Do the chemosensitive neurons persist?

      This is an excellent question that we tried to address with our study. As we report in figures 2 and 3, we propose that some cell death is occurring as an off-target effect within the first 2 weeks post-infection, likely due to off-target action of the shRNA approach and not dependent on the reduction of PHOX2B expression (discussed in the manuscript). This is further evidenced by our Fig.S1 data in which higher concentrations of shRNA led to more cell death, indicative of off-target effects. We do not believe it is a consequence of our surgical procedure as we do not see similar cell loss when injecting vehicle or other control solutions (unpublished work; Janes et al., 2024).

      During the first 2 weeks post-surgery the proportion of Nmb+/PHOX2B- cells does not change compared to control rats or non-target shRNA (knockdown is not yet visible at protein level). Four weeks post-injection, there is no further cell death (assessed by the total number of NMB cells), whereas the fraction of NMB cells that express PHOX2B is reduced (and the fraction of NMB not expressing PHOX2B is increased), suggesting that the reduction of PHOX2B protein in Nmb cells is not correlated with cell loss/survival whereas the impairment that we observe in terms of central chemoreception is possibly due to the progressive decrease of PHOX2B expression in these neurons.

      (7) In Figures 2 and 3, it is noteworthy that the authors observe peak expression at a very caudal level. In rats, the RTN initiates at the caudal end of the facial, approximately 11.6 mm, and should exhibit a rostral direction of about 2 mm.

      In our experience the Nmb cells on the ventral surface of the medulla peak in number around the caudal tip of the facial nucleus in adult SD rats (Janes et al., 2024). To add clarity to the figure we reported cell count distribution data in relation to the distance from caudal tip of the facial.

      Minor

      (1) I would like to suggest that the authors correct the recurring statement throughout the manuscript that Phox2b is essential only for the development of the autonomic nervous system. In my view, it also plays a crucial role in certain sensory and respiratory systems.

      We have addressed this in the manuscript.

      (2) Page 4, lines 59-60: Out of curiosity, do the data include information from different countries?

      This data refers to information from France and Japan. Currently it is estimated that there are 1000-2000 CCHS patients worldwide.

      (3) Page 7, lines 129-131: In my understanding, the sentence is quite clear; if we knock down the PHOX2B gene, we are expected to reduce or even eliminate the expression of Gpr4 or Task2. Am I right?

      This is what we propose from the results of this study. We would like to point out that the transcriptional activity of PHOX2B (i.e., what genes PHOX2B regulate) in adult neurons has not yet been fully investigated. With the exception of few target genes (e.g., TH, DBH) the transcriptional activity of PHOX2B in neurons is not yet known. Here we report novel findings that suggest that Gpr4 and Task2 are potential target genes of PHOX2B in RTN neurons.

      (4) The authors mentioned that NT-shRNA also impacts CO2 chemosensitivity. Could this effect be attributed to mechanical damage of the tissue resulting from the injection?

      Just to clarify, we observe some impairment in chemosensitivity when NT-shRNA was injected in “larger” (2x 200ul/side) volume. No impairment was observed in NT-shRNA when we injected smaller volumes (2x 100ul/side). Physical damage could be a possibility although in our experience (unpublished work; Janes et al, 2024, Acta Physiologica) injections of similar volume of solution performed by the same investigator in the same brain area and experimental settings did not produce a physical lesion associated with respiratory impairment. Hence we attribute the unexpected results with larger volumes to toxic effects associated with the shRNA viral constructs.

      (5) In the reference section, the authors should review and correct some entries. For instance, Janes, T. A., Cardani, S., Saini, J. K., & Pagliardini, S. (2024). Title: "Etonogestrel Promotes Respiratory Recovery in an In Vivo Rat Model of Central Chemoreflex Impairment." Running title: "Chemoreflex Recovery by Etonogestrel." Some references contain the journal, pages, and volume, while others lack this information entirely.

      We have updated references. Janes et al., 2024 has now been published in Acta Physiologica.

      (6) Why does the baseline have distribution points, whereas the other boxplots do not?

      We have clarified in the figure legend that, to be fair to the presentation of our results, the data points shown in some of the boxplot graphs do not refer to entire baseline data but only the ones that are outliers.

      In our Box-and Whisker-Plots, whiskers represent the 10th and 90th percentiles, showing the range of values for the middle 80% of the data. Individual data values that fall outside the 10th/90th percentile range are represented as single point (outliers).

      Reviewer #3 (Recommendations For The Authors):

      • What is the rationale behind dedicating the first paragraph of results to discussing an artifact?

      We think that it is important to report off target effects of shRNA viral constructs as concentration and volumes of viruses injected in various studies vary considerably and other investigators may attempt to use larger volumes of viruses to obtain more considerable or faster knockdown but would obtain erroneous conclusions if appropriate tests are not performed.

      Furthermore, because some readers could question whether we injected enough virus to knockdown the expression of PHOX2B, and may wonder if with a larger amount of virus we would increase knockdown efficiency, we wanted to show that, in our opinion, we used the maximum amount of virus to knockdown PHOX2B without causing toxic effects or physiological changes that are not dependent on PHOX2B knockdown.

      • All individual data points should be visible in floating bar graphs in Figures 1 and 4. For example, I don't see any dots for naïve animals in any of the panels in Figure 1.

      We have clarified in the figure legend that, to be fair to the presentation of our results, the data points shown in some of the boxplot graphs do not refer to entire baseline data but only the ones that are outliers.

      In our Box-and Whisker-Plots, whiskers represent the 10th and 90th percentiles, showing the range of values for the middle 80% of the data. Individual data values that fall outside the 10th/90th percentile range are represented as single point (outliers).

      • Please include specific F and T values along with DF.

      We have included a table with all the specific values in the supplementary section as Table 1.

      • The C1 and facial partly overlap with the RTN at this level of the medulla and these cells should appear as Phox2b+/Nmb- cells so it is not clear to me why these cells are not evident in the control tissue in Figures 2B and 3B. Also, some of the bregma levels shown in Figure 5A overlap with Figures 2-3 so again it is not clear to me how this non-cell type specific viral approach was targeted to Nmb cells but not nearby TH+ cells. Please clarify.

      In our experience, C1 TH cells are located slightly medial to the Nmb cells and they spread much more caudally than Nmb cells of the RTN. We focused our small volume injection in the core of the RTN to target Nmb cells but we also assessed PHOX2B knockdown in TH C1 cells by counting the PHOX2B/TH cells across treatment groups. Although we can’t exclude subtle changes in the C1 population, we did not observe changes in the total number of C1 cells (TH+/PHOX2B+), in the number of TH cells expressing PHOX2B, or in the hypoxic ventilatory response (which is dependent on the health status of C1 neuron). We have updated figure 5 to show representative expression of PHOX2B in TH+ neurons in the ventral medulla to complement our cell count analysis. To address potential effects on other cell populations we have edited our discussion as follows:

      “PHOX2B knockdown was also restricted to RTN neurons, as adjacent C1 TH+ neurons did not show any change in number of TH+/PHOX2B+ expressing cells, although we cannot exclude that some C1 cells may have been infected and their relative PHOX2B expression levels were reduced. To support the lack of significant alterations associated with the possible loss of C1 function was the absence of significant changes in the hypoxic response that has been shown to be dependent on C1 neurons (Malheiros-Lima et al., 2017).”

      • To confirm, Nmb is not expressed in the NTS, and this region was chosen as a background, right?

      In order to systematically analyze Nmb mRNA expression we decided to use measurement of fluorescence relative to Nmb neurons present in the dorsal brainstem. Here cells are sparse but we used them as reference fluorescence since they would not be affected by the ventral shRNA injection. Similar cells are also present in the brains of mice and reported by the Allen Brain atlas (https://mouse.brain-map.org/experiment/show/71836874). We have clarified our procedure in the methods section (see above) and included a sample image of Nmb in both ventral and dorsal in Figure 5.

      • How do you get a loss of Nmb+ neurons (Figs 2-3) with no change in Nmb fluorescence (Fig. 5B)? In the absence of representative images these results are not compelling and should be substantiated by more readily quantifiable approaches like qPCR.

      We have clarified in the methods and results section our analytical procedure to assess PHOX2B and Nmb expression. Figure 2 and 3 display the results of counting numbers of Nmb+ cells in the RTN. Figure 5B reports the average of total cell fluorescence measured inside Nmb+ cells, not an average fluorescence measurement of the area of the ventral medulla. Basically, our results show that we have less Nmb cells that express PHOX2B but the overall Nmb mRNA fluorescence (expression) in Nmb cells relative to Nmb fluorescence in cells of the dorsal brainstem is the same.

      We have edited the methods as follows:

      “The Corrected Total Cell Fluorescence (CTCF) signal for Nmb, Gpr4 and Task2 mRNAs was quantified as previously described (Cardani et al., 2022; McCloy et al., 2014). Briefly, a Leica TCS SP5 (B-120G) Laser Scanning Confocal microscope was used to acquire images of the tissue. Exposure time and acquisition parameters were set for the naïve group and kept unchanged for the entire dataset acquisition. The collected images were then analysed by selecting a single cell at a time and measuring the area, integrated density and mean grey value (McCloy et al., 2014). For each image, three background areas were used to normalize against autofluorescence. We used 4 sections/rat (210 µm interval) to count Nmb, Gpr4 and Task2 mRNA CTCF in the core of the RTN area where several Nmb cells could be identified. For each section two images were acquired with a 20× objective, so that at least fifty cells per tissue sample were obtained for the mRNA quantification analysis. To evaluate changes in Nmb mRNA expression levels following PHOX2B knockdown at the level of the RTN, we compared the fluorescence intensity of each RTN Nmb+ cell (223.2 ± 37.1 cells/animal) with the average fluorescent signal of Nmb+ cells located dorsally in the NTS ( 4.3 ± 1.2 cells/animal) (Nmb CTCF ratio RTN/NTS) as we reasoned that the latter would not be affected by the shRNA infection and knockdown. “

      A single cell qPCR analysis would be definitely ideal but a qPCR from dissected tissue would not help us determine whether within a cell there was a reduction in Nmb mRNA levels.

      • The boxed RTN region in these examples is all over the place. It the RTN should be consistently placed along the ventral surface under the facial and pprox.. equal distance from the trigeminal and pyramids.

      We have update the figures to consistently present the areas of interest where Nmb cells are located and images are taken.

      • Fluorescent in situ typically appears as discrete puncta so it is not clear to me why that is not the case here.

      Our images are taken at low magnification (20X) where it is difficult to distinguish the single mRNA molecules. However, is it possible to appreciate the differences between the grainy fluorescent signal in the in situ hybridization assay (RNAScope) and the smoother signal of protein detection in the immunofluorescence assay.

      • Can TUNEL staining be done to confirm loss of Nmb neurons is due to death and not re-localization?

      Does the reviewer mean “cell migration” with relocalization? We do not expect that this would occur in our experiments. Although TUNEL in the first week post-infection could be useful to determine cell death in our tissue, we do not expect a cell migration of neurons within the brain as our viral shRNA injections are performed in adult rats when developmental processes are already concluded.

    2. Reviewer #2 (Public Review):

      Summary:

      This significant research explored how the PHOX2B transcription factor functions within neurons located in the retrotrapezoid nucleus (RTN), a crucial brainstem chemosensory area, to sustain appropriate CO2 chemoreflex reactions related to breathing in adult rats when observed in a living state. By applying a viral shRNA technique to selectively suppress PHOX2B in RTN neurons, the authors present compelling evidence of deteriorating ventilatory reactions to increased CO2 levels. This impairment progresses over a four-week period in vivo, hinting at disruptions in RTN neuron transcriptional processes and a consequent dulling of CO2-induced breathing responses. The data on RTN neuronal mRNA expression indicates that the weakened hypercapnic ventilatory response may stem from reduced levels of crucial proton sensors within the RTN. This research holds relevance for neuroscientists focused on the neurobiology of respiration and the neurodevelopmental regulation of motor functions.

      Strengths:

      The authors employed a shRNA viral strategy to systematically reduce PHOX2B protein levels, targeting RTN neurons specifically, to assess the importance of PHOX2B for the survival and chemosensory capabilities of adult RTN neurons in a living organism. The findings of this research underscore that beyond its developmental role, PHOX2B remains essential for sustaining accurate CO2 chemoreflex reactions in the adult brain. Furthermore, its diminished presence in Congenital Central Hypoventilation Syndrome (CCHS) could be a factor in the respiratory deficiencies observed in the condition. This study highlights the critical ongoing function of PHOX2B in adult physiology and its potential impact on respiratory health, offering valuable insights for the scientific and medical communities involved in treating and understanding respiratory disorders.

      Weaknesses:

      N/A

    3. eLife assessment

      This important study utilizes a viral-mediated short hairpin RNA (shRNA) approach to investigate in a novel way the role of the wild-type PHOX2B transcription factor expressed in critical chemosensory neurons in the brainstem retrotrapezoid nucleus (RTN) region for maintaining normal CO2 chemoreflex control of breathing in adult rats. The convincing results show blunted ventilation during elevated inhaled CO2 (hypercapnia) with knockdown of PHOX2B, accompanied by a reduced expression of Gpr4 and Task2 mRNA for the proposed RTN neuron proton sensor proteins GPR4 and TASK2. These results indicate that maintained expression of wild-type PHOX2B affects respiratory control in adult animals, complementing previous studies showing that PHOX2B-expressing RTN neurons may be critical for chemosensory control throughout the lifespan, and with implications for neurological disorders involving the RTN, which will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

    4. Reviewer #1 (Public Review):

      Summary:

      This important study investigated the role of the PHOX2B transcription factor in neurons in the key brainstem chemosensory structure, the retrotrapezoid nucleus (RTN), for maintaining proper CO2 chemoreflex responses of breathing in the adult rat in vivo. PHOX2B has an important transcriptional role in neuronal survival and/or function, and mutations of PHOX2B severely impair the development and function of the autonomic nervous system and RTN, resulting in the developmental genetic disease congenital central hypoventilation syndrome (CCHS) in neonates, where the RTN may not form and is functionally impaired. The function of the wild-type PHOX2B protein in adult RTN neurons that continue to express PHOX2B is unknown. By utilizing a viral PHOX2B-shRNA approach for the knockdown of PHOX2B specifically in RTN neurons, the authors' solid results show impaired ventilatory responses to elevated inspired CO2, measured by whole-body plethysmography in freely behaving adult rats, that develop progressively over a four-week period in vivo, indicating effects on RTN neuron transcriptional activity and associated blunting of the CO2 ventilatory response. The RTN neuronal mRNA expression data presented suggests the impaired hypercapnic ventilatory response is possibly due to the decreased expression of key proton sensors in the RTN. This study will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

      Strengths:

      (1) The authors used a shRNA viral approach to progressively knock down the PHOX2B protein, specifically in RTN neurons, to determine whether PHOX2B is necessary for the survival and/or chemosensory function of adult RTN neurons in vivo.

      (2) To determine the extent of PHOX2B knockdown in RTN neurons, the authors combined RNAScope® and immunohistochemistry assays to quantify the subpopulation of RTN neurons expressing PHOX2B and Neuromedin B (Nmb), which has been proposed to be key chemosensory neurons in the RTN.

      (3) The authors demonstrate that knockdown efficiency is time-dependent, with a progressive decrease in the number of Nmb-expressing RTN neurons that co-express PHOX2B over a four-week period.

      (4) Their results convincingly show hypoventilation, particularly in 7.2% CO2 only, for PHOX2B-shRNA RTN-injected rats after four weeks compared to naïve and non-PHOX2B-shRNA targeted (NT-shRNA) RTN-injected rats, suggesting a specific impairment of chemosensitive properties in RTN neurons with PHOX2B knockdown.

      (5) Analysis of the association between PHOX2B knockdown in RTN neurons and the<br /> attenuation of the hypercapnic ventilatory response (HCVR), by evaluating the correlation between the number of Nmb+/PHOX2B+ or Nmb+/PHOX2B- cells in the RTN and the resulting HCVR, showed a significant correlation between HCVR and number of Nmb+/PHOX2B+ and Nmb+/PHOX2B- cells, suggesting that the number of PHOX2B-expressing cells in the RTN is a predictor of the chemoreflex response and the reduction of PHOX2B protein impairs the CO2-chemoreflex.

      (6) The data presented indicate that PHOX2B knockdown reduces the HCVR and the expression of Gpr4 and Task2 mRNAs. This suggests that PHOX2B knockdown affects RTN neurons' transcriptional activity and decreases the CO2 response, possibly by reducing the expression of key proton sensors in the RTN.

      (7) This study's results show that independent of its role during development, PHOX2B is still required to maintain proper CO2 chemoreflex responses in the adult brain, and its reduction in CCHS may contribute to the respiratory impairment in this disorder.

      Weaknesses:

      (1) The authors found a significant decrease in the total number of Nmb+ RTN neurons (i.e., Nmb+/PHOX2B+ plus Nmb+/ PHOX2B-) in NT-shRNA rats at two weeks post viral injection, and also at the four-week period where the impairment of the chemosensory function of the RTN became significant, suggesting some inherent cell death possibly due to off-target toxic effects associated with shRNA procedures.

      (2) The tissue sampling procedures for quantifying numbers of cells expressing proteins/mRNAs throughout the extended RTN region bilaterally have not been completely validated to accurately represent the full expression patterns in the RTN under the experimental conditions.

      (3) The inferences about RTN neuronal expression of NMB, GPR4, or TASK2 are based on changes in mRNA levels, so it remains speculation that the observed reduction in Gpr4 and Task2 mRNA translates to a reduction in the protein levels and associated reduction of RTN neuronal chemosensitive properties.

    5. Reviewer #3 (Public Review):

      A brain region called the retrotrapezoid nucleus (RTN) regulates breathing in response to changes in CO2/H+, a process termed central chemoreception. A transcription factor called PHOX2B is important for RTN development and mutations in the PHOX2B gene result in a severe type of sleep apnea called Congenital Central Hypoventilation Syndrome. PHOX2B is also expressed throughout life, but its postmitotic functions remain unknown. This study shows that knockdown of PHOX2B in the RTN region in adult rats decreased expression of Task2 and Gpr4 in Nmb-expressing RTN chemoreceptors and this corresponded with a diminished ventilatory response to CO2 but did not impact baseline breathing or the hypoxic ventilatory response. These results provide novel insight regarding postmitotic functions of PHOX2B in RTN neurons.

      I have two main concerns and several points of clarification.

      Main issues:<br /> (1) The experimental approach was not targeted to Nmb+ neurons and since other cells in the area also express Phox2b, conclusions should be tempered to focus on Phox2b expressing parafacial neurons NOT specifically RTN neurons

      (2) It's not clear whether PHOX2B is important for transcription of pH sensing machinery, cell health or both. If knockdown of PHOX2B knockdown results in loss of RTN neurons this is also expected to decrease Task2 and Gpr4 levels, albeit by a transcription-independent mechanism.

      Other points:

      (3) All individual data points should be visible in floating bar graphs in Figs 1 and 4. For example, I don't see any dots for naïve animals in any of the panels in Fig. 1.

      (4) the C1 and facial partly overlap with the RTN at this level of the medulla and these cells should appear as Phox2b+/Nmb- cells so it is not clear to me why these cells are not evident in the control tissue in figs 2B and 3B. Also, some of the bregma levels shown in Fig. 5A overlap with Figs 2-3 so again it's not clear to me how this non-cell type specific viral approach was targeted to Nmb cells but not near by TH+ cells. Please clarify.

      (5) How do you get a loss of Nmb+ neurons (Figs 2-3) with no change in Nmb fluorescence (Fig. 5B)? In the absence of representative images these results are not compelling and should be substantiated by more readily quantifiable approaches like qPCR.

    1. Author response:

      We sincerely thank the editors and reviewers for the rigorous evaluation of our work and the precious time invested. The positive comments resonate with our endeavor to explore the intrinsic role of astrocyte aquaporin in brain water homeostasis. Meanwhile, we very appreciate the constructive suggestions of the reviewers to consolidate this study. Here is the provisional response, which briefly outlines our acknowledgement of the reviewers’ suggestions:

      To Reviewer #1:

      • Imaging data will be examined and collected to determine whether AQP4 inhibition has differential effects on astrocyte calcium signals in terms of cellular locations.

      • New analysis will be performed for CSD swelling data to provide additional kinetic information.

      • The mentioned original papers are important, and will be included in the revision.

      To Reviewer #2:

      We agree, a careful revision will improve and better position the study.

      • Echoing Reviewer #1, the introduction and discussion will be strengthened with current scientific contexts, while paying attention to the important advances in glymphatic system. The limits of the study mentioned in the reviews will be stated.

      • The use of TGN-020 was based on its validation by wide range of ex vivo and in vivo studies. AER-270(271) was nicely introduced by Farr et al., 2019 (PMID: 30738082). Its validation in vivo in AQP4 KO mice, and the comparison to TGN-020, is reported in a very recent study (Giannetto et al., 2024 - PMID: 38363040) that provides valuable insights.

      • The description of specific methodologies, including the DW-MRI, will be reinforced. The presentation of experiments and statistical analysis will be refined.

      To Reviewer #3:

      • Solenov et al., 2004 (PMID: 14576087) used the calcein quenching assay and KO mice convincingly showing AQP4 is a functional water channel in cultured astroctyes. AQP4 deletion reduced both astrocyte water permeability and the absolute amplitude of swelling over comparable time, and also slowed down cell shrinking, which overall parallels our results from acute AQP4 blocking. Yet in Solenovr’s study, the time to swelling plateau was prolonged in AQP4 KO astrocytes, differing from our data of acute blocking. This difference may be due to compensatory mechanisms in chronic AQP4 KO, or reflect the different volume responses in cultured astrocytes from brain slices/in vivo results as noted previously (e.g., Risher et al., 2009 - PMID: 18720409). As suggested, methods for volume recordings will be examined.

      • It is an important point that TGN-020 partially blocks AQP4, implying the actual functional impact of AQP4 per se might be stronger than what we observed. TGN provides a means to acutely probe AQP4 function in situ, still we agree, its limitation needs be acknowledged.

      • As also pointed by Reviewer #2, the description and interpretation of DW-MRI data will be improved.

    2. eLife assessment

      Using in vitro and in vivo experiments, the authors show that astrocytes swell after inhibition of aquaporin 4 (AQP4) with TGN-020, which is indicative of tonic water efflux from these cells under physiological conditions. Though potentially valuable, the study is currently incomplete due to possible off-target effects of TGN-020, limited mechanistic information underlying the detected effects, and potential limitations of some of the adopted experimental techniques. These findings can be especially relevant for cortical spreading depression in ischemic stroke or seizure and to get a comprehensive understanding of neuron-astrocyte interactions.

    3. Reviewer #1 (Public Review):

      Summary:

      Pham and colleagues provide an illuminating investigation of aquaporin-4 water flux in the brain utilizing ex vivo and in vivo techniques. The authors first show in acute brain slices, and in vivo with fiber photometry, SRB-loaded astrocytes swell after inhibition of AQP4 with TGN-020, indicative of tonic water efflux from astrocytes in physiological conditions. Excitingly, they find that TGN-020 increases the ADC in DW-MRI in a region-specific manner, potentially due to AQP4 density. The resolution of the DW-MRI cannot distinguish between intracellular or extracellular compartments, but the data point to an overall accumulation of water in the brain with AQP4 inhibition. These results provide further clarity on water movement through AQP4 in health and disease.

      Overall, the data support the main conclusions of the article, with some room for more detailed treatment of the data to extend the findings.

      Strengths:

      The authors have a thorough investigation of AQP4 inhibition in acute brain slices. The demonstration of tonic water efflux through AQP4 at baseline is novel and important in and of itself. Their further testing of TGN-020 in hyper- and hypo-osmotic solutions shows the expected reduction of swelling/shrinking with AQP4 blockade.

      Their experiment with cortical spreading depression further highlights the importance of water efflux from astrocytes via AQP4 and transient water fluxes as a result of osmotic gradients. Inhibition of AQP4 increases the speed of tissue swelling, pointing to a role in the efflux of water from the brain.

      The use of DW-MRI provides a non-invasive measure of water flux after TGN-020 treatment.

      Weaknesses:

      The authors specifically use GCaMP6 and light sheet microscopy to image their brain sections in order to identify astrocytic microdomains. However, their presentation of the data neglects a more detailed treatment of the calcium signaling. It would be quite interesting to see whether these calcium events are differentially affected by AQP4 inhibition based on their cellular localization (ie. processes vs. soma vs. vascular end feet which all have different AQP4 expressions).

      The authors show the inhibition of AQP4 with TGN-020 shortens the onset time of the swelling associated with cortical spreading depression in brain slices. However, they do not show quantification for many of the other features of CSD swelling, (ie. the duration of swelling, speed of swelling, recovery from swelling).

      Significance:

      AQP4 is a bidirectional water channel that is constitutively open, thus water flux through it is always regulated by local osmotic gradients. Still, characterizing this water flux has been challenging, as the AQP4 channel is incredibly water-selective. The authors here present important data showing that the application of TGN-020 alone causes astrocytic swelling, indicating that there is constant efflux of water from astrocytes via AQP4 in basal conditions. This has been suggested before, as the authors rightfully highlight in their discussion, but the evidence had previously come from electron microscopy data from genetic knockout mice.

      AQP4 expression has been linked with the glymphatic circulation of cerebrospinal fluid through perivascular spaces since its rediscovery in 2012 [1]. Further studies of aging[2], genetic models[3], and physiological circadian variation[4] have revealed it is not simply AQP4 expression but AQP4 polarization to astrocytic vascular endfeet that is imperative for facilitating glymphatic flow. Still, a lingering question in the field is how AQP4 facilitates fluid circulation. This study represents an important step in our understanding of AQP4's function, as the basal efflux of water via AQP4 might promote clearance of interstitial fluid to allow an influx of cerebrospinal fluid into the brain. Beyond glymphatic fluid circulation, clearly, AQP4-dependent volume changes will differentially alter astrocytic calcium signaling and, in turn, neuronal activity.

      (1) Iliff, J.J., et al., A Paravascular Pathway Facilitates CSF Flow Through the Brain Parenchyma and the Clearance of Interstitial Solutes, Including Amyloid β. Sci Transl Med, 2012. 4(147): p. 147ra111.<br /> (2) Kress, B.T., et al., Impairment of paravascular clearance pathways in the aging brain. Ann Neurol, 2014. 76(6): p. 845-61.<br /> (3) Mestre, H., et al., Aquaporin-4-dependent Glymphatic Solute Transport in the Rodent Brain. eLife, 2018. 7.<br /> (4) Hablitz, L., et al., Circadian control of brain glymphatic and lymphatic fluid flow. Nature Communications, 2020. 11(1).

    4. Reviewer #2 (Public Review):

      Summary:

      The paper investigates the role of astrocyte-specific aquaporin-4 (AQP4) water channel in mediating water transport within the mouse brain and the impact of the channel on astrocyte and neuron signaling. Throughout various experiments including epifluorescence and light sheet microscopy in mouse brain slices, and fiber photometry or diffusion-weighted MRI in vivo, the researchers observe that acute inhibition of AQP4 leads to intracellular water accumulation and swelling in astrocytes. This swelling alters astrocyte calcium signaling and affects neighboring neuron populations. Furthermore, the study demonstrates that AQP4 regulates astrocyte volume, influencing mainly the dynamics of water efflux in response to osmotic challenges or associated with cortical spreading depolarization. The findings suggest that AQP4-mediated water efflux plays a crucial role in maintaining brain homeostasis, and indicates the main role of AQP4 in this mechanism. However authors highlight that the report sheds light on the mechanisms by which astrocyte aquaporin contributes to the water environment in the brain parenchyma, the mechanism underlying these effects remains unclear and not investigated. The manuscript requires revision.

      Strengths:

      The paper elucidates the role of the astrocytic aquaporin-4 (AQP4) channel in brain water transport, its impact on water homeostasis, and signaling in the brain parenchyma. In its idea, the paper follows a set of complimentary experiments combining various ex vivo and in vivo techniques from microscopy to magnetic resonance imaging. The research is valuable, confirms previous findings, and provides novel insights into the effect of acute blockage of the AQP4 channel using TGN-020.

      Weaknesses:

      Despite the employed interdisciplinary approach, the quality of the manuscript provides doubts regarding the significance of the findings and hinders the novelty claimed by the authors. The paper lacks a comprehensive exploration or mention of the underlying molecular mechanisms driving the observed effects of astrocytic aquaporin-4 (AQP4) channel inhibition on brain water transport and brain signaling dynamics. The scientific background is not very well prepared in the introduction and discussion sections. The important or latest reports from the field are missing or incompletely cited and missconcluded. There are several citations to original works missing, which would clarify certain conclusions. This especially refers to the basis of the glymphatic system concept and recently published reports of similar content. The usage of TGN-020, instead of i.e. available AER-270(271) AQP4 blocker, is not explained. While employing various experimental techniques adds depth to the findings, some reasoning behind the employed techniques - especially regarding MRI - is not clear or seemingly inaccurate. Most of the time the number of subjects examined is lacking or mentioned only roughly within the figure captions, and there are lacking or wrongly applied statistical tests, that limit assessment and reproducibility of the results. In some cases, it seems that two different statistical tests were used for the same or linked type of data, so the results are contradictory even though appear as not likely - based on the figures. Addressing these limitations could strengthen the paper's impact and utility within the field of neuroscience, however, it also seems that supplementary experiments are required to improve the report.

    5. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors propose that astrocytic water channel AQP4 represents the dominant pathway for tonic water efflux without which astrocytes undergo cell swelling. The authors measure changes in astrocytic sulforhodamine fluorescence as the proxy for cell volume dynamics. Using this approach, they perform a technically elegant series of ex vivo and in vivo experiments exploring changes in astrocytic volume in response to AQP4 inhibitor TGN-020 and/or neuronal stimulation. The key finding is that TGN-020 produces an apparent swelling of astrocytes and modifies astrocytic cell volume regulation after spreading depolarizations. Additionally, systemic application of TGN-020 produced changes in diffusion-weighted MRI signal, which the authors interpret as cellular swelling. This study is perceived as potentially significant. However, several technical caveats should be strongly considered and perhaps addressed through additional experiments.

      Strengths:

      (1) This is a technically elegant study, in which the authors employed a number of complementary ex vivo and in vivo techniques to explore functional outcomes of aquaporin inhibition. The presented data are potentially highly significant (but see below for caveats and questions related to data interpretation).

      (2) The authors go beyond measuring cell volume homeostasis and probe for the functional significance of AQP4 inhibition by monitoring Ca2+ signaling in neurons and astrocytes (GCaMP6 assay).

      (3) Spreading depolarizations represent a physiologically relevant model of cellular swelling. The authors use ChR2 optogenetics to trigger spreading depolarizations. This is a highly appropriate and much-appreciated approach.

      Weaknesses:

      (1) The main weakness of this study is that all major conclusions are based on the use of one pharmacological compound. In the opinion of this reviewer, the effects of TGN-020 are not consistent with the current knowledge on water permeability in astrocytes and the relative contribution of AQP4 to this process.

      Specifically: Genetic deletion of AQP4 in astrocytes reduces plasmalemmal water permeability by ~two-three-fold (when measured a 37oC, Solenov et al., AJP-Cell, 2004). This is a significant difference, but it is thought to have limited/no impact on water distribution. Astrocytic volume and the degree of anisosmotic swelling/shrinkage are unchanged because the water permeability of the AQP4-null astrocytes remains high. This has been discussed at length in many publications (e.g., MacAulay et al., Neuroscience, 2004; MacAulay, Nat Rev Neurosci, 2021) and is acknowledged by Solenov and Verkman (2004).

      Keeping this limitation in mind, it is important to validate astrocytic cell volume changes using an independent method of cell volume reconstruction (diameter of sulforhodamine-labeled cell bodies? 3D reconstruction of EGFP-tagged cells? Else?)

      (2) TGN-020 produces many effects on the brain, with some but not all of the observed phenomena sensitive to the genetic deletion of AQP4. In the context of this work, it is important to note that TGN-020 does not completely inhibit AQP4 (70% maximal inhibition in the original oocyte study by Huber et al., Bioorg Med Chem, 2009). Thus, besides not knowing TGN-020 levels inside the brain, even "maximal" AQP4 inhibition would not be expected to dramatically affect water permeability in astrocytes.

      This caveat may be addressed through experiments using local delivery of structurally unrelated AQP4 blockers, or, preferably, AQP4 KO mice.

      (3) This reviewer thinks that the ADC signal changes in Figure 5 may be unrelated to cellular swelling. Instead, they may be a result of the previously reported TGN-020-induced hyphemia (e.g., H. Igarashi et al., NeuroReport, 2013) and/or changes in water fluxes across pia matter which is highly enriched in AQP4. To amplify this concern, AQP4 KO brains have increased water mobility due to enlarged interstitial spaces, rather than swollen astrocytes (RS Gomolka, eLife, 2023). Overall, the caveats of interpreting DW-MRI signal deserve strong consideration.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a valuable computational study that applies the machine learning method of bilinear modeling to the problem of relating gene expression to connectivity. Specifically, the author attempts to use transcriptomic data from mouse retinal neurons to predict their known connectivity. The results are promising, although the reviewers felt that demonstration of the general applicability of the approach required testing it against a second data set. Hence the present results were felt to provide borderline incomplete support for a key premise of the paper.

      We thank the reviewers for their insightful and constructive feedback. In response to the reviews, we have undertaken a comprehensive revision of our manuscript, incorporating changes and improvements as outlined below:

      (1) New results have been included showcasing the application of our bilinear model to a seconddataset focusing on C. elegans gap junction connectivity. This extension validates our model with a biological context other than mouse retina and facilitates a direct comparison with the spatial connectome model (SCM).

      (2) A new section titled "Previous Approaches" has been added to background, situating our studywithin the broader landscape of existing modeling methodologies.

      (3) The discussion sections have been expanded to fully incorporate the suggestions and insightsoffered by the reviewers. This includes a deeper exploration of the implications of our findings, potential applications of our model, and a more thorough consideration of its limitations and future directions.

      (4) To streamline the main text and ensure that the core narrative remains focused and accessible, select figures and tables have been relocated to the "Supplementary Materials" section.

      Reviewer 1 (Public Review):

      Summary of what the author was trying to achieve: In this study, the author aimed to develop a method for estimating neuronal-type connectivity from transcriptomic gene expression data, specifically from mouse retinal neurons. They sought to develop an interpretable model that could be used to characterize the underlying genetic mechanisms of circuit assembly and connectivity.

      Strengths:

      The proposed bilinear model draws inspiration from commonly implemented recommendation systems in the field of machine learning. The author presents the model clearly and addresses critical statistical limitations that may weaken the validity of the model such as multicollinearity and outliers. The author presents two formulations of the model for separate scenarios in which varying levels of data resolution are available. The author effectively references key work in the field when establishing assumptions that affect the underlying model and subsequent results. For example, correspondence between gene expression cell types and connectivity cell types from different references are clearly outlined in Tables 1-3. The model training and validation are sufficient and yield a relatively high correlation with the ground truth connectivity matrix. Seemingly valid biological assumptions are made throughout, however, some assumptions may reduce resolution (such as averaging over cell types), thus missing potentially important single-cell gene expression interactions.

      Thank you for recognizing the strengths of our work, particularly the clarity of the model presentation and its foundation in recommendation systems. In the revised manuscript we have also extended the model’s capabilities to analyze gene interactions for neural connectivity at single-cell resolution, when gene expression and connectivity of each cell are known simultaneously.

      Weaknesses:

      The main results of the study could benefit from replication in another dataset beyond mouse retinal neurons, to validate the proposed method. Dimensionality reduction significantly reduces the resolution of the model and the PCA methodology employed is largely non-deterministic. This may reduce the resolution and reproducibility of the model. It may be worth exploring how the PCA methodology of the model may affect results when replicating. Figure 5, ’Gene signatures associated with the two latent dimensions’, lacks some readability and related results could be outlined more clearly in the results section. There should be more discussion on weaknesses of the results e.g. quantification of what connectivity motifs were not captured and what gene signatures might have been missed.

      We acknowledge the significance of validating our method across different datasets. In line with this, our revised manuscript now includes an expanded analysis utilizing a C. elegans gap junction connectivity dataset, which not only broadens the method’s demonstrated applicability but also underscores its versatility across varied neuronal systems.

      To address the concern of resolution and reproducibility associated with PCA preprocessing, we have conducted a comparative analysis from five replicates of the bilinear model, presenting the results in the revised manuscript (Figure S3). This analysis confirms the consistency of the solutions, as evidenced by the similarity metrics. Furthermore, we discussed alternative methodologies, such as L1 or L2 regularization, to tackle multicollinearity, offering flexibility in preprocessing choices.

      In response to feedback on the original Figure 5’s clarity, we have replaced the original Figure 5e-h with Table S4, which summarizes the gene ontology (GO) enrichment results and quantifies the number of genes associated with aspects of neural development and synaptic organization. This revision aims to improve the interpretability and accessibility of the results, ensuring a clearer presentation of the model’s insights.

      Finally, we have expanded our discussion to address the study’s limitations more comprehensively. This includes exploration of potentially missed connections and gene signatures, such as transcription factors, which might not be captured by a linear model due to its inherent preference for predictors with strong correlations to the target variable.

      The main weakness is the lack of comparison against other similar methods, e.g. methods presented in Barabási, Dániel L., and Albert-László Barabási. "A genetic model of the connectome." Neuron 105.3 (2020): 435-445. Kovács, István A., Dániel L. Barabási, and Albert-László Barabási. "Uncovering the genetic blueprint of the C. elegans nervous system." Proceedings of the National Academy of Sciences 117.52 (2020): 33570-33577. Taylor, Seth R., et al. "Molecular topography of an entire nervous system." Cell 184.16 (2021): 4329-4347.

      We value your suggestion to compare our model with established methods. The revised manuscript now includes a comparative analysis with the spatial connectome model (SCM) using the same C. elegans dataset. In addition, a section reviewing previous approaches has been included in the background part, and the discussion part has been extended for the comparison.

      Appraisal of whether the author achieved their aims, and whether results support their conclusions: The author achieved their aims by recapitulating key connectivity motifs from single-cell gene expression data in the mouse retina. Furthermore, the model setup allowed for insight into gene signatures and interactions, however could have benefited from a deeper evaluation of the accuracy of these signatures. The author claims the method sets a new benchmark for single-cell transcriptomic analysis of synaptic connections. This should be more rigorously proven. (I’m not sure I can speak on the novelty of the method)

      In the revised manuscript. we emphasized the bilinear model’s innovative application in the context of neuronal connectivity analysis, inspired by collaborative filtering in recommendation systems. We present quantitative performance metrics, such as the ROC-AUC score and Pearson correlation coefficient, as well as its comparison with the SCM, to benchmark our model’s efficacy in reconstructing connectivity matrices. We also quantified the overlap of the genetic interactions revealed by the bilinear model and the SCM (using the C. elegans dataset), and reported the percentage of the top genes associated with neural development and synaptic organization (using the mouse retina dataset). These numbers set a precedent for future methodological comparisons.

      Discussion of the likely impact of the work on the field, and the utility of methods and data to the community : This study provides an understandable bilinear model for decoding the genetic programming of neuronal type connectivity. The proposed model leaves the door open for further testing and comparison with alternative linear and/or non-linear models, such as neural networkbased models. In addition to more complex models, this model can be built on to include higher resolution data such as more gene expression dimensions, different types of connectivity measures, and additional omics data.

      We are grateful for your recognition of the study’s potential impact. The bilinear model indeed offers a foundation for future explorations, allowing for integration with more complex models, higher-resolution data, and diverse connectivity measures.

      Reviewer 1 (Recommendations For The Authors):

      The inclusion of predicted connectivity (Figure 6) of unknown BC neurons is useful as it shows that this is a strong hypothesis generation tool. This utility should potentially be showcased more as it is also brought up in the abstract, "genetic manipulation of circuit wiring", with an explanation of how the model could be leveraged as such. The discussion may benefit from a summarizing sentence regarding which key gene signatures were identified and are in line with the literature, which key gene signatures/connectivity motifs may have been missed, and which gene signatures are novel.

      Thank you for the insightful recommendation on emphasizing the model’s utility in generating hypotheses, particularly regarding predicting connectivity. In the revised manuscript, we have expanded the discussion on how our model can be leveraged to guide genetic manipulations at altering circuit wiring and highlighted its potential impact in the field.

      We have discussed key gene signatures identified from our model that are in line with existing literature, such as plexins and cadherins, which have been previously recognized for their involvement in synaptic connection formation and maintenance. We have also introduced potential new candidates, such as delta-protocadherins. In the revised manuscript, we summarized potentially missed gene signatures or synaptic connections, to provide a comprehensive view of our findings.

      Reviewer 2 (Public Review):

      Summary:

      In this study, Mu Qiao employs a bilinear modeling approach, commonly utilized in recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. However, it is important to note that the current model has only been tested with Bipolar Cell and Retinal Ganglion Cell data, and its applicability in more general neuronal connectivity scenarios remains to be demonstrated.

      Strengths:

      This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates singlecell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.

      Thank you for your positive assessment of the paper.

      Weaknesses:

      (1) The study lacks experimental validation of the model’s prediction results.

      We recognize the importance of experimental validation in substantiating the predictions made by computational models. While the primary focus of this study remains computational, we have dedicated a section in the revised manuscript, titled "Experimental Validation of Candidate Genes", to outline proposed methodologies for the empirical verification of our model’s predictions. This section specifically discusses the experimental exploration of novel candidate genes, such as deltaprotocadherins, within the mouse retina using AAV-mediated CRISPR/Cas9 genetic manipulation. We plan to collaborate with experimental laboratories to facilitate the validation. Given the extensive nature of experimental work, both in terms of time and resources, it is more pragmatic to present a comprehensive experimental investigation in a follow-up study.

      (2) The model’s applicability in other neuronal connectivity settings has not been thoroughly explored.

      The question of the model’s broader applicability is well-taken. In response, we have expanded our analysis to include additional neuronal data and connectivity settings. Specifically, the revised manuscript includes results where we apply the model to a dataset of C. elegans gap junction connectivity, demonstrating its potential in different neuronal systems. This extension serves to illustrate the model’s adaptability and potential applicability to a broader range of neuronal connectivity studies.

      (3) The proposed method relies on the availability of neuronal connectomic data for model training,which may be limited or absent in certain brain connectivity settings.

      We acknowledge the limitations posed by the model’s dependency on comprehensive connectomic data, which may not be readily available across all research contexts. To address this, we have discussed in the revised manuscript several alternative strategies to adapt our model to the available data. This includes exploring the potential of applying the model to available data such as projectome, and integrating other data modalities such as electrophysiological measurements. These initiatives aim to enhance the model’s applicability and ensure its utility in a broader spectrum of brain connectivity studies, especially in scenarios where detailed connectomic data are not available.

      Reviewer 2 (Recommendations For The Authors):

      Q1. In this work, the author has mainly been studying the retina neuronal type connectivity, it will be interesting to see whether the model works for other brain regions or other neuronal type connectivity as well.

      We value your interest in the model’s applicability to other brain regions and neuronal types. To address this, we have extended our analysis in the revised manuscript to include a study on gap junction connectivity between C. elegans neurons. This extension demonstrates the model’s versatility and its potential applicability across various nervous systems and connectivity types.

      Q2. Whether the authors can use the same transformation matrices trained from the retina data to predict neuronal connectivity in other brain regions? Or an easier case, the connectivity between RGC types to the neuronal types in SC, dLGN, or other post-RGC-synaptic brain regions. As the neuronal connection mechanisms are conserved and widely shared between different neuronal types, one would expect the same transformation matrices may work in predicting other neuronal type connectivity as well (at least to some extent).

      The idea to use the same transformation matrices for predicting connectivity in other brain regions is intriguing. While direct application of these matrices to different regions remains challenging, we discussed the potential scalability of our model to other brain areas. By applying the model to combined datasets from various regions, we could uncover conserved neuronal connection mechanisms. This approach is theoretically feasible and is supported by the demonstrated scalability of the bilinear model and its deep learning variants in industrial applications.

      Q3. Section 5.2 Connectivity metric generation: in this work, the author uses the stratification profiles of the neurons to estimate the connectivity metric, how reliable this method is? There will be a scenario where though two neuronal types project to a similar inner plexiform layer, they may not have any connection. Have the authors considered combining other experimental data (like electrophysiology data or neuron tracing data)?

      We discussed the reliability of using stratification profiles for estimating connectivity metrics, acknowledging potential limitations. In the revised manuscript, we added discussion on how the integration of additional experimental data, such as electrophysiological and neuron tracing data, could enhance the accuracy of the connectivity metrics.

      Q4. Section 6 Model training and validation: does the author have a potential hypothesis as to why 2 dimensions are the best latent feature spaces dimensionality? One would imagine with more dimensionality, the model will give better results. Could it be that the connectivity data that is used to train the model is only considering the two-dimensional space of the neuronal stratification?

      The selection of two dimensions for the latent feature space was informed by 5-fold cross-validation, aimed at optimizing model generalization to unseen data. Here while increasing dimensionality improves performance on the training set, it does not necessarily enhance generalization to the validation set. Thus, the choice of two dimensions ensures good performance without overfitting to the training data.

      Q5. Could the author provide the source code for the analysis? Or could the author make it a python/R package so that non-computational biologists can easily apply the method to their own data?

      We have included a "Data and Code Availability" section in the revised manuscript. This section provides a link to the source code with pointers to datasets used in our study, facilitating the application of our methods by researchers from various backgrounds.

      Q6. I know it may be difficult for the author to do, but is it possible to design and perform some experiments to validate the model prediction results, either connectivity partners of transcriptomicallydefined RGC types or the function of the key genetic molecules (which hasn’t been discovered before)? The author may consider collaborating with some experimental labs. The author may even consider predicting the connectivity between RGC with some of its post-synaptic neurons in the brain regions, like SC or dLGN, as recently there are a lot of single-cell sequencing data as well as connectivity data.

      We appreciate your suggestion regarding experimental validation. As a future direction, we have discussed potential experimental approaches to validate the model’s predictions in the "Experimental Validation of Candidate Genes" section. Specifically, we propose an experimental design involving the manipulation of delta-protocadherins using AAV-mediated CRISPR/Cas9 and subsequent examination of connectivity phenotypes. We are also open to collaborating with experimental labs to further explore the model’s predictions, particularly in predicting connectivity between RGCs and their post-synaptic neurons in other brain regions.

    2. Reviewer #2 (Public Review):

      Summary:

      In this study, Mu Qiao employs a bilinear modelling approach, commonly utilised in the recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell Transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, Connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. In the revised version of the manuscript, the author has applied and validated the model in both C. elegans gap junction connectivity and the retina neuron connectivity conditions.

      Strengths:

      This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates single-cell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.

      Weaknesses:

      (1) When compared with the previous method - SCM, the new model shows a similar performance level. This may be due to the limitation of the dataset itself, as it only has the innexin expression data. Is it possible to apply the SCM model to the more complete retina dataset and compare the performance with the proposed bilinear modelling approach?

      Minor Weakness:

      (1) The study lacks experimental validation of the model's prediction results.

    3. Reviewer #1 (Public Review):

      Summary:

      In this study, the author aimed to develop a method for estimating neuronal-type connectivity from transcriptomic gene expression data. They sought to develop an interpretable model that could be used to characterize the underlying genetic mechanisms of circuit assembly and connectivity in various neuronal systems.

      Strengths:

      Many of the proposed suggestions were addressed by the author from the initial review. In general the claims made by the author are more strongly supported by the data and better situated in the literature. A major improvement includes the application of the model to the C. elegans gap junction neuronal system. Despite several key differences in the dataset as compared to the mouse retina data, the proposed model performs comparably to the SCM model currently considered state of the art in the literature (the author should remain cautious about claiming better performance given extremely marginal differences). In section 7.2, the author clearly outlines additional advantages of the proposed model including superior time and space complexity. The overall model performance remains modest, but it learns the same rules as the SCM model as well as other candidate patterns.

      As in the initial submission, the bilinear model recapitulates key connectivity motifs for the mouse dataset. The algorithm is shown to converge across several runs affirming its stability/replicability. The model is also extended to predict connectivity on unknown RGC-BC cell type pairs. Without ground truth, the author posits how it should perform based on known functional properties of the RGC type. The hypotheses are confirmed for 8/10 neuronal types with unknown connectivity. The author more clearly describes how this model can be used experimentally for hypothesis testing and presents a more comprehensive future roadmap regarding validation, avenues for improving the model, and incorporation of growing datasets.

      Weaknesses:

      While the C Elegans dataset is useful because it enables benchmarking to existing models, the dataset is quite different. The gene expression dimensionality is 18 genes as opposed to over 3000 genes in the mouse dataset. It is a strength that the model still works as intended, but a weakness that the bilinear model could not be tested on a similar mouse dataset. This distinction matters because it remains an open question if the PCA methodology would hold up in a dataset with varied distributions of gene expression. Variations of the PCA methodology could be evaluated further with the present dataset to make the generalizability of the model more convincing.

      The Gene Ontology analysis requires more methodological explanation. The author claims, "(the linear nature of the model) enables the direct interpretation of gene expressions by examining their associated weights in the model. These weights signify the importance of each gene in determining the connectivity motifs between the BC and RGC types." If I am correctly understanding the methods, the model weights in each dimension are indexing the importance of a gene expression feature as opposed to the importance of a single gene alone, "the gene expression of the BCs in X and the RGCs in Y were featurized by their respective PCs, resulting in matrices of dimensions 22453 × 11323 and 3779 × 3142, respectively." It would be helpful to explain how gene weights are extracted from a gene expression feature once highlighted.

      There could be a more rigorous analysis of the predictive capacity of the model even with the current data. The model recapitulates connectivity patterns from the full dataset and a prediction is demonstrated for unknown data. The model is thus championed as a useful tool for predicting how genetic modifications will influence connectivity, but this is not empirically evaluated.

      Appraisal of whether the author achieved their aims, and whether results support their conclusions:

      In line with the aims of the paper, the author proposed an interpretable bilinear model to learn a shared latent feature space derived from gene expression profiles to predict synaptic connectivity between various neuron types. The model was shown to generalize to two distinct neuronal systems with varying levels of genomic and cellular resolution. While the performance remains modest, the model performs comparably to the existing state of the art despite improved computational complexity.

      Discussion of likely impact of the work on the field, and utility of methods and data to the community:

      The author has elaborated substantially on the impact of this work, particularly how it could be leveraged in experimental settings. The clear methodology could be implemented by other researchers to test the model on new datasets and for benchmarking novel methods.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews

      Reviewer #1 (Public Review):

      Comment: The fact that there are Arid1a transcripts that escape the Cre system in the Arid1a KO mouse model might difficult the interpretation of the data. The phenotype of the Arid1a knockout is probably masked by the fact that many of the sequencing techniques used here are done on a heterogeneous population of knockout and wild type spermatocytes. In relation to this, I think that the use of the term "pachytene arrest" might be overstated, since this is not the phenotype truly observed. Knockout mice produce sperm, and probably litters, although a full description of the subfertility phenotype is lacking, along with identification of the stage at which cell death is happening by detection of apoptosis.

      Response: As the reviewer indicates, we did not observe a complete arrest at Pachynema. In fact, the histology shows the presence of spermatids and sperm in seminiferous tubules and epididymides (Fig. Sup. 3). However, our data argue that the wild-type haploid gametes produced were derived from spermatocyte precursors that have likely escaped Cre mediated activity (Fig. Sup. 4). Furthermore, diplotene and metaphase-I spermatocytes lacking ARID1A protein by IF were undetectable in the Arid1acKO testes (Fig. S4B). Therefore, although we do not demonstrate a strict pachytene arrest, it is reasonable to conclude that ARID1A is necessary to progress beyond pachynema. We have revised the manuscript to reflect this point (Abstract lines 17,18; Results lines 153,154)

      Comment: It is clear from this work that ARID1a is part of the protein network that contributes to silencing of the sex chromosomes. However, it is challenging to understand the timing of the role of ARID1a in the context of the well-known DDR pathways that have been described for MSCI.

      Response: With respect to the comment on the lack of clarity as to which stage of meiosis we observe cell death, our data do suggest that it is reasonable to conclude that mutant spermatocytes (ARID1A-) undergo cell death at pachynema given their inability to execute MSCI, which is a well-established phenotype.

      Comment: Staining of chromosome spreads with Arid1a antibody showed localization at the sex chromosomes by diplonema; however, analysis of gene expression in Arid1a KO was performed on pachytene spermatocytes. Therefore, is not very clear how the chromatin remodeling activity of Arid1a in diplonema is affecting gene expression of a previous stage. CUTnRUN showed that ARID1a is present at the sex chromatin in earlier stages, leading to hypothesize that immunofluorescence with ARID1a antibody might not reflect ARID1a real localization.

      Response: It is unclear what the reviewer means about not understanding how ARID1A activity at diplonema affects gene expression at earlier stages. Our interpretations were not based solely on the observation of ARID1A associations with the XY body at diplonema. In fact, mRNA expression and CUT&RUN analyses were performed on pachytene-enriched populations. ARID1A's association with the XY body is not exclusive to diplonema. Based on both CUT&RUN and IF data, ARID1A associates with XY chromatin as early as pachynema. Only at late diplonema did we observe ARID1A hyperaccumulation on the XY body by IF.

      Reviewer #2 (Public Review):

      Comment: The inefficient deletion of ARID1A in this mouse model does not allow any detailed analysis in a quantitative manner.

      Response: As explained in our response to these comments in the first revision, we respectfully disagree with this reviewer’s conclusions. We have been quantitative by co-staining for ARID1A, ensuring that we can score mutant pachytene spermatocytes from escapers. Additionally, we provide data to show the efficiency of ARID1A loss in the purified pachytene populations sampled in our genomic assays.

      Reviewer #3 (Public Review):

      Comment: The data demonstrate that the mutant cells fail to progress past pachytene, although it is unclear whether this specifically reflects pachytene arrest, as accumulation in other stages of Prophase also is suggested by the data in Table 1. The western blot showing ARID1A expression in WT vs. cKO spermatocytes (Fig. S2) is supportive of the cKO model but raises some questions. The blot shows many bands that are at lower intensity in the cKO, at MWs from 100-250kDa. The text and accompanying figure legend have limited information. Are the various bands with reduced expression different isoforms of ARID1A, or something else? What is the loading control 'NCL'? How was quantification done given the variation in signal across a large range of MWs?

      Response: The loading control is Nucleolin. With respect to the other bands in the range of 100-250 kDa, it is difficult to say whether they represent ARID1A isoforms. The Uniprot entry for Mouse ARID1A only indicates a large mol. wt sequence of ~242 kDa; therefore, the band corresponding to that size was quantified. There is no evidence to suggest that lower molecular weight isoforms may be translated. Although speculative, it is possible that the lower molecular weight bands represent proteolytic/proteasomal degradation products or products of antibody non-specificity. These points are addressed in the revised manuscript (Legend to Fig S2, lines 926-931). Blots were scanned on a LI-COR Odyssey CLx imager and viewed and quantified using Image Studio Version 5.2.5 (Methods, lines 640-642).

      Comment: An additional weakness relates to how the authors describe the relationship between ARID1A and DNA damage response (DDR) signaling. The authors don't see defects in a few DDR markers in ARID1A CKO cells (including a low-resolution assessment of ATR), suggesting that ARID1A may not be required for meiotic DDR signaling. However, as previously noted the data do not rule out the possibility that ARID1A is downstream of DDR signaling and the authors even indicate that "it is reasonable to hypothesize that DDR signaling might recruit BAF-A to the sex chromosomes (lines 509-510)." It therefore is difficult to understand why the authors continue to state that "...the mechanisms underlying ARID1A-mediated repression of the sex-linked transcription are mutually exclusive to DDR pathways regulating sex body formation" (p. 8) and that "BAF-A-mediated transcriptional repression of the sex chromosomes occurs independently of DDR signaling" (p. 16). The data provided do not justify these conclusions, as a role for DDR signaling upstream of ARID1A would mean that these mechanisms are not mutually exclusive or independent of one another.

      Response: The reviewer’s argument is reasonable, and we have made the recommended changes (Results, lines 212-215; Discussion, lines 499-500).

      Comment: A final comment relates to the impacts of ARID1A loss on DMC1 focus formation and the interesting observation of reduced sex chromosome association by DMC1. The authors additionally assess the related recombinase RAD51 and suggest that it is unaffected by ARID1A loss. However, only a single image of RAD51 staining in the cKO is provided (Fig. S11) and there are no associated quantitative data provided. The data are suggestive but it would be appropriate to add a qualifier to the conclusion regarding RAD51 in the discussion which states that "...loss of ARID1a decreases DMC1 foci on the XY chromosomes without affecting RAD51" given that the provided RAD51 data are not rigorous. In the long-term it also would be interesting to quantitatively examine DMC1 and RAD51 focus formation on autosomes as well.

      Response: We agree with the reviewer’s comment and have made the recommended changes (Discussion, lines 518-519).

      Response to non-public recommendations

      Reviewer 2:

      Comment: Meiotic arrest is usually judged based on testicular phenotypes. If mutant testes do not have any haploid spermatids, we can conclude that meiotic arrest is a phenotype. In this case, mutant testes have haploid spermatids and are fertile. The authors cannot conclude meiotic arrest. The mutant cells appear to undergo cell death in the pachytene stage, but the authors cannot say "meiotic arrest."

      Response: We disagree with this comment. By IF, we see that ~70% of the spermatocytes have deleted ARID1A. Furthermore, we never observed diplotene spermatocytes that lacked ARID1A. The conclusion that the absence of ARID1A results in a pachynema arrest and that the escapers produce the haploid spermatids is firm.

      Comment: Fig. S2 and S3 have wrong figure legends.

      Response: The figure legends for Fig. S2 and S3 are correct.

      Comment: The authors do not appear to evaluate independent mice for scoring (the result is about 74% deletion above, Table S1). Sup S2: how many independent mice did the authors examine?

      Response:These were Sta-Put purified fractions obtained from 14-15 WT and mutant mice. It is difficult to isolate pachytene spermatocytes by Sta-Put at the required purity in sufficient yields using one mouse at a time. We used three technical replicates to quantify the band intensity, and the error bars represent the standard error of the mean (S.E.M) of the band intensity.

      Comment: Comparison of cKO and wild-type littermate yielded nearly identical results (Avg total conc WT = 32.65 M/m; Avg total conc cKO = 32.06 M/ml)". This sounds like a negative result (i.e., no difference between WT and cKO).

      Response: This is correct. There is no difference between Arid1aWT and Arid1aCKO sperm production. This is because wild-type haploid gametes produced were derived from spermatocyte precursors that have escaped Cre-mediated activity (Fig. S4). These data merely serve to highlight an inherent caveat of our conditional knockout model and are not intended to support the main conclusion that ARID1A is necessary for pachytene progression.

      Comment: The authors now admit ~ 70 % efficiency in deletion, and the authors did not show the purity of these samples. If the purity of pachytene spermatocytes is ~ 80%, the real proportion of mutant cells can be ~ 56%. It is very difficult to interpret the data.

      Response: The original submission did refer to inefficient Cre-induced recombination. The reviewer asked for the % efficiency, which was provided in the revised version. Also, please refer to Fig. S2, where Western blot analysis demonstrates a significant loss of ARID1A protein levels in CKO relative to WT pachytene spermatocyte populations that were used for CUT&RUN data generation.

      Comment: The authors should not use the other study to justify their own data. The H3.3 ChIP-seq data in the NAR paper detected clear peaks on autosomes. However, in this study, as shown in Fig. S7A, the authors detected only 4 peaks on autosomes based on MACS2 peak calling. This must be a failed experiment. Also, S7A appears to have labeling errors.

      Response: I believe the reviewer is referring to supplementary figure 8A. Here, it is not clear which labeling errors the reviewer is referring to. In the wild type, the identified peaks were overwhelmingly sex-linked intergenic sites. This is consistent with the fact that H3.3 is hyper-accumulated on the sex chromosomes at pachynema.

      The authors of the NAR paper did not perform a peak-calling analysis using MACS2 or any other peak-calling algorithm. They merely compared the coverage of H3.3 relative to input. Therefore, it is not clear on what basis the reviewer says that the NAR paper identified autosomal peaks. Their H3.3 signal appears widely distributed over a 6 kb window centered at the TSS of autosomal genes, which, compared to input, appears enriched. Our data clearly demonstrates a less noisy and narrower window of H3.3 enrichment at autosomal TSSs in WT pachytene spermatocytes, albeit at levels lower than that seen in CKO pachytene spermatocytes (Fig S8B and see data copied below for each individual replicate). Moreover, the lack of peaks does not mean that there was an absence of H3.3 at these autosomal TSSs (Supp. Fig. S8B). Therefore, we disagree with the reviewer’s comment that the H3.3 CUT&RUN was a failed experiment.

      Author response image 1.

      H3.3 Occupancy at genes mis-regulated in the absence of ARID1A

      Comment: If the author wishes to study the function of ARID2 in spermatogenesis, they may need to try other cre-lines to have more robust phenotypes, and all analyses must be redone using a mouse model with efficient deletion of ARID2.

      Response: As noted, we chose Stra8-Cre to conditionally knockout Arid1a because ARID1A is haploinsufficient during embryonic development. The lack of Cre expression in the maternal germline allows for transmission of the floxed allele, allowing for the experiments to progress.

      Comment: The inefficient deletion of ARID1A in this mouse model does not allow any detailed analysis in a quantitative manner.

      Response: In many experiments, we have been quantitative when possible by co-staining for ARID1A, ensuring that we can score mutant pachytene spermatocytes from escapers. Additionally, we provide data to show the efficiency of ARID1A loss in the purified pachytene populations sampled in our genomic assays.

      Reviewer 3:

      Comment: The Methods section refers to antibodies as being in Supplementary Table 3, but the table is labeled as Supplementary Table 2.

      Response: This has been corrected

    2. eLife assessment

      This study presents a valuable dataset regarding chromatin remodeling by the BAF complex in the context of meiotic sex chromosome inactivation. Solid data generally support the conclusions, although there is room for improvement. This work will be of interest to researchers working on chromatin and reproductive biology.

    3. Reviewer #1 (Public Review):

      The work by Debashish U. Menon, Noel Murcia, and Terry Magnuson brings important knowledge about histone H3.3 dynamics involved in meiotic sex chromosome inactivation (MSCI). MSCI is unique to gametes and failure during this process can lead to infertility. Classically, MSCI has been studied in the context of DNA Damage repair pathways and little is known about the epigenetic mechanisms behind maintenance of the sex body as a silencing platform during meiosis. One of the major strengths of this work is the evidence provided on the role of ARID1A, a BAF subunit, in MSCI through the regulation of H3.3 occupancy in specific genic regions.

      Using RNA seq and CUT&RUN and ATAC-seq, the authors show that ARID1A regulates chromatin accessibility of the sex chromosomes and XY gene expression. Loss of ARID1A increases promoter accessibility of XY linked genes with concomitant influx of RNA pol II to the sex body and up regulation of XY-linked genes. This work suggests that ARID1A regulates chromatin composition of the sex body since in the absence of ARID1A, spermatocytes show less enrichment of H3.3 in the sex chromosomes and stable levels of the canonical histones H3.1/3.2. By overlapping CUT&RUN and ATAC-seq data, authors show that changes in chromatin accessibility in the absence of ARID1A are given by redistribution of occupancy of H3.3. Gained open chromatin in mutants corresponds to up regulation of H3.3 occupancy at transcription start sites of genes mediated by ARID1A.

      Interestingly, ARID1A loss caused increased promoter occupancy by H3.3 in regions usually occupied by PRDM9. PRDM9 catalyzes histone H3 lysine 4 trimethylation during meiotic prophase I, and positions double strand break (DSB) hotspots. Lack of ARID1A causes reduction in occupancy of DMC1, a recombinase involved in DSB repair, in non-homologous sex regions. These data suggest that ARID1A might indirectly influence DNA DSB repair on the sex chromosomes by regulating the localization of H3.3. This is very interesting given the recently suggested role for ARID1A in genome instability in cancer cells. It raises the question of whether this role is also involved in meiotic DSB repair in autosomes and/or how this mechanism differs in sex chromosomes compared to autosomes.

      The fact that there are Arid1a transcripts that escape the Cre system in the Arid1a KO mouse model might difficult the interpretation of the data. The phenotype of the Arid1a knockout is probably masked by the fact that many of the sequencing techniques used here are done on a heterogeneous population of knockout and wild type spermatocytes. In relation to this, I think that the use of the term "pachytene arrest" might be overstated, since this is not the phenotype truly observed. Nonetheless, the authors provide evidence showing that the spermatids observed in cKO testes that progress in spermatogenesis are the ones expressing Arid1a. This work presents enough evidence to include the BAF complex as part of the MSCI process, which increases our knowledge on specific regulation of the sex chromatin during meiosis.

    4. Reviewer #2 (Public Review):

      The authors tried to characterize the function of the SWI/SNF remodeler family, BAF, in spermatogenesis. The authors focused on ARID1A, a BAF-specific putative DNA binding subunit, based on gene expression profiles.

      The authors disagreed with my previous assessments. I disagree with their response.

    5. Reviewer #3 (Public Review):

      In this manuscript, Magnuson and colleagues investigate the meiotic functions of ARID1A, a putative DNA binding subunit of the SWI/SNF chromatin remodeler BAF. The authors develop a germ cell specific conditional knockout (cKO) mouse model using Stra8-cre and observe that ARID1A-deficient cells fail to progress beyond pachytene, although due to inefficiency of the Stra8-cre system the mice retain ARID1A-expressing cells that yield sperm and allow fertility. Because ARID1A was found to accumulate at the XY body late in Prophase I, the authors suspected a potential role in meiotic silencing and by RNAseq observe significant misexpression of sex-linked genes that typically are silenced at pachytene. They go on to show that ARID1A is required for exclusion of RNA PolII from the sex body and for limiting promoter accessibility at sex-linked genes, consistent with a meiotic sex chromosome inactivation (MSCI) defect in cKO mice. The authors proceed to investigate the impacts of ARID1A on H3.3 deposition genome-wide. H3.3 is known be regulated by ARID1A and is linked to silencing, and here the authors find that upon loss of ARID1A, overall H3.3 enrichment at the sex body as measured by IF failed to occur, but H3.3 was enriched specifically at transcriptional start sites of sex-linked genes that are normally regulated by ARID1A. The results suggest that ARID1A normally prevents H3.3 accumulation at target promoters on sex chromosomes and based on additional data, restricts H3.3 to intergenic sites. Finally, the authors present data implicating ARID1A and H3.3 occupancy in DSB repair, finding that ARID1A cKO leads to a reduction in focus formation by DMC1, a key repair protein. Overall the paper provides new insights into the process of MSCI from the perspective of chromatin composition and structure, and raises interesting new questions about the interplay between chromatin structure, meiotic silencing and DNA repair.

      In general the data are convincing. The conditional KO mouse model has some inherent limitations due to incomplete recombination and the existence of 'escaper' cells that express ARID1A and progress through meiosis normally. This reviewer feels that the authors have addressed this point thoroughly and have demonstrated clear and specific phenotypes using the best available animal model. The data demonstrate that the mutant cells fail to progress past pachytene, although it is unclear whether this specifically reflects pachytene arrest, as accumulation in other stages of Prophase also is suggested by the data in Table 1.

      The revised manuscript more appropriately describes the relationship between ARID1A and DNA damage response (DDR) signaling. The authors don't see defects in a few DDR markers in ARID1A CKO cells (including a low resolution assessment of ATR), suggesting that ARID1A may not be required for meiotic DDR signaling. However, as previously noted the data do not rule out the possibility that ARID1A is downstream of DDR signaling, and the authors note the possibility of a role for DDR signaling upstream of ARID1A.

      A final comment relates to the impacts of ARID1A loss on DMC1 focus formation and the interesting observation of reduced sex chromosome association by DMC1. The authors additionally assess the related recombinase RAD51 and suggest that it is unaffected by ARID1A loss. However, only a single image of RAD51 staining in the cKO is provided (Fig. S11) and there are no associated quantitative data provided. The data are suggestive and conclusions about the impacts of ARID1A loss on RAD51 must be considered as preliminary until more rigorously assessed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Here we address the major points raised by the reviewers.

      Reviewer #1 (Public Review):

      Weaknesses:

      • The signaling pathway upstream of Maf1 remains unknown. In eukaryotes, Maf1 is a negative regulator of RNA pol III and is regulated by external signals via the TORC pathway. Since TORC components are absent in the apicomplexan lineage, one central question that remains open is how Maf1 is regulated in P. falciparum. Magnesium is probably not the sole stimulus involved, as suggested by the observation that Ile deprivation also down-regulates RNA pol III activity.

      We agree that there is still much to uncover relating to the PfMaf1 signaling pathway. While we still do not know each component, we have been able to link external factors (of course not limited to only magnesium) to the increased nuclear occupancy of PfMaf1. Other protein interactors that potentially regulate PfMaf1, while not confirmed, have been identified in plasma sample as candidates for future experiments to validate their potential involvement of RNA Pol III inhibition.

      • The study does not address why MgCl2 levels vary depending on the clinical state. It is unclear whether plasma magnesium is increased during asymptomatic malaria or decreased during symptomatic infection, as the study does not include control groups with non-infected individuals. Along the same line, MgCl2 supplementation in parasite cultures was done at 3mM, which is higher than the highest concentrations observed in clinical samples.

      This reviewer raised a valid point. The plasma magnesium levels for the wet symptomatic samples (averaging [0.79mM]) were within the normal range of a healthy individual (between [0.75-0.95mM]) while the dry asymptomatic levels were above the normal range (averaging [1.13mM]). Ideally, we would have liked to have control uninfected plasma samples from individuals from The Gambia. Unfortunately, field studies and human volunteer studies do not always have all the ideal controls that in vitro studies have. We recognize that [3mM] is higher than the normal range for magnesium levels, which is why we included a revised Supplementary Figure 3A. This figure shows that magnesium concentrations as low as [1mM] (similar to the levels found in dry asymptomatic samples) reduced the expression of RNA Pol III-transcribed genes.

      • Although the study provides biochemical evidence of Maf1 accumulation in the parasite nuclear fraction upon magnesium addition, this is not fully supported by the immunofluorescence experiments.

      We agree that the resolution of IFA images does not allow to support the WB data. We believe that the importance of the IFA Supplementary Figure is to show that PfMaf1 clusters together in foci, which has not been previously reported.

      Reviewer #2 (Public Review):

      Weaknesses:

      However, most analyses are rather preliminary as only very few (3-5) candidate genes are analyzed by qPCR instead of carrying out comprehensive analyses with a large qPCR panel or RNA-seq experiments with GO term analyses. Data presentation lacks clarity, the number of biological replicates is rather low and the statistical analyses need to be largely revised. Although the in vivo data from wet (mildly symptomatic) and dry (asymptomatic) season parasites with different expression levels of Pol III-regulated genes, var genes, and MgCl2 are interesting, the link between the in vitro data and the in vivo virulence of P. falciparum, which is made in many sections of the manuscript, should be toned down. Especially since (i) the only endothelial receptor studied is CD36, which is associated with parasite binding during mild malaria, and (ii) several studies provide contradictory data on MgCl2 levels during malaria and in different disease states, which is not further discussed, but the authors mainly focused on this external stimulus in their experiments.

      We agree that, ideally, we would have liked to do full RNA-seq on The Gambia samples. However, that was out of the scope of this project. The RNA samples were limited which is why we did not use more primers. We believe that an appropriate number of replicates was done for the experiments. The wet symptomatic samples from this study were from mildly symptomatic individuals, as stated in the manuscript. Therefore, CD36 was a relevant receptor to use for our studies.

      We agree that the published studies about magnesium levels in infected individuals are not always consistent. What these studies do not consider is the time of year, whether the infection occurred during the dry or wet season. These studies were also done in different regions of the world using different technologies. For this reason, we only highlight the observed difference observed in our field study data from The Gambia.

      Reviewer #3 (Public Review):

      Weaknesses:

      (1) The signals upstream of Maf1 remain rather a black box. 4 are tested - heat shock and low-glucose, which seem to suppress ALL transcription; low-Isoleucine and high magnesium, which suppress Pol3. Therefore the authors use Mg supplementation throughout as a 'starvation type' stimulus. They do not discuss why they didn't use amino acid limitation, which could be more easily rationalised physiologically. It may be for experimental simplicity (no need for dropout media) but this should be discussed, and ideally, sample experiments with low-IsoLeu should be done too, to see if the responses (e.g. cytoadhesion) are all the same.

      We agree that deprivation of isoleucine would have been another experimental assay for our study, but it also would not have been as novel as magnesium. While understanding the exact mechanism or involvement of magnesium as a stress condition was not the scope of this manuscript, we believe that our data will be valuable into demonstrating that external stimuli act on P. falciparum virulence gene expression via RNA Pol III inhibition. Since we also had plasma level data for magnesium, and not isoleucine, we believed it made for a better external factor to use for our in vitro studies.

      (2) The proteomics, conducted to seek partners of Maf1, is probably the weakest part. From Figure S3: the proteins highlighted in the text are clearly highly selected (as ones that might be relevant, e.g. phosphatases), but many others are more enriched. It would be good to see the whole list, and which GO terms actually came top in enrichment.

      We apologize if the reviewer did not see the attached supplementary Co-IP MS data. The file includes all proteins found in each sample as well as GO term analysis. For the purpose of this work, we highlight proteins potentially involved in the canonical role of Maf1 that have been shown in model organisms to reversibly inhibit RNA Pol III (phosphatases, RNA Pol III subunits).

      (3) Figure 3 shows the Maf1-low line has very poor growth after only 5 days but it is stated that no dead parasites are seen even after 8 cycles and the merozoites number is down only ~18 to 15... is this too small to account for such poor growth (~5-fold reduced in a single cycle, day 3-5)? It would additionally be interesting to see a cell-cycle length assessment and invasion assay, to see if Maf1-low parasites have further defects in growth.

      We agree with the reviewer that the observed reduced merozoite numbers may not the only cause of the reduced growth rate. Other factors in the PfMaf1 knock-down line may contribute to the observed poor growth.

    2. eLife assessment

      This important study links the activity of polymerase III to the regulation of virulence gene expression in the deadliest malaria parasite, Plasmodium falciparum. It identifies Maf1 as a Pol III inhibitor that enables the parasite to respond to external stimuli such as magnesium chloride plasma levels by downregulating Pol III-transcribed ruf6 genes and subsequently regulated var genes. While the evidence presented is generally convincing, some of the results are incomplete, and the mechanistic link between external signals and Maf1 activation remains unknown.

    3. Reviewer #1 (Public Review):

      Summary:

      Asymptomatic malaria infections are frequent during the dry season and have been associated with lower cytoadherence of P. falciparum parasites and lower expression of variant surface antigens. The mechanisms underlying parasite adaptation during the low transmission season remain poorly understood. The authors previously established that members of the non-coding RNA RUF6 gene family, transcribed by RNA pol III, are required for expression of the main variant surface antigens in P. falciparum, PfEMP1, which drive parasite cytoadherence and pathogenicity. In this study, the authors investigated the contribution of RNA pol III transcription in the regulation of PfEMP1 expression in different clinical states, either symptomatic malaria cases during the wet season or asymptomatic infections during the dry season.

      By reanalyzing RNAseq data from a previous study in Mali, complemented with RT-qPCR on new samples collected in The Gambia, the authors first report the down-regulation of RNA pol III genes (tRNAs, RUF6) in P. falciparum isolates collected from asymptomatic individuals during the dry season, as compared to isolates from symptomatic (wet season) individuals. They also confirm the down-regulation of var (DBLalpha) gene expression in asymptomatic infection as compared to symptomatic malaria. Plasma analysis in the two groups in the Gambian study reveals higher Magnesium levels in dry season as compared to wet season samples, pointing at a possible role of external factors. The authors tested the effect of MgCl2 supplementation on cultured parasites, as well as three other stimuli (temperature, low glucose, Ile deprivation), and show that Ile deprivation and MgCl2 both induce down-regulation of RNA pol III transcription but not pol I or pol II (except the active var gene). Using RNAseq, they show that MgCl2 supplementation predominantly inhibits RNA pol III-transcribed genes, including the entire RUF6 family. Conditional depletion of Maf1 leads to the up-regulation of RNA pol III gene transcription, confirming that Maf1 is a RNA pol III inhibitor in P. falciparum, as described in other organisms. Quantitative mass spectrometry shows that Maf1 interacts with RNA pol III complex in the nucleus, and with distinct proteins including two phosphatases in the cytoplasm. Using the Maf1 cKD parasites, the authors document that down-regulation of RNA pol III by MgCl2 is dependent on Maf1. Finally, they show that MgCl2 results in decreased cytoadherence of infected erythrocytes, associated with reduced PfEMP1 expression.

      Strengths:

      -The work is very well performed and presented.<br /> -The study uncovers a novel regulatory mechanism relying on RNA pol III-dependent regulation of variant surface antigens in response to external signals, which could contribute to parasite adaptation during the low transmission season.<br /> -Potential regulators of Maf1 were identified by mass spectrometry, including phosphatases, paving the way for future mechanistic studies.

      Weaknesses:

      -The signaling pathway upstream of Maf1 remains unknown. In eukaryotes, Maf1 is a negative regulator of RNA pol III and is regulated by external signals via the TORC pathway. Since TORC components are absent in the apicomplexan lineage, one central question that remains open is how Maf1 is regulated in P. falciparum. Magnesium is probably not the sole stimulus involved, as suggested by the observation that Ile deprivation also down-regulates RNA pol III activity.<br /> -The study does not address why MgCl2 levels vary depending on the clinical state. It is unclear whether plasma magnesium is increased during asymptomatic malaria or decreased during symptomatic infection, as the study does not include control groups with non-infected individuals. Along the same line, MgCl2 supplementation in parasite cultures was done at 3mM, which is higher than the highest concentrations observed in clinical samples.<br /> -Although the study provides biochemical evidence of Maf1 accumulation in the parasite nuclear fraction upon magnesium addition, this is not fully supported by the immunofluorescence experiments.

    4. Reviewer #2 (Public Review):

      The study by Diffendall et al. set out to establish a link between the activity of RNA polymerase III (Pol III) and its inhibitor Maf1 and the virulence of Plasmodium falciparum in vivo. Having previously found that knockdown of the ncRNA ruf6 gene family reduces var gene expression in vitro, they now present experimental evidence for the regulation of ruf6 and subsequently, var gene expression by Pol III using a commercially available inhibitor. They confirm their findings with samples from a previously published Gambian cohort study using asymptomatic dry season and mildly symptomatic wet season samples, showing that higher levels of Pol III-dependent transcripts and var transcripts as well as lower MgCl2 plasma concentrations are present in wet season samples. From this, they hypothesize that the external stimuli heat, reduced glucose and essential amino acid supply, and increased MgCl2 levels are sensed by the parasite through the only known Pol III inhibitor Maf1 and result in lower Pol III activity and fewer ruf6 transcripts, which in turn reduces var gene expression, leading to reduced cytoadherence and virulence of P. falciparum. In their in vitro experiments they focus on investigating higher MgCl2 levels and their impact on Pol III and Maf1 activity as well as var gene expression and parasites adherence to purified CD36, thereby successfully confirming their hypothesis for MgCl2. Nicely, MgCl2-induced down-regulation of Pol III activity was shown to be dependent on Maf1 using a knock-down cell line. Additionally, they show that the Maf1-KD cell line displays a slower growth rate with fewer merozoites per schizonts and Maf1 interacts with RNA pol III subunits and some kinases/phosphatases.

      Comments on latest version:

      It is understandable that the RNA samples from the Gambian cohort were limited, but for all in vitro analyses a larger panel of qPCR primers or RNAseq would have been feasible. I also understand the rationale for using the general var primer pair (DBLa) for field isolates, but since the authors were working with a clonal parasite line (3D7) in vitro, qPCR with specific 3D7/NF54 primer pairs or RNAseq, which would also allow inferences about ruf6 regulation of specific (neighboring?) var genes and other Pol III-regulated genes, would have been a far better option.

      As far as I could see from the resubmitted manuscript, the authors did not correct the statistical analyses. For example, they continue to apply a t-test to fold-change values (which must be transformed to log2), many t-test based analyses rely on only 2-3 replicates (a non-parametric test would be more appropriate), they have not corrected for multiple testing, and it is unclear how the authors handle technical and biological replicates in their plots. Therefore, I still suspect that more appropriate statistical analyses might have an impact on the significance of their results.

      I agree that CD36 binding is associated with mild malaria, but since the authors only make a link between Pol III and CD36 binding in vitro, I think it is an overstatement to claim something like "Our study reveals a regulatory mechanism in P. falciparum involving RNA Polymerase III, which plays a pivotal role in the parasite's virulence."

      Finally, if the authors have checked all the relevant literature on MgCl2, it should be easy for them to give a brief explanation why they included only one study and ignored all the other contradictory results.

    5. Reviewer #3 (Public Review):

      Summary:

      This work describes a new pathway by which malaria parasites, P. falciparum, may regulate their growth and virulence (i.e. their expression of virulence-linked cytoadhesins). This is a topic of considerable interest in the field - does this important parasite sense factor(s) in its host bloodstream and regulate itself accordingly? Several fragments of evidence have come out on this topic in the past decade, showing, for example, reduced parasite growth under calorie restriction (in mice); parasite dormancy in response to amino acid starvation (in culture and in mice), and also reduced virulence in dry-season, low-parasitaemia infections in humans. The molecular mechanisms that may underlie this interesting biology remain only poorly understood.

      Here, the authors show that dry-season P. falciparum parasites have reduced expression of Pol3-transcribed tRNAs and ncRNAs that positively regulate virulence gene expression. They link the level of Pol3 activity to PfMaf1, a remnant of the largely-absent nutrient-sensing TOR pathway in this parasite. They propose that in the dry season, human hosts may be calorie-restricted, leading to Maf1 moving to the nucleus and suppressing Pol3, thus downregulated growth and virulence of parasites. The evidence is intriguing and the idea is conceptually elegant.

      Strengths:

      The use of dry/wet-season field samples from The Gambia is a strength, showing potential real-world relevance. The generation of an inducible knockdown of Maf1 in lab-cultured parasites is also a strength, allowing this pathway to be studied somewhat in isolation.

      Weaknesses:

      (1) The signals upstream of Maf1 remain rather a black box. 4 are tested - heatshock and low-glucose, which seem to suppress ALL transcription; low-Isoleucine and high magnesium, which suppress Pol3. Therefore the authors use Mg supplementation throughout as a 'starvation type' stimulus. They do not discuss why they didn't use amino acid limitation, which could be more easily rationalised physiologically. It may for experimental simplicity (no need for dropout media) but this should be discussed, and ideally sample experiments with low-IsoLeu should be done too, to see if the responses (e.g. cytoadhesion) are all the same.

      (2) The proteomics, conducted to seek partners of Maf1, is probably the weakest part. From Fig S4 it is clear that the proteins highlighted in the text are highly selected (as ones that might be relevant, e.g. phosphatases), but many others are more enriched. It would be good to see a) the top hits from the whole list provided as a short table within the main proteomics figure, along with the GO terms that actually came top in enrichment; b) the whole list provided as a supp. spreadsheet for easy re-analysis, rather than a PDF which cannot be easily re-used.

      (3) Fig 3 shows the Maf1-low line has very poor growth after only 5 days but it is stated that no dead parasites are seen even after 8 cycles and the merozoites number is down only ~18 to 15... is this too small to account for such poor growth (~5-fold reduced in a single cycle, day 3-5)? It would additionally be interesting to see a cell-cycle length assessment and invasion assay, to see of Maf1-low parasite have further defects in growth.

      Other weaknesses, which are more restricted but were not addressed in revision, are highlighted below:

      Fig S1B - The downregulation of RNAPol3 transcripts caused by a commercial Pol3 inhibitor is pretty weak - mostly non-significant. The authors might comment on why they think this is, when interfering with PfMaf1 evidently has a greater effect.

      Fig 2D: the legend states ' Expressed transcripts from three replicates between control and addition of MgCl2 that are significantly up-regulated are highlighted in red while significantly down-regulated RNA Pol III genes are highlighted in blue (FDR corrected p-value of <0.05) and a FC {greater than or equal to}{plus minus} 1.95) with examples listed as text'. This isn't very clear. The authors could clarify whether they took ALL (Pol3 or not) upregulated genes to show in red, but only putative Pol3-regulated genes to show in blue? If so, why? Or did they take all significantly downregulated genes, and found they were all annotated as pol3 transcribed? (I cannot see any dots that are not blue. If there are some, a clearer figure is needed?)

      Line 227: 'PfMaf1 levels were shown to decrease by approximately 57% in total extracts after one cycle' - the provenance of this very precise percentage isn't clear (it does not appear on the figure). Is it densitometry of a western blot? And if so, is it an average of the 3 replicates that are stated in the legend (but not shown), or from the single example blot shown in Figure 3?

      Fig 4A: the western blot, as shown, lacks controls, both for loading and for completeness of cyto/nuclear fractionation. To avoid confusion, these should be shown in the main figure, as is standard in the field, rather than separately in a supp figure. Ideally, 3 repeats should be done, with densitiometry quantification.

    1. eLife assessment

      This study assessed antibody levels, which are indicative of protection, resulting from both COVID-19 vaccination and natural infection in a representative sample of the Canadian population. The work provides solid evidence that Individuals who received a booster vaccination and had a prior infection had the highest antibody levels, particularly when either the vaccination or natural infection had occurred within the past six months. These findings are of fundamental importance in supporting the value of booster vaccination in populations vulnerable to severe COVID-19.

    2. Reviewer #1 (Public Review):

      This study holds significant importance as it assessed antibody levels arising from both COVID-19 vaccination and natural infection in a representative population-based sample. The analysis was conducted with thoughtfulness and rigor. The sampling methodology ensured the representation of the broader Canadian population, including minorities and indigenous communities. Findings suggest, that despite a substantial number of individuals having been previously infected, especially following the first omicron wave, repeat booster vaccination is essential to ensure that individuals develop an optimal antibody response against new exposures to infection, given the waning of antibodies over time. The study findings carry global significance as it informs decisions about the relevance of booster vaccination for reducing infection incidence amid the ongoing challenge of vaccine hesitancy and the continual emergence of new variants.

      Among the weaknesses of the study, from my perspective, is the lack of explicit clarification that one objective of achieving repeat booster vaccination is to impart a robust level of protection against acquiring infections. Previous studies have demonstrated that the effectiveness of even only primary-series vaccination against COVID-19 severe disease was high, with slow waning over time. However, even when effectiveness against severity is high, infections may still present a risk for progression to severe COVID-19 among older individuals and those with comorbidities. Another limitation is that the study did not investigate whether there were variations in spike levels based on the last vaccine type administered. Furthermore, it is important to comment on the generalizability of the findings considering that individuals who participated in the research may have been different from those who did not participate and therefore residual confounding cannot be eliminated.

    3. Reviewer #2 (Public Review):

      Strengths<br /> (1) The study benefits from a Large sample size, encompassing serial assessments of 4000-9000 adults over an extended period. This large cohort enhances the reliability and generalizability of the findings.<br /> (2) The study employs a rigorous methodology, including serial assessments, self-collected dried blood spots, and highly sensitive antibody assays. The use of multiple measures ensures a robust evaluation of hybrid immunity and SARS-CoV-2 incidence within the Canadian population.<br /> (3) The manuscript provides detailed analyses of antibody levels, vaccination history, infection rates, and demographic factors. The inclusion of stratified analyses by age, sex, and ethnicity enhances the understanding of population-level immunity dynamics.<br /> (4) The study's findings contribute valuable insights into the dynamics of hybrid immunity and SARS-CoV-2 incidence, particularly during the emergence of the Omicron variant. The observed decline in COVID-19 death rates amidst rising infection rates underscores the potential protective role of hybrid immunity against severe outcomes.

      Weaknesses<br /> (1) Sampling Limitations: While the study claims to be representative of the Canadian population, there are potential limitations in sampling methods, particularly reliance on an online polling platform. This approach may introduce selection bias and limit the generalizability of findings to certain demographic groups.<br /> (2) Assay Limitations: The study acknowledges limitations associated with antibody assays and the potential for assay saturation, the reliance on self-reported vaccination history and infection status may introduce recall bias and affect the accuracy of estimates.<br /> (3) Data Interpretation: While the study presents compelling data on hybrid immunity and SARS-CoV-2 incidence, some interpretations may be speculative. The assertion of a causal relationship between hybrid immunity and reduced COVID-19 mortality warrants cautious interpretation, given the complexity of factors influencing disease outcomes.<br /> (4) Lack of inclusion and exclusion criteria: The manuscript does not have specific inclusion and exclusion criteria for participants and the methods used for data analysis.<br /> (5) The protocol does not include disaggregated data, this is only available on page 25 as an annex.

    1. eLife assessment

      This study presents an important finding that Obox4 and Dux act redundantly in regulating zygotic genome activation in mice. The evidence supporting the claims of the authors is solid. The work will be of interest to researchers interested in early embryo development and epigenetic reprogramming.

    2. Reviewer #1 (Public Review):

      Authors investigated the role of OBOX4 in the zygotic genome activation (ZGA) in mice. Obox4 genes form an array of duplicated genes they were identified as a candidate ZGA factor based on expression patterns during early development. The role of OBOX4 was subsequently studied in embryonic stem cells and early embryos. It was found that transcriptional activation mediated by OBOX4 has similar features as that of DUX, which was previously identified as a zygotic transcription factor involved in ZGA and a major activator of the zygotic expression program. It was, however, unexpected that Dux knock-out did not impair embryonic development. The work by Guo et al. provides several lines of evidence that OBOX4-mediated activation of gene expression considerably overlaps with that of DUX and this redundancy might explain the loss of early developmental phenotype in Dux mutants. Consistent with this model, double mutants of Obox4 and Dux show impaired development. Given the difficulties with investigating details of the genetic model in double mutants at the preimplantation embryo stage, authors not only crossed genetic mutants, but also used (1) nuclear transfer of mutated nuclei of ESCs, which could be characterized on their own in separate experiments, and (2) antisense oligonucleotides (ASO) microinjection, which included a rescue control demonstrating that reintroducing OBOX4 is sufficient to rescue the phenotype caused by blocking both, Dux and Obox4.

      This work is important for the field because it reveals functional redundancy and plasticity of the zygotic genome activation in mammals, where the mouse model stands as a remarkable example of genome activation, which massively integrated long terminal repeat (LTR)-derived enhancers from retrotransposons and now two of the key activating zygotic factors appear to be encoded by tandemly duplicated clusters of different phylogenetic age. Identification of OBOX4 as a second factor partially redundant with DUX now allows us to decipher what constitutes the essential part of the ZGA program.

    3. Reviewer #2 (Public Review):

      In this study, Guo et al., screened a few homeobox transcription factors and identified that Obox4 can induce the 2-cell like state in mouse embryonic stem cells (mESCs) (Fig. 1 and 2). The authors also compared in detail how Obox4 vs. Dux in activating 2C repeats and genes in mESCs (Fig. 3). Compared to Dux, Obox4 activates fewer 2C genes (Fig. 2). In addition, although both Obox4 and Dux bind to MERVL elements, Obox4 additionally binds to ERVK (Fig. 3). The authors then used three different approaches (i.e., SCNT-mediated KO, ASO-mediated KD, and genetic KO) to study how Obox4 and Dux regulates zygotic genome activation in embryos. Although there are some inconsistencies among different approaches, the authors were able to show that loss of both Obox4 and Dux causes more severe consequences than loss of single protein in embryonic development and zygotic genome activation (Fig. 4 and 5).

      Overall, this is a comprehensive study that addresses an important question that puzzles the community. However, some comparisons to the recent work by Ji et al (PMID: 37459895) are highly recommended. Ji et al knocked out the entire Obox cluster (including Obox4) in mice and found that Obox cluster KO causes 2-4 cell arrest without affecting Dux. That said, Obox proteins seem more critical than Dux in regulating ZGA, and Obox cluster KO cannot be compensated by Dux. Ji et al., also reported that maternal (Obox1, 2, 5, 7) and zygotic (Obox3, 4) Obox proteins redundantly regulate embryogenesis because loss of either is compatible to development. Consistent with Ji's work, Obox4 KO embryos generated in this study can develop to adulthood and are fertile. Since these two studies are highly relevant, some comparisons of Obox4 KO and Obox4/Dux DKO with the previous Obox cluster KO will greatly benefit the community.

    1. Reviewer #3 (Public Review):

      Summary:

      The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings

      Strengths:

      The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.

      Weaknesses:

      It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.

    2. eLife assessment

      This valuable study presents findings on DNA methylation as an efficient epigenetic transcriptional regulating strategy in bacteria. The authors utilized single-molecule real-time sequencing to profile the DNA methylation landscape across three model pathovars of Pseudomonas syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system, which includes a conserved sequence motif associated with N6-methyladenine. The evidence presented is solid and the study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation.

    3. Reviewer #1 (Public Review):

      Summary:

      In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA.

      Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.

      Major Concerns:

      • Most of the authors' data concern Psph pathovar. I am not sure that the authors' conclusions are supported by the two other pathovars they used in the initial 2 figures. If the authors want to broaden their conclusions to Pseudomonas synringae and not restrict it to Psph, the authors should have stronger methylation data using replicates. Additionally, they should discuss why Pss is so different than Pst and Psph. Could they do a blot to confirm it is really the case and not a sequencing artefact? Is the change of methylation during bacterial growth conserved between the pathovar? The authors should obtain mutants in the other pathovar to see if they have the same phenotype. The authors have a nice set of data concerning Psph but the broadening of the results to other pathovar requires further investigation.

      • The authors should include proper statistical analysis of their data. A lot of terms are descriptive but not supported by a deeper analysis to sustain the conclusions. For example, in Figure 4E, we do not know if the overlap is significant or not. Are DEGs more overlapping to 6mA sites than non-DEGs? Here is a non-exhaustive list of terms that need to be supported by statistics: different level (L145), greater conservation (L162), significant conservation (L165), considerable similarity (L175), credible motifs (L189), Less strong (L277) and several "lower" and "higher" throughout the text.

      • The authors performed SMRT sequencing of the delta hsdMSR showing a reduction of 6mA. Could they include a description of their results similar to Figures 1-2. How reduced is the 6mA level? Is it everywhere in the genome? Does it affect other methylation marks? This analysis would strengthen their conclusions.

      • In Figure 6E to conclude that methylation is required on both strands, the authors are missing the control CAGCN6CGC construct otherwise the effect could be linked to the A on the complementary strand.

    4. Reviewer #2 (Public Review):

      In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis and discussion need to be clarified and extended.

    1. eLife assessment

      This study is focused on the question of how Nrp1 contributes to the regulation of vascular permeability and whether or why there are differences between different vascular beds. The scientific concept of this paper suggests a possible role of Nrp1 on perivascular cells as a participant in the regulation of vascular permeability. This concept is interesting and potentially useful. However, the methodology and quantitative analysis are currently inadequate to fully support the claims.

    2. Reviewer #1 (Public Review):

      Summary:

      This study examines how blood vessels exposed to the cytokine VEGF respond to vascular leakage when the VEGF receptor NRP1 is targeted. This study compares results in in two different body sites of the dermis and in a different organ, the trachea. The authors refer to the two different sites of the dermis as two different organs, but the dermis is one organ. The authors report that vascular leakage is differentially affected by NRP1 targeting in the ear skin compared to the trachea and back skin. They attribute these differences to NRP1 presence in cells other than the vascular endothelium, especially in the ear skin, where they observe higher perivascular NRP1 staining.

      The manuscript states that the aim was to uncover the role of NRP1 in VEGF-mediated vascular permeability. This was misleading, because a lot is already known on NRP1 in this pathway, as is evidenced by a large number of publications the authors themselves quote (and sometimes misquote). The main information they wish to add is the possibility that NRP1 may also play a role in other cells to regulate permeability, as they previously suggested for blood vessel growth. Several technical issues and experimental limitations call into question whether the above conclusion can be reached with the data provided.

      Strengths:

      It is an interesting concept that NRP1 regulates vascular permeability by acting in perivascular cells.

      Weaknesses:

      (A) Technical limitations due to assay type:

      A direct comparison of the skin in two body sites is not warranted given that the authors used different methods to study the two sites. Below is a list of differences reported in their methods section:

      (A1) Different tracers were used to visualize VEGF165-induced leakage in different sites.<br /> Ear skin assay: 2 kDa FITC and two different dextrans, 10 kDa TRITC dextran, and another dextran whose molecular weight is not specified. It is not explained why 3 different tracers were used. Figures 1 and 2 report data with 2 kDa TRITC dextran.<br /> Back skin assay: They describe the Miles assay using Evans Blue, which binds to albumin, making it a 67 kDa tracer. However, Figure 1 suggests that 2 kDa dextran was used, and perhaps Evans Blue was only used for the supplemental data. This is relevant because current knowledge suggests that small dyes use the junctional pathway, whereas larger proteins such as albumin can use vesicular transport. The former is thought to be a fast pathway (hence, the authors measured dye extravasation 3 min after VEGF165 injection). The latter pathway is a slower one (hence, measured 30 min after VEGF165 injection in the Miles assay).

      Quantification: For ear skin, the number of leakage sites and lag period is quantified, as well as leakage over time. For back skin, the amount of extravasated dye is quantified at a fixed time point. Such different measurements do not allow for direct comparison.

      (A2) Mice were prepared in different ways for the different body sites studied:<br /> Ear skin assay: general anesthesia with ketamine-xylazine.<br /> Back skin assay: No anesthesia is described for the back skin Miles assay. This would be a concern because intradermal injections are considered to be painful. For back skin histology, they do report to have used isoflurane anesthesia before perfusion fixation. However, it is not advisable to use used isoflurane anesthesia for perfusion fixation if this has been done via the conventional cardiac route, because opening the chest cavity to access the heart for perfusion causes lung collapse, meaning that the mice cannot breathe the anaesthetic, and there is a risk of them regaining consciousness. The authors should clarify what exactly they have done, for ethical reasons and also because the type of anesthesia can affect vascular studies, for example, see PMID 36418078.

      (A3) Differential histamine use:<br /> Back skin assay: uses anti-histamine, as is advised with intradermal injections to minimize vascular leakage due to histamine release after local trauma.<br /> Ear skin assay: no anti-histamine was used, so histamine-induced background leakage might have been present, independently of VEGF165. The authors suggest that the ear skin injection does not cause trauma, but it is unclear how this is possible, given that skin needs to be disrupted for the needle to enter the tissue.

      (A4) Different VEGF165 concentration used:<br /> The ear skin assay uses 10 ng VEGF per injection, and the back skin assay 80 ng.

      Given all these differences in experimental protocols, as well as different knockdown efficiency (see below), the results for the different sites are not directly comparable. Hence it cannot presently be concluded that the role of NRP1 in both sites is different, and further work is required to make a firm conclusion. In addition, the conflicts between the reported methods and figures need to be resolved.

      (B) It is unclear whether appropriate controls were used:

      (B1) What genotype and treatment are the control mice for NRP1 targeting? The ideal control would be wild-type mice with the same CreER, injected with tamoxifen according to the same timeline, to account for vehicle, tamoxifen, and tamoxifen-induced CreER toxicity (https://doi.org/10.1038/s44161-022-00125-6). This could be a littermate mouse or, alternatively, a separate experiment should be shown comparing wild-type mice carrying the same CreER as used for the ablation studies and injected with tamoxifen, versus wild-type mice injected with tamoxifen, to demonstrate that the induction regime does not in itself cause phenotypes.

      (B2) Has a PBS injection been performed to compare baseline leakage between genotypes, independently of VEGF165 injections? This is an essential control.

      (B3) The experimental protocol assays 4 days after 5 consecutive tamoxifen injections, which does not allow much time for drug washout. Moreover, this is a lot of tamoxifen (80 mg x 5 = 400 mg tamoxifen per kg). Due to the possibility that tamoxifen-induced effects might still be present and cause sex-differential effects, the corresponding sex for each individual data point should be indicated in all graphs.

      (B4) i.p. peanut oil is used in undefined volumes; this vehicle was shown to cause inflammation if administered i.p. (PMID 33139505). Therefore, inflammation might be present, which might affect different body sites differently.

      (C) Validation of NRP1 targeting:<br /> The authors have not performed an NRP1 knockout in the endothelium, as they repeatedly claim. In the lung, there is a good knockdown of around 75%; this may or may not be due to complete EC knockdown with preservation of NRP1 in other cell types. In the trachea, ear skin, and back skin, knockdown was not quantified, although qualitative comparisons by NRP1 immunostaining in Supplementary Figure 1 suggest that the back skin targeting worked better than the ear skin targeting, which would confound results, but in any case, it was neither a knockdown nor knockout. The staining for global targeting looks fainter than for the other genotypes, and the single-channel images seem to have different intensities than the overlays in Supplementary Figure 1 A.

      (D) Systemic permeability studies:<br /> Organs have very different baseline permeability, due to the properties of the vascular barrier, i.e. tight barriers in the brain and retina and permeable endothelium in the liver and kidney. In this assay, VEGF is not delivered from the tissue side, as would be typical during inflammation but is delivered through the circulation, which has been shown to differentially affect the VEGF response, at least in some tissues (PMID 25175707). Nevertheless, this is a helpful readout, especially given that PBS controls appear not to have been performed above to establish baseline leakage between genotypes and tissues.

      Figure Supplement 3 shows that VEGF induces vascular leakage in all body sites examined, independently of the size of the tracer used, and agreeing with current literature. An additional set of panels should be included with data shown without calculating the fold change relative to the control, set to 1, to account for the endothelium in different organs having different baseline vascular permeability. How do the authors explain that VEGF has the same effect in the ear and back skin in this assay, when NRP1 is present, given that they claim a role for perivascular NRP1 in the ear, but not back skin, for reducing VEGF/VEGFR2 signalling?

      (E) Comparing results obtained with different tools:

      - The endothelial NRP1 knockdown yielded different results for ear and back skin.<br /> - Anti-NRP1 yielded similar results for ear and back skin.<br /> - The global NRP1 ko yielded similar results for ear and back skin.<br /> Because anti-NRP1 and the global NRP1 knockdown gives similar results for all tissues, the authors deduce that the NRP1 acts in cell types other than endothelial cells to regulate permeability. This is an interesting idea, based on the lab's prior work in angiogenesis. In their trans-interaction scenario, NRP1 would have the same role in ECs in all sites, but non-endothelial NRP1 can override the function of the endothelial NRP1 function depending on its expression levels.

      Confidence in this conclusion would require additional experiments:<br /> - Show that the endothelial knockdown works equally well in different body sites, via NRP1 staining and/or by checking recombination efficiency with a reporter.<br /> - Using an analogous assay to measure permeability in different body sites.<br /> - Perform a non-endothelial knockdown, i.e. in pericytes, which is hypothesized to be the source of NRP1 that affects vascular leakage signalling in endothelial cells in trans.

      (F) Abstract, introduction, and references:<br /> The authors suggest controversy with regard to NRP1's roles in permeability. However, NRP1's function in VEGF signalling has been defined as being an accessory to VEGFR2, with a role in promoting SFK activation. This function relies on the NRP1 cytoplasmic domain, which mediates VEGFR2 trafficking and signalling; the relevant literature for the NRP1 cytoplasmic domain is mentioned for arteriogenesis (PMID 23639442), but not permeability (PMID 28289053). Another paper is mentioned which describes a VEGFR2-independent pathway for a CendR ligand, but this prior study did NOT make the claim that VEGF signalling is NRP1-independent or promotes it (PMID 27117252). In the eye, NRP1 has been implicated in both SEMA3A and VEGF165-induced permeability, which was also corroborated by the Miles assay in two prior studies (PMID 18180379, PMID 28289053). The last sentence in the abstract is incorrect, because differences in ear versus back skin do not constitute organotypic difference (as the organ is the dermis), and the potential role of perivascular cells is only inferred from the global endothelial NRP1 knockdown, which gives the same result as reported for the endothelial NRP1 knockdown in the literature.

      (1) Lines 5/.53: The references for VEGF-NRP1 signalling in age-related macular degeneration are not helpful: Raimondi investigated VEGF-independent NRP1 pathways in angiogenesis, Fernandez-Robredo investigated NRP1 pathways in angiogenesis and showed that fewer vessels correlated with less leakage but did not test VEGF signaling specifically. A more suitable reference would have been PMID 28289053.

      (2) Lines 63/64 and repeated in 84-89: The references quoted all showed that NRP1 inhibition reduces vascular permeability, and therefore do not provide evidence for the idea that NRP1 inhibition promotes permeability, as the authors report here for the ear skin; the only study supporting them is one using arterial endothelial cells, which are not permeability-relevant.

      (3) Lines 106/107: The references used to underpin organ-specific barrier properties are correct, but as stated above, the dermis is the dermis, and therefore, these references would not be useful to provide support for the idea that the ear and back skin behave differently after NRP1 knockdown.

      (G) Additional comments on the figures:<br /> Figure 4: The authors show that VEGFR2 is essential for permeability, and VEGF164 effects are VEGFR2 dependent - this is well established for VEGF164 in the Miles assay, including the accessory role of NRP1 (e.g. PMID 28289053). As the proposed trans function of NRP1 cannot make a difference in VEGFR2 signaling when VEGFR2 is not there, this experiment is only confirmatory of prior VEGFR2 knowledge.

    3. Reviewer #2 (Public Review):

      The paper by Pal et al. examines the role of Nrp1 in organ-specific permeability response to VEGF. The subject is certainly interesting, but there are a number of significant methodological problems that make data evaluation rather problematic. In particular, lung endothelial cells are used to assess the effectiveness of Nrp1 knockout when experiments focus on different organs; small number of data points (as small as 2 or 3) are used to claim statistically significant differences; obvious data scatter is not commented on and seems ignored; key reagents (anti-Nrp1 Ab) are not well characterized, a proposed model is not verified in vitro, etc. Some of these issues are outlined in detail below, but the list of problems is much longer than this.

      (1) Intradermal injection of anti-Nrp1 Ab: I am puzzled by this experiment: Will Ab presence be limited locally or is there a systemic distribution? This needs to be verified.

      (2) What does anti-Nrp1 Ab actually do? Does it block VEGF binding? Induces Nrp1 and VEGFR2 endocytosis?

      (3) How does i.v. injection of anti-Nrp1 Ab affect permeability in different organs?

      (4) Effect of endothelial Nrp1KO: Since the authors examine organ-specific effects of Nrp1, it seems illogical to assess its expression in the lung as a measure of KO as KO efficiency may differ organ by organ. Immunocytochemistry is not particularly quantitative and prone to selection bias. I'd suggest using EC bulk RNAseq from different organs to confirm the magnitude of the knockout in different beds.

      (5) Figures 1B and 2B show profoundly different levels of Nrp1 KO in lung ECs. Were different mouse strains used in Figure 1 and Figure 2 experiments? This may well explain the differences the authors have observed.

      (6) Supplementary Figure 2: why is there no leakage of 10kD dextran in the heart in response to VEGF when there is an increase in the 70kD dextran leakage? That does not seem possible. Further, the authors observed no significant increase in 70kD dextran leakage after VEGF in the skeletal muscle. That also seems very unlikely and flies against experience of many labs in the field.

      (7) Since the authors think that peri-vascular cell Nrp1 expression accounts for organ-specific Nrp1 effects, this should be studied and examined in an in vitro co-culture model.

      (8) Quantification: a lot of quantifications- of Nrp1 expression level, VE-cadherin Y685 phosphorylation, etc. are done on the basis of immunocytochemistry. This really is not a quantitative technique and is prone to numerous artifacts. The data should be at least confirmed by whole-tissue Westerns. I am also puzzled by small numbers of samples. If each dot on a graph represents an individual data point, how do authors get a p<0.5 value with an N of 3? (for example Figure 5B, but there are other examples). Also, in Figure 4F data scatter is quite enormous. This is either an experimental problem or, more likely, there is a biological message here - the tissue is not uniform. In any case, I do not see how one gets a significant result here. Figures 5B and 5C have a similar problem while Figure 5D seems to be based on only two data points?

    4. Reviewer #3 (Public Review):

      Summary:

      Pal et al. provide valuable evidence supporting distinct vascular bed-specific VEGF-A mediated vascular permeability function of Neuropilin-1 (NRP1) in adult mice. Using a suite of genetic mice models and state-of-the-art vascular permeability assays the authors demonstrate that ear skin vasculature of EC-specific NRP1 adult knockout mice is hypersensitive to VEGF-A mediated high-molecular weight dye leakage from venules, as opposed to back skin and tracheal vasculature where EC-specific NRP1 loss had a more classical negative effect on permeability. Interestingly, both whole organism KO of NRP1 and a blocking antibody treatment, attenuated VEGF-A mediated permeability in ear skin and had the usual attenuation of permeability phenotype in back skin and tracheal vasculature. Using a pericyte promoter specific reporter mice line, the authors characterize NRP1 expression in the vascular beds of the ear dermis and back skin and conclude that NRP1 expression is higher in perivascular cells in the ear dermis as opposed to back skin vasculature, thus indicating a juxtracrine NRP1-VEGFR2 signaling model in adult mice. Further, they use a Vegfr2 phosphosite mutant homozygous mice model in the background of NRP1 iECKO to find the hypersensitivity to VEGF-A stimulation in ear skin is abrogated and therefore, prove the juxtracrine NRP1 control of VEGFR2 mediated downstream signaling leading to vascular permeability. Further, they successfully show distinctive vascular bed-specific results as above using a well-characterized VE-Cadherin Y685 antibody staining which corresponds to vascular leakage downstream of VEGF-A/VEGFR2 signaling in ear dermis and back skin vascular beds.

      Strengths:

      The question of the in vivo role of NRP1 in VEGF-A-induced hyper-permeability is an unresolved one and the elegant use of genetic mice models to demonstrate the phenotypes is valuable to the field. The organotypic differences observed in vascular permeability upon VEGF-A treatment in ear skin versus back skin and tracheal vasculature are solid. The subsequent investigation to validate heightened VEGFR2 signaling in ear dermis downstream of VEGF-A stimulation using Vegfr2 Y949F mice, VEC Y685 antibody, and pPLCγ antibody is also very convincing.

      Weaknesses:

      The mechanism proposed by the authors by which EC-specific loss of NRP1 caused hypersensitivity to VEGF-A in ear dermis is through elevated juxtracrine signaling of NRP1 expressed in pericytes in trans binding and retaining VEGFR2 on the cell surface of ECs to sustain downstream signaling for longer time, in corroboration to earlier findings in Koch et al., 2014, where NRP1 was studied in the context of tumor angiogenesis. To support their claim, the authors stain the ear dermis and back skin vasculature of Pdgfrb-GFP reporter mice, with NRP1 and CD31 antibodies and find out that ear skin vasculature has higher perivascular cells as opposed to back skin vasculature. While this is a good experiment to prove the above point, there are no functional experiments to support this model.

      Overall, although the paper presents very useful findings in the field of NRP1-VEGFR2 biology, and most of the conclusions are well supported by the data, there are a few points if addressed can significantly substantiate the model of juxtracrine signaling proposed by the authors. They are:

      (1) It will be important to know if the perivascular to vascular NRP1 expression (such as in Figure 3B) increases further in ear skin vasculatures of NRP1 iECKO mice compared to otherwise WT mice.

      (2) Does knocking out NRP1 in pericytes attenuate the VEGF-A mediated hyperpermeability observed in ear skin of NRP1 iECKO mice (similar to experiments in 1C, 2C)?

      (3) What is the status of VEGFR2 expression in ECs of ear skin and back skin of NRP1 iECKO and NRP1 iKO mice? This experiment is a proof-of-concept and is not essential to prove the point of juxtracrine NRP1 signaling since downstream readouts - pPLCγ and VEC Y685 staining have already been shown to correlate in the ear dermis.

    1. eLife assessment

      This important study uses cellular automata and evolution algorithms to offer an alternative to long-range signalling models of developmental patterning. The computational evidence that local rules suffice to produce a robust and global pattern is convincing. With some additional insights that connect the theoretical findings back to real biological examples, this work could be of interest to the broad community of developmental and systems biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      In this article, Kremser et al set off to explore how local interactions between cells can drive pattern formation by focusing on the French flag problem whereby an initially homogeneous system breaks axial symmetry to form three distinct regions of different cell fates. The authors use a cellular automata model together with evolution searches on possible rules that determine cell state and tissue level patterning. It is assumed that three cell states are possible and that at each time iteration each cell updates its fate according to the current state of itself and its neighbours. The authors use a computational procedure based on evolution algorithms to identify "fit" update rules that can successfully drive patterning into three distinct domains and go on to provide insights with regards to the function of these rules as well as their properties such as robustness and patterning dynamics. The article is generally well-written, the results seem solid, and the analysis and methods are thorough and generally well-explained. A main concern is the lack of connection between the biology that motivated the analysis and the results, this could be improved in the discussion by making the methods somewhat more concise to allow space to make links back to potential biological mechanisms when the results are presented. We raise some general points and some more specific questions and suggestions for clarification below that we hope will help improve the MS and make it more accessible to a wider audience.

      General points:

      • Although the authors motivate their work on the premise that biological patterns at the tissue level often are driven by local cell-cell interactions, by the end of the analysis any possible connection to the underlying biology is lost. For example, it would have been useful to discuss how the rules that evolved to dominate the patterning process in the results section could be implemented by cells. Is there a connection that could be made back to Notch signalling and its multiple ligands or to morphogens that diffuse only locally? Would the large number of rules possible in the cellular automata context reflect transcriptional feedback? This is an important point to bring the work "home". At the moment, it feels like a nice computational analysis of cellular automata but the links to the systems that motivate the work are lost in the process.

      • When growth is considered (p.14-15) a discussion of timescales seems pertinent. Often patterning takes place at a timescale faster than cell division so the system could be allowed to reach a steady state before a new division event takes place. What are the time scales of updating the phenotype compared with the time scales of division in the model and in relevant biological systems? How would different limiting cases impact conclusions, e.g. new cells added and pattern allowed to reach steady state before more growth versus cells added while patterning dynamics are still updating?

      • An interesting question is whether certain elements of rules (out of the 27 possible elements for the system with 3 states) are more or less likely to appear together in an evolved final rule. This may give a mechanistic understanding of what combinations of elements are likely to drive the optimal pattern and which combinations are avoided altogether.

    3. Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors seek to identify strategies that can be used to generate robust one-dimensional large-scale patterns through the sequential application of only local, unchanging, space-independent rules. This is an important general question in developmental biology.

      Strengths:

      The authors do a nice job of laying out the problem, which they explore through cellular automaton (CA) modeling. The modeling framework is well described, as are the methods used for computational identification of effective (most "fit") strategies. As many biologists are unfamiliar with CA models, the clarity of description offered by these authors is especially important, as is the attention that was paid to useful visualization of results.

      Ultimately, the authors use their approach to converge on certain generic strategies for achieving robust patterns. In the case when there are only three states (no hidden or transient states) available to cells, they rationalize the consensus strategy that emerges to involve a combination of "sorting" and "bulldozer" modules, which are relatively easy to rationalize. In cases involving a fourth state, a more complicated set of strategies arise and are considered.

      As a pure modeling paper, I find the work to be very well done, and the conclusions are well supported by the data and analyses. In terms of the long-term importance of this approach to biologists studying pattern formation, I see this paper as primarily laying a foundation for taking the next step, which is moving into two (or three dimensions). Clearly, the complexity of rules becomes much greater, but one may expect some big qualitative differences to show up in higher dimensions, where simple strategies like sorting and bulldozing cannot work quite as simply. It will be interesting to see where this leads.

      Weaknesses:

      Ultimately, the relevance of this work to biology rests with its ability to provide insight into important biological problems. In terms of explaining the challenging nature of generating long-range patterns using short-range rules, I think the authors do a good job. However, they could do a better job of relating the results of the work back to biology. For example, are there examples of "sorting module" and "bulldozer module" behavior in biology? Could they be involved in explaining actual biological patterns?

      It also would have been helpful for the authors to generalize more about the way in which their CA rules achieve global patterns with other patterning mechanisms. For example, in a Wolpert positional information model, patterning information is distributed over space in a steady-state gradient. In the CA model, no information spreads more than one cell at any one time point, but over time information still spreads, so in a sense a stationary spatial gradient has been traded for a moving spatial discontinuity. Because the discontinuity moves without decrement, any stationary state ends up being determined by the boundaries of the system, which goes a long way to explaining the robustness they observe, as well as why the result is quite sensitive to growth (which keeps changing the boundary).

    1. eLife assessment

      This is a binocular rivalry study that uses ECG to present visual stimuli pulsing in line with cardiac events, to examine whether systole-entrained stimuli (i.e. presented during the period where the heart has contracted) are suppressed within visual awareness. Arguably out of line with this idea, the dominance durations were increased for systole-entrained stimuli. The manuscript addresses an important, precisely defined, and theoretically well-motivated question using sophisticated experimental and statistical methods. The interpretation of these results is not straightforward, however, such that they currently only provide incomplete support for the claims.

    2. Reviewer #1 (Public Review):

      Summary:

      The aim of the study described in this paper was to test whether visual stimuli that pulse synchronously with the systole phase of the cardiac cycle are suppressed compared with stimuli that pulse in the diastole phase. To this end, the authors employed a binocular rivalry task and used the duration of the perceived image as the metric of interest. The authors predicted that if there was global suppression of the visual stimulus during systole then the durations of the stimulus that were pulsing synchronously with systole should be of shorter duration than those pulsing in diastole. However, the results observed were the opposite of those predicted. The authors speculate on what this facilitation effect might mean for the baroreceptor suppression hypothesis.

      Strengths:

      This is an interesting and timely study that uses a clever paradigm to test the baroreceptor suppression hypothesis in vision. This is a refreshingly focussed paper with interesting and seemingly counterintuitive results.

      Weaknesses:

      The paper could benefit from a clearer explanation of the predicted results. For those not experts in binocular rivalry, it would be useful to explain the predicted results. Does pulsing stimuli in this way change durations in such a task? If there is global suppression of visual stimuli why would this lead to shorter/longer durations in the systole compared to the diastole conditions? In addition, the duration lengths in both conditions seem to be longer than one cardiac cycle. If the cardiac cycle modulates duration it would be interesting to discuss why this occurs on some cycles but not on others. If there is a facilitation effect why does it only occur on some cycles?

    3. Reviewer #2 (Public Review):

      Summary:

      This is a binocular rivalry study that uses electrocardiogram events to modulate visual stimuli in real-time, relative to participants' heartbeats. The main finding is that modulations during the period around when the heart has contracted (systole) increase rivalry dominance durations. This is a really neat result, that demonstrates the link between interoception and vision. I thought the Bayesian mixture modelling was a really smart way to identify cardiac non-perceivers, and the finding that the main result is preserved in this group is compelling. Overall, the study has been conducted to a high standard, is appropriately powered, and reported clearly. I have one suggestion about interpretation, which concerns the explanation of increased dominance durations with reference to contemporary models of binocular rivalry, and a few minor queries. However, I think this paper is a worthwhile addition to the literature.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript addresses a question inspired by the Baroceptor Hypothesis and its links to visual awareness and interoception. Specifically, the reported study aimed to determine if the effects of cardiac contraction (systole) on binocular rivalry (BR) are facilitatory or suppressive. The main experiment - relying on a technically challenging procedure of presenting stimuli synchronised with the heartbeats of participants - has been conducted with great care, and numerous manipulation checks the authors report convincingly show that the methods they used work as intended. Moreover, the control experiment allows for excluding alternative explanations related to participants being aware of their heartbeats. Therefore, the study convincingly shows the effect of cardiac activity on BR - and this is an important finding. The results, however, do not allow for unambiguously determining if this effect is facilitatory or suppressive (see details below), which renders the study not as informative as it could be.

      While the authors strongly focus on interoception and awareness, this study will be of interest to researchers studying BR as such. Moreover, the code and the data the authors share can facilitate the adoption of their methods in other labs.

      Strengths:

      (1) The study required a complex technical setup and the manuscript both describes it well and demonstrates that it was free from potential technical issues (e.g. in section 3.3. Manipulation check).

      (2) The sophisticated statistical methods the authors used, at least for a non-statistician like me, appear to be well-suited for their purpose. For example, they take into account the characteristics of BR (gamma distributions of dominance durations). Moreover, the authors demonstrate that at least in one case their approach is more conservative than a more basic one (Binomial test) would be.

      (3) Finally, the control experiment, and the analysis it enabled, allow for excluding a multitude of alternative explanations of the main results.

      (4) The authors share all their data and materials, even the code for the experiment.

      (5) The manuscript is well-written. In particular, it introduces the problem and methods in a way that should be easy to understand for readers coming from different research fields.

      Weaknesses:

      (1) The interpretation of the main result in the context of the Baroceptor hypothesis is not clear. The manuscript states: The Baroreceptor Hypothesis would predict that the stimulus entrained to systole would spend more time suppressed and, conversely, less time dominant, as cortical activity would be suppressed each time that stimulus pulses. The manuscript does not specify why this should be the case, and the term 'entrained' is not too helpful here (does it refer to neural entrainment? or to 'being in phase with'?). The answer to this question is provided by the manuscript only implicitly, and, to explain my concern, I try to spell it out here in a slightly simplified form.

      During systole (cardiac contraction), the visual system is less sensitive to external information, so it 'ignores' periods when the systole-synchronised stimulus is at the peak of its pulse. Conversely, the system is more sensitive during diastole, so the stimulus that is at the peak of its pulse then should dominate for longer, because its peaks are synchronised with the periods of the highest sensitivity of the visual system when the information used to resolve the rivalry is sampled from the environment. This idea, while indeed being a clever test of the hypothesis in question, rests on one critical assumption: that the peak of the stimulus pulse (as defined in the manuscript) is the time when the stimulus is the strongest for the visual system. The notion of 'stimulus strength' is widely used in the BR literature (see Brascamp et al., 2015 for a review). It refers to the stimulus property that, simply speaking, determines its tendency to dominate in the BR. The strength of a stimulus is underpinned by its low-level visual properties, such as contrast and spatial frequency content. Coming back to the manuscript, the pulsing of the stimuli affected at least spatial frequency (and likely other low-level properties), and it is unknown if it was in phase with the pulsing of the stimulus strength, or not. If my understanding of the premise of the study is correct, the conclusions drawn by the authors stand only if it was.

      In other words, most likely the strength of one of the stimuli was pulsating in sync with the systole, but is it not clear which stimulus it was. It is possible that, for the visual system, the stimulus meant to pulse in sync with the systole was pulsing strength-wise in phase with the diastole (and the one intended to pulse with in sync with the diastole strength-wise pulsed with the systole). If this is the case, the predictions of the Baroceptor Hypothesis hold, which would change the conclusion of the manuscript.

      (2) Using anaglyph goggles necessitates presenting stimuli of a different colour to each eye. The way in which different colours are presented can impact stimulus strength (e.g. consider that different anaglyph foils can attenuate the light they let through to different degrees). To deal with such effects, at least some studies on BR employed procedures of adjusting the colours for each participant individually (see Papathomas et al., 2004; Patel et al., 2015 and works cited there). While I think that counterbalancing applied in the study excludes the possibility that colour-related effects influenced the results, the effects of interest still could be stronger for one of the coloured foils.

      (3) Several aspects of the methods (e.g. the stimuli), are not described at the level of detail some readers might be accustomed to. The most important issue here is the task the participants performed. The manuscript says that they pressed a button whenever they experienced a switch in perception, but it is only implied that there were different buttons for each stimulus.

      Brascamp, J. W., Klink, P. C., & Levelt, W. J. M. (2015). The 'laws' of binocular rivalry: 50 years of Levelt's propositions. Vision Research, 109, 20-37. https://doi.org/10.1016/j.visres.2015.02.019<br /> Papathomas, T. V., Kovács, I., & Conway, T. (2004). Interocular grouping in binocular rivalry: Basic attributes and combinations. In D. Alais & R. Blake (Eds.), Binocular Rivalry (pp. 155-168). MIT Press<br /> Patel, V., Stuit, S., & Blake, R. (2015). Individual differences in the temporal dynamics of binocular rivalry and stimulus rivalry. Psychonomic Bulletin and Review, 22(2), 476-482. https://doi.org/10.3758/s13423-014-0695-1

    1. eLife assessment

      This fundamental work unravels how female Drosophila can assess their social context via chemosensory cues and modulate the sperm storage process after copulation accordingly. A convincing set of rigorous experiments uncovers specific pheromones that influence the excitability of the female brain receptivity circuit and their propensity to discard inseminate from a mating. This insight into neuronal mechanisms of sexual behavior plasticity is of general interest to scientists working in the fields of animal behavior, neuroscience, evolution, and sexual selection, as well as insect chemosensation and reproduction.

    2. Reviewer #1 (Public Review):

      Yun et al. examined the molecular and neuronal underpinnings of changes in Drosophila female reproductive behaviors in response to social cues. Specifically, the authors measure the ejaculate-holding period, which is the amount of time females retain male ejaculate after mating (typically 90 min in flies). They find that female fruit flies, Drosophila melanogaster, display shorter holding periods in the presence of a native male or male-associated cues, including 2-Methyltetracosane (2MC) and 7-Tricosene (7-T). They further show that 2MC functions through Or47b olfactory receptor neurons (ORNs) and the Or47b channel, while 7-T functions through ppk23 expressing neurons. Interestingly, their data also indicates that two other olfactory ligands for Or47b (methyl laurate and palmitoleic acid) do not have the same effects on the ejaculate-holding period. By performing a series of behavioral and imaging experiments, the authors reveal that an increase in cAMP activity in pC1 neurons is required for this shortening of the ejaculate-holding period and may be involved in the likelihood of remating. This work lays the foundation for future studies on sexual plasticity in female Drosophila.

      The conclusions of this paper are mostly supported by the data, but aspects of the lines used for individual pC1 subtypes and visual contributions as well as the statistical analysis need to be clarified.

      (1) The pC1 subtypes (a - e) are delineated based on their morphology and connectivity. While the morphology of these neurons is distinct, they do share a resemblance that can be difficult to discern depending on the imaging performed. Additionally, genetic lines attempting to label individual neurons can easily be contaminated by low-level expression in off-target neurons in the brain or ventral nerve cord (VNC), which could contribute to behavioral changes following optogenetic manipulations. In Figures 5C - D, the authors generated and used new lines for labeling pC1a and pC1b+c. The line for pC1b+c was imaged as part of another recent study (https://doi.org/10.1073/pnas.2310841121). However, similar additional images of the pC1a line (i.e. 40x magnification and VNC expression) would be helpful in order to validate its specificity.

      (2) The author's experiments examining olfactory and gustatory contributions to the holding period were well controlled and described. However, the experiments in Figure 1D examining visual contributions were not sufficiently convincing as the line used (w1118) has previously been shown to be visually impaired (Wehner et al., 1969; Kalmus 1948). Using another wild-type line would have improved the authors' claims.

      (3) When comparisons between more than 2 groups are shown as in Figures 1E, 3D, and 5E, the comparisons being made were not clear. Adding in the results of a nonparametric multiple comparisons test would help for the interpretation of these results.

    3. Reviewer #2 (Public Review):

      The work by Yun et al. explores an important question related to post-copulatory sexual selection and sperm competition: Can females actively influence the outcome of insemination by a particular male by modulating the storage and ejection of transferred sperm in response to contextual sensory stimuli? The present work is exemplary for how the Drosophila model can give detailed insight into the basic mechanism of sexual plasticity, addressing the underlying neuronal circuits on a genetic, molecular, and cellular level.

      Using the Drosophila model, the authors show that the presence of other males or mated females after mating shortens the ejaculate-holding period (EHP) of a female, i.e. the time she takes until she ejects the mating plug and unstored sperm. Through a series of thorough and systematic experiments involving the manipulation of olfactory and chemo-gustatory neurons and genes in combination with exposure to defined pheromones, they uncover two pheromones and their sensory cells for this behavior. Exposure to the male-specific pheromone 2MC shortens EHP via female Or47b olfactory neurons, and the contact pheromone 7-T, present in males and on mated females, does so via ppk23 expressing gustatory foreleg neurons. Both compounds increase cAMP levels in a specific subset of central brain receptivity circuit neurons, the pC1b,c neurons. By employing an optogenetically controlled adenyl cyclase, the authors show that increased cAMP levels in pC1b and c neurons increase their excitability upon male pheromone exposure, decrease female EHP, and increase the remating rate. This provides convincing evidence for the role of pC1b,c neurons in integrating information about the social environment and mediating not only virgin but also mated female post-copulatory mate choice.

      Understanding context and state-dependent sexual behavior is of fundamental interest. Mate behavior is highly context-dependent. In animals subjected to sperm competition, the complexities of optimal mate choice have attracted a long history of sophisticated modelling in the framework of game theory. These models are in stark contrast to how little we understand so far about the biological and neurophysiological mechanisms of how females implement post-copulatory or so-called "cryptic" mate choice and bias sperm usage when mating multiple times.

      The strength of the paper is decrypting "cryptic" mate choice, i.e. the clear identification of physiological mechanisms and proximal causes for female post-copulatory mate choice. The discovery of peripheral chemosensory nodes and neurophysiological mechanisms in central circuit nodes will provide a fruitful starting point to fully map the circuits for female receptivity and mate choice during the whole gamut of female life history.

    1. eLife assessment

      This study presents valuable findings relevant to research on olfactory neurogenesis and long-term adaptation. The evidence, at this stage, is incomplete. First, the effects described could, in part, also be attributed to "downregulation" of OR subtype-specific neurogenesis upon sensory deprivation, instead of selectively increased neurogenesis. Second, additional control experiments would be needed to support the main claims and rule out alternative explanations.

    2. Reviewer #1 (Public Review):

      Summary:

      Olfactory sensory neurons (OSNs) in the olfactory epithelium detect myriads of environmental odors that signal essential cues for survival. OSNs are born throughout life and thus represent one of the few neurons that undergo life-long neurogenesis. Until recently, it was assumed that OSN neurogenesis is strictly stochastic with respect to subtype (i.e. the receptor the OSN chooses to express).

      However, a recent study showed that olfactory deprivation via naris occlusion selectively reduced birthrates of only a fraction of OSN subtypes and indicated that these subtypes appear to have a special capacity to undergo changes in birthrates in accordance with the level of olfactory stimulation. These previous findings raised the interesting question of what type of stimulation influences neurogenesis, since naris occlusion does not only reduce the exposure to potentially thousands of odors but also to more generalized mechanical stimuli via preventing airflow.

      In this study, the authors set out to identify the stimuli that are required to promote the neurogenesis of specific OSN subtypes. Specifically, they aim to test the hypothesis that discrete odorants selectively stimulate the same OSN subtypes whose birthrates are affected. This would imply a highly specific mechanism in which exposure to certain odors can "amplify" OSN subtypes responsive to those odors suggesting that OE neurogenesis serves, in part, an adaptive function.

      To address this question, the authors focused on a family of OSN subtypes that had previously been identified to respond to musk-related odors and that exhibit higher transcript levels in the olfactory epithelium of mice exposed to males compared to mice isolated from males. First, the authors confirm via a previously established cell birth dating assay in unilateral naris occluded mice that this increase in transcript levels actually reflects a stimulus-dependent birthrate acceleration of this OSN subtype family. In a series of experiments using the same assay, they show that one specific subtype of this OSN family exhibits increased birthrates in response to juvenile male exposure while a different subtype shows increased birthrates to adult mouse exposure. In the core experiment of the study, they finally exposed naris occluded mice to a discrete odor (muscone) to test if this odor specifically accelerates the birth rates of OSN types that are responsive to this odor. This experiment reveals a complex relationship between birth rate acceleration and odor concentrations showing that some muscone concentrations affect birth rates of some members of this family and do not affect two unrelated OSN subtypes.

      Strengths:

      The scientific question is valid and opens an interesting direction. The previously established cell birth dating assay in naris occluded mice is well performed and accompanied by several control experiments addressing potential other interpretations of the data.

      Weaknesses:

      (1) The main research question of this study was to test if discrete odors specifically accelerate the birth rate of OSN subtypes they stimulate, i.e. does muscone only accelerate the birth rate of OSNs that express muscone-responsive ORs, or vice versa is the birthrate of muscone-responsive OSNs only accelerated by odors they respond to?

      This question is only addressed in Figure 5 of the manuscript and the results only partially support the above claim. The authors test one specific odor (muscone) and find that this odor (only at certain concentrations) accelerates the birth rate of some musk-responsive OSN subtypes, but not two other unrelated control OSN subtypes. This does not at all show that musk-responsive OSN subtypes are only affected by odors that stimulate them and that muscone only affects the birthrate of musk-responsive OSNs, since first, only the odor muscone was tested and second, only two other OSN subtypes were tested as controls, that, importantly, are shown to be generally stimulus-independent OSN subtypes (see Figure 2 and S2).

      As a minimum the authors should have a) tested if additional odors that do not activate the three musk-responsive subtypes affect their birthrate b) choose 2-3 additional control subtypes that are known to be stimulus-dependent (from their own 2020 study) and test if muscone affects their birthrates.

      (2) The finding that Olfr1440 expressing OSNs do not show any increase in UNO effect size under any muscone concentration (Figure 5D, no significance in line graph for UNO effect sizes, middle) seems to contradict the main claim of this study that certain odors specifically increase birthrates of OSN subtypes they stimulate. It was shown in several studies that olfr1440 is seemingly the most sensitive OR for muscone, yet, in this study, muscone does not further increase birthrates of OSNs expressing olfr1440. The effect size on birthrate under muscone exposure is the same as without muscone exposure (0%).

      In contrast, the supposedly second most sensitive muscone-responsive OR olfr235 shows a significant increase in UNO effect size between no muscone exposure (0%) and 0.1% as well as 1% muscone.

      (3) The authors introduce their choice to study this particular family of OSN subtypes with first, the previous finding that transcripts for one of these musk-responsive subtypes (olfr235) are downregulated in mice that are deprived of male odors. Second, musk-related odors are found in the urine of different species. This gives the misleading impression that it is known that musk-related odors are indeed excreted into male mouse urine at certain concentrations. This should be stated more clearly in the introduction (or cited, if indeed data exist that show musk-related odors in male mouse urine) because this would be a very important point from an ethological and mechanistic point of view.

      In addition, this would also be important information to assess if the chosen muscone concentrations fall at all into the natural range.

      Related: If these are male-specific cues, it is interesting that changes in OR transcripts (Figure 1) can already be seen at the age of P28 where other male-specific cues are just starting to get expressed. This should be discussed.

      (4) Figure 5: Under muscone exposure the number of newborn neurons on the closed sides fluctuates considerably. This doesn't seem to be the case in other experiments and raises some concerns about how reliable the naris occlusion works for strong exposure to monomolecular odors or what other potential mechanisms are at play.

      (5) In contrast to all other musk-responsive OSN types, the number of newborn OSNs expressing olfr1437 increases on the closed side of the OE relative to the open in UNO-treated male mice (Figure 1). This seems to contradict the presented theory and also does not align with the bulk RNAseq data (Figure S1).

      (6) The authors hypothesize in relation to the accelerated birthrate of musk-responsive OSN subtypes that "the acceleration of the birthrates of specific OSN subtypes could selectively enhance sensitivity to odors detected by those subtypes by increasing their representation within the OE". However, for two other OSN subtypes that detect male-specific odors, they hypothesize the opposite "By contrast, Olfr912 (Or8b48) and Olfr1295 (Or4k45), which detect the male-specific non-musk odors 2-sec-butyl-4,5-dihydrothiazole (SBT) and (methylthio)methanethiol (MTMT), respectively, exhibited lower representation and/or transcript levels in mice exposed to male odors, possibly reflecting reduced survival due to overstimulation."

      Without any further explanation, it is hard to comprehend why exposure to male-derived odors should, on one hand, accelerate birthrates in some OSN subtypes to potentially increase sensitivity to male odors, but on the other hand, lower transcript levels and does not accelerate birth rates of other OSN subtypes due to overstimulation.

    3. Reviewer #2 (Public Review):

      In their paper entitled "In mice, discrete odors can selectively promote the neurogenesis of sensory neuron subtypes that they stimulate" Hossain et al. address lifelong neurogenesis in the mouse main olfactory epithelium. The authors hypothesize that specific odorants act as neurogenic stimuli that selectively promote biased OR gene choice (and thus olfactory sensory neuron (OSN) identity). Hossain et al. employ RNA-seq and scRNA-seq analyses for subtype-specific OSN birthdating. The authors find that exposure to male and musk odors accelerates the birthrates of the respective responsive OSNs. Therefore, Hossain et al. suggest that odor experience promotes selective neurogenesis and, accordingly, OSN neurogenesis may act as a mechanism for long-term olfactory adaptation.

      The authors follow a clear experimental logic, based on sensory deprivation by unilateral naris occlusion, EdU labeling of newborn neurons, and histological analysis via OR-specific RNA-FISH. The results reveal robust effects of deprivation on newborn OSN identity. However, the major weakness of the approach is that the results could, in (possibly large) parts, depend on "downregulation" of OR subtype-specific neurogenesis, rather than (only) "upregulation" based on odor exposure. While, in Figure 6, the authors show that the observed effects are, in part, mediated by odor stimulation, it remains unclear whether deprivation plays an "active" role as well. Moreover, as shown in Figure 1C, unilateral naris occlusion has both positive and negative effects in a random subtype sample.

      Another weakness is that the authors build their model (Figure 8), specifically the concept of selectivity, on a receptor-ligand pair (Olfr912 that has been shown to respond, among other odors, to the male-specific non-musk odors 2-sec-butyl-4,5-dihydrothiazole (SBT)) that would require at least some independent experimental corroboration. At least, a control experiment that uses SBT instead of muscone exposure should be performed. In this context, it is somewhat concerning that some results, which appear counterintuitive (e.g., lower representation and/or transcript levels of Olfr912 and Olfr1295 in mice exposed to male odors) are brushed off as "reflecting reduced survival due to overstimulation." The notion of "reduced survival" could be tested by, for example, a caspase3 assay.<br /> Important analyses that need to be done to better be able to interpret the findings are to present (i) the OR+/EdU+ population of olfactory sensory neurons not just as a count per hemisection, but rather as the ratio of OR+/EdU+ cells among all EdU+ cells; and (ii) to the ratio of EdU+ cells among all nuclei (UNO versus open naris). This way, data would be normalized to (i) the overall rate of neurogenesis and (ii) any broad deprivation-dependent epithelial degeneration.

      Finally, the paper will benefit from improved data presentation and adequate statistical testing. Images in Figures 2 - 7, showing both EdU labeling of newborn neurons and OR-specific RNA-FISH, are hard to interpret. Moreover, t-tests should not be employed when data is not normally distributed (as is the case for most of their samples).

    4. Reviewer #3 (Public Review):

      Summary:

      Neurogenesis in the mammalian olfactory epithelium persists throughout the life of the animal. The process replaces damaged or dying olfactory sensory neurons. It has been tacitly that replacement of the OR subtypes is stochastic, although anecdotal evidence has suggested that this may not be the case. In this study, Santoro and colleagues systematically test this hypothesis by answering three questions: is there enrichment of specific OR subtypes associated with neurogenesis? Is the enrichment dependent on sensory stimulus? Is the enrichment the result of differential generation of the OR type or from differential cell death regulated by neural activity? The authors provide some solid evidence indicating that musk odor stimulus selectively promotes the OR types expressing the musk receptors. The evidence argues against a random selection of ORs in the regenerating neurons.

      Strengths:

      The strength of the study is a thorough and systematic investigation of the expression of multiple musk receptors with unilateral naris occlusion or under different stimulus conditions. The controls are properly performed. This study is the first to formulate the selective promotion hypothesis and the first systematic investigation to test it. The bulk of the study uses in situ hybridization and immunofluorescent staining to estimate the number of OR types. These results convincingly demonstrate the increased expression of musk receptors in response to male odor or muscone stimulation.

      Weaknesses:

      A major weakness of the current study is the single-cell RNASeq result. The authors use this piece of data as a broad survey of receptor expression in response to unilateral nasal occlusion. However, several issues with this data raise serious concerns about the quality of the experiment and the conclusions. First, the proportion of OSNs, including both the immature and mature types, constitutes only a small fraction of the total cells. In previous studies of the OSNs using the scRNASeq approach, OSNs constitute the largest cell population. It is curious why this is the case. Second, the authors did not annotate the cell types, making it difficult to assess the potential cause of this discrepancy. Third, given the small number of OSNs, it is surprising to have multiple musk receptors detected in the open side of the olfactory epithelium whereas almost none in the closed side. Since each OR type only constitutes ~0.1% of OSNs on average, the number of detected musk receptors is too high to be consistent with our current understanding and the rest of the data in the manuscript. Finally, unlike the other experiments, the authors did not describe any method details, nor was there any description of quality controls associated with the experiment. The concerns over the scRNASeq data do not diminish the value of the data presented in the bulk of the study but could be used for further analysis.

      A weakness of the experiment assessing musk receptor expression is that the authors do not distinguish immature from mature OSNs. Immature OSNs express multiple receptor types before they commit to the expression of a single type. The experiments do not reveal whether mature OSNs maintain an elevated expression level of musk receptors.

      There are also two conceptual issues that are of concern. The first is the concept of selective neurogenesis. The data show an increased expression of musk receptors in response to male odor stimulation. The authors argue that this indicates selective neurogenesis of the musk receptor types. However, it is not clear what the distinction is between elevated receptor expression and a commitment to a specific fate at an early stage of development. As immature OSNs express multiple receptors, a likely scenario is that some newly differentiated immature OSNs have elevated expression of not only the musk receptors but also other receptors. The current experiments do not distinguish the two alternatives. Moreover, as pointed out above, it is not clear whether mature OSNs maintain the increased expression. Although a scRNASeq experiment can clarify it, the authors, unfortunately, did not perform an in-depth analysis to determine at which point of neurogenesis the cells commit to a specific musk receptor type. The quality of the scRNASeq data unfortunately also does not lend confidence for this type of analysis.

      A second conceptual issue, the idea of homeostasis in regeneration, which the authors presented in the Introduction, needs clarification. In its current form, it is confusing. It could mean that a maintenance of the distribution of receptor types, or it could mean the proper replacement of a specific OR type upon the loss of this type. The authors seem to refer to the latter and should define it properly.