26,869 Matching Annotations
  1. May 2024
    1. Author response:

      Reviewer #1 (Public Review):

      We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.

      (1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.

      We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.

      (2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.

      We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schuhmacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.

      The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.

      Reviewer #2 (Public Review):

      We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.

      (1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG).

      We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.

      (2) The lack of the structure of a 2-OG and ATP-bound GlnA1.

      Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (3) The observed GlnA1-filaments are an interesting finding.

      We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.

      Reviewer #3 (Public Review):

      We thank Reviewer #3 for the expert evaluation and inspiring criticism.

      (1) Encouragement to examine ligand-bound states of GlnK1.

      We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (2) The exact role of 2-OG could have been dissected much better.

      We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.

      (3) The lack of studies on dimers.

      This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.

      (4) Previous studies und structures did not show the 2-OG.

      We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.

    1. eLife assessment

      This landmark work by Lewis and Hegde represents the most significant breakthrough in membrane and secretory biogenesis in recent years. Their work reveals with outstanding clarity how nascent transmembrane segments can pass through the gate of Sec61 into the ER membrane through the coordinated motions of a conformationally and compositionally dynamic machine. Among many other insights, the authors discovered how a new factor, RAMP4, contributes to the formation and function of the lateral gate for certain substrates. The technical quality of the work is exceptional, setting the bar appropriately high.

    2. Reviewer #1 (Public Review):

      The paper meticulously explores various conformations and states of the ribosome-translocon complex. Employing advanced techniques such as cryoEM structural determination and AlphaFold modeling, the study delves into the dynamic nature of the ribosome-translocon complex. The findings from these analyses unveil crucial insights, significantly advancing our understanding of the co-translational translocation process in cellular mechanisms.

      To begin with, the authors employed a construct comprising the first two transmembrane domains of rhodopsin as a model for studying protein translocation. They conducted in vitro translation, followed by the purification of the ribosome-translocon complex, and determined its cryoEM structures. An in-depth analysis of their ribosome-translocon complex structure revealed that the nascent chain can pass through the lateral gate of translocon Sec61, akin to the behavior of a Signaling Peptide. Additionally, Sec61 was found to interact with 28S rRNA helix 24 and the ribosomal protein uL24. In summary, their structural model aligns with the through-pore model of insertion, contradicting the sliding model.

      Secondly, the authors successfully identified RAMP4 in their ribosome-translocon complex structure. Notably, the transmembrane domain of RAMP4 mimics the binding of a Signaling Peptide at the lateral gate of Sec61, albeit without unplugging. Intriguingly, RAMP4 is exclusively present in the non-multipass translocon ribosome-translocon complex, not in those containing multipass translocon. This observation suggests that co-translational translocation specifically occurs in the Sec61 channel that includes bound RAMP4. Additionally, the authors discovered an interaction between the C-tail of ribosomal proteins uL22 and the translocon Sec61, providing valuable insights into the nascent chain's behavior.

      Moving on to the third point, the focused classification unveiled TRAP complex interactions with various components. The authors propose that the extra density observed in their novel ribosome-translocon complex can be attributed to calnexin, a major binder of TRAP according to previous studies. Furthermore, the new structure reveals a TRAP-OSTA interaction. This newly identified TRAP-OSTA interaction offers a potential explanation for why patients with TRAP delta defects exhibit congenital disorders of glycosylation.

      In conclusion, this paper presents a robust contribution to the field with its thorough structural and modeling analyses. The significance of the findings is evident, providing valuable insights into the intricate mechanisms of protein co-translational translocation. The well-crafted writing, meticulous analyses, and clear figures collectively contribute to the overall strength of the paper.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript Lewis and Hegde present a structural study of the ribosome-bound multipass translocon (MPT) based on re-analysis of cryo-EM single particle data of ribosome-MPTs processing the multipass transmembrane substrate RhoTM2 from a previous publication (Smalinskaité et al, Nature 2022) and AlphaFold2 multimer modeling. Detailed analysis of the laterally open Sec61 is obtained from PAT-less particles.

      The following major claims are made:

      - TMs can bind similarly to the Sec61 lateral gate as signal peptides.

      - Ribosomal H59 is in immediate proximity to basic residues of TMs and signal peptides, suggesting it may contribute to the positive-inside rule.

      - RAMP4/SERP1 binds to the Sec61 lateral gate and the ribosome near 28S rRNA's helices 47, 57, and 59 as well as eL19, eL22, and eL31.

      - uL22 C-terminal tail binds H24/47 blocking a potential escape route for nascent peptides to the cytosol.

      - TRAP and BOS compete for binding to Sec61 hinge.

      - Calnexin TM binds to TRAPg.

      - NOMO wedges between TRAP and MPT.

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (public review and recommendations for the authors):

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      As suggested, we have added Western blots showing that RAMP4 is retained at secretory translocons (and not multipass translocons) after solubilisation, affinity purification, and recovery of ribosome-translocon complexes (Fig. 3F). This data supports both our assignment of RAMP4 in ribosome-translocon complexes, and also the structure-based proposition that its occupancy is mutually exclusive with the multipass translocon (in particular, the PAT complex).  

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. We have exhaustively searched our maps for any unexplained density connected with the putative Calnexin TMD, and have found none. This is consistent with Calnexin's lumenal domain being flexibly linked to its TMD, and thus would not be resolved in a ribosome-aligned reconstruction.

      Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we have ensured that the text and figures consistently describe this assignment as provisional or putative.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

      We agree that mapping disease-causing point mutants to the TRAP delta structure could be potentially informative. Unfortunately, the referenced TRAP delta disease mutants act by simply impairing TRAP delta expression, and thus admit no such fine-grained analyses. However, sequence conservation is our next best guide to mutant function. We note in the text that the contact site charges on TRAP delta and RPN2 are conserved, and that the closest-juxtaposed interaction pair (K117 on TRAPδ and D386 on RPN2) is also the most conserved.

      Here are some minor points:

      (1) In the introduction, when the EMC, PAT, and BOS complexes were initially mentioned, it would be beneficial for the authors to provide more context or cite relevant references. This additional information will aid readers in better understanding these complexes, ensuring a smoother comprehension of their significance in the context of the study.

      The Introduction has been edited to provide more context with relevant references. 

      (2) In Figure 7, it would be valuable for the authors to include details on how they sampled the sequence alignments. 

      To clarify this methodological point, we have revised the Figure 7 caption to include these sentences: “The logo plots in panels A and D represent an HMM generated by jackHMMER upon convergence after querying UniProtKB’s metazoan sequences with the human TRAPα sequence. Only signal above background is shown, as rendered by Skylign.org.”

      Reviewer #2 (public review and recommendations for the authors):

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

      Weaknesses:

      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

      We agree that the manuscript is long, but we have retained what we feel are the most important findings in the main text because the supplement is often undiscoverable via literature searches. Indeed, we chose eLife for its flexibility regarding article length and suitability for extended and detailed analyses. 

      Major:

      - Figure S1 does not capture the fact that a PAT-free subset of particles is analyzed. The PAT classification step should be added.

      We apologise for having caused some confusion on this point: we do not show a PAT classification step because there was none. Instead we reanalysed the whole dataset with a focus on Sec61 and TRAP. The very little PAT present (9% of particles, per Smalinskaitė et al. 2022) appeared as a very weak density in some of the closed-Sec and weak-TRAP classes.

      - The assignment of calnexin appears highly speculative. As the authors acknowledge the EM density is clearly of insufficient resolution for identification, and also AF2 does not render orthogonal support for the interpretation. The binding to TRAPg also does not explain complex formation in lower eukaryotes that do not have TRAPg. The authors may consider moving the calnexin assignment and interpretation to the supplement as it appears highly speculative. In any case, it should not be referred to as a hypothesis and not a structure.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we have ensured that the text and figures consistently describe this assignment as provisional or putative.

      - P. 8: "This extensive competition explains why prior studies found TRAP in only 40% of MPT complexes, but at high occupancy at all other RTCs29". The interpretation is at odds with a recent re-analysis of the same dataset (preprint: Gemmer et al 2023, https://doi.org/10.1101/2023.11.28.569136), which finds TRAP occupancy to negatively correlate with PAT, not BOS.

      The reviewer is correct that the Gemmer study demonstrates a negative correlation between PAT and TRAP occupancy, but it does not, as the reviewer claims, argue against a negative correlation between BOS and TRAP. In fact it agrees that Sec61•BOS•PAT complex would clash with TRAP, and that therefore “BOS could trigger release of TRAP from the multipass translocon.” Thus, there is no conflict between the two studies. The revised text in this passage now cites the Gemmer et al. preprint and clarifies that TRAP is partially displaced by competition with BOS, but retained at the translocon via its ribosome-binding domain.  

      - P. 7/8: the authors suggest that TRAPd may be important for OSTA recruitment and hence TRAPd deletion may cause glycosylation defects in patients by failure to recruit OSTA. However, cryo-ET studies (Pfeffer et al, Nat. Comms 2017) showed that OSTA still binds in patient-derived microsomes (and the OSTA-TRAPd interaction). The author should discuss their model in the light of these data.

      As explained in the text, our hypothesis predicts that TRAPδ is more important for OSTA’s recruitment to the RTC than for its RTC affinity: “OSTA’s attraction to TRAPδ is weak compared to its binding to the ribosome, but TRAPδ may nonetheless help recruit OSTA, since TRAPδ would attract OSTA from most possible angles of approach, whereas OSTA’s ribosome contacts are stereospecific.” Therefore the fact that Pfeffer et al. 2017 found OSTA at some TRAPδ-negative RTCs is not surprising. For confirmation we would look for TRAPδ-dependent glycosylation sites in fast-folding domains or otherwise kinetically sensitive loci, and indeed TRAP-dependence screens return complex profiles that could be consistent with such a mechanism (Phoomak et al. 2021).

      - Some confidence measure for the assignment of SERP1/RAMP4 should be provided adding support for the claim "The resolution of the RBD density was sufficient for de novo modelling". Indeed, the N-terminal ribosome-bound segment appears well resolved and programs like Modelangelo or FindMySequence should provide a confidence measure for the assignment of the density to SERP1. The TM part appears less well resolved, but the connectivity to the Nterminus may justify the assignment, which should be elaborated on.

      Although we appreciate the value of tools like Modelangelo or FindMySequence, and would have used them if we were resting our assignment of RAMP4 on its RBD alone, we feel that such analyses would be superfluous here. They would quantify only the buildability of RAMP4’s

      RBD, whereas the real question of RAMP4’s assignability is independently supported by AlphaFold’s confirmation of RAMP4’s TMD as the Sec61-binding density, and further biochemical data provided or cited in the paper.

      - P. 3: "Because PAT complex recruitment and MPT assembly are just beginning, ..." the implicit kinetic model seems to be that the MPT subcomplexes assemble on ribosome and Sec61. What is the evidence for this model and later recruitment of PAT (as opposed to GEL, BOS, and PAT binding pre-assembled)?

      The work of Sundaram et al. (PMID 36261522) established that PAT, GEL and BOS do not coassociate appreciably in the absence of the ribosome-Sec61 complex. This is consistent with the structural data in Smalinskaite et al. (PMID 36261528), which shows that PAT, GEL, and BOS each contact the ribosome (and Sec61 in the case of PAT and BOS), but have few if any specific contacts among themselves. Finally, data in both of these studies show that recruitment of each complex to the RNC is not lost when any of them is missing, arguing that each is capable of independent recruitment to ribosome-Sec61 complexes. 

      - p. 4: the meaning of the sentence "Stabilising interactions with this widely conserved motif may help Sec61 respond to its diverse substrates with a consistent open state." is not entirely clear. Published single-particle cryo-EM structures of RTC appear to have resulted in various degrees of openness.

      Here we were referring not to RTC structures in general, but to substrate-engaged RTCs in particular.  The two substrate-engaged RTC structures under discussion in this paragraph are nearly identical (Figure 2c) despite large differences in substrate sequence (RhoTM2 vs preprolactin’s SP). We were surprised to find that this engaged structure creates noncovalent bonds between the Sec61 N-half and the ribosome. This bonding would tend to stabilise this particular engaged structure, and this stabilisation helps explain why the newly observed TMengaged structure is so similar to the previously observed SP-engaged structure. Without this stabilising N-half interaction, one might instead expect to see more variability, such as the reviewer suggests.

      - A recent analysis of heimdallarchaea already hypothesized TRAP in these organisms and should be cited: Eme et al, Nature 618:992-999 (2023). The novel findings of this manuscript compared to Eme et al should be discussed.

      We thank the reviewer for bringing this relevant contemporaneous work to our attention. Reviewing the putative TRAP homologs identified by Eme et al, we find that most do not in fact appear to be TRAP homologs at all, judged by the measures used in our work (reciprocal HHpred queries against the human proteome and predicted structural similarity). This is not surprising since Eme et al. relied on low-threshold sequence similarity searches rather than structural measures. To acknowledge this work, we have added a sentence as follows (italics): “To test whether these candidates are also similar to TRAPαβγ in sequence, we used them to perform reciprocal HHpred queries of the human proteome, and in each case the corresponding human TRAP protein was the top hit (E = 0.031 for TRAPα, 9.4×10-14 for TRAP β, and 110 for

      TRAPγ). A contemporaneous study has also claimed to find TRAP homologs in

      Heimdallarchaeota (Eme et al. 2023), although some caution is warranted in these assignments because they do not seem to share predicted structural similarity to TRAP subunits and do not find human homologs in reciprocal HHpred queries.”

      - Given that the authors expand the evolutionary analysis of TRAP to archaea it would be helpful if sampling for RAMP4 were consistent (i.e., is TRAP present in the early eukaryotes that do not feature RAMP4? Is RAMP4 absent from heimdallarchaea?).

      As stated in the text, RAMP4’s absence from early-branching eukaryotic taxa indicates that it was also absent from their archaeal ancestors. We did of course run such queries for completeness and indeed find no archaeal RAMP4. TRAP, for its part, is generally present in early-branching eukaryotic taxa, as stated in the text, and this necessarily includes those from which RAMP4 is absent.

      - The authors may consider discussing (Gemmer et al 2023, https://doi.org/10.1101/2023.11.28.569136), which comes to similar conclusions for NEMO integration into the MPT.

      We thank the reviewer for bringing this relevant work to our attention. We have added the following sentence to the section on NOMO: “Contemporaneous work has arrived at a similar model for PLD10-12 but did not model PLD1 (Gemmer et al. 2023).”

      - The abundance approximation of RAMP4 in the native translocon by OccuPy should probably be taken with a grain of salt. The '80%' mentioned in the conclusion may stick around and could eventually turn out to be closer to 100%.

      It is certainly possible that the occupancy of RAMP4 is higher than OccuPy estimates.

      Unfortunately no available method can provide occupancy estimates with confidence intervals. The Western blots we have added to the revised manuscript are consistent with high occupancy, but cannot discriminate between 80 or 100%.

      Minor

      - p. 5: The following sentence is incomplete: "Together, these factors explain why RAMP4's occupancy in prior cryo-EM maps was low enough to be overlooked, although in hindsight seems to be visible in several7,68,69"

      Thank you for catching this typo. We have revised the sentence as follows: “Together, these factors explain why RAMP4's occupancy in prior cryo-EM maps was low enough to be overlooked, although in hindsight it is visible in several of them.”

    1. eLife assessment

      The manuscript describes a valuable method to boost WNT signaling in a tissue-specific manner. The work extends previous data from the authors based on fusing an RSPO2 mutant protein to an antibody that binds ASGR1/2. In the current manuscript, two new antibodies with similar effects are described, that expand this solid approach and provide alternatives for potential future clinical applications. This manuscript will be of interest to all scientists studying protein engineering and cellular targeting.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors have previously described a way to boost WNT/CTNNB1 signaling in a tissue-specific manner, by directing an RSPO2 mutant protein (RSPO2RA) to a liver-specific receptor (ASGR1/2). This is done by fusing the RSPO2RA to an antibody that binds ASGR1/2.

      Here the authors describe two new antibodies, 8M24 and 8G8, with similar effects. 8M24 shows specificity for ASGR1, while 8G8 has broader affinity for mouse/human ASGR1/2.<br /> The authors resolve and describe the crystal structure of the hASGR1CRD:8M24 complex and the hASGR2CRD:8G8 complex in great detail, which help explain the specificities of the 8M24 and 8G8 antibodies. Their epitopes are non-overlapping.<br /> Upon fusion of the antibodies to an RSPO2RA (an RSPO mutant), these antibodies are able to enhance WNT signaling by promoting the ASGR1-mediated clearance of ZNRF3/RNF43, thereby increasing cell surface expression of FZD. This has previously also been shown to be the case for RSPO2RA fused to an anti-ASGR1 antibody 4F3 - and the paper also tests how the antibodies compare to the 4F3 fusion.

      Strengths:

      (1) One challenge in treating diseases, is the fact that one would like therapeutics to be highly specific - not just in terms of their target (e.g. aimed at a specific protein of interest) but also in terms of tissue specificity (i.e. affecting only tissue X but leaving all others unaffected). This study broadens the collection of antibodies that can be used for this purpose and thus expands a potential future clinical toolbox.

      (2) The authors have addressed questions raised after a first round of review, e.g. by showing that ASGR1 is itself indeed ubiquitinated.

      Weaknesses:

      (1) Some questions remain as to how 8M24 and 8G8 compare to 4F3.

      (2) Some questions remain as to the specificity of the approach: the initial goal was not to also downregulate ASGR1 per se, so this targeting to a specific receptor/membrane protein is not trivial and/or neutral.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      The authors demonstrate that ASGR1 is degraded in response to RSPO2RA-antibody treatment through both the proteasomal and the lysosomal pathway, suggesting that this is due to the RSPO2RA-mediated recruitment of ZNRF3/RNF43, which have E3 ubiquitin ligase activity. The paper doesn't show, however, if ASGR1 is indeed ubiquitinated.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrated ubiquitination of ASGR1 with SWEETS treatment (new Fig. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      The authors conclude that the RSPO2A-Ab fusions can act as a targeted protein degredation platform, because they can degrade ASGR. While I agree with this statement, I would argue that the goal of these Abs would not be to degrade ASGR per se. The argumentation is a bit confusing here. This holds for both the results and the discussion section: The authors focus on the dual role of their agents, i.e. on promoting both WNT signaling AND on degrading ASGR1. They might want to reconsider how they present their data (e.g. it may be interesting to target ASGR1, but one would presumably then like to do this without also increasing WNT responsiveness?).

      We thank the reviewer for this comment. As the reviewer states, the initial goal of the RSPO2RA-ab fusions was to generate tissue-specific RSPO mimetics that focus on elimination of E3. As an unintended consequence, we observed enhanced elimination of ASGR as well. While this was unintended, the results did provide POC that when an E3 ligase is brought into proximity of another protein, ubiquitination and degradation of this protein may occur. Additionally, our results highlight that one needs to be careful in fully assessing the impact of bispecific molecules on the intended target as well as unintended targets to understand the potential side effects of such bispecific molecules. We have revised the manuscript to make this more clear, both in the Results and Discussion sections.

      Lines 326-331: The authors use a lot of abbreviations for all of the different protein targeting technologies, but since they are hinting at specific mechanisms, it would be better to actually describe the biological activity of LYTAC versus AbTAC/PROTAB/REULR so non-experts can follow.

      We thank the reviewer for this suggestion. We have added more details in the Discussion to highlight the different mechanisms of the various systems described.

      Can the authors comment on how 8M24 and 8G8 compare to 4F3? The latter seems a bit more specific (ie. lower background activity in the absence of ASGR1 in 5C)? Are there any differences/advances between 8M24 and 8G8 over 4F3? This remains unclear.

      These three antibodies bind different regions/epitopes on ASGR. 8M24 and 8G8 bind non-overlapping epitopes on the carbohydrate recognition domain (CRD), while 4F3 binds the stalk region outside of the CRD. This information is in the Results section of the manuscript. We do not believe that the difference in the ASGR binding epitopes contributes to the slight differences in the background activity. The slight differences may be due to differences in the conformation of the antibodies resulting from the differences in their primary sequences, and these differences may not be significant. We have now repeated the experiments in Fig. 5C and 5D to address the reviewer’s next comment on the axis. These new data (new Fig. 5C and 5D) show less background differences between the molecules.

      Can the authors ensure that the axes are labelled/numbered similarly for Fig 5B-D? This will make it easier to compare 5C and 5D.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B–D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Reviewer #2 (Public Review):

      Weaknesses:

      The authors show crystal structures for binding of these antibodies to ASGR1/2, and hypothesize about why specificity is mediated through specific residues. They do not test these hypotheses.

      We thank the reviewer for this comment. We did not further test the residue contributions to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      The authors demonstrate in hepatocyte cell lines that these function as mimetics, and that they do not function in HEK cells, which do not express ASGR1. They do not perform an exhaustive screen of all non-hepatocyte cells, nor do they test these molecules in vivo.

      We agree with the reviewer. For the 4F3-based SWEETS molecule, additional in vitro and in vivo specificity characterized were performed and described in Zhang et al., Sci Rep, 2020. Since 8M24 is human specific and 8G8 only weakly interacts with mouse receptors, in vivo experiments in mouse were not performed. While we did not extensively test the 8M24- and 8G8-based SWEETS on additional cell lines or in vivo, we do believe the data presented strongly support the hepatocyte-specific effects of these molecules.

      Surprisingly, these molecules also induced loss of ASGR1, which the authors hypothesize is due to ubiquitination and degradation, initiated by the E3 ligases recruited to ASGR1. They demonstrate that inhibition of either the proteasome or lysosome abrogates this effect and that it is dependent on E1 ubiquitin ligases. They do not demonstrate direct ubiquitination of ASGR1 by ZNRF3/RNF43.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrate ubiquitination of ASGR1 with SWEETS treatment (new Figs. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There are multiple instances where articles (i.e. the use of "the") are missing.

      We thank the reviewer for this comment. Following the suggestion, the manuscript has gone through a detailed review by an editorial service, and these and other grammatical errors have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      The best I can think of is to inject these into Wnt reporter mice (or maybe humanized mice) and see if the liver lights up while other tissues do not.

      We thank the reviewer for this suggestion. The liver specificity was demonstrated in vivo in our earlier publication (SciRep, 10:13951, 2020) with the 4F3-RSPO2RA molecule. Unfortunately, as the results in this manuscript show, the new ASGR binders 8M24 and 8G8 either do not bind or only weakly interact with mouse receptors. Therefore, the in vivo experiments were not performed here.

      You could also consider addressing some of the statements in the manuscript that are currently hypothetical experimentally.

      We thank the reviewer for this comment. We did not further test the residues’ contribution to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      It would be easier to compare the graphs in 5B-D if all Y-axes were the same scale, with the same scientific notation.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B-D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Some of the western blots in Figure 6 do not have antibody/target labels, making them harder to interpret.

      All the Western blots antibody/target labels are on the right side of the blots for each panel, we have now made the text bold and thus easier to identify.

      Figure 6 and Supplementary Figure 2 are the same I think.

      Figure 6 and Supplementary Figure 2 show the same experimental set-up performed on two different cell lines, Fig. 6 is on Huh7 cells and Supplementary Fig. 2 is on HepG2 cells. The results from these two cell lines are quite consistent, making their appearance very similar.

    1. eLife assessment

      This important research article provides a novel approach to measure imaginal disc growth and uses this approach to explore the roles of Fat and Dachsous, two conserved protocadherins, in late larval development. The authors have addressed all referee concerns and the evidence supporting the authors' findings overall are compelling.

    2. Reviewer #1 (Public Review):

      The manuscript presents novel results on the regulation of Drosophila wing growth by the protocadherins Ds and Fat. The manuscript performs a more careful analysis of disc volume, larval size, and the relationship between the two, in normal and mutant larvae, and after localized knockdown or overexpression of Fat and Ds. Not all of the results are equally surprising given the previous work on Fat, Ds, and their regulation of disc growth, pupariation, and the Hippo pathway, but the presentation and detail of the presented data is new. The most novel results concern the scaling of gradients of Fat and Ds protein during development, a largely unstudied gradient of Fat protein, and using overexpression of Ds to argue that changes in the Ds gradient do not underlie the slowing and halting of cell divisions during development.

    3. Reviewer #2 (Public Review):

      This manuscript from Liu et al. examines the role of Fat and Dachsous, two transmembrane proto-cadherins that function both in planar cell polarity and in tissue growth control mediated by the Hippo pathway. The authors developed a new method for measuring growth of the wing imaginal disc during late larval development and then used this approach to examine the effects of disruption of Fat/Dachsous function on disc growth. The authors show that during mid to late third instar the wing imaginal disc normally grows in a linear rather than exponential fashion and that this occurs due to slowing of the mitotic cell cycle as the disc grows during this period. Consistent with their known role in regulating Hippo pathway activity, this slowing of growth is disrupted by loss of Fat/Dachsous function. The authors also observed a previously unreported gradient of Fat protein across the wing blade. However, graded expression of Fat or Dachsous is not necessary for proper growth regulation in the late third instar because ectopic Dachsous expression, which affects gradients of both Dachsous and Fat, has no growth phenotype.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviews

      All reviewers were positive about the rigor and impact of our work and offered a number of very helpful suggestions. We have done a number of suggested experiments, whose results have been added to the revision. We have also used their suggestions to improve the clarity and precision with which we describe and interpret our results.

      Reviewer 1 found the paper to be clearly written, with novel results, and the conclusions relevant and solid. This review offered many insights and thoughtful suggestions, which we have adopted to greatly improve the manuscript. The referee’s points are listed below with our responses.

      The study chooses to examine growth only in the prospective wing blade (the "pouch") rather than the wing disc as a whole. This can create biases, as fat and ds manipulations often cause stronger effects on growth, and on Hippo signaling targets, in the adjacent hinge regions of the disc. So I am curious about this choice. 

      Actually, several experiments described in the manuscript measured growth in regions of the wing disc that did not include the pouch (Fig 1 supplement 4). We found that in the second phase of allometric growth, growth of the pouch was greater than growth of the hinge-notum (Fig.1G and Fig 1 supplement 4).  We also looked at the effect of Ds and Fat on growth of the hinge-notum (Fig 4 supplement 1 and Fig 5 supplement 2). Loss of Ds or Fat also affected allometric growth of the pouch differently from their effects on allometric growth of the hinge-notum. We therefore treated analysis of each region independently. Greater focus was given to wing pouch growth because it was in this region that we detected the interesting gradient properties in Fat and Ds expression.

      The limitation to the wing region also creates some problems for the measurements themselves. The division between wing and pouch is not a strict lineage boundary, and thus cells can join or leave this region, creating two different reasons for changes in wing pouch size; growth of cells already in the region, or recruitment of cells into or out of the region. The authors do not discuss the second mechanism.

      We agree with this assessment that pouch growth can occur via lineage-restricted growth or by recruitment of cells into the region. This has now been clarified in the Introduction and the Discussion with discussion of the second mechanism.

      It is not at all clear that the markers for the pouch used by the authors are stable during development. One of these is Vg expression, or the Vg quadrant enhancer. But the Vgexpressing region is thought to increase by recruitment over late second and third instar through a feed-forward mechanism by which Vg-expressing cells induce Vg expression in adjacent cells. In fact, this process is thought to be driven in part by Fat and Ds (Zecca et al 2010). So when the authors manipulate Fat and Ds are they increasing growth or simply increasing Vg recruitment? I would prefer that this limitation be addressed. 

      There is the possibility that the feedforward recruitment of disc cells to express Vg leads to some expansion of the measured pouch domain. However, we argue that the recruitment mechanism may not be contributing significantly to the phenomena we measured in this study. 1) We limited our analysis of pouch growth to the third instar stage. In Fig.2, Zecca and Struhl (2007 doi 10.1242/dev.006411) found that recruitment was much stronger in clones induced at first instar rather than third instar, and so they limited their clonal analysis throughout the paper to first instar induced clones. Thus, it is unclear how much the feedforward recruitment mechanism contributes to pouch growth in the mid-to-late third instar. 2) We detected an effect of Ds and Fat on how rapidly the cell cycle slows down over time in pouch cells. The effect is entirely consistent with it having a causal effect on wing pouch growth. For example, nub>Ds(RNAi) causes the average third instar pouch cell to divide ~25% more rapidly than normal, when comparing the slopes in Figure 6. Note that at the beginning of the third instar, the average pouch cell has a similar doubling time whether lacking Ds or not (Figure 6). When we measured the final size of the wing pouch at the end of the third instar, nub>Ds(RNAi) caused the pouch to be ~30% larger than normal (Figure 5). This effect is quite comparable to the effect of Ds RNAi on cell doubling.

      To provide more rigorous evidence that the effect of Fat and Ds on cell cycle dynamics is primarily responsible for their effects on wing growth that we measured, we have adapted the simple growth modeling framework from Wartlick et al (2011) and fit our cell cycle measurements made for different genotypes. These fits give us estimates for instantaneous cell growth rates over time, and using these estimates, we simulated the theoretical growth trajectory of the entire wing pouch for wildtype and ds / fat RNAi animals. When we compare these model predictions of wing growth to our pouch volume measurements over time, they agree very well with one another. These

      analyses and results are now discussed in the Results and presented in Fig. 6 supplement 2. Overall, it supports a model that Fat and Ds regulate cell cycle dynamics in the wing pouch during third instar and this effect is primarily responsible for Fat and Ds’s effect on overall wing pouch growth in that timeframe. It does not rule out that Fat and Ds might also affect Vg recruitment at third instar, but such effects must be small relative to the primary effect on the cell cycle. It is feasible that Fat and Ds work via the feedforward mechanism at earlier larval stages. We have now discussed all this in detail in the Discussion considering the limitation of recruitment. 

      The second pouch marker the authors use is epithelial folding, but this also has problems, as Fat and Ds manipulations change folding. Even in wild type, the folding patterns are complex. For instance, to make folding fit the Vg-QE pattern at late third the authors appear to be jumping in the dorsal pouch between two different sets of folds (Fig 1S2A). The authors also do not show how they use folding patterns in younger, less folded discs, nor provide evidence that the location of the folds are the same and do not shift relative to the cells. They also do not explain how they use folds and measure at later wpp and bpp stages, as the discs unfold and evert, exposing cells that were previously hidden in the folds.

      The primary marker we used for the pouch boundary were the folds. We agree with the reviewer that our original description of how we defined the pouch boundary using the folds was inadequate. We now have substantially expanded the Methods section describing how we defined the boundary at all stages using the folds, including a supplementary figure (Fig 1 supplement 2). Importantly, in our measurements, we did not exclude the pouch regions within the folds but included them (see also the next point). Our microscopy detected fluorescence in the folds, and surface rendering allowed us to visualize fold structure and its contents. In younger discs with less folding, we defined the boundary by the location of the Wg inner ring. The folds were more prominent in older L3 larval discs and in the WPP and later stages since the wings had not fully everted yet. Therefore, we used accepted morphological definitions of the pouch boundary from the literature to define the boundaries. We were able to do so even though, as the reviewer notes, the fold architecture evolves as the larvae age. We agree with the reviewer that defining a boundary based on morphology could be error prone, especially prone to systematic error based on age. It is the main reason we directly compared the morphologically defined boundaries to boundaries defined by the Vg quadrant expression domain for many wing discs across all ages. As seen in Fig 1 supplement 3C, the two methods are in strong agreement with one another for discs of all ages. There is a slight overestimate of the pouch boundary using the morphological method, but the error is small (2.5%) and independent of disc size.  

      Finally, the authors limit their measurements to cells with exposed apical faces and thus a measurable area but apparently ignore the cells inside the folds. At late third, however, a substantial amount of the prospective wing blade is found within the folds, especially where they are deepest near the A/P compartment boundary. Using the third vein sensory organ precursors as markers, the L3-2 sensillum is found just distal to the fold, the L3-1 and the ACV sensilla are within the fold, and the GSR of the distal hinge is found just proximal to the fold. That puts the proximal half of the central wing blade in the fold, and apparently uncounted in their assays. These cells will however be exposed at wpp and especially bpp stages. How are the authors adjusting for this? 

      We apologize for not describing the methods of measurement thoroughly in the original submission. In fact, we did make measurements of cells located within the folds of the wing pouch at all stages. Z stacks of optical sections were collected that transversed the disc, including the folds. Using surface detection algorithms, we could make spatial measurements (xyz distances and areas) of the material within the folds enveloping the apical pouch. Therefore, we could measure the surface area and volume of the wing pouch that included the folds. This was indeed what we did and reported in the original submission. A much more complete description of the process has now been added to the Methods.

      On the other hand, we could not reliably measure Fat-GFP or Ds-GFP fluorescence intensity in cells deep in the folds due to light scattering. Therefore, we did not assay the entire gradient across the pouch. Of the cells we did measure, we know their relative distance to the center of the pouch, defined as the intersection of the AP and DV boundaries. Therefore, fluorescence intensities could be directly compared across stages since they were calibrated by the centerpoint of the pouch. We have added text to the Methods to clarify this.

      Stabilizing and destabilizing interactions between Fat and Ds- The authors describe a distal accumulation of Fat protein in the wing, and show that this is unlikely to be through Fat transcription. They further try to test whether the distal accumulation depends on destabilization of proximal Fat by proximal Ds by looking at Fat in ds mutant discs. However, the authors do not describe how they take into account the stabilizing effects of heterophilic binding between the extracellular domains (ECDs) of Fat and Ds; without one, the junctional levels and stability of the other is reduced (Ma et al., 2003; Hale et al. 2015). So when they show that the A-P gradient of Fat is reduced in a ds mutant, is this because of the loss of a destabilizing effect of Ds on Fat, as they assume, or is it because all junctional Fat has been destabilized by loss of extracelluarlar binding to Ds? The description of the Fat gradient in Ds mutants is also confusing (see note 6 below), making this section difficult for the reader to follow. 

      We did not intend to imply that Ds actively inhibits Fat. We now describe the implications of the result more clearly in the Results and Discussion with reference to the prior Hale and Ma study of heterophilic stabilization. It is worth noting that Ma et al 2003 saw elevated junctional Fat in ds mutant cells if they were surrounded by other ds mutant cells. This is consistent with our results. We also apologize for the confusion in describing the Fat gradient and have reworded the section in the Results to make it more clear.

      The authors do not propose or test a mechanism for the proposed destabilization. Fat and Ds bind not only through their ECDs, but binding has now also been demonstrated through their ICDs (Fulford et al. 2023)

      We now discuss possible mechanisms in the Discussion and include the Fulford reference in the Results.

      Ds gradient scales by volume, rather than cell number - This is an intriguing result, but the authors do not discuss possible mechanisms.

      We have now added discussion of possible mechanisms in the Discussion.

      Fat and Ds are already known to have autonomous effects on growth and Hippo signaling from clonal analyses and localized knockdowns. One novelty here is showing that localized knockdown does not delay pupariation in the way that whole animal knockdown does, although the mechanism is not investigated. Another novelty is that the authors find stronger wing pouch overgrowth after localized ds RNAi or whole disc loss of fat than after localized fat RNAi, the latter being only 11% larger. The fat RNAi result would have been strengthened by testing different fat RNAi stocks, which vary in their strength and are commonly weaker than null mutations, or stronger drivers such as the ap-gal4 they used for some of their ds-RNAi experiments or use of UAS-dcr2. Another reason for caution is that Garoia (2005) found much stronger overgrowth in fat mutant clones, which were about 75% larger than control clones.

      We thank the reviewer for this suggestion. Indeed, the weak effect of Fat RNAi had been due to the specific RNAi driver. We followed the reviewer’s suggestion and tested other RNAi stocks. We had in hand an RNAi driver against GFP that we had found in unrelated studies to be a very potent repressor of GFP expression. Since we had been using a knock-in allele of GFP inserted in frame to Fat throughout this study, we applied nub>Gal4 UAS-GFP RNAi to knock down homozygous Fat-GFP. The effect of the knockdown was very strong, as measured by residual 488nm fluorescence above background autofluorescence after knockdown. Correcting for background autofluorescence, we estimate that only 4.5% of Fat-GFP remained under RNAi conditions (Figure 5 - figure supplement 3). 

      Using the more potent RNAi reagent, we repeated the various experiments related to

      Fat. We observed a 42% increase in wing pouch growth, which is similar to that of Ds RNAi. We also observed an effect of Fat RNAi on the average cell cycle time of wing pouch cells. There was still a linear coupling between the cell cycle duration and wing pouch size, but the slope of the coupling was smaller with Fat RNAi. This was very similar to what Ds RNAi does to the cell cycle. Therefore, we have replaced the data from the original Fat RNAi experiments with the new data and modified the text throughout the manuscript to describe the new results.

      Flattening of Ds gradient does not slow growth. One model suggests that the flattening of the Ds gradient, and thus polarized Ds-Fat binding, account for slowed growth in older discs. The difficulty in the past has been that two ways of flattening the Ds gradient, either removing Ds or overexpressing Ds uniformly, give opposite results; the first increases growth, while the latter slows it. Both experiments have the problem of not just flattening the gradient, but also altering overall levels of Ds-Fat binding, which will likely alter growth independent of the gradients. Here, the authors instead use overexpression to create a strong Ds gradient (albeit a reversely oriented one) that does not flatten, and show that this does not prevent growth from slowing and arresting.

      To make sure that this is not some effect caused by using a reverse gradient, one might instead induce a more permanent normally oriented Ds gradient and see if this also does not alter growth; there is a ds Trojan gal4 line available that might work for this, and several other proximal drivers.

      Again, we thank the reviewer for this suggestion. We followed the reviewer’s suggestion and generated Trojan-Gal4 mediated overexpression of Ds. The Ds protein gradient was strongly amplified by Trojan-Gal4 but remained normally oriented. However, it only caused a modest (12%) increase in wing pouch volume. It did not significantly alter Fat expression dynamics nor the dynamics of cell cycle duration. This new data has been added to the Results (Fig. 7 and Fig 7 supplement 2) and discussed at length in the text.

      Another possible problem is that, unlike previous studies, the authors have not blocked the Four-jointed gradient; Fj alters Fat-Ds binding and might regulate polarity independently of Ds expression. A definitive test would be to perform the tests above in four-joined mutant discs.

      We examined a fj null mutant (fjp1/d1) and found that it did not alter final wing pouch size (Fig. 2 - figure supplement 3E). Moreover, neither Fat nor Ds expression were altered in the fj mutant (Fig. 2- figure supplement 3C,D). 

      The Discussion of these data should be improved. The authors state in the Discussion "The significance of these dynamics is unclear, but the flattening of the Fat gradient is not a trigger for growth cessation." While the Discussion mentions the effects of Ds on Fat distribution in some detail, this is the only phrase that discusses growth, which is surprising given how often the gradient model of growth control is mentioned elsewhere. The reader would be helped if details are given about what experiment supports this conclusion, the effect on not only growth cessation but cell cycle time, and why the result differs from those of Rogjula 2008 and Willecke 2008 using Ds and Fj overexpression.

      We have rewritten the Discussion to better reflect the results and incorporate the reviewer’s criticisms.

      The authors spend much of the discussion speculating on the possibility that Fat and Ds control growth by changing the wing's sensitivity to the BMP Dpp. As the manuscript contains no new data on Dpp, this is somewhat surprising. The discussion also ignores Schwank (2011), who argues that Fat and Dpp are relatively independent. There have also been studies showing genetic interactions between Fat and signaling pathways such as Wg (Cho and Irvine 2004) and EGF (Garoia 2005).

      We have modified the discussion to be more inclusive of mechanisms connecting Fat and other signaling pathways, and we deleted some of the speculation about Dpp. However, since Dpp is the only known growth factor whose local concentration linearly scales with average cell doubling time (the process we found Ds/Fat regulates), there is a logical connection that readers deserve to know about. Therefore, we have retained some discussion of the hypothesis that the two might be linked through cell cycle duration. It is for future studies to test that hypothesis as it is beyond the scope of this paper.

      That said, there are studies that discount the work of Wartlick’s Dpp model, eg. Schwank et al 2012, arguing that Dpp regulates growth permissively by limiting an antigrowth factor, Brinker. We have added this reference and the others in the Discussion to discuss alternative models where Fat/Ds act in parallel to Dpp. 

      Wpp and Bpp- First, the charts treat wpp as if it is a fixed number of hours after 5 day larvae, but this will not be true in fat and ds mutants with extended larval life. This should be mentioned.

      We have clarified this distinction in the figure legends.

      How are the authors limiting bpp to 1 hr from wpp? Prepupa are brown and lack air bubbles, but that spans 5 hours of disc changes from barely everted to fully wing-like.

      We deliberately chose 1 hour post WPP because we wanted to measure final wing volume with minimal eversion. We agree with the reviewer’s concerns with calling this BPP and we now call it WPP+1  

      "However, growth of the wing pouch ceased at the larva-pupa molt and its size remained constant".

      The transition from late third to wpp shown in the figure is not the pupal molt. Unlike in most insects, in Drosophila the larval cuticle is not molted away, it is remodeled during pupariation into the prepupal case. The pupal cuticle is not formed until 6 hr APF, which is why the initial stages are termed pre-pupal. Also, there is at least one more set of cell divisions that occur in later pupal stages (for instance, see recent work from the Buttitta lab).

      We have changed the reference of pupal molt to larva-prepupal transition throughout the manuscript.

      "In contrast, the notum-hinge exhibited simpler linear-like positive allometric growth (Fig. 1 - figure supplement 3C) 

      This oversimplifies, as there is still a strong inflection after the third time point, albeit not as large as with the wing because there is less notal growth.

      We have reworded the text as suggested. 

      "whereas at the WPP stage, dividing cells were only found in a narrow zone where sensory organ precursor cells undergo two divisions to generate future sensory organs (Fig. 1 - figure supplement 4C-E)."

      While there are more dividing cells at the anterior D/V, which will form sensory bristles, there are also dividing cells elsewhere, including in the posterior and scattered through the pouch, where there are no sensory precursors. Sensory organs are limited to the wing margin and the very few campaniform sensilla found on the prospective third vein. The Sens-GFP shown here, meant to identify sensory precursors, does not look much like the Sens expression in Nolo et al 2000. Anterior is on the left in 1S4A-D, but on the right in E.

      We thank the reviewer for this observation. Indeed, the Sens-GFP signal in the figure is too broad. This was owing to bleed-through of the PHH3 signal. Since the pattern of dividing cells at the WPP stage has been so well characterized in the literature, as has the pattern of Sens+ cells at that stage (ie, Nolo et al 2000), we have removed these panels and now simply cite the relevant literature.  

      "The gradient was asymmetric along the AP axis, being lower at the A margin than the P margin."

      The use of "margin" here is a bit confusing, as the term is usually used to describe the wing margin; that is, the D/V compartment boundary in the disc that forms the edge of the wing. Can the authors use a different term? It would also be helpful to point out that the A and P extremes are also, because of the geometry of the disc, the prospective proximal portions of the wing margin, and the hinge, especially since the authors are including the regions proximal to the most distal fold.

      We have reworded it as suggested.   

      The graphed loss of the Fat A-P gradient between day 5 third and wpp is dramatic. Given that the changes in folding at wpp might alter which cells are being graphed, can the authors show a photo?

      We have now included a photo of Fat-GFP at WPP in Fig 2 - figure supplement 2E.

      "Since Ds levels are highest and most steep near the margins, perhaps Ds inhibits Fat expression in a dose- or gradient-dependent manner. We also followed Fat-GFP dynamics in the ds mutant. We did not observe the progressive flattening of the FatGFP profile to the WPP wing (Fig. 2 - figure supplement 3A). Instead, the Fat-GFP profile was graded at the WPP stage and flattened somewhat more by the BPP stage (Fig. 2 - figure supplement 3B)."

      This description does not tell the reader if there is any less grading of Fat in the ds mutant compared with wild type; instead, it sounds like it is more graded, as gradation continues at wpp. This would then contradict the hypothesis that proximal Ds is required to create the distal Fat gradient.

      The Fat signals for the two genotypes are directly comparable as the samples were imaged together with the same microscope settings.  Fig 2M shows that the Fat gradient is less graded compared to the wildtype. We have reworded the text to make this more clear. But this graded expression persists longer into WPP, not the level of gradation. The reason for this is not understood.

      The figure, on the other hand, looks like Fat is less graded, although as noted above this could instead be caused by loss of the stable Ds-bound Fat normally found at junctions. 

      Fig 2M shows an increase in Fat levels at the proximal regions of the ds mutant pouch, where Ds is normally most concentrated. This makes the overall profile look less graded. 

      Confusingly, in the Discussion the authors state: "Loss of Ds affects the Fat gradient such that distribution of Fat is uniformly upregulated to peak levels." There is no mention of "peak levels" in the Results, and no mention of "graded" expression in the Discussion. I am unclear on how the absolute levels are being determined and would be surprised if there were peak levels after loss of Ds-bound Fat from junctions.

      The absolute levels between the genotypes were determined by carefully calibrated fluorescence of Fat-GFP from samples imaged at the same time with the same settings. We used the word peak to refer to the highest level of Fat-GFP within a given gradient profile. Clearly, the description is confusing and so we have deleted the word and modified the text to clarify the meaning.

      "Interestingly, the reversed Ds gradient caused a change in the Fat gradient (Fig. 7E). Its peak also became skewed to the anterior and did not normally flatten at the WPP stage."

      This result contradicts the author's earlier model that proximal Ds destabilizes Fat. Instead, the result fits the stabilization of Fat caused by binding to endogenous or overexpressed Ds or Ds ECD (Ma et al. 2003; Matakatsu and Blair, 2004; 2006; Hale et al. 2015).

      We agree that the reversed Ds affects Fat differently than the loss-of-function ds phenotype. We were not intending to propose a model based on the ds mutant, but a simple interpretation of the result. The reversed Ds experiment generates on its own a simple interpretation that is not consistent with the other. This speaks to the complexity of the system. We have changed the text in the Results to make this less confusing.

      Reviewer 2 found the paper to provide insights into normal growth of the wing and useful tools for measurement of growth features. This review offered many insights and thoughtful suggestions, which we have adopted to greatly improve the manuscript. The referee’s points are listed below with our responses.

      Although the approach used to measure volume is new to this study, the basic finding that imaginal disc growth slows at the mid-third instar stage has been known for some time from studies that counted disc cell number during larval development (Fain and Stevens, 1982; Graves and Schubiger, 1982). Although these studies did not directly measure disc volume, because cell size in the disc is not known to change during larval development, cell number is an accurate measure of tissue volume. However, it is worth noting that the approach used here does potentially allow for differential growth of different regions of the disc.

      We had cited the older literature in reference to our results. We have now noted the approach’s usefulness in measuring different disc regions such as the pouch.

      Related to point 1, a main conclusion of this study, that cell cycle length scales with growth of the wing, is based on a developmentally limited analysis that is restricted to the mid-third instar larval stage and later (early third instar begins at 72 hr - the authors' analysis started at 84 hr). The previous studies cited above made measurements from the beginning of the 3rd instar and combined them with previous histological analyses of cell numbers starting at the beginning of the 2nd instar. Interestingly, both studies found that cell number increases exponentially from the start of the 2nd instar until mid-third instar, and only after that point does the cell cycle slow resulting in the linear growth reported here. The current study states that growth is linear due to scaling of cell cycle with disc size as though this is a general principle, but from the earlier studies, this is not the case earlier in disc development and instead applies only to the last day of larval life.

      We apologize for not making this distinction clearer in the original manuscript. Indeed, growth is initially exponential and shifts to a more linear-like regime in the mid third instar. Our focus in the manuscript is primarily this latter phase. We have now rewritten the text in the Introduction, Results and Discussion to make this very clear. 

      While cell number and pouch volume increase exponentially from the start of the 2nd instar, the cell cycle already begins to slow down during the 2nd instar, as found with mitotic index measurements done by Wartlick et al 2011. Using their data to model cell cycle duration as a function of pouch area, we find that during the 2nd instar, cell cycle duration also increases as the size of the wing pouch increases. This is shown in the figure (panel C) below. Note that this relationship appears nonlinear and is quantitatively distinct from the relationship for third instar wing growth.

      Author response image 1.

      The analysis of the roles of Fat and Dachsous presented here has weaknesses that should be addressed. It is very curious that the authors found that depletion of Fat by RNAi in the wing blade had essentially no effect on growth while depletion of Dachsous did, given that the loss of function overgrowth phenotype of null mutations in fat is more severe than that of null mutations in dachsous (Matakatsu and Blair, 2006). An obvious possibility is that the Fat RNAi transgene employed in these experiments is not very efficient. The authors tried to address this by doubling the dose of the transgene, but it is not clear to me that this approach is known to be effective. The authors should test other RNAi transgenes and additionally include an analysis of growth of discs from animals homozygous for null alleles, which as they note survive to the late larval stages.

      We thank the reviewer for this suggestion. Indeed, the weak effect of Fat RNAi had been due to the specific RNAi driver. We followed the reviewer’s suggestion and tested other RNAi stocks. We had in hand an RNAi driver against GFP that we had found in unrelated studies to be a very potent repressor of GFP expression. Since we had been using a knock-in allele of GFP inserted in frame to Fat throughout this study, we applied nub>Gal4 UAS-GFP RNAi to knock down homozygous Fat-GFP. The effect of the knockdown was very strong, as measured by remaining 488nm fluorescence above background fluorescence after knockdown. Correcting for background fluorescence, we estimated that only 4.5% of Fat-GFP remained under RNAi conditions (Figure 5 - figure supplement 3). 

      Using the more potent RNAi reagent, we repeated the various experiments related to Fat. We observed a 42% increase in wing pouch growth, which is similar to that of Ds RNAi. We also observed an effect of Fat RNAi on the average cell cycle time of wing pouch cells. There was still a linear coupling between the cell cycle duration and wing pouch size, but the slope of the coupling was smaller with Fat RNAi. This was very similar to what Ds RNAi does to the cell cycle. Therefore, we have replaced the data from the original Fat RNAi experiments with the new data and modified the text throughout the manuscript to describe the new results.

      It is surprising that the authors detect a gradient of Fat expression that has not been seen previously given that this protein has been extensively studied. It is also surprising that they find that expression of Nubbin Gal4 is graded across the wing blade given that previous studies indicate that it is uniform (ie. Martín et al. 2004). These two surprising findings raise the possibility that the quantification of fluorescence could be inaccurate. The curvature of the wing blade makes it a challenging tissue to image, particularly for quantitative measurements.

      Fat protein expression not being uniform has been observed before but not carefully quantified (see Mao et al., 2009, Strutt and Strutt 2002).  Martin et al. 2004 (doi 10.1242/dev.013) claimed that Nub-Gal4 is uniform without actually measuring it. Please consult Fig 1A and 2A in their paper, which clearly shows stronger expression in the center/distal region of the pouch. 

      Regarding systematic errors in quantification, we took great pains to minimize them. We carefully divided the complex folded disc’s z stack into an apical region of interest (ROI) that included the distal domain of the wing pouch and a basal ROI that included the folds encompassing the pouch. We then used a published and widely used surface detection algorithm (ImSAnE) that captures a 3D region of interest (ROI) that can be curved and complex in shape (in z space) because the user creates a surface spline of the ROI. The resulting output treats the ROI as a virtual 2D object. This obviates the need to perform max projections of confocal stacks, which often create artifacts that the reviewer speaks of. Instead, ImSAnE eliminates such artifacts, and it is the gold standard for image processing of ROIs with 3D curvature. 

      Moreover, our pipeline does detect uniform expression if it is there. We used a da-Gal4 driver in Fig. 2K,L - this driver is widely acknowledged to be uniformly expressed in tissues of the fly. When it drives a control fluorescent marker (Bazooka-mCherry), our analysis pipeline detects a uniform expression pattern across the wing pouch (Fig. 2L). When the same Gal4 transgene drives Fat-HA in the same tissue, our pipeline detects a graded expression pattern of Fat-HA (Fig. 2L). In fact, this experiment co-expressed both Fat-HA and the control marker in the same disc. Thus, we feel confident that our analysis is not inaccurate.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

      Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      (1) I think the authors should state how many parameters require fitting to the data vs the total number of model parameters.

      The total number of model parameters are listed in Table 1. Each parameter has, in addition, references listed for the source of data (if one exists) along with how the data were used (’C’ calculate, ’F’ fit, ’E’ estimated, or ’S’ for scaled) for the specific simulations that appear in this paper. While this is a daunting number of parameters, only a few of these parameters must be updated when modeling a new musculotendon.

      Similar to a Hill-type muscle model, at least 5 parameters are needed to fit the VEXAT model to a specific musculotendon: maximum isometric force (fiso), optimal contractile element (CE) length, pennation angle, maximum shortening velocity, and tendon slack length. However, similar to a Hill model, it is only possible to use this minimal set of parameters by making use of default values for the remaining set of parameters. The defaults we have used have been extracted from mammalian muscle (see Table 1) and may not be appropriate for modeling muscle tissue that differs widely in terms of the ratio of fast/slow twitch fibers, titin isoform, temperature, and scale.

      Even when these defaults are appropriate, variation is the rule for biological data rather than the exception. It will always be the case that the best fit can only be obtained by fitting more of the model’s parameters to additional data. Standard measurements of the active force-length relation, passive forcelength relation, and force-velocity relations are quite helpful to improve the accuracy of the model to a specific muscle. It is challenging to improve the fit of the model’s cross-bridge (XE) and titin models because the data required are so rare. The experiments of Kirsch et al., Prado et al, and Trombitas et´ al. are unique to our knowledge. However, if more data become available, it is relatively straight forward to update the model’s parameters using the methods described in Appendix B or the code that appears online (https://github.com/mjhmilla/Millard2023VexatMuscle).

      We have modified the manuscript to make it clear that, in some circumstances, the burden of parameter identification for the VEXAT model can be as low as a Hill model:

      - Section 3: last two sentences of the 2nd paragraph, found at: Page 10, column 2, lines 1-12 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      - Table 1: last two sentences of the caption, found at: Page 11 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      (2) It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      All of the experiments simulated in this work are in-situ or ex-vivo. So far the main challenges of simulating any experiment have been quite consistent across both in-situ and ex-vivo datasets: there are insufficient data to fit most model parameters to a specific specimen and, instead, defaults from the literature must be used. In an ideal case, a specimen would have roughly ten extra trials collected so that the maximum isometric force, optimal fiber length, active force-length relation, passive force-length relation (upto ≈ 0_._6_f_oM), and the force-velocity relations could be identified from measurements rather than relying on literature values. Since most lab specimens are viable for a small number of trials (with the exception of cat soleus), we don’t expect this situation to change in future.

      However, if data are available the fitting process is pretty straight forward for either in-situ or ex-vivo data: use a standard numerical method (for example non-linear least squares, or the bisection method) to adjust the model parameters to reduce the errors between simulation and experiment. The main difficulty, as described in the previous paragraph, is the availability of data to fit as many parameters as possible for a specific specimen. As such, the fitting process really varies from experiment to experiment and depends mainly on the richness of measurements taken from a specific specimen, and from the literature in general.

      Working from in-vivo data presents an entirely different set of challenges. When working with human data, for example, it’s just not possible to directly measure muscle force with tendon buckles, and so it is never completely clear how force is distributed across the many muscles that typically actuate a joint. Further, there is also uncertainty in the boundary condition of the muscle because optical motion capture markers will move with respect to the skeleton. Video fluoroscopy offers a method of improving the accuracy of measured boundary conditions, though only for a few labs due to its great expense. A final boundary condition remains impossible to measure in any case: the geometry and forces that act at the boundaries as muscle wraps over other muscles and bones. Fitting to in-vivo data are very difficult.

      While this is an interesting topic, it is tangent to our already lengthy manuscript. Since these reviews are public, we’ll leave it to the motivated reader to find this text here.

      Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model’s ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      (1) It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch.

      While many muscle physiologists are aware of the limitations of the Hill model, these limitations are not so well known among computational biomechanists. There are at least two reasons for this gap: there are few comprehensive evaluations of Hill models against several experiments, and some of the differences are quite nuanced. For example, active lengthening experiments can be replicated reasonably well using a Hill model if the lengthening is done on the ascending limb of the force length curve. Clearly the story is quite different on the descending limb as shown in Figure 9. Similarly, as Figure 8 shows, by choosing the right combination of tendon model and perturbation bandwidth it is possible to get reasonably accurate responses from the Hill model to stochastic length changes. Yet when a wide variety of perturbation bandwidths, magnitudes, and tendon models are tested it is clear that the Hill model cannot, in general, replicate the response of muscle to stochastic perturbations. For these reasons we think many of the Hill model’s drawbacks have not been clearly understood by computational biomechanists for many years now.

      (2) Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      We agree that it will be valuable to benchmark other models in the literature using the same set of experiments. Hopefully we, or perhaps others, will have the good fortune to secure research funding to continue this benchmarking work. This will, however, be quite challenging: few muscle models are accompanied by a professional-quality open-source implementation. Without such an implementation it is often impossible to reproduce published results let alone provide a fair and objective evaluation of a model.

      (3) For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening.

      The titin model described in the paper will provide an enhancement of force during a stretch-shortening cycle. This certainly would be an interesting next experiment to simulate in a future paper.

      (4) In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      We can only respond to what drives the frequency dependent stiffness in the model, though we’re quite interested in what happens physiologically. Hopefully that there are some new experiments done to examine this phenomena in the future. In the case of the model, the reasons are pretty straight forward: the formulation of Eqn. 16 is responsible for this shift.

      Equation 16 has been formulated so that the acceleration of the attachment point of the XE is driven by the force difference between the XE and a reference Hill model (numerator of the first term in Eqn. 16) which is then low pass filtered (denominator of the first term in Eqn. 16). Due to this formulation the attachment point moves less when the numerator is small, or when the differences in the numerator change rapidly and effectively become filtered out. When the attachment point moves less, more of the CE’s force output is determined by variations in the length of the XE and its stiffness.

      On the other hand, the attachment point will move when the numerator of the first term in Eqn. 16 is large, or when those differences are not short lived. When the attachment point moves to reduce the strain in the XE, the force produced by the XE’s spring-damper is reduced. As a result, the CE’s force output is less influenced by variations of the length of the XE and its stiffness.

      Reviewer #2 (Recommendations for the Authors):

      I find the clarity of the manuscript to be much improved following revision. While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction, the revised description of small length changes makes the interpretation much less confusing.

      Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established. Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      (1) While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction ...

      We have had to abstract some of the details of reality to have a model that can be used to simulate hundreds of muscles. In contrast, FiberSim produced by Kenneth Campbell’s group uses much less abstraction and might be of greater interest to you. FiberSim’s models include individual cross-bridges, titin molecules, and an explicit representation of the spatial geometry of a sarcomere. While this model is a great tool for testing muscle physiology questions through simulation, it is computationally expensive to use this model to simulate hundreds of muscles simultaneously.

      Kosta S, Colli D, Ye Q, Campbell KS. FiberSim: A flexible open-source model of myofilament-level contraction. Biophysical journal. 2022 Jan 18;121(2):175-82.https://campbell-muscle-lab.github.io/FiberSim/

      (2) Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established.

      Please see our response 1 to Reviewer # 1.

      (3) Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      Please see our response to 2 to Reviewer #1.

    2. eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

    3. Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

    4. Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model's ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

    1. eLife assessment

      This proof-of-concept study focuses on an A->G DNA base editing strategy that converts CAG repeats to CAA repeats in the human HTT gene, which causes Huntington's disease (HD). These studies are conducted in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. The findings of this study are valuable for the HD field, applying state-of-the-art techniques; however, the key experiments have yet to be performed in neuronal systems or brains of these mice: actual disease-rectifying effects relevant to patients have yet to observed.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In the paper by Choi et al., the authors aimed to develop base editing strategies to convert CAG repeats to CAA repeats in the huntingtin gene (HTT), which causes Huntington's disease (HD). They hypothesized that this conversion would delay disease onset by shortening the uninterrupted CAG repeat. Using HEK-293T cells as a model, the researchers employed cytosine base editors and guide RNAs (gRNAs) to efficiently convert CAG to CAA at various sites within the CAG repeat. No significant indels, off-target edits, transcriptome alterations, or changes in HTT protein levels were detected. Interestingly, somatic CAG repeat expansion was completely abolished in HD knock-in mice carrying CAA-interrupted repeats.

      Strengths:<br /> This study represents the first proof-of-concept exploration of the cytosine base editing technique as a potential treatment for HD and other repeat expansion disorders with similar mechanisms.

      Weaknesses:<br /> Given that HD is a neurodegenerative disorder, it is crucial to determine the efficiency of the base editing strategies tested in this manuscript and their feasibility in relevant cells affected by HD and the brain, which needed to be improved in this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In a proof-of-concept study with the aspiration of developing a treatment to delay HD onset, Choi et al. design and test an A>G DNA base editing strategy to exploit the recently established inverse relationship between the number of uninterrupted CAG repeats in polyglutamine repeat expansions and the age-of-onset of Huntington's Disease (HD). Most of the study is devoted to optimizing a base editing strategy typified by BE4max and gRNA2. The base editing is performed in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats.

      Weaknesses:<br /> Genotypic data on DNA editing are not portrayed in a clear manner consistent with the study's goal, namely reducing the number of uninterrupted CAG repeats by a clinically relevant amount according to the authors' least square approximated mean age-at-onset. No phenotypic data are presented to show that editing performed in either model would lead to reduced hallmarks of HD onset.

      More evidence is needed to support the central claims and therapeutic potential needs to be more adequate.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In human patients with Huntington's disease (HD), caused by a CAG repeat expansion mutation, the number of uninterrupted CAG repeats at the genomic level influences age-at-onset of clinical signs independent of the number of polyglutamine repeats at the protein level. In most patients, the CAG repeat terminates with a CAA-CAG doublet. However, CAG repeat variants exist that either do not have that doublet or have two doublets. These variants consequently differ in their number of uninterrupted CAG repeats, while the number of glutamine repeats is the same as both CAA and CAG codes for glutamine. The authors first confirm that a shorter uninterrupted CAG repeat number in human HD patients is associated with developing the first clinical signs of HD later. They predict that introducing a further CAA-CAG doublet will result in years of delay of clinical onset. Based on this observation, the authors tested the hypothesis that turning CAG to CAA within a CAG repeat sequence using base editing techniques will benefit HD biology. They show that, indeed, in HD cell models (HEK293 cells expressing 16/17 CAG repeats; a single human stem cell line carrying a CAG repeat expansion in the fully penetrant range with 42 CAG repeats), their base editing strategies do induce the desired CAG-CAA conversion. The efficiency of conversion differed depending on the strategy used. In stem cells, delivery posed a problem, so to test allele specificity, the authors then used a HEK 293 cell line with 51 CAG repeats on the expanded allele. Conversion occurred in both alleles with huntingtin protein and mRNA levels; transcriptomics data was unchanged. In knock-in mice carrying 110 CAG repeats, however, base editing did not work as well for different, mainly technical, reasons.

      Strengths:<br /> The authors use state-of-the-art methods and carefully and thoroughly designed experiments. The data support the conclusions drawn. This work is a very valuable translation from the insight gained from large GWAS studies into HD pathogenesis. It rightly emphasises the potential this has as a causal treatment in HD, while the authors also acknowledge important limitations.

      Weaknesses:<br /> They could dedicate a little more to discussing several of the mentioned challenges. The reader will better understand where base editing is in HD currently and what needs to be done before it can be considered a treatment option. For instance,

      -It is important to clarify what can be gained by examining again the relationship between uninterrupted CAG repeat length and age-at-onset. Could the authors clarify why they do this and what it adds to their already published GWAS findings? What is the n of datasets?<br /> -What do they think an ideal conversion rate would be, and how that could be achieved?<br /> -Is there a dose-effect relationship for base editing, and would it be realistic to achieve the ideal conversion rate in target cells, given the difficulties described by the authors in differentiated neurons from stem cells?<br /> - The liver is a good tool for in-vivo experiments examining repeat instability in mouse models. However, the authors could comment on why they did not examine the brain.<br /> - Is there a limit to judging the effects of base editing on somatic instability with longer repeats, given the difficulties in measuring long CAG repeat expansions?<br /> - Given the methodological challenges for assessing HTT fragments, are there other ways to measure the downstream effects of base editing rather than extrapolate what it will likely be?<br /> - Sequencing errors could mask low-level, but biologically still relevant, off-target effects (such as gRNA-dependent and gRNA-independent DNA, Off-targets, RNA off-targets, bystander editing). How likely is that?<br /> - How worried are the authors about immune responses following base editing? How could this be assessed?

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      In the paper by Choi et al., the authors aimed to develop base editing strategies to convert CAG repeats to CAA repeats in the huntingtin gene (HTT), which causes Huntington's disease (HD). They hypothesized that this conversion would delay disease onset by shortening the uninterrupted CAG repeat. Using HEK-293T cells as a model, the researchers employed cytosine base editors and guide RNAs (gRNAs) to efficiently convert CAG to CAA at various sites within the CAG repeat. No significant indels, off-target edits, transcriptome alterations, or changes in HTT protein levels were detected. Interestingly, somatic CAG repeat expansion was completely abolished in HD knock-in mice carrying CAA-interrupted repeats. 

      Correction of factual errors

      We analyzed HEK293 cells, not "HEK-293T".

      Strengths: 

      This study represents the first proof-of-concept exploration of the cytosine base editing technique as a potential treatment for HD and other repeat expansion disorders with similar mechanisms. 

      Weaknesses: 

      Given that HD is a neurodegenerative disorder, it is crucial to determine the efficiency of the base editing strategies tested in this manuscript and their feasibility in relevant cells affected by HD and the brain, which needed to be improved in this manuscript. 

      We appreciate the reviewer's constructive recommendations. Our genetic investigation focused on understanding observations in HD patients to develop genetic-based treatment strategies and test their feasibility. We agree with the reviewer regarding the importance of data from relevant cell types. Unfortunately, the levels of CAG-to-CAA conversion in the patient-derived neurons were modest, as described in our manuscript (approximately 2%). In addition, AAV did not produce detectable conversions in the brain of HD knock-in mice (data not shown), which was somewhat expected from the literature (PMID: 31937940). We believe some technical hurdles can be overcome by developing efficient delivery methods. Nonetheless, it will be an important follow-up study to perform preclinical studies employing optimized base editing strategies and efficient brain delivery methods to fully demonstrate the therapeutic potential of BE strategies. 

      Reviewer #2 (Public Review):

      Summary: 

      In a proof-of-concept study with the aspiration of developing a treatment to delay HD onset, Choi et al. design and test an A>G DNA base editing strategy to exploit the recently established inverse relationship between the number of uninterrupted CAG repeats in polyglutamine repeat expansions and the age-of-onset of Huntington's Disease (HD). Most of the study is devoted to optimizing a base editing strategy typified by BE4max and gRNA2. The base editing is performed in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. 

      Correction of factual errors

      We tested base editing strategies aimed at C > T conversion, not A > G DNA base editing. In addition to HEK293 and knock-in mice, we tested base editing strategies in patient-derived iPSC and neurons.

      Weaknesses: 

      Genotypic data on DNA editing are not portrayed in a clear manner consistent with the study's goal, namely reducing the number of uninterrupted CAG repeats by a clinically relevant amount according to the authors' least square approximated mean age-at-onset. No phenotypic data are presented to show that editing performed in either model would lead to reduced hallmarks of HD onset. 

      More evidence is needed to support the central claims and therapeutic potential needs to be more adequate. 

      Our strategies for converting CAG to CAA in model systems resulted in quantitative DNA modification in a population of cells. Consequently, individual cells may carry different genotypes, some harboring CAA and others CAG at the same genomic location. Therefore, using a standard genotype format for DNA to present base editing outcomes may not be ideal. Instead, we presented the resulting genotype data in a quantitative fashion to provide the percentage of conversion at each site. This approach allows for an intuitive interpretation of both the extent of repeat length reduction and the proportion of such modifications.

      Currently, genetically precise HD mouse models with robust motor and behavioral phenotypes are unavailable. While some HD mouse models, such as the BAC and YAC models, feature pronounced behavioral phenotypes, they consist of interrupted CAG repeat sequences, making them unsuitable for base conversion studies due to their inherently short uninterrupted repeats. Although genetically precise HD knockin mouse models exist, they do not manifest motor symptom-like phenotypes. Given that CAG repeat expansion is the primary driver of the disease and knock-in mice recapitulate such phenomenon, our genetic investigation focused on assessing the effects of base conversion on CAG repeat instability in knock-in mice. However, as emphasized by the reviewer, subsequent preclinical studies to evaluate the therapeutic efficacy of CAG-to-CAA conversion strategies using mouse models harboring uninterrupted adult-onset CAG repeats and robust HD-like phenotypes remain crucial.

      Reviewer #3 (Public Review):

      Summary: 

      In human patients with Huntington's disease (HD), caused by a CAG repeat expansion mutation, the number of uninterrupted CAG repeats at the genomic level influences age-at-onset of clinical signs independent of the number of polyglutamine repeats at the protein level. In most patients, the CAG repeat terminates with a CAACAG doublet. However, CAG repeat variants exist that either do not have that doublet or have two doublets. These variants consequently differ in their number of uninterrupted CAG repeats, while the number of glutamine repeats is the same as both CAA and CAG codes for glutamine. The authors first confirm that a shorter uninterrupted CAG repeat number in human HD patients is associated with developing the first clinical signs of HD later. They predict that introducing a further CAA-CAG doublet will result in years of delay of clinical onset. Based on this observation, the authors tested the hypothesis that turning CAG to CAA within a CAG repeat sequence using base editing techniques will benefit HD biology. They show that, indeed, in HD cell models (HEK293 cells expressing 16/17 CAG repeats; a single human stem cell line carrying a CAG repeat expansion in the fully penetrant range with 42 CAG repeats), their base editing strategies do induce the desired CAG-CAA conversion. The efficiency of conversion differed depending on the strategy used. In stem cells, delivery posed a problem, so to test allele specificity, the authors then used a HEK 293 cell line with 51 CAG repeats on the expanded allele. Conversion occurred in both alleles with huntingtin protein and mRNA levels; transcriptomics data was unchanged. In knock-in mice carrying 110 CAG repeats, however, base editing did not work as well for different, mainly technical, reasons. 

      Correction of factual errors

      "HD cell models (HEK293 cells expressing 16/17 CAG repeats" is an incorrect description. It should be "HD cell models (HEK293 cells expressing 51/17 CAG repeats".

      Strengths: 

      The authors use state-of-the-art methods and carefully and thoroughly designed experiments. The data support the conclusions drawn. This work is a very valuable translation from the insight gained from large GWAS studies into HD pathogenesis. It rightly emphasises the potential this has as a causal treatment in HD, while the authors also acknowledge important limitations. 

      Weaknesses: 

      They could dedicate a little more to discussing several of the mentioned challenges. The reader will better understand where base editing is in HD currently and what needs to be done before it can be considered a treatment option. For instance, 

      - It is important to clarify what can be gained by examining again the relationship between uninterrupted CAG repeat length and age-at-onset. Could the authors clarify why they do this and what it adds to their already published GWAS findings? What is the n of datasets? 

      Published HD GWAS (PMID: 31398342) compared the onset age of duplicated interruption and loss of interruption to that of canonical repeats to determine whether uninterrupted CAG repeat or polyglutamine determines age at onset. However, GWAS findings did not quantify the magnitude of the unexplained remaining variance in age at onset in duplicated interruption and loss of interruption. Our study further investigated to gain insights into the amount of additional impact of duplicated interruption to estimate the maximum clinical benefits of base editing strategies for CAG-to-CAA conversion. Since the purpose of this genetic analysis is described in the result section already, we added the following sentence in the introduction section to bring up what is unknown. 

      "Still, age at onset of loss of interruption and duplicated interruption was not fully accounted for by uninterrupted CAG repeat, suggesting additional effects of non-canonical repeats."

      We added sample size for the least square approximation analysis in the text and corresponding figure legend. Sample sizes for molecular and animal experiments can be found in the corresponding figure legend.

      - What do they think an ideal conversion rate would be, and how that could be achieved? 

      It is a very important question. However, speculating the ideal conversion levels is out of the scope of this genetic investigation. A series of preclinical studies using relevant models may generate data that may shed light on the conversion rate levels that are required to produce meaningful clinical benefits. In the discussion section, we added the following sentence. 

      "Currently, the ideal levels of CAG-to-CAA conversion that produce significant clinical benefits are unknown. A series of preclinical studies using relevant model systems may generate data that may shed light on the optimal conversion rate levels that are required to produce significant clinical benefits."

      - Is there a dose-effect relationship for base editing, and would it be realistic to achieve the ideal conversion rate in target cells, given the difficulties described by the authors in differentiated neurons from stem cells? 

      We observed a clear dose-response relationship between the amount of BE reagents and the levels of conversion in non-neuronal cells. Unfortunately, the conversion rate was low in neuronal cells, potentially due to limited delivery, as speculated in the result section. As described in the discussion sections, we predict that efficient delivery methods will be crucial to produce significant CAG-to-CAA conversion to achieve therapeutic benefits.

      - The liver is a good tool for in-vivo experiments examining repeat instability in mouse models. However, the authors could comment on why they did not examine the brain.

      We focused on liver instability because of 1) the expectation that delivery/targeting efficiency is significantly lower in the brain (PMID: 31937940) and 2) shared underlying mechanisms between the brain and liver (described in the result section). The following sentence was added in the method section to provide a rationale for liver analysis. 

      "Since significantly lower delivery/targeting efficiency was expected in the brain 34, we focused on analyzing liver instability."

      - Is there a limit to judging the effects of base editing on somatic instability with longer repeats, given the difficulties in measuring long CAG repeat expansions? 

      Determining the levels of base conversion using sequencing technologies gets harder as repeats become longer. Fragment analysis can overcome such technical difficulty if conversion efficiency is high. As pointed out, the repeat expansion measure is also challenging because amplification is biased toward shorter alleles. However, if repeat sizes are relatively similar, the levels of repeat expansion as a function of base conversion can be determined relatively precisely without a significant bias by a standard fragment analysis approach. 

      - Given the methodological challenges for assessing HTT fragments, are there other ways to measure the downstream effects of base editing rather than extrapolate what it will likely be?

      Our CAG-to-CAA conversion strategies are not expected to directly generate fragments of huntingtin DNA, RNA, or protein. In contrast, immediate downstream effects of CAG-to-CAA conversion include sequence changes (DNA and RNA) and alteration of repeat instability, which are presented in the manuscript. If repeat instability is associated with HTT exon 1A fragment, base conversion strategies may indirectly alter the levels of such putative toxic species, which remains to be determined.  

      - Sequencing errors could mask low-level, but biologically still relevant, off-target effects (such as gRNAdependent and gRNA-independent DNA, Off-targets, RNA off-targets, bystander editing). How likely is that? 

      We agree with the reviewer that increased editing efficiency is expected to increase the levels of off-target editing. However, the field is actively developing base editors with minimal off-target effect (PMID: 35941130), which will increase the safety aspects of this technology for clinical use. We added the following sentence.  "In addition, developing base editors with high level on-target gene specificity and minimal off-target effects is a critical aspect to address 100."

      - How worried are the authors about immune responses following base editing? How could this be assessed? 

      We added the following sentence in the discussion section as the reviewer raised an important safety issue.  

      "Thorough assessments of immune responses against base editing strategies (e.g., development of antibody, B cell, and T cell-specific immune responses) and subsequent modification (e.g., immunosilencing) 101 will be critical to address immune response-associated safety issues of BE strategies."

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following points could be considered to improve the overall quality of the manuscript: 

      (1) The authors mentioned that the reason for checking repeat instability in the nonneuronal cells was due to the availability of specific types of AAV; there are other subtypes of AAVs available to infect neurons and iPSCs. 

      Our pilot experiments testing several AAV serotypes in patient-derived iPSC and HD knock-in mice showed that only AAV9 converted CAG to CAA at detectable levels in the liver, not in the brain or neurons. We also speculate that difficulties in targeting the CAG repeat region due to GC-rich sequence contributed to low conversion efficiency. Therefore, subsequent optimization of base editor and delivery may improve BE strategies for HD, permitting robust conversion at the challenging locus. 

      (2) Despite its bold nature, minimal data in the manuscript demonstrate that this gene editing strategy is disease-modifying.

      Resources required to demonstrate the therapeutic benefits of CAG-to-CAA conversion strategies are not fully available. Especially, relevant HD mouse models that carry uninterrupted adult onset CAG repeat and that permit measuring the levels of disease-modifying are lacking, as described in our response to the second reviewer. Given that CAG repeat expansion is the primary driver of the disease, this genetic investigation focused on determining the impacts of base editing strategies on CAG repeat expansion. Still, as indicated by the reviewer, follow-up preclinical studies to evaluate the levels of disease-modifying of CAG-to-CAA conversion strategies using relevant mouse models represent important next steps.

      (3) Off-target analysis at the DNA level was limited to "predicted" off-target sites. What about possible translocations that can result from co-nicking on different chromosomes, as a large number of potential targets exist? 

      Among gRNAs we tested, we focused on gRNAs 1 and 2, which predicted small numbers of off-target. Therefore, our off-target analysis at the DNA level was focused on validating those predicted off-targets. As pointed out, thoroughly evaluating off-target effects will be necessary when candidate BE strategies take the next steps for therapeutic development.

      Genomic translocation caused by double-strand breaks can produce negative consequences, such as cancer. Importantly, although paired nicks efficiently induced translocations, translocations were not detected when a single nick was introduced on each chromosome (PMID: 25201414). Therefore, it is predicted that BE strategies using nickase confers little risk of translocation.

      (4) For in vivo work, somatic repeat expansion was analyzed only in peripheral tissue samples. Since the main affected cellular population in HD is the brain, the outcome of this treatment on a disease-relevant organ still needs to be determined. 

      Challenges in delivery to the brain made us determine instability in the liver since many mechanistic components of somatic CAG repeat instability are shared between the liver and striatum, as rationalized in the manuscript. However, we agree with the reviewer regarding the importance of determining the effects of base conversion on brain instability. We added the following sentence in the method section to provide a rationale. "Since significantly lower delivery/targeting efficiency was expected in brain 34, we focused on analyzing liver instability."

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, the authors apologize for techniques that do not work when workarounds seem readily apparent to an expert in the field. In its current form, the manuscript reads verbose, speculative, apologetic, and preliminary. 

      Drug development programs that are supported by human genetics data show increased success rates in clinical trials (PMID: 26121088, 31827124, 31830040). This is why this genetic study focused on 1) investigating observations in HD subjects and 2) subsequently developing treatment strategies that are supported by patient genetics. As the first illustration of base editing in HD, the main scope of our manuscript is to justify the genetic rationale of CAG-to-CAA conversion and demonstrate the feasibility of therapeutic strategies rooted in patient genetics. As our study was not aimed at entirely demonstrating the clinical benefits of base editing strategies in HD, some of our data were based on tools and approaches that were not fully optimal. We agree with the reviewer that it will be an important next step to employ optimized approaches to evaluate the efficacy of base editing strategies in model systems. Nevertheless, our novel base conversion strategies derived from HD patient genetics represent a significant advancement as they may contribute to developing effective treatments for this devastating disorder. 

      Reviewer#3 (Recommendations For The Authors):

      It would make for an easier read if abbreviations were kept to a minimum. 

      As recommended, we decreased the use of abbreviations. The following has been spelled out throughout the manuscript: CR (canonical repeat), LI (loss of interruption), DI (duplicated interruption), and CBE (cytosine base editor). Other abbreviations with infrequent usage (e.g., ABE, SS, QC) were also spelled out in the text.

    1. eLife assessment

      This study provides a valuable contribution to our understanding of the mechanisms underlying the limited capacity to process rapid sequences of visual stimuli by reporting convincing evidence that the attentional blink affects neurally separable processes of visual detection and discrimination. The motivation for some of the analyses and the connection to previous empirical and theoretical work can be improved. The study will be of interest to neuroscientists and psychologists investigating perception and attention.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors used a multi-alternative decision task and a multidimensional signal-detection model to gain further insight into the cause of perceptual impairments during the attentional blink. The model-based analyses of behavioural and EEG data show that such perceptual failures can be unpacked into distinct deficits in visual detection and discrimination, with visual detection being linked to the amplitude of late ERP components (N2P and P3) and discrimination being linked to the coherence of fronto-parietal brain activity.

      Strengths:

      The main strength of this paper lies in the fact that it presents a novel perspective on the cause of perceptual failures during the attentional blink. The multidimensional signal-detection modelling approach is explained clearly, and the results of the study show that this approach offers a powerful method to unpack behavioural and EEG data into distinct processes of detection and discrimination.

      Weaknesses:

      While the model-based analyses are compelling, the paper also features some analyses that seem misguided, or, at least, insufficiently motivated and explained. Specifically, in the introduction, the authors raise the suggestion that the attentional blink could be due to a reduction in sensitivity or a response bias. The suggestion that a response bias could play a role seems misguided, as any response bias would be expected to be constant across lags, while the attentional blink effect is only observed at short lags. Thus, it is difficult to understand why the authors would think that a response bias could explain the attentional blink.

      A second point of concern regards the way in which the measures for detection and discrimination accuracy were computed. If I understand the paper correctly, a correct detection was defined as either correctly identifying T2 (i.e., reporting CW or CCW if T2 was CW or CCW, respectively, see Figure 2B), or correctly reporting T2's absence (a correct rejection). Here, it seems that one should also count a misidentification (i.e., incorrect choice of CW or CCW when T2 was present) as a correct detection, because participants apparently did detect T2, but failed to judge/remember its orientation properly in case of a misidentification. Conversely, the manner in which discrimination performance is computed also raises questions. Here, the authors appear to compute accuracy as the average proportion of T2-present trials on which participants selected the correct response option for T2, thus including trials in which participants missed T2 entirely. Thus, a failure to detect T2 is now counted as a failure to discriminate T2. Wouldn't a more proper measure of discrimination accuracy be to compute the proportion of correct discriminations for trials in which participants detected T2?

      My last point of critique is that the paper offers little if any guidance on how the inferred distinction between detection and discrimination can be linked to existing theories of the attentional blink. The discussion mostly focuses on comparisons to previous EEG studies, but it would be interesting to know how the authors connect their findings to extant, mechanistic accounts of the attentional blink. A key question here is whether the finding of dissociable processes of detection and discrimination would also hold with more meaningful stimuli in an identification task (e.g., the canonical AB task of identifying two letters shown amongst digits). There is evidence to suggest that meaningful stimuli are categorized just as quickly as they are detected (Grill-Spector & Kanwisher, 2005; Grill-Spector K, Kanwisher N. Visual recognition: as soon as you know it is there, you know what it is. Psychol Sci. 2005 Feb;16(2):152-60. doi: 10.1111/j.0956-7976.2005.00796.x. PMID: 15686582.). Does that mean that the observed distinction between detection and discrimination would only apply to tasks in which the targets consist of otherwise meaningless visual elements, such as lines of different orientations?

    3. Reviewer #2 (Public Review):

      Summary:

      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.

      Strengths:

      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.

      Weaknesses:

      The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these previous behavioral findings.

    4. Reviewer #3 (Public Review):

      Summary:

      In the present study, the authors aimed to achieve a better understanding of the mechanisms underlying the attentional blink, that is, a deficit in processing the second of two target stimuli when they appear in rapid succession. Specifically, they used a concurrent detection and identification task in- and outside of the attentional blink and decoupled effects of perceptual sensitivity and response bias using a novel signal detection model. They conclude that the attentional blink selectively impairs perceptual sensitivity but not response bias, and link established EEG markers of the attentional blink to deficits in stimulus detection (N2p, P3) and discrimination (fronto-parietal high-beta coherence), respectively. Taken together, their study suggests distinct mechanisms mediating detection and discrimination deficits in the attentional blink.

      Strengths:

      Major strengths of the present study include its innovative approach to investigating the mechanisms underlying the attentional blink, an elegant, carefully calibrated experimental paradigm, a novel signal detection model, and multifaceted data analyses using state-of-the-art model comparisons and robust statistical tests. The study appears to have been carefully conducted and the overall conclusions seem warranted given the results. In my opinion, the manuscript is a valuable contribution to the current literature on the attentional blink. Moreover, the novel paradigm and signal detection model are likely to stimulate future research.

      Weaknesses:

      Weaknesses of the present manuscript mainly concern the negligence of some relevant literature, unclear hypotheses, potentially data-driven analyses, relatively low statistical power, potential flaws in the EEG methods, and the absence of a discussion of limitations. In the following, I will list some major and minor concerns in detail.

      Major points

      Hypotheses:<br /> I appreciate the multifaceted, in-depth analysis of the given dataset including its high amount of different statistical tests. However, neither the Introduction nor the Methods contain specific statistical hypotheses. Moreover, many of the tests (e.g., correlations) rely on selected results of previous tests. It is unclear how many of the tests were planned a priori, how many more were performed, and how exactly corrections for multiple tests were implemented. Thus, I find it difficult to assess the robustness of the results.

      Power:<br /> Some important null findings may result from the rather small sample sizes of N = 24 for behavioral and N = 18 for ERP analyses. For example, the correlation between detection and discrimination d' deficits across participants (r=0.39, p=0.059) (p. 12, l. 263) and the attentional blink effect on the P1 component (p=0.050, no test statistic) (p. 14, 301) could each have been significant with one more participant. In my opinion, such results should not be interpreted as evidence for the absence of effects.

      Neural basis of the attentional blink:<br /> The introduction (e.g., p. 4, l. 56-76) and discussion (e.g., p. 19, 427-447) do not incorporate the insights from the highly relevant recent review by Zivony & Lamy (2022), which is only cited once (p. 19, l. 428). Moreover, the sections do not mention some relevant ERP studies of the attentional blink (e.g., Batterink et al., 2012; Craston et al., 2009; Dell'Acqua et al., 2015; Dellert et al., 2022; Eiserbeck et al., 2022; Meijs et al., 2018).

      Detection versus discrimination:<br /> Concerning the neural basis of detection versus discrimination (e.g., p. 6, l. 98-110; p. 18, l. 399-412), relevant existing literature (e.g., Broadbent & Broadbent, 1987; Hillis & Brainard, 2007; Koivisto et al., 2017; Straube & Fahle, 2011; Wiens et al., 2023) is not included.

      Pooling of lags and lag 1 sparing:<br /> I wonder why the authors chose to include 5 different lags when they later pooled early (100, 300 ms) and late (700, 900 ms) lags, and whether this pooling is justified. This is important because T2 at lag 1 (100 ms) is typically "spared" (high accuracy) while T2 at lag 3 (300 ms) shows the maximum AB (for reviews, see, e.g., Dux & Marois, 2009; Martens & Wyble, 2010). Interestingly, this sparing was not observed here (p. 43, Figure 2). Nevertheless, considering the literature and the research questions at hand, it is questionable whether lag 1 and 3 should be pooled.

      Discrimination in the attentional blink<br /> Concerning the claims that previous attentional blink studies conflated detection and discrimination (p. 6, l. 111-114; p. 18, l. 416), there is a recent ERP study (Dellert et al., 2022) in which participants did not perform a discrimination task for the T2 stimuli. Moreover, since the relevance of all stimuli except T1 was uncertain in this study, irrelevant distractors could not be filtered out (cf. p. 19, l. 437). Under these conditions, the attentional blink was still associated with reduced negativities in the N2 range (cf. p. 19, l. 427-437) but not with a reduced P3 (cf. p. 19, l 439-447).

      General EEG methods:<br /> While most of the description of the EEG preprocessing and analysis (p. 31/32) is appropriate, it also lacks some important information (see, e.g., Keil et al., 2014). For example, it does not include the length of the segments, the type and proportion of artifacts rejected, the number of trials used for averaging in each condition, specific hypotheses, and the test statistics (in addition to p-values).

      EEG filters:<br /> P. 31, l. 728: "The data were (...) bandpass filtered between 0.5 to 18 Hz (...). Next, a bandstop filter from 9-11 Hz was applied to remove the 10 Hz oscillations evoked by the RSVP presentation." These filter settings do not follow common recommendations and could potentially induce filter distortions (e.g., Luck, 2014; Zhang et al., 2024). For example, the 0.5 high-pass filter could distort the slow P3 wave. Mostly, I am concerned about the bandstop filter. Since the authors commendably corrected for RSVP-evoked responses by subtracting T2-absent from T2-present ERPs (p. 31, l. 746), I wonder why the additional filter was necessary, and whether it might have removed relevant peaks in the ERPs of interest.

      Coherence analysis:<br /> P. 33, l. 786: "For subsequent, partial correlation analyses of coherence with behavioral metrics and neural distances (...), we focused on a 300 ms time period (0-300 ms following T2 onset) and high-beta frequency band (20-30 Hz) identified by the cluster-based permutation test (Fig. 5A-C)." I wonder whether there were any a priori criteria for the definition and selection of such successive analyses. Given the many factors (frequency bands, hemispheres) in the analyses and the particular shape of the cluster (p. 49, Fig 5C), this focus seems largely data-driven. It remains unclear how many such tests were performed and whether the results (e.g., the resulting weak correlation of r = 0.22 in one frequency band and one hemisphere in one part of a complexly shaped cluster; p. 15, l. 327) can be considered robust.

      References<br /> Batterink, L., Karns, C. M., & Neville, H. (2012). Dissociable mechanisms supporting awareness: The P300 and gamma in a linguistic attentional blink task. Cerebral Cortex, 22(12), 2733-2744. https://doi.org/10.1093/cercor/bhr346<br /> Broadbent, D. E., & Broadbent, M. H. P. (1987). From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception & Psychophysics, 42(2), 105-113. https://doi.org/10.3758/BF03210498<br /> Craston, P., Wyble, B., Chennu, S., & Bowman, H. (2009). The attentional blink reveals serial working memory encoding: Evidence from virtual and human event-related potentials. Journal of Cognitive Neuroscience, 21(3), 550-566. https://doi.org/10.1162/jocn.2009.21036<br /> Dell'Acqua, R., Dux, P. E., Wyble, B., Doro, M., Sessa, P., Meconi, F., & Jolicœur, P. (2015). The attentional blink impairs detection and delays encoding of visual information: Evidence from human electrophysiology. Journal of Cognitive Neuroscience, 27(4), 720-735. https://doi.org/10.1162/jocn_a_00752<br /> Dellert, T., Krebs, S., Bruchmann, M., Schindler, S., Peters, A., & Straube, T. (2022). Neural correlates of consciousness in an attentional blink paradigm with uncertain target relevance. NeuroImage, 264C, 119679. https://doi.org/10.1016/j.neuroimage.2022.119679<br /> Dux, P. E., & Marois, R. (2009). The attentional blink: A review of data and theory. Attention, Perception, & Psychophysics, 71(8), 1683-1700. https://doi.org/10.3758/APP.71.8.1683<br /> Hillis, J. M., & Brainard, D. H. (2007). Distinct mechanisms mediate visual detection and identification. Current Biology, 17(19), 1714-1719. https://doi.org/10.1016/j.cub.2007.09.012<br /> Keil, A., Debener, S., Gratton, G., Junghöfer, M., Kappenman, E. S., Luck, S. J., Luu, P., Miller, G. A., & Yee, C. M. (2014). Committee report: Publication guidelines and recommendations for studies using electroencephalography and magnetoencephalography. Psychophysiology, 51(1), 1-21. https://doi.org/10.1111/psyp.12147<br /> Koivisto, M., Grassini, S., Salminen-Vaparanta, N., & Revonsuo, A. (2017). Different electrophysiological correlates of visual awareness for detection and identification. Journal of Cognitive Neuroscience, 29(9), 1621-1631. https://doi.org/10.1162/jocn_a_01149<br /> Luck, S. J. (2014). An introduction to the event-related potential technique. MIT Press.<br /> Martens, S., & Wyble, B. (2010). The attentional blink: Past, present, and future of a blind spot in perceptual awareness. Neuroscience & Biobehavioral Reviews, 34(6), 947-957. https://doi.org/10.1016/j.neubiorev.2009.12.005<br /> Meijs, E. L., Slagter, H. A., de Lange, F. P., & Gaal, S. van. (2018). Dynamic interactions between top-down expectations and conscious awareness. Journal of Neuroscience, 38(9), 2318-2327. https://doi.org/10.1523/JNEUROSCI.1952-17.2017<br /> Straube, S., & Fahle, M. (2011). Visual detection and identification are not the same: Evidence from psychophysics and fMRI. Brain and Cognition, 75(1), 29-38. https://doi.org/10.1016/j.bandc.2010.10.004<br /> Wiens, S., Andersson, A., & Gravenfors, J. (2023). Neural electrophysiological correlates of detection and identification awareness. Cognitive, Affective, & Behavioral Neuroscience. https://doi.org/10.3758/s13415-023-01120-5<br /> Zhang, G., Garrett, D. R., & Luck, S. J. (2024). Optimal filters for ERP research II: Recommended settings for seven common ERP components. Psychophysiology, n/a(n/a), e14530. https://doi.org/10.1111/psyp.14530

    5. Author response:

      Reviewer #1: 

      Summary:

      In this study, the authors used a multi-alternative decision task and a multidimensional signal-detection model to gain further insight into the cause of perceptual impairments during the attentional blink. The model-based analyses of behavioural and EEG data show that such perceptual failures can be unpacked into distinct deficits in visual detection and discrimination, with visual detection being linked to the amplitude of late ERP components (N2P and P3) and discrimination being linked to the coherence of fronto-parietal brain activity.

      Strengths:

      The main strength of this paper lies in the fact that it presents a novel perspective on the cause of perceptual failures during the attentional blink. The multidimensional signaldetection modelling approach is explained clearly, and the results of the study show that this approach offers a powerful method to unpack behavioural and EEG data into distinct processes of detection and discrimination.

      Weaknesses:

      (1.1) While the model-based analyses are compelling, the paper also features some analyses that seem misguided, or, at least, insufficiently motivated and explained. Specifically, in the introduction, the authors raise the suggestion that the attentional blink could be due to a reduction in sensitivity or a response bias. The suggestion that a response bias could play a role seems misguided, as any response bias would be expected to be constant across lags, while the attentional blink effect is only observed at short lags. Thus, it is difficult to understand why the authors would think that a response bias could explain the attentional blink.

      A deficit in T2 identification accuracy could arise from either sensitivity or criterion effects; the criterion effect may manifest as a choice bias. For example, in short T1-T2 lag trials, when T2 closely follows T1, participants may adopt a more conservative choice criterion for reporting the presence of T2. Moreover, criterion effects need not be uniform across lags: A participant could infer the T1-T2 lag interval based on various factors, including trial length, thereby permitting them to adjust their choice criterion variably across different lags. We will provide a more detailed illustration of this claim in the revision.

      (1.2) A second point of concern regards the way in which the measures for detection and discrimination accuracy were computed. If I understand the paper correctly, a correct detection was defined as either correctly identifying T2 (i.e., reporting CW or CCW if T2 was CW or CCW, respectively, see Figure 2B), or correctly reporting T2's absence (a correct rejection). Here, it seems that one should also count a misidentification (i.e., incorrect choice of CW or CCW when T2 was present) as a correct detection, because participants apparently did detect T2, but failed to judge/remember its orientation properly in case of a misidentification. Conversely, the manner in which discrimination performance is computed also raises questions. Here, the authors appear to compute accuracy as the average proportion of T2-present trials on which participants selected the correct response option for T2, thus including trials in which participants missed T2 entirely. Thus, a failure to detect T2 is now counted as a failure to discriminate T2. Wouldn't a more proper measure of discrimination accuracy be to compute the proportion of correct discriminations for trials in which participants detected T2?

      Detection and discrimination accuracies were computed with precisely the same procedure, and under the same conditions, as described by the Reviewer (underlined text, above). We regret our poor description; we will improve upon it in the revised manuscript.

      (1.3) My last point of critique is that the paper offers little if any guidance on how the inferred distinction between detection and discrimination can be linked to existing theories of the attentional blink. The discussion mostly focuses on comparisons to previous EEG studies, but it would be interesting to know how the authors connect their findings to extant, mechanistic accounts of the attentional blink. A key question here is whether the finding of dissociable processes of detection and discrimination would also hold with more meaningful stimuli in an identification task (e.g., the canonical AB task of identifying two letters shown amongst digits). There is evidence to suggest that meaningful stimuli are categorized just as quickly as they are detected (Grill-Spector & Kanwisher, 2005; Grill-Spector K, Kanwisher N. Visual recognition: as soon as you know it is there, you know what it is. Psychol Sci. 2005 Feb;16(2):152-60. doi: 10.1111/j.0956-7976.2005.00796.x. PMID: 15686582.). Does that mean that the observed distinction between detection and discrimination would only apply to tasks in which the targets consist of otherwise meaningless visual elements, such as lines of different orientations?

      Our results are consistent with previous literature suggested by the Reviewer. Specifically, we do not claim that detection and discrimination are sequential processes; in fact, we modeled them as concurrent computations (Figs. 3A-B). Yet, our results suggest that these processes possess distinct neural bases. We have discussed this idea briefly in the Discussion section (e.g., “Yet, we found no evidence for these two computations being sequential…”). We will discuss this further in the revised manuscript in the context of previous literature.

      Reviewer #2:

      Summary:

      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.

      Strengths:

      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.

      Weaknesses:

      (2.1) The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      While there is consensus that the low-level perceptual factors are not affected by the attentional blink, other studies may suggest evidence to the contrary (e.g., Chua et al, Percept. Psychophys., 2005). We will highlight the significance of our findings in the context of such conflicting evidence in literature, in the revised manuscript.

      (2.2) On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      We will incorporate these additional analyses in the revised manuscript.

      (2.3) Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these behavioural findings.

      Thank you. We will discuss our findings in the context of these recent studies.

      Reviewer #3:

      Summary:

      In the present study, the authors aimed to achieve a better understanding of the mechanisms underlying the attentional blink, that is, a deficit in processing the second of two target stimuli when they appear in rapid succession. Specifically, they used a concurrent detection and identification task in- and outside of the attentional blink and decoupled effects of perceptual sensitivity and response bias using a novel signal detection model. They conclude that the attentional blink selectively impairs perceptual sensitivity but not response bias, and link established EEG markers of the attentional blink to deficits in stimulus detection (N2p, P3) and discrimination (fronto-parietal high-beta coherence), respectively. Taken together, their study suggests distinct mechanisms mediating detection and discrimination deficits in the attentional blink.

      Strengths:

      Major strengths of the present study include its innovative approach to investigating the mechanisms underlying the attentional blink, an elegant, carefully calibrated experimental paradigm, a novel signal detection model, and multifaceted data analyses using state-of-theart model comparisons and robust statistical tests. The study appears to have been carefully conducted and the overall conclusions seem warranted given the results. In my opinion, the manuscript is a valuable contribution to the current literature on the attentional blink. Moreover, the novel paradigm and signal detection model are likely to stimulate future research.

      Weaknesses:

      Weaknesses of the present manuscript mainly concern the negligence of some relevant literature, unclear hypotheses, potentially data-driven analyses, relatively low statistical power, potential flaws in the EEG methods, and the absence of a discussion of limitations. In the following, I will list some major and minor concerns in detail.

      Major points

      (3.1) Hypotheses:

      I appreciate the multifaceted, in-depth analysis of the given dataset including its high amount of different statistical tests. However, neither the Introduction nor the Methods contain specific statistical hypotheses. Moreover, many of the tests (e.g., correlations) rely on selected results of previous tests. It is unclear how many of the tests were planned a priori, how many more were performed, and how exactly corrections for multiple tests were implemented. Thus, I find it difficult to assess the robustness of the results.

      As outlined in the Introduction, we hypothesized that neural computations associated with target detection would be characterized by regional neuronal markers (e.g., parietal or occipital ERPs), whereas computations linked to feature discrimination may involve neural coordination across multiple brain regions (e.g. fronto-parietal coherence). We planned and conducted our statistical tests based on this hypothesis. All multiple comparison corrections (e.g., Bonferroni-Holm correction, see Methods) were performed separately for each class of analyses. We will clarify these hypotheses and provide further details in the revised manuscript.

      (3.2) Power:

      Some important null findings may result from the rather small sample sizes of N = 24 for behavioral and N = 18 for ERP analyses. For example, the correlation between detection and discrimination d' deficits across participants (r=0.39, p=0.059) (p. 12, l. 263) and the attentional blink effect on the P1 component (p=0.050, no test statistic) (p. 14, 301) could each have been significant with one more participant. In my opinion, such results should not be interpreted as evidence for the absence of effects.

      We agree and will revise the manuscript accordingly. We will also report Bayes factor (BF) values, where relevant, to further evaluate these claims.

      (3.3) Neural basis of the attentional blink:

      The introduction (e.g., p. 4, l. 56-76) and discussion (e.g., p. 19, 427-447) do not incorporate the insights from the highly relevant recent review by Zivony & Lamy (2022), which is only cited once (p. 19, l. 428). Moreover, the sections do not mention some relevant ERP studies of the attentional blink (e.g., Batterink et al., 2012; Craston et al., 2009; Dell'Acqua et al., 2015; Dellert et al., 2022; Eiserbeck et al., 2022; Meijs et al., 2018).

      We will motivate and discuss our study in the context of these previous studies. 

      (3.4) Detection versus discrimination:

      Concerning the neural basis of detection versus discrimination (e.g., p. 6, l. 98-110; p. 18, l. 399-412), relevant existing literature (e.g., Broadbent & Broadbent, 1987; Hillis & Brainard, 2007; Koivisto et al., 2017; Straube & Fahle, 2011; Wiens et al., 2023) is not included.

      Thank you for these suggestions. We will include these important studies in our discussion.

      (3.5) Pooling of lags and lags 1 sparing:

      I wonder why the authors chose to include 5 different lags when they later pooled early (100, 300 ms) and late (700, 900 ms) lags, and whether this pooling is justified. This is important because T2 at lag 1 (100 ms) is typically "spared" (high accuracy) while T2 at lag 3 (300 ms) shows the maximum AB (for reviews, see, e.g., Dux & Marois, 2009; Martens & Wyble, 2010). Interestingly, this sparing was not observed here (p. 43, Figure 2). Nevertheless, considering the literature and the research questions at hand, it is questionable whether lag 1 and 3 should be pooled.

      Lag-1 sparing is not always observed in attentional blink studies; there are notable exceptions that do not report such sparing (Hommel et al., Q. J. Exp. Psychol., 2005; Livesay et al., Attention, Percept. Psychophys., 2011). Our statistical tests revealed no significant difference in accuracies between short lag (100 and 300 ms) trials or between long lag (700 and 900 ms) trials but did reveal significant differences between the short and long lag trials (ANOVA, followed by post-hoc tests). To simplify the presentation of the findings, we pooled together the short lag (100 and 300 ms) and, separately, the long lag (700 and 900 ms) trials. We will present these analyses, and clarify the motivation for pooling in the revised manuscript. 

      (3.6) Discrimination in the attentional blink

      Concerning the claims that previous attentional blink studies conflated detection and discrimination (p. 6, l. 111-114; p. 18, l. 416), there is a recent ERP study (Dellert et al., 2022) in which participants did not perform a discrimination task for the T2 stimuli. Moreover, since the relevance of all stimuli except T1 was uncertain in this study, irrelevant distractors could not be filtered out (cf. p. 19, l. 437). Under these conditions, the attentional blink was still associated with reduced negativities in the N2 range (cf. p. 19, l. 427-437) but not with a reduced P3 (cf. p. 19, l 439-447).

      We will address the difference between our findings and those of Dellert et al (2022) in the revised manuscript.

      (3.7) General EEG methods:

      While most of the description of the EEG preprocessing and analysis (p. 31/32) is appropriate, it also lacks some important information (see, e.g., Keil et al., 2014). For example, it does not include the length of the segments, the type and proportion of artifacts rejected, the number of trials used for averaging in each condition, specific hypotheses, and the test statistics (in addition to p-values).

      We regret the oversight. We will include these details in the revised Methods.

      (3.8) EEG filters:

      P. 31, l. 728: "The data were (...) bandpass filtered between 0.5 to 18 Hz (...). Next, a bandstop filter from 9-11 Hz was applied to remove the 10 Hz oscillations evoked by the RSVP presentation." These filter settings do not follow common recommendations and could potentially induce filter distortions (e.g., Luck, 2014; Zhang et al., 2024). For example, the 0.5 high-pass filter could distort the slow P3 wave. Mostly, I am concerned about the bandstop filter. Since the authors commendably corrected for RSVP-evoked responses by subtracting T2-absent from T2-present ERPs (p. 31, l. 746), I wonder why the additional filter was necessary, and whether it might have removed relevant peaks in the ERPs of interest.

      Thank you for this suggestion. We will repeat this analysis by removing these additional filters.

      (3.9) Coherence analysis:

      P. 33, l. 786: "For subsequent, partial correlation analyses of coherence with behavioral metrics and neural distances (...), we focused on a 300 ms time period (0-300 ms following T2 onset) and high-beta frequency band (20-30 Hz) identified by the cluster-based permutation test (Fig. 5A-C)." I wonder whether there were any a priori criteria for the definition and selection of such successive analyses. Given the many factors (frequency bands, hemispheres) in the analyses and the particular shape of the cluster (p. 49, Fig 5C), this focus seems largely data-driven. It remains unclear how many such tests were performed and whether the results (e.g., the resulting weak correlation of r = 0.22 in one frequency band and one hemisphere in one part of a complexly shaped cluster; p. 15, l. 327) can be considered robust.

      Please see responses to comments #3.1 and #3.2 (above). In addition to reporting further details regarding statistical tests and multiple comparisons corrections, we will compute and report Bayes factors to quantify the strength of the evidence for correlations, as appropriate.

    1. eLife assessment

      This study employs a modified protocol for single-nuclei RNA sequencing of adipose tissue that preserves RNA quality and nuclei integrity. Using this protocol, the study provides valuable insights into the cellular heterogeneity and molecular landscape of murine adipose tissue from lean mice and mice with diet-induced obesity. The study is solid in its approach and analysis, providing a comprehensive description of a dysfunctional hypertrophic adipocyte subpopulation that emerges in association with obesity.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript from So et al. describes what is suggested to be an improved protocol for single-nuclei RNA sequencing (snRNA-seq) of adipose tissue. The authors provide evidence that modifications to the existing protocols result in better RNA quality and nuclei integrity than previously observed, with ultimately greater coverage of the transcriptome upon sequencing. Using the modified protocol, the authors compare the cellular landscape of murine inguinal and perigonadal white adipose tissue (WAT) depots harvested from animals fed a standard chow diet (lean mice) or those fed a high-fat diet (mice with obesity).

      Strengths:

      Overall, the manuscript is well-written, and the data are clearly presented. The strengths of the manuscript rest in the description of an improved protocol for snRNA-seq analysis. This should be valuable for the growing number of investigators in the field of adipose tissue biology that are utilizing snRNA-seq technology, as well as those other fields attempting similar experiments with tissues possessing high levels of RNAse activity.

      Moreover, the study makes some notable observations that provide the foundation for future investigation. One observation is the correlation between nuclei size and cell size, allowing for the transcriptomes of relatively hypertrophic adipocytes in perigonadal WAT to be examined. Another notable observation is the identification of an adipocyte subcluster (Ad6) that appears "stressed" or dysfunctional and likely localizes to crown-like inflammatory structures where pro-inflammatory immune cells reside.

      Weaknesses:

      Analogous studies have been reported in the literature, including a notable study from Savari et al. (Cell Metabolism). This somewhat diminishes the novelty of some of the biological findings presented here. Moreover, a direct comparison of the transcriptomic data derived from the new vs. existing protocols (i.e. fully executed side by side) was not presented. As such, the true benefit of the protocol modifications cannot be fully understood.

    3. Reviewer #2 (Public Review):

      Summary:

      In the present manuscript So et al utilize single-nucleus RNA sequencing to characterize cell populations in lean and obese adipose tissues.

      Strengths:

      The authors utilize a modified nuclear isolation protocol incorporating VRC that results in higher-quality sequencing reads compared with previous studies.

      Weaknesses:

      The use of VRC to enhance snRNA-seq has been previously published in other tissues. The snRNA-seq snRNA-seq data sets presented in this manuscript, when compared with numerous previously published single-cell analyses of adipose tissue, do not represent a significant scientific advance.

      Figure 1-3: The snRNA-seq data obtained by the authors using their enhanced protocol does not represent a significant improvement in cell profiling for the majority of the highlighted cell types including APCs, macrophages, and lymphocytes. These cell populations have been extensively characterized by cytoplasmic scRNA-seq which can achieve sufficient sequencing depth, and thus this study does not contribute meaningful additional insight into these cell types. The authors note an increase in the number of rare endothelial cell types recovered, however this is not translated into any kind of functional analysis of these populations.

      Figure 4: The authors did not provide any evidence that the relative fluorescent brightness of GFP and mCherry is a direct measure of the nuclear size, and the nuclear size is only a moderate correlation with the cell size. Thus sorting the nuclei based on GFP/mCherry brightness is not a great proxy for adipocyte diameter. Furthermore, no meaningful insights are provided about the functional significance of the reported transcriptional differences between small and large adipocyte nuclei.

      Figure 5-6: The Ad6 population is highly transcriptionally analogous to the mAd3 population from Emont et al, and is thus not a novel finding. Furthermore, in the present data set, the authors conclude that Ad6 are likely stressed/dying hypertrophic adipocytes with a global loss of gene expression, which is a well-documented finding in eWAT > iWAT, for which the snRNA-seq reported in the present manuscript does not provide any novel scientific insight.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors aimed to improve single-nucleus RNA sequencing (snRNA-seq) to address current limitations and challenges with nuclei and RNA isolation quality. They successfully developed a protocol that enhances RNA preservation and yields high-quality snRNA-seq data from multiple tissues, including a challenging model of adipose tissue. They then applied this method to eWAT and iWAT from mice fed either a normal or high-fat diet, exploring depot-specific cellular dynamics and gene expression changes during obesity. Their analysis included subclustering of SVF cells and revealed that obesity promotes a transition in APCs from an early to a committed state and induces a pro-inflammatory phenotype in immune cells, particularly in eWAT. In addition to SVF cells, they discovered six adipocyte subpopulations characterized by a gradient of unique gene expression signatures. Interestingly, a novel subpopulation, termed Ad6, comprised stressed and dying adipocytes with reduced transcriptional activity, primarily found in eWAT of mice on a high-fat diet. Overall, the methodology is sound, the writing is clear, and the conclusions drawn are supported by the data presented. Further research based on these findings could pave the way for potential novel interventions in obesity and metabolic disorders, or for similar studies in other tissues or conditions.

      Strengths:

      • The authors developed a robust snRNA-seq technique that preserves the integrity of the nucleus and RNA across various tissue types, overcoming the challenges of existing methods.

      • They identified adipocyte subpopulations that follow adaptive or pathological trajectories during obesity.

      • The study reveals depot-specific differences in adipose tissues, which could have implications for targeted therapies.

      Weaknesses:

      • The adipose tissues were collected after 10 weeks of high-fat diet treatment, lacking the intermediate time points for identifying early markers or cell populations during the transition from healthy to pathological adipose tissue.

      • The expansion of the Ad6 subpopulation in obese iWAT and gWAT is interesting. The author claims that Ad6 exhibited a substantial increase in eWAT and a moderate rise in iWAT (Figure 4C). However, this adipocyte subpopulation remains the most altered in iWAT upon obesity. Could the authors elaborate on why there is a scarcity of adipocytes with ROS reporter and B2M in obese iWAT?

      • While the study provides extensive data on mouse models, the potential translation of these findings to human obesity remains uncertain.

    1. eLife assessment

      This valuable study shows how an intersecting network of regulators acting on genes with differences in their RNA metabolism explains why the loss of some regulators of RNAi in C. elegans can selectively impair the silencing of some target genes. The evidence presented is convincing, as the authors use a combination of computational modeling and RNAi assays to support their conclusions.

    2. Reviewer #1 (Public Review):

      The goal of Knudsen-Palmer et al. was to define a biological set of rules that dictate the differential RNAi-mediated silencing of distinct target genes, motivated by facilitating the long-term development of effective RNAi-based drugs/therapeutics. To achieve this, the authors use a combination of computational modeling and RNAi function assays to reveal several criteria for effective RNAi-mediated silencing. This work provides insights into how (1) cis-regulatory elements influence the RNAi-mediated regulation of genes; (2) it is determined that genes can "recover" from RNAi-silencing signals in an animal; and 3) pUGylation occurs exclusively downstream of the dsRNA trigger sequence, suggesting 3º siRNAs are not produced. In addition, the authors show that the speed at which RNAi-silencing is triggered does not correlate with the longevity of the silencing. These insights are significant because they suggest that if we understand the rules by which RNAi pathways effectively silence genes with different transcription/processing levels then we can design more effective synthetic RNAi-based therapeutics targeting endogenous genes. The conclusions of this study are mostly supported by the data, but there are some aspects that need to be clarified.

      (1) The methods do not describe the "aged RNAi plates feeding assay" in Figure 2E. The figure legend states that "aged RNAi plates" were used to trigger weaker RNAi, but the detail explaining the experiment is insufficient. How aged is aged? If the goal was to effectively reduce the dsRNA load available to the animals, why not quantitatively titrate the dsRNA provided? Were worms previously fed on the plates, or was simply a lawn of bacteria grown until presumably the IPTG on the plate was exhausted?

      (2) Is the data presented in Figure 2F completed using the "aged RNAi plates" to achieve the partial silencing of dpy-7 observed? Clarification of this point would be helpful.

      (3) Throughout the manuscript the authors refer to "non-dividing cells" when discussing animals' ability to recover from RNA silencing. It is not clear what the authors specifically mean with the phrase "non-dividing cells", but as this is referred to in one of their major findings, it should be clarified. Do they mean the cells are somatic cells in aged animals, thus if they are "non-dividing" the siRNA pools within the cells cannot be diluted by cell division? Based on the methods, the animals of RNAi assays were L4/Young adults that were scored over 8 days after the initial pulse of dsRNA feeding. If this is the case, wouldn't these animals be growing into gravid adults after the feeding, and thus have dividing cells as they grew?

      (4) What are the typical expression levels/turnover of unc-22 and bli-1? Based on the results from the altered cis-regulatory regions of bli-1 and unc-22 in Figure 5, it seems like the transcription/turnover rates of each of these genes could also be used as a proof of principle for testing the model proposed in Figure 4. The strength of the model would be further increased if the RNAi sensitivity of unc-22 reflects differences in its transcription/turnover rates compared to bli-1.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript by Knudsen-Palmer et al. describes and models the contribution of MUT-16 and RDE-10 in the silencing through RNAi by the Argonaute protein NRDE-3 or others. The authors show that MUT-16 and RDE-10 constitute an intersecting network that can be redundant or not depending on the gene being targeted by RNAi. In addition, the authors provide evidence that increasing dsRNA processing can compensate for NRDE-3 mutants. Overall, the authors provide convincing evidence to understand the factors involved in RNAi in C. elegans by using a genetic approach.

      Major Strengths:

      The author's work presents a compelling case for understanding the intricacies of RNA interference (RNAi) within the model organism Caenorhabditis elegans through a meticulous genetic approach. By harnessing genetic manipulation, they delve into the role of MUT-16 and RDE-10 in RNAi, offering a nuanced understanding of the molecular mechanisms at play in two independent case study targets (unc-22 and bli-1).

      Major Weaknesses:

      (1) It is unclear how the molecular mechanisms of amplification are different under the MUT-16 and RDE-10 branches of the regulatory pathway, since they are clearly distinct proteins structurally. It would be interesting to do some small-RNA-seq of products generated from unc-22 and bli-1, on wild-type conditions and some of the mutants studied (eg. mut-16, rde-10 and mut-16 + rde-10). That would provide some insights into whether the products of the 2 amplifications are the same in all conditions, just changing in abundance, or whether they are distinct in sequence patterns.

      (2) In the same line, Figure 5 aims to provide insights into the sequence determinants that influence the RNAi of bli-1. It is unclear whether the changes in transcript stability dictated by the 3'UTR are the sole factor governing the preference for the MUT-16 and RDE-10 branches of the regulatory pathway. In line with the mutant jam297, it might be interesting to test whether factors like codon optimality, splicing, ... of the ORF region upstream from bli-1-dsRNA can affect its sensitivity to the MUT-16 and RDE-10 branches of the regulatory pathway.

    1. eLife assessment

      This work investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. The finding that kinesin-13, katanin and Ataxin-2 proteins are involved in this process is useful in uncovering the mechanisms underlying healthy embryo formation. The overall conclusions of the work are supported by solid evidence obtained by microscopy and RNAi experiments, though more robust data analyses and rescue experiments would have strengthened the study.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper by Beath et. al. identifies a potential regulatory role for proteins involved in cytoplasmic streaming and maintaining the grouping of paternal organelles: holding sperm contents in the fertilized embryos away from the oocyte meiotic spindle so that they don't get ejected into the polar body during meiotic chromosome segregation. The authors show that by time-lapse video, paternal mitochondria (used as a readout for sperm and its genome) is excluded from yolk granules and maternal mitochondria, even when moving long distances by cytoplasmic streaming. To understand how this exclusion is accomplished, they first show that it is independent of both internal packing and the engulfment of the paternal chromosomes by maternal endoplasmic reticulum creating an impermeable barrier. They then test whether the control of cytoplasmic steaming affects this exclusion by knocking down two microtubule motors, Katanin and kinesis I. They find that the ER ring, which is used as a proxy for paternal chromosomes, undergoes extensive displacement with these treatments during anaphase I and interacts with the meiotic spindle, supporting their hypothesis that the exclusion of paternal chromosomes is regulated by cytoplasmic streaming. Next, they test whether a regulator of maternal ER organization, ATX-2, disrupts sperm organization so that they can combine the double depletion of ATX-2 and KLP-7, presumably because klp-7 RNAi (unlike mei-1 RNAi) does not affect polar body extrusion and they can report on what happens to paternal chromosomes. They find that the knockdown of both ATX-2 and KLP-7 produces a higher incidence of what appears to be the capture of paternal chromosomes by the meiotic spindle (5/24 vs 1/25). However, this capture event appears to halt the cell cycle, preventing the authors from directly observing whether this would result in the paternal chromosomes being ejected into the polar body.

      Strengths:

      This is a useful, descriptive paper that highlights a potential challenge for embryos during fertilization: when fertilization results in the resumption of meiotic divisions, how are the paternal and maternal genomes kept apart so that the maternal genome can undergo chromosome segregation and polar body extrusion without endangering the paternal genome? In general, the experiments are well-executed and analyzed. In particular, the authors' use of multiple ways to knock down ATX-2 shows rigor.

      Weaknesses:

      The paper makes a case that this regulation may be important but the authors should do some additional work to make this case more convincing and accessible for those outside the field. In particular, some of the figures could include greater detail to support their conclusions, they could explain the rationale for some experiments better and they could perform some additional control experiments with their double depletion experiments to better support their interpretations. Also, the authors' inability to assess the functional biological consequences of the capture of the sperm genome by the oocyte spindle should be discussed, particularly in light of the cell cycle arrest that they observe.

    3. Reviewer #2 (Public Review):

      Summary

      In this manuscript, Beath et al. use primarily C. elegans zygotes to test the overarching hypothesis that cytoplasmic mechanisms exit to prevent interaction between paternal chromosomes and the meiotic spindle, which are present in a shared zygotic cytoplasm after fertilization. Previous work, much of which by this group, had characterized cytoplasmic streaming in the zygote and the behavior of paternal components shortly after fertilization, primarily the clustering of paternal mitochondria and membranous organelles around the paternal chromosomes. This work set out to identify the molecular mechanisms responsible for that clustering and test the specific hypothesis that the "paternal cloud" helps prevent the association of paternal chromosomes with the meiotic spindle.

      Strengths

      This work is a collection of technical achievements. The data are primarily 3- and 4-channel time-lapse images of zygotes shortly after fertilization, which were performed inside intact animals. There are many instances in which the experiments show extreme technical skill, such as tracking the paternal chromosomes over large displacements throughout the volume of the embryo. The authors employ a wide variety of fluorescent reporters to provide a remarkably clear picture of what is going on in the zygote. These reagents and the novel characterization of these stages that they provide will be widely beneficial to the community.

      The data provide direct visualization of what had previously been a mostly hypothetical structure, the "paternal cloud," using simultaneous labeling of paternal DNA and mitochondria in combination with a variety of maternal proteins including maternal mitochondria, yolk granules, tubulin, and plasma membrane. Together, these images provided convincing evidence of the existence of this specified cytoplasmic domain. They go on to show that the knockdown of the ataxin-2 homolog ALX-2, a protein previously shown to affect ER dynamics, disrupted the paternal cloud, identifying a role for ER organization in this structure.

      The authors then used the system to test the functional consequences of perturbing the cytoplasmic organization. Consistent with the paternal cloud being a stable structure, it stayed intact during large movements the authors generated using previously published knockdowns (of mei-1/katanin and kinesin-13/kpl-7) that increased cytoplasmic streaming. They used this data to document instances in which the paternal chromosomes were likely to have been attached to the spindle. They concluded with direct evidence of spindle fibers connecting to the paternal chromatin upon knockdown of ATX-2 in combination with increased cytoplasmic streaming, providing strong, direct support for their overarching hypothesis.

      Weaknesses

      While the data is convincing, the narrative of the paper could be streamlined to highlight the novelty of the experiments and better articulate the aims. For example, the cloud of paternal mitochondria and membranous organelles was previously shown, but Figures 1-2 largely reiterate that observation. The innovation seems to be that the combination of ER, yolk, and maternal mitochondrial markers makes the existence of a specified domain more concrete. There are also some instances where more description is needed to make the conclusions from the images clear.

      The manuscript intersperses what read like basic characterizations of fluorescent markers that, as written, can distract from the main story. The authors characterized the dynamics of ER organization throughout the substages of meiosis and the permeability of the envelope of ER that surrounds the paternal chromatin, but it could be more clearly established how the ability to visualize these structures allowed them to address their aims. More background on what was previously known about ER organization in M-phase and the role of ataxin proteins specifically may help provide more continuity.

    4. Reviewer #3 (Public Review):

      Summary:

      This study by Beath et al. investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. Time-lapse imaging revealed that sperm DNA is surrounded by paternal mitochondria and maternal ER that is permeable to proteins. By increasing cytoplasmic streaming using kinesin-13 or katanin RNAi, the authors demonstrated that limiting cytoplasmic streaming in the embryo is an important step that prevents the capture of sperm DNA by the oocyte meiotic spindle. Further experiments showed that the Ataxin-2 protein is required to hold paternal mitochondria together and close to the sperm DNA. Finally, double depletion of kinesin-13 and Ataxin-2 suggested an increased risk of meiotic spindle capture of sperm DNA.

      Overall, this is an interesting finding that could provide a new understanding of how meiotic spindle capture of sperm DNA and its accidental expulsion into the polar body is prevented. However, some conceptual gaps need to be addressed and further experiments and improved data analyses would strengthen the paper.

      • It would be helpful if the authors could discuss in good detail how they think maternal ER surrounds the sperm DNA and why is it not disrupted following Ataxin disruption.

      • Since important phenotypes revealed in RNAi experiments (e.g. kinesin-13 and ataxin-2 double depletion) are not very robust, the authors should consider toning down their conclusions and revising some of their section headings. I appreciate that they are upfront about some limitations, but they do nonetheless make strong concluding sentences.

      • The discussion section could be improved further to present the authors' findings in the larger context of current knowledge in the field.

      • The authors previously demonstrated that F-actin prevents meiotic spindle capture of sperm DNA in this system. However, the current manuscript does not discuss how the katanin, kinesin-13 and Ataxin-2 mechanisms could work together with previously established functions of F-actin in this process.

      • How can the authors exclude off-target effects in their RNAi depletion experiments? Can kinesin-13, katanin, and Ataxin phenotypes be rescued for instance?

      • How are the authors able to determine if the paternal genome was actually captured by the spindle? Does lack of movement definitively suggest capture without using a spindle marker?

    1. eLife assessment

      This important study identifies biallelic variants of DNAH3 in four unrelated infertile men. In addition, it reports that DNAH3 knockout (KO) mice are infertile, and that compromised DNAH3 activity decreases the expression of IDA-associated proteins in the spermatozoa of human patients and the KO mice. Of note, the infertility of both can be rescued by intracytoplasmic sperm injection (ICSI). In aggregate, the work provides solid evidence to demonstrate that DNAH3 is a novel pathogenic gene for asthenoteratozoospermia and male infertility . It will be of substantial interest to clinicians, reproductive counselors, embryologists, and basic researchers working on infertility and assisted reproductive technology.

    2. Joint Public Review:

      Summary:

      The study identified biallelic variants of DNAH3 in four unrelated Han Chinese infertile men through whole-exome sequencing, which contributes to abnormal sperm flagellar morphology and ultrastructure. To investigate the importance of DNAH3 in male infertility, the authors generated crispant DNAH3 knockout (KO) male mice. They observed that KO mice are also infertile, showing a severe reduction in sperm movement with abnormal IDA (inner dynein arms) and mitochondrion structure. Moreover, nonfunctional DNAH3 expression decreased the expression of IDA-associated proteins in the spermatozoa of patients and KO mice, which are involved in the disruption of sperm motility. Interestingly, the infertility of patients and KO mice was rescued by intracytoplasmic sperm injection (ICSI). Taken together, the authors propose that DNAH3 is a novel pathogenic gene for asthenoterozoospermia and male infertility.

      Strengths:

      This work investigates the role of DNAH3 in sperm mobility and male infertility and utilised gold-standard molecular biology techniques, showing strong evidence of its role in male infertility. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.

      Weaknesses:

      (1) The manuscript lacks a comparison with previous studies on DNAH3 in the Discussion section.

      (2) The variants of DNAH3 in four infertile men were identified through whole-exome sequencing. Providing an overview of the WES data would be beneficial to offer additional insights into whether other variants may contribute the infertility. This could also help explain why ICSI only works for two out of four patients with DNAH3 variants.

      (3) Quantification of images would help substantiate the conclusions, particularly in Figures 2, 3, 4, and 6. Improved images in Figures 3A, 4B, and 4C, would help increase confidence in the claims made.

    1. eLife assessment

      This work presents valuable information on the structure of the spirosome's native extended conformation as the active form of the enzyme aldehyde-alcohol dehydrogenase (AdhE). However, the data supporting this claim are incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      Clostridium thermocellum serves as a model for consolidated bioprocess (CBP) in lignocellulosic ethanol production, but yet faces limitations in solid contents and ethanol titers achieved by engineered strains thus far. The primary ethanol production pathway involves the enzyme aldehyde-alcohol dehydrogenase (AdhE), which forms long oligomeric structures known as spirosomes, previously characterized via the 3.5 Å resolution E. coli AdhE structure using single-particle cryo-EM. The present study describes the cryo-EM structure of the C. thermocellum ortholog, sharing 62% sequence identity with E. coli AdhE, resolved at 3.28 Å resolution. Detailed comparative structural analysis, including the Vibrio cholerae AdhE structure, was conducted. Integrating cryo-EM data with molecular dynamics simulations indicated that the aldehyde intermediate resides longer in the channel of the extended form, supporting the hypothesis that the extended spirosome represents the active form of AdhE.

      Strengths:

      The study conducts a comprehensive structural comparative analysis of oligomerization interfaces and the acetaldehyde channel across compact and extended conformations. Structural and computational results suggest the extended spirosome as the most likely active state of AdhE.

      Weaknesses:

      The overall resolution of the C. thermocellum structure is similar to the E. coli ortholog, which shares 62% sequence identity, and the oligomerization interfaces and the acetaldehyde channel were previously described.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Ziegler et al, entitled 'Structural characterization and dynamics of AdhE ultrastructure from Clostridium thermocellum: A containment strategy for toxic intermediates?" presents the atomic resolution cryo-EM structure of C. thermocellum AdhE showing that it show dominantly an extended form while E.coli AdhE shows dominantly a compact form. With comparative analysis of their C. thermocellum structure and the previous E.coli AdhE structure, they tried to reveal the mechanism by which C.thermocellum and E.coli show different dominant conformations. In addition, they also analyzed the substrate channel by comparative and computational approaches. Lastly, their computational analysis using CryoDRGN reveals conformational heterogeneity in the sample. Although this manuscript suggests a potential mechanism of the different features of AdhEs, this manuscript is very descriptive and does not provide sufficient data to support the authors' conclusions, which may be due to the lack of experimental data to support their findings from the computational analysis.

      Strengths:

      This manuscript provides the first C. thermocellum (Ct) AdhE structure and comparatively analyzed this structure with E.coli AdhE.

      Weaknesses:

      Their main conclusions obtained mostly by computational and comparative analysis are not supported by experimental data.

    4. Reviewer #3 (Public Review):

      This study describes the first structure of Gram-positive bacterial AdhE spirosomes that are in a native extended conformation. All the previous structures of AdhE spirosomes obtained come from Gram-negative bacterial species with native compact spirosomes (E. coli, V. cholerae). In E. coli, AdhE spirosomes can be found in two different conformational states, compact and extended, depending on the substrates and cofactors they are bound to.

      The high-resolution cryoEM structure of the extended C. thermocellum AdhE spirosomes produced in E. coli in an apo state (without any substrate or cofactors) is compared to the E. coli extended and compact AdhE spirosomes structures previously published. The authors have modeled (in Swiss-Model) the structure of compact C. thermocellum AdhE spirosomes, using E. coli compact AdhE spirosome conformation as a template, and performed molecular dynamics simulations. They have identified a channel in which the toxic reaction intermediate aldehyde could transit from the aldehyde dehydrogenase active site to the alcohol dehydrogenase active site, in an analogous manner to E. coli spirosomes. These findings are in line with the hypothesis that the extended spirosomes could correspond to the active form of the enzyme.

      In this work, the authors speculate that the C. thermocellum AdhE spirosomes could switch from the native extended conformation to a compact conformation, in a way that is inverse of E. coli spirosomes. Although attractive, this hypothesis is not supported by the literature. Amazingly, in some Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. difficile...), AdhE spirosomes are natively extended and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). The data presented as they are now are not convincing to confirm the existence of C. thermocellum AdhE spirosomes in a compact conformation.

    1. eLife assessment:

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    2. Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, building on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The experiments support the main message of the paper regarding durotaxis by amoeboid cells. In my opinion, a few clarifications on the mechanism proposed to explain this phenomenon could strengthen this research:

      (1) According to your model, the rear end of the cell, which is in contact with softer substrates, will have slower diffusion rates of MNIIA. Does this mean that bigger cells will durotax better than smaller cells because the stiffness difference between front and rear is higher? Is it conceivable to attenuate the slope of the durotactic gradient to a degree where smaller cells lose their ability to durotact, while longer cells retain their capacity for directional movement?

      (2) Where did you place the threshold for soft, middle, and stiff regions (Figure 6)? Is it possible that you only have a linear rigidity gradient in the center of your gel and the more you approach the borders, the flatter the gradient gets? In this case, cells would migrate randomly on uniform substrates. Did you perform AFM over the whole length of the gel or just in the central part?

      (3) In which region (soft, middle, stiff) did you perform all the cell tracking of the previous figures?

      (4) What is the level of confinement experienced by the cells? Is it possible that cells on the soft side of the gels experience less confinement due to a "spring effect" whereby the coverslips descending onto the cells might exert diminished pressure because the soft hydrogels act as buffers, akin to springs? If this were the case, cells could migrate following a confinement gradient.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors developed an imaging-based device that provides both spatial confinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.<br /> (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.<br /> (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.<br /> (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

    1. eLife assessment

      This manuscript presents important observations on the early changes in calcium signaling, TMEM16a activation, and mitochondrial dysfunction in salivary gland cells in an inflammation murine model of autoimmune Sjögren's disease. Convincing changes are shown in saliva release, calcium signaling, TMEM16a activation, mitochondrial function, and sub-cellular morphology of the endoplasmic reticulum following DMXAA treatment. The work will be of strong interest to physiologists working on secretion, calcium signaling, and mitochondria.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors address cellular mechanisms underlying the early stages of Sjogren's syndrome, using a mouse model in which 5,6-Dimethyl-9-oxo-9H-xanthene-4-acetic acid (DMXAA) is applied to stimulate the interferon gene (STING) pathway. They show that, in this model, salivary secretion in response to neural stimulation is greatly reduced, even though individual secretory cell calcium responses were enhanced. They attribute the secretion defect to reduced activation of Ca2+ -activated Cl- channels (TMEM16a), due to an increased distance between Ca2+ release channels (IP3 receptors) and TMEM16a which is expected to reduce the [Ca2+] sensed by TMEM16a. A variety of disruptions in mitochondria were also observed after DMXAA treatment, including reduced abundance, altered morphology, depolarization, and reduced oxygen consumption rate. The results of this study shed new light on some of the early events leading to the loss of secretory function in Sjogren's syndrome, at a time before inflammatory responses cause the death of secretory cells.

      Strengths:

      Two-photon microscopy enabled Ca2+ measurements in the salivary glands of intact animals in response to physiological stimuli (nerve stimulation). This approach has been shown previously by the authors as necessary to preserve the normal spatiotemporal organization of calcium signals that lead to secretion under physiological conditions.

      Superresolution (STED) microscopy allowed precise measurements of the spacing of IP3R and TMEM16a and the cell membranes that would otherwise be prevented by the diffraction limit. The measured increase of distance (from 84 to 155 nm) would be expected to reduce [Ca2+] at the TMEM16a channel.

      The authors effectively ruled out a variety of alternative explanations for reduced secretion, including changes in AQP5 expression, TMEM16a expression, localization, and Ca2+ sensitivity as indicated by Cl- current in response to defined levels of Ca2+.

      Weaknesses:

      While the Ca2+ distribution in the cells was less restricted to the apical region in DMXAA-treated cells, it is not clear that this is relevant to the reduced activation of TMEM16a. The way in which the change in Ca2+ distribution is quantified (apical/basal ratio) is not informative, as this is not what activates TMEM16a, but rather the local [Ca2+] at the channel.

      Despite the decreased level of secretion, Ca2+ signal amplitudes were higher in the treated cells, raising the question of how much this might compensate for the increased distance between IP3R and TMEM16a. The authors assume that the increased separation of IP3R and TMEM16a (and the resulting decrease in local [Ca2+]) outweighed the effect of higher global [Ca2+], but this important point was not addressed.

      The description of mitochondrial changes in abundance, morphology, membrane potential, and oxygen consumption rate were not well integrated into the rest of the paper. While they may be a facet of the multiple effects of STING activation and may occur during Sjogren's syndrome, their possible role in reducing secretion was not examined. As it stands, the mitochondrial results are largely descriptive and there is no evidence here that they contribute to the secretory phenotype.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript describes a very eloquent study of disrupted stimulus-secretion coupling in salivary acinar cells in the early stages of an animal model (DMXAA) of Sjogren's syndrome (SS). The study utilizes a range of technically innovative in vivo imaging of Ca signaling, in vivo salivary secretion, patch clamp electrophysiology to assess TMEM16a activity, immunofluorescence and electron microscopy, and a range of morphological and functional assays of mitochondrial function. Results show that in mice with DMXAA-induced Sjogren's syndrome, there was a reduced nerve-stimulation-induced salivary secretion, yet surprisingly the nerve-stimulation-induced Ca signaling was enhanced. There was also a reduced carbachol (CCh)-induced activation of TMEM16a currents in acinar cells from DMXAA-induced SS mice, whereas the intrinsic Ca-activated TMEM16a currents were unaltered, further supporting that stimulus-secretion coupling was impaired. Consistent with this, high-resolution STED microscopy revealed that there was a loss of close physical spatial coupling between IP3Rs and TMEM16a, which may contribute to the impaired stimulus-secretion coupling. Furthermore, the authors show that the mitochondria were both morphologically and functionally impaired, suggesting that bioenergetics may be impaired in salivary acinar cells of DMXAA-induced SS mice.

      Strengths:

      Overall, this is an outstanding manuscript, that will have a huge impact on the field. The manuscript is beautifully well-written with a very clear narrative. The experiments are technically innovative, very well executed, and with a logical design The data are very well presented and appropriately analyzed and interpreted.

    4. Reviewer #3 (Public Review):

      Summary:

      The pathomechanism underlying Sjögren's syndrome (SS) remains elusive. The authors have studied if altered calcium signaling might be a factor in SS development in a commonly used mouse model. They provide a thorough and straightforward characterization of the salivary gland fluid secretion, cytoplasmic calcium signaling, mitochondrial morphology, and respiration. A special strength of the study is the spectacular in vivo imaging, very few if any groups could have succeeded with the studies. The authors show that the cytoplasmic calcium signaling is upregulated in the SS model and the Ca2+ regulated Cl- channels are normally localized and function, but still fluid secretion is suppressed. They also find altered localization of the IP3R and speculate about lesser exposure of Cl- channels to high local [Ca2+]. In addition, they describe changes in mitochondrial morphology and function that might also contribute to the attenuated secretory response. Although the exact contribution of calcium and mitochondria to secretory dysfunction remains to be determined, the results seem to be useful for a range of scientists.

      Specific points to consider:

      (1) Are all the effects of DMXAA mediated through STING? DMXAA has been reported to inhibit NAD(P)H quinone oxidoreductase (NQO1) PMID: 10423172, which might be relevant both for the calcium and mitochondrial phenotypes. I would recommend that the authors either test the dependency of the DMXAA effects on STING or avoid attributing all effects of DMXAA to STING.

      (2) "mitochondrial membrane potential (ΔΨm), the driving force of ATP production" the driving force is the electrochemical H+ gradient.

      (3) ΔΨm is assessed as decreased in the DMXAA model without a change in TMRE steady state. Higher post-uncoupler fluorescence caused a lesser uncoupler-sensitive pool. This is not a very common observation. Was the autofluorescence of the DMXAA-treated cells higher in the red channel?

      (4) The EM study indicated ER structure disruption. Are there any clues to the contribution of this to the augmented agonist/electrical stimulation-induced calcium signaling and decreased fluid secretion?

    1. eLife assessment

      Gain-of-function mutations and amplifications of PPM1D are found across several human cancers and are associated with advanced tumor stage and worse prognosis. Thus far, the clinical translation has not been possible due to the lack of PPM1D inhibitors with favorable pharmacokinetic properties. This useful study leverages CRISPR/Cas9 screening to determine that loss of SOD1 and is synthetic lethal with PPM1D mutation in leukemia. The mechanistic analyses are still incomplete.

    1. eLife assessment

      This important study expands our understanding of the role of two axon guidance factors in a specific axon guidance decision. The strength of the study is the compelling axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

    2. Reviewer #1 (Public Review):

      Summary:

      The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ.

      Strengths:

      Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling.<br /> The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ.

      Weaknesses:

      No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors conduct a detailed analysis of the molecular cues that control guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos supports the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing an additional confirmation of the requirement for both guidance pathways. This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

    4. Reviewer #3 (Public Review):

      Summary:

      In this paper, Curran et al investigate the role of Ntn, Slit1 and Slit 2 in axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. The paper clearly demonstrates molecules that impact the patterning of the dorsal funiculus formation, which can provide a foundation for future studies on the specific steps in that patterning that require the studied molecules.

      Strengths:

      The manuscript uses the advantage of mouse genetics to investigate axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1 and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight.

      Weaknesses:

      Overall, the manuscript is sound in technique and analysis. While not a weakness, the paper provides the foundation for future studies that investigate the specific molecular mechanisms of each step in the patterning of the dorsal funiculus.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ. 

      Strengths: 

      Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling. 

      The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ. 

      Weaknesses: 

      No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript. 

      Reviewer #2 (Public Review):

      Summary: 

      In this manuscript, the authors conduct a detailed analysis of the molecular cues that control the guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single-neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos support the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing additional confirmation of the requirement for both guidance pathways. 

      Strengths: 

      This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor. 

      Weaknesses: 

      There are some places in the text where the discussion of these data is compared with other studies and models, but additional details would help clarify the arguments. 

      The details were added to the first section of Discussion in the revision to address this weakness.  Also see the response to the recommendations below.

      Reviewer #3 (Public Review):

      Summary: 

      In this paper, Curran et al investigate the role of Ntn, Slit1, and Slit 2 in the axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. However, there are some potential areas of the manuscript that should be edited to better match the results with the conclusions of the work. 

      Strengths: 

      The manuscript uses the advantage of mouse genetics to investigate the axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1, and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight. 

      Weaknesses: 

      Overall, the manuscript is sound in technique and analysis. However, the majority of the manuscript is about the dorsal funiculus and not the bifurcation of the axons, as the title would make a reader believe. Further, the manuscript would provide a more scholarly discussion of the current knowledge of DRG axon patterning and how their work fits into that knowledge. 

      We revised the title as suggested.  Additional discussion of DRG axon growth at the DREZ is added to the last section of the Discussion in the revision.  Also see the response to the recommendations below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Given the reasons stated above, I have no specific recommendations for the authors. 

      There is a typo in the Abstract (... mice with triple deletion of Ntn1, Slit2, and Slit2....). 

      Corrected in the revision.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors twice repeated that their data on DRG guidance defects in the Ntn1 mutants differ from studies previously published in references 19 and 26. However it is unclear to me, without having read those other studies, what is actually different between this study and those, and why there would be differences between the results from two groups. If the authors think this is an important point to make they need to more clearly say what the other group saw and offer an explanation of why the data may be different. 

      We added detailed comparison of the defects from different studies to the first section of the Discussion and suggested multiple roles of Ntn1 in controlling sensory axon growth at the DREZ in the revision.

      (2) In the final section of the discussion it says, "The guidance regulation of DRG axon bifurcation by Slit and Ntn1 may be similar to but overshadowed by their function in midline guidance [43]." The meaning of this sentence was unclear to me. I had been thinking that since there are total knockout embryos (not conditional) there could be patterning effects that happen before the DRG branching that influence the formation of the DREZ. Is this what the authors mean to say here? How can the authors show that the guidance factors they have knocked out are actually functioning in the DRG neurons? 

      We agree with the reviewer that the first sentence is vague, so we edited the paragraph and included the discussion of the regulation of DRG axons at the DREZ, which was the main theme of this last section.  In addition, we agree with the reviewer’s suggestion of the possible indirect role of Ntn1 on DRG axons via the control of interneuron migration.  This possibility was included in the last paragraph of the Discussion.

      (3) In several of the figures (3T, 5I, 5J) there are distance measurements that are presumably averages of multiple axons in 3 or 4 embryos because 3-4 points are shown per graph. However, the figure and methods do not say how many axons were measured per embryo and I could not find if it says these numbers are averages. Clarifying the details of these panels would be useful. 

      The n is the number of animals analyzed and is now added to the figure legends.  From each animal, multiple sections (2-4) were analyzed for various parameters in Fig. 3 and 5.  This information was added to the Method section of the revision.

      Reviewer #3 (Recommendations For The Authors):

      Overall the data matches the conclusions in the paper. However, to this reviewer, the title suggests that Ntn and Slit will have defects in bifurcation. This is not the presented phenotype. I recommend the authors change the title to better reflect the findings of the work. 

      We edited the title of the revised manuscript to reflect the control of growth direction in the context of bifurcation.  

      The introduction of the work clearly outlines what is known about DREZ formation in mice but could extend its discussion to other systems like chick and zebrafish (Jaeda Coutinho-Budd et al. 2008, Wang and Scott 2000, Golding et al 1997, Nichols and Smith 2019, Kikel-Coury et al 2021). These studies are particularly important given that pioneer events, including bifurcation, can be visualized. Acknowledging the contribution of other model systems to the understanding of DRG axon patterning is important to improve the scholarly discussion of the paper. 

      We added more detailed discussion of the current knowledge of DRG axon growth at the DREZ from several relevant studies of the rodent and zebrafish models in the last section of Discussion.

      In the data presented, the authors see defects in the axon patterning of DRG neurons and conclude it is a defect in the dorsal funiculus formation. Another interpretation is that a subset of axons cannot invade the spinal cord boundary properly. This phenotype was observed in zebrafish with timelapse imaging (Kikel-Coury et al 2021). It may not be necessary to specifically test the axons' ability to enter the spinal cord in this paper, but the possibility that this could drive the presented phenotypes should be more clearly stated in the results. Entry is not thoroughly addressed in this paper and would need to be confirmed by labeling the edge of the spinal cord with a second reporter. No entry would obviously impact axon targeting. However, delayed entry could place the axon in a navigation environment that is atypical, causing it to navigate aberrantly and present as a funiculus phenotype. 

      We thank the reviewer for raising this very interesting point.  In our present view, dorsal funiculus formation is related to DRG axon patterning, which involves growth, guidance, and bifurcation of the incoming afferents at the dorsal spinal cord.  We believe that these events are highly coordinated by various environmental cues to generate the DREZ and the dorsal funiculus.  The defects we observed could result from the disruption of such coordination that leads to misregulation of DRG axon entry at the dorsal spinal cord, as suggested by the reviewer.  We propose that further analysis by time-lapse imaging as done in zebrafish would provide better understanding of such coordination.  This discussion was included in the last section of Discussion. 

      The authors should clarify that their approach does not knock out molecules in a cell-specific way. This would specifically impact the interpretation of the Dcc phenotypes. It is possible that UNC-40/DCC is guiding cells that are not labeled. The non-autonomous role of UNC-40/DCC should be clearly stated as a possibility. 

      This discussion was added to the last paragraph of the Discussion section.

    1. eLife assessment

      This study presents an important finding on the structural role of glycosylation at position N343 of the SARS-CoV-2 spike protein's receptor-binding domain in maintaining its stability, with implications across different variants of concern. The evidence supporting the claims of the authors is convincing, since appropriate and validated methodology in line with current state-of-the-art has been approached. The work will be of interest to evolutionary virologists.

    2. Reviewer #2 (Public Review):

      The authors sought to establish the role played by N343 glycosylation on the SARS-CoV-2 S receptor binding domain structure and binding affinity to the human host receptor ACE2 across several variants of concern. The work includes both computational analysis in the form of molecular dynamics simulations and experimental binding assays between the RBD and ganglioside receptors.

      The work extensively samples the conformational space of the RBD beginning with atomic coordinates representing both the bound and unbound states and computes molecular dynamics trajectories until equilibrium is achieved with and without removing N343 glycosylation. Through comparison of these simulated structures, the authors are able to demonstrate that N343 glycosylation stabilizes the RBD. Prior work had demonstrated that glycosylation at this site plays an important role in shielding the RBD core and in this work the authors demonstrate that removal of this glycan can trigger a conformational change to reduce water access to the core without it. This response is variant dependent and variants containing interface substitutions which increase RBD stability, including Delta substitution L452R, do not experience the same conformational change when the glycan is removed. The authors also explore structures corresponding to Alpha and Beta in which no structure-reinforcing substitutions were identified and two Omicron variants in which other substitutions with an analogous effect to L452R are present.

      The authors experimentally assessed these inferred structural changes by measuring the binding affinity of the RBD for the oligosaccharides of the monosialylated gangliosides GM1os and GM2os with and without the glycan at N343. While GM1os and GM2os binding is influenced by additional factors in the Beta and Omicron variants, the comparison between Delta and Wuhan-hu-1 is clear: removal of the glycan abrogated binding for Wuhan-hu-1 and minimally affected Delta as predicted by structural simulations.

      In summary, these findings suggest, in the words of the authors, that SARS-CoV-2 has evolved to render the N-glycosylation site at N343 "structurally dispensable". This study emphasizes how glycosylation impacts both viral immune evasion and structural stability which may in turn impact receptor binding affinity and infectivity. Mutations which stabilize the antigen may relax the structural constraints on glycosylation opening up avenues for subsequent mutations which remove glycans and improve immune evasion. This interplay between immune evasion and receptor stability may support complex epistatic interactions which may in turn substantially expand the predicted mutational repertoire of the virus relative to expectations which do not take into account glycosylation.

    3. Reviewer #3 (Public Review):

      Summary:

      The receptor binding domain of SARS-Cov-2 spike protein contains two N-glycans which have been conserved the variants observed in these last 4 years. Through the use of extensive molecular dynamics, the authors demonstrate that even if glycosylation is conserved, the stabilization role of glycans at N343 differs among the strains. They also investigate the effect of this glycosylation on the binding of RBD towards sialylated gangliosides, also as a function of evolution

      Strengths:

      The molecular dynamics characterization is well performed and demonstrates differences on the effect of glycosylation as a factor of evolution. The binding of different strains to human gangliosides shows variations of strong interest. Analyzing structure function of glycans on SARS-Cov-2 surface as a function of evolution is important for the surveillance of novel variants, since it can influence their virulence.

      Weaknesses:

      The revised article does not hold significant weaknesses

    4. Author response:

      The following is the authors’ response to the original reviews.

      We are thankful to all reviewers and to you for your careful analysis of our work and for the feedback you all provided. The reviews were fundamentally positive with very minor modifications suggested, which we have addressed in this new version as follows.

      (1) We changed Figure 1 to include a high resolution image of the 3D structure of the low affinity complex between the RBD and the GM1 tetrasaccharide (GM1os), see panel d. We predicted this structure through extensive sampling through MD simulations as part of earlier work aimed at guiding the resolution of a crystal structure. Due to insurmountable difficulties in the crystallization of such complex the work was only published as an extended abstract(Garozzo, Nicotra, and Sonnino 2022). Following one of the reviewer’s suggestions we added all the details on the computational approach we used as Supplementary Material.

      (2) We added the comment and corresponding references to the Discussion section in relation to earlier work flagged by one of the Reviewers (Rochman et al. 2022) “Further to this, our results show that taking into consideration the effects on _N-_glycosylation on protein structural stability and dynamics in the context of specific protein sequences may be key to understanding epistatic interactions among RBD residues, which would be otherwise very difficult, where not impossible, to decipher.”

      References

      Garozzo, Domenico, Francesco Nicotra, and Sandro Sonnino. 2022. “‘Glycans and Glycosylation in SARS-COV2 Infection’ Session at the XVII Advanced School in Carbohydrate Chemistry, Italian Chemical Society. July 4th -7th 2021, Pontignano (Si), Italy.” Glycoconjugate Journal 39 (3): 327–34.

      Rochman, Nash D., Guilhem Faure, Yuri I. Wolf, Peter L. Freddolino, Feng Zhang, and Eugene V. Koonin. 2022. “Epistasis at the SARS-CoV-2 Receptor-Binding Domain Interface and the Propitiously Boring Implications for Vaccine Escape.” MBio 13 (2): e0013522.

    1. Author response:

      eLife assessment

      This study presents potentially valuable insights into the role of climbing fibers in cerebellar learning. The main claim is that climbing fiber activity is necessary for optokinetic reflex adaptation, but is dispensable for its long-term consolidation. There is evidence to support the first part of this claim, though it requires a clearer demonstration of the penetrance and selectivity of the manipulation. However, support for the latter part of the claim is incomplete owing to methodological concerns, including unclear efficacy of longer-duration climbing fiber activity suppression.

      We sincerely appreciate the thoughtful feedback provided by the reviewer regarding our study on the role of climbing fibers in cerebellar learning. Each point raised has been carefully considered, and we are committed to addressing them comprehensively. We acknowledge the importance of addressing methodological concerns, particularly regarding the efficacy of long-term suppression of CF activity, as well as ensuring clarity regarding penetrance and selectivity of our manipulation. To this end, we have outlined plans for substantial revisions to the manuscript to adequately address these issues.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study by Seo et al highlights knowledge gaps regarding the role of cerebellar complex spike (CS) activity during different phases of learning related to optokinetic reflex (OKR) in mice. The novelty of the approach is twofold: first, specifically perturbing the activity of climbing fibers (CFs) in the flocculus (as opposed to disrupting communication between the inferior olive (IO) and its cerebellar targets globally); and second, examining whether disruption of the CS activity during the putative "consolidation phase" following training affects OKR performance.

      The first part of the results provides adequate evidence supporting the notion that optogenetic disruption of normal CF-Purkinje neuron (PN) signaling results in the degradation of OKR performance. As no effects are seen in OKR performance in animals subjected to optogenetic irradiation during the memory consolidation or retrieval phases, the authors conclude that CF function is not essential beyond memory acquisition. However, the manuscript does not provide a sufficiently solid demonstration that their long-term activity manipulation of CF activity is effective, thus undermining the confidence of the conclusions.

      Strengths:

      The main strength of the work is the aim to examine the specific involvement of the CF activity in the flocculus during distinct phases of learning. This is a challenging goal, due to the technical challenges related to the anatomical location of the flocculus as well as the IO. These obstacles are counterbalanced by the use of a well-established and easy-to-analyse behavioral model (OKR), that can lead to fundamental insights regarding the long-term cerebellar learning process.

      Weaknesses:

      The impact of the work is diminshed by several methodological shortcomings.

      Most importantly, the key finding that prolonged optogenetic inhibition of CFs (for 30 min to 6 hours after the training period) must be complemented by the demonstration that the manipulation maintains its efficacy. In its current form, the authors only show inhibition by short-term optogenetic irradiation in the context of electrical-stimulation-evoked CSs in an ex vivo preparation. As the inhibitory effect of even the eNpHR3.0 is greatly diminished during seconds-long stimulations (especially when using the yellow laser as is done in this work (see Zhang, Chuanqiang, et al. "Optimized photo-stimulation of halorhodopsin for long-term neuronal inhibition." BMC biology 17.1 (2019): 1-17. ), we remain skeptical of the extent of inhibition during the long manipulations. In short, without a demonstration of effective inhibition throughout the putative consolidation phase (for example by showing a significant decrease in CS frequency throughout the irradiation period), the main claim of the manuscript of phase-specific involvement of CF activity in OKR learning can not be considered to be based on evidence.

      Second, the choice of viral targeting strategy leaves gaps in the argument for CF-specific mechanisms. CaMKII promoters are not selective for the IO neurons, and even the most precise viral injections always lead to the transfection of neurons in the surrounding brainstem, many of which project to the cerebellar cortex in the form of mossy fibers (MF). Figure 1Bii shows sparsely-labelled CFs in the flocculus, but possibly also MFs. While obtaining homogenous and strong labeling in all floccular CFs might be impossible, at the very least the authors should demonstrate that their optogenetic manipulation does not affect simple spiking in PNs.

      Finally, while the paper explicitly focuses on the effects of CF-evoked complex spikes in the PNs and not, for example, on those mediated by molecular layer interneurons or via direct interaction of the CF with vestibular nuclear neurons, it would be best if these other dimensions of CF involvement in cerebellar learning were candidly discussed.

      We appreciate the thorough review and recognize both the strengths and weaknesses highlighted.

      We concur with the reviewer’s assessment of the novelty of our approach, particularly in specifically perturbing the activity of CF in the flocculus and examining the effects during different phases of learning. Also the usage of OKR behavior paradigm adds strength to our study by providing a well-established model for investigating cerebellar learning processes.

      Regarding concerns about the efficacy of long-term optogenetic inhibition and the specificity of viral targeting, we are committed to addressing these issues through additional experiments. Specifically, we aim to demonstrate sustained inhibition of CF transmission by verifying the maintenance of inhibition throughout the putative consolidation phase. This may involve monitoring CF activity during the irradiation period in vivo. Furthermore, we plan to provide further characterization of viral targeting to ensure specificity of our approach.  

      Additionally, we recognize the importance of discussing alternative mechanisms of CF involvement in cerebellar learning. Hence, we will expand the manuscript to provide more comprehensive discussion of these dimensions of CF function to provide a clearer understanding of the broader implications of our findings.

      Reviewer #2 (Public Review):

      Summary:

      The authors aimed to explore the role of climbing fibers (CFs) in cerebellar learning, with a focus on optokinetic reflex (OKR) adaptation. Their goal was to understand how CF activity influences memory acquisition, memory consolidation, and memory retrieval by optogenetically suppressing CF inputs at various stages of the learning process.

      Strengths:

      The study addresses a significant question in the cerebellar field by focusing on the specific role of CFs in adaptive learning. The authors use optogenetic tools to manipulate CF activity. This provides a direct method to test the causal relationship between CF activity and learning outcomes.

      Weaknesses:

      Despite shedding light on the potential role of CFs in cerebellar learning, the study is hampered by significant methodological issues that question the validity of its conclusions. The absence of detailed evidence on the effectiveness of CF suppression and concerns over tissue damage from optogenetic stimulation weakens the argument that CFs are not essential for memory consolidation. These challenges make it difficult to confirm whether the study's objectives were fully met or if the findings conclusively support the authors' claims. The research commendably attempts to unravel the temporal involvement of CFs in learning but also underscores the difficulties in pinpointing specific neural mechanisms that underlie the phases of learning. Addressing these methodological issues, investigating other signals that might instruct consolidation, and understanding CFs' broader impact on various learning behaviors are crucial steps for future studies.

      We appreciate the reviewer’s recognition of the significance of our study in addressing the fundamental question of the role of CF in adaptive learning within the cerebellar field. The use of optogenetic tools indeed provides a direct means to investigate the causal relationship between CF activity and learning outcomes.

      To address concerns regarding the effectiveness of CF suppression during consolidation, we plan to conduct further in-vivo recordings. These will demonstrate how reliably CF transmission can be suppressed through optogenetic manipulation over an extended period.

      In response to the concern about potential tissue damage from laser stimulation, we believe that our optogenetic manipulation was not strong enough to induce significant heat-induced tissue damage in the flocculus. According to Cardin et al. (2010), light applied through an optic fiber may cause critical damage if the intensity exceeds 100 mW, which is eight times stronger than the intensity we used in our OKR experiment. Furthermore, if there had been tissue damage from chronic laser stimulation, we would expect to see impaired long-term memory reflected in abnormal gain retrieval results tested the following day. However, as shown in Figures 2 and 3, there were no significant abnormalities in consolidation percentages even after the optogenetic manipulation.

      Finally, we appreciate the reviewer’s recognition of the challenges involved in pinpointing specific neural mechanisms. We plan to expand the discussion to address these complexities and outline future research directions.

    2. eLife assessment

      This study presents potentially valuable insights into the role of climbing fibers in cerebellar learning. The main claim is that climbing fiber activity is necessary for optokinetic reflex adaptation, but is dispensable for its long-term consolidation. There is evidence to support the first part of this claim, though it requires a clearer demonstration of the penetrance and selectivity of the manipulation. However, support for the latter part of the claim is incomplete owing to methodological concerns, including unclear efficacy of longer-duration climbing fiber activity suppression.

    3. Reviewer #1 (Public Review):

      Summary:

      The study by Seo et al highlights knowledge gaps regarding the role of cerebellar complex spike (CS) activity during different phases of learning related to optokinetic reflex (OKR) in mice. The novelty of the approach is twofold: first, specifically perturbing the activity of climbing fibers (CFs) in the flocculus (as opposed to disrupting communication between the inferior olive (IO) and its cerebellar targets globally); and second, examining whether disruption of the CS activity during the putative "consolidation phase" following training affects OKR performance.

      The first part of the results provides adequate evidence supporting the notion that optogenetic disruption of normal CF-Purkinje neuron (PN) signaling results in the degradation of OKR performance. As no effects are seen in OKR performance in animals subjected to optogenetic irradiation during the memory consolidation or retrieval phases, the authors conclude that CF function is not essential beyond memory acquisition. However, the manuscript does not provide a sufficiently solid demonstration that their long-term activity manipulation of CF activity is effective, thus undermining the confidence of the conclusions.

      Strengths:

      The main strength of the work is the aim to examine the specific involvement of the CF activity in the flocculus during distinct phases of learning. This is a challenging goal, due to the technical challenges related to the anatomical location of the flocculus as well as the IO. These obstacles are counterbalanced by the use of a well-established and easy-to-analyse behavioral model (OKR), that can lead to fundamental insights regarding the long-term cerebellar learning process.

      Weaknesses:

      The impact of the work is diminshed by several methodological shortcomings.

      Most importantly, the key finding that prolonged optogenetic inhibition of CFs (for 30 min to 6 hours after the training period) must be complemented by the demonstration that the manipulation maintains its efficacy. In its current form, the authors only show inhibition by short-term optogenetic irradiation in the context of electrical-stimulation-evoked CSs in an ex vivo preparation. As the inhibitory effect of even the eNpHR3.0 is greatly diminished during seconds-long stimulations (especially when using the yellow laser as is done in this work (see Zhang, Chuanqiang, et al. "Optimized photo-stimulation of halorhodopsin for long-term neuronal inhibition." BMC biology 17.1 (2019): 1-17. ), we remain skeptical of the extent of inhibition during the long manipulations. In short, without a demonstration of effective inhibition throughout the putative consolidation phase (for example by showing a significant decrease in CS frequency throughout the irradiation period), the main claim of the manuscript of phase-specific involvement of CF activity in OKR learning can not be considered to be based on evidence.

      Second, the choice of viral targeting strategy leaves gaps in the argument for CF-specific mechanisms. CaMKII promoters are not selective for the IO neurons, and even the most precise viral injections always lead to the transfection of neurons in the surrounding brainstem, many of which project to the cerebellar cortex in the form of mossy fibers (MF). Figure 1Bii shows sparsely-labelled CFs in the flocculus, but possibly also MFs. While obtaining homogenous and strong labeling in all floccular CFs might be impossible, at the very least the authors should demonstrate that their optogenetic manipulation does not affect simple spiking in PNs.

      Finally, while the paper explicitly focuses on the effects of CF-evoked complex spikes in the PNs and not, for example, on those mediated by molecular layer interneurons or via direct interaction of the CF with vestibular nuclear neurons, it would be best if these other dimensions of CF involvement in cerebellar learning were candidly discussed.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to explore the role of climbing fibers (CFs) in cerebellar learning, with a focus on optokinetic reflex (OKR) adaptation. Their goal was to understand how CF activity influences memory acquisition, memory consolidation, and memory retrieval by optogenetically suppressing CF inputs at various stages of the learning process.

      Strengths:

      The study addresses a significant question in the cerebellar field by focusing on the specific role of CFs in adaptive learning. The authors use optogenetic tools to manipulate CF activity. This provides a direct method to test the causal relationship between CF activity and learning outcomes.

      Weaknesses:

      Despite shedding light on the potential role of CFs in cerebellar learning, the study is hampered by significant methodological issues that question the validity of its conclusions. The absence of detailed evidence on the effectiveness of CF suppression and concerns over tissue damage from optogenetic stimulation weakens the argument that CFs are not essential for memory consolidation. These challenges make it difficult to confirm whether the study's objectives were fully met or if the findings conclusively support the authors' claims. The research commendably attempts to unravel the temporal involvement of CFs in learning but also underscores the difficulties in pinpointing specific neural mechanisms that underlie the phases of learning. Addressing these methodological issues, investigating other signals that might instruct consolidation, and understanding CFs' broader impact on various learning behaviors are crucial steps for future studies.

    1. eLife assessment:

      This important study combines experiments that rely on the use of target-agnostic memory B cell sorting and screening approaches and thorough characterization of antibodies with specificities to the sexual stages of Plasmodium falciparum. The authors present solid findings that one antibody, B1E11K, is cross-reactive with multiple proteins containing glutamate-rich repeats. B1E11k binds to the repeats through homotypic interactions, similar to what has been observed for Plasmodium circumsporozoite protein repeat-directed antibodies. Despite the importance of the findings beyond the field of malaria, the writing, in several places, lacks clarity.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors used target agnostic MBC sorting and activation methods to identify B cells and antibodies against sexual stages of Plasmodium falciparum. While they isolated some Mabs against PFs48/45 and PFs230, two well-known candidates for "transmission blocking" vaccines, these antibodies' efficacies, as measured by TRA, did not perform as well as other known antibodies. They also isolated one cross-reactive mAb to proteins containing glutamic acid-rich repetitive elements, that express at different stages of the parasite life cycle. They then determined the structure of the Fab with the highest protein binder they could determine through protein microarray, RESA, and observed homotypic interactions.

      Strengths:

      - Target agnostic B cell isolation (although not a novel methodology).<br /> - New cross-reactive antibody and mechanism (homotypic interactions) as demonstrated by structural data and other biophysical data.

      Weaknesses:

      The paper lacks clarity at times and could benefit from more transparency (showing all the data) and explanations.<br /> In particular:<br /> -define SIFA<br /> -define TRAbs<br /> -it is not possible to read the Supplementary Figure 6B and C panels.

    3. Reviewer #2 (Public Review):

      This manuscript by Amen, Yoo, Fabra-Garcia et al describes a human monoclonal antibody B1E11K, targeting EENV repeats which are present in parasite antigens such as Pfs230, RESAs, and 11.1. The authors isolated B1E11K using an initial target agnostic approach for antibodies that would bind gamete/gametocyte lysate which they made 14 mAbs. Following a suite of highly appropriate characterization methods from Western blotting of recombinant proteins to native parasite material, use of knockout lines to validate specificity, ITC, peptide mapping, SEC-MALS, negative stain EM, and crystallography, the authors have built a compelling case that B1E11K does indeed bind EENV repeats. In addition, using X-ray crystallography they show that two B1E11K Fabs bind to a 16 aa RESA repeat in a head-to-head conformation using homotypic interactions and provide a separate example from CSP, of affinity-matured homotypic interactions.

      There are some minor comments and considerations identified by this reviewer, These include that one of the main conclusions in the paper is the binding of B1E11K to RESAs which are blood stage antigens that are exported to the infected parasite surface. It would have been interesting if immunofluorescence assays with B1E11K mAb were performed with blood-stage parasites to understand its cellular localization in those stages.

    4. Reviewer #3 (Public Review):

      The manuscript from Amen et al reports the isolation and characterization of human antibodies that recognize proteins expressed at different sexual stages of Plasmodium falciparum. The isolation approach was antigen agnostic and based on the sorting, activation, and screening of memory B cells from a donor whose serum displays high transmission-reducing activity. From this effort, 14 antibodies were produced and further characterized. The antibodies displayed a range of transmission-reducing activities and recognized different Pf sexual stage proteins. However, none of these antibodies had higher TRA than previously described antibodies.

      The authors then performed further characterization of antibody B1E11K, which was unique in that it recognized multiple proteins expressed during sexual and asexual stages. Using protein microarrays, B1E11K was shown to recognize glutamate-rich repeats, following an EE-XX-EE pattern. An impressive set of biophysical experiments was performed to extensively characterize the interactions of B1E11K with various repeat motifs and lengths. Ultimately, the authors succeeded in determining a 2.6 A resolution crystal structure of B1E11K bound to a 16AA repeat-containing peptide. Excitingly, the structure revealed that two Fabs bound simultaneously to the peptide and made homotypic antibody-antibody contacts. This had only previously been observed with antibodies directed against CSP repeats.

      Overall I found the manuscript to be very well written, although there are some sections that are heavy on field-specific jargon and abbreviations that make reading unnecessarily difficult. For instance, 'SIFA' is never defined. Strengths of the manuscript include the target-agnostic screening approach and the thorough characterization of antibodies. The demonstration that B1E11K is cross-reactive to multiple proteins containing glutamate-rich repeats, and that the antibody recognizes the repeats via homotypic interactions, similar to what has been observed for CSP repeat-directed antibodies, should be of interest to many in the field.

    1. eLife assessment

      This study presents an important study of the relationship between morphogen signaling and cell fate choices in the forming zebrafish neural tube, addressing a topical question in developmental biology. The authors provide a solid characterization of the precision limit for gene regulatory networks interpreting Shh, with single-cell resolution and state-of-the-art in vivo approaches. However, the analyses are at times incomplete and would benefit from a higher number of cell traces. With the analyses strengthened, this work will be of interest to developmental biologists interested in cellular decision-making.

    2. Reviewer #1 (Public Review):

      Throughout the paper, the authors do a fantastic job of highlighting caveats in their approach, from image acquisition to analysis. Despite this, some conclusions and viewpoints portrayed in this study do not appear well-supported by the provided data. Furthermore, there are a few technical points regarding the analysis that should be addressed.

      (1) Analysis of signaling traces

      - Relevance of "modeled signaling level": It is not clear whether this added complexity and potential for error (below) provides benefits over a more simple analysis such as taking the derivative (shown in Figure 3C). Could the authors provide evidence for the benefits? For example, does the "maximal response" given a simpler metric correlate less well with cell fate than that calculated from the fitted response?

      - Assumptions for "modeled signaling level": According to equation (1) Kaede levels are monotonically increasing. This is assumed given the stability of the fluorescent protein. However, this only holds for the "totally produced Kaede/fluorescence". Other metrics such as mean fluorescence can very well decrease over time due to growth and division. Does "intensity" mean total fluorescence? Visual inspection of the traces shown in Figure 2 suggests that "fluorescence intensity" can decrease. What does this mean for the inferred traces?

      - Estimation of Kaede reporter half-live: It is not clear how the mRNA stability of Kaede is estimated. It sounds like it was just assessed visually, which seems not entirely appropriate given the quantitative aspects of the rest of the study. Also, given that Shh signaling was inhibited on the level of Smoothened, it is not obvious how the dynamics of signaling shutdown affect the estimate. Most results in Figure 7 seem to be quite robust to the estimate of the half-live. That they are, might suggest that the whole analysis is unnecessary in the first place. However, not all are. Thus, it would be important to make this estimate more quantitative.

      (2) Assignment of fates and correlations

      - Error estimate for cell-type assignment: Trying to correlate signaling traces to cell fate decisions requires accurate cell fate assignment post-tracking. The provided protocol suggests a rather manual, expert-directed process of making those decisions. Can the authors provide any error-bound on those decisions, for example comparing the results obtained by two experts or something comparable? I am particularly concerned about the results regarding the higher degree of variability in the correlation between signaling dynamics and cell fate in the posterior neural tube. Here, the expression of Olig2 does not seem to segregate between different assigned fates, while it does so nicely in the anterior neural tube. This would suggest to me that cells in the posterior neural tube might not yet be fully committed to a fate or that there could be a relatively high error rate in assigning fates. Thus, the results could emerge from technical errors or differences in pure timing. Could the authors please comment on these possibilities?

      - Clustering and fates: One approach the authors use to analyze the correlation between signaling and fate is clustering of cell traces and comparison of the fate distributions in those clusters. There is a large number of clusters with only single traces, suggesting that the data (number of traces) might not be sufficient for this analysis. Furthermore, I am skeptical about clustering cells of different anterior-posterior identities together, given potential differences in the timing of signal reception and signaling. I am not convinced that this analysis reveals enough about how signaling maps to fate given the heterogeneity in traces in large clusters and the prevalence of extremely small clusters.

      - Signaling vector and hand-picked metrics: As an alternative approach, that might be better suited for their data, the authors then pick three metrics (based on their model-predicted signaling dynamics) and show that the maximal response is a very good predictor of fate for different anterior-posterior identities. Previous information-theoretic analysis of signaling dynamics has found that a whole time-vector of signaling can carry much more information than individual metrics (Selimkhanov et al, 2014, PMID: 25504722). Have the authors tried to use approaches that make use of the whole trace (such as simple classifiers (Granados et al, 2018, PMID: 29784812), or can comment on why this is not feasible for their data? The authors should at least make clear that their results present a lower bound to how accurately cells can make cell-fate decisions based on signaling dynamics.

      (3) Consequences of signaling heterogeneity

      The authors focus heavily on portraying that signaling dynamics are highly variable, which seems visually true at first glance. However, there is no metric used or a description given of what this actually means. Mainly, the variability seems to relate to the correlation between signaling and fate. However, given the data and analysis, I would argue that the decoding of signaling dynamics into fate is surprisingly accurate. So signaling dynamics that seem quite noisy and variable by visual inspection can actually be very well discriminated by cells, which to me appears very exciting.

      Indeed, simple features of signaling traces can predict cell fate as well as position (for anterior progenitors). Given that signaling should be a function of position, it naively seems as if signaling read-out could be almost perfect. It might be interesting to plot dorsal-ventral position vs the signaling metrics, to also investigate how Shh concentration/position maps to signaling dynamics, this would give an even more comprehensive view of signal transmission.

      There remains the discrepancy between signaling traces and fate in the posterior neural tube. The authors point towards differences in tissue architecture and difficulties in interpreting a "small" Shh gradient. However, the data seems consistent with differences in timing of cell-fate decisions between anterior and posterior cells. The authors show that fate does initially not correlate well with position in the posterior neural tube. So, signaling dynamics should likely also not, as they should rather be a function of position, given they are downstream of the Shh gradient. As mentioned above, not even Olig2 expression does segregate the assigned fates well. All this points towards a difference in the time of fate assignment between the anterior and posterior. Given likely delays in reporter protein production and maturation, it can thus not be expected that signaling dynamics correlate better with cell fate than the reporter "83%". Can the authors please discuss this possibility in the paper?

      Thus, while this paper represents an example of what the community needs to do to gain a better understanding of robust patterning under variability, the provided data is not always sufficient to make clear conclusions regarding the functional consequences of signaling dynamics.

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, Xiong and colleagues examine the relationship between the profile of the morphogen Shh and the resulting cell fate decisions in the zebrafish neural tube. For this, the authors combine high-resolution live imaging of an established Shh reporter with reporter lines for the different progenitor types arising in the forming neural tube. One of the key observations in this manuscript is that, while, on average, cells respond to differences in Shh activity to adopt distinct progenitor fates, at the single cell level there is strong heterogeneity between Shh response and fate choices. Further, the authors showed that this heterogeneity was particularly prominent for the pMN fate, with similar Shh response dynamics to those observed in neighboring LFP progenitors.

      Strengths:

      It is important to directly correlate Shh activity with the downstream TFs marking distinct progenitor types in vivo and with single cell resolution. This additional analysis is in line with previous observations from these authors, namely in Xiong, 2013. Further, the authors show that cells in different anterior-posterior positions within the neural tube show distinct levels of heterogeneity in their response to Shh, which is a very interesting observation and merits further investigation.

      Weaknesses:

      This is a convincing work, however, adding a few more analyses and clarifications would, in my view, strengthen the key finding of heterogeneity between Shh response and the resulting cell fate choices.

    1. eLife assessment

      The authors address key assumptions underlying current models of the formation of value-based decisions. They provide solid evidence that the subjective values human participants assign to choice options change across sequences of multiple decisions and establish valuable methods to detect these changes in frequently used behavioral task designs. That said, the description of the fMRI results requires further elaboration in order to support the claim that the authors' algorithm reveals neural valuation processes better than the current standard approach.

    2. Reviewer #1 (Public Review):

      Summary:

      There is a long-standing idea that choices influence evaluation: options we choose are re-evaluated to be better than they were before the choice. There has been some debate about this finding, and the authors developed several novel methods for detecting these re-evaluations in task designs where options are repeatedly presented against several alternatives. Using these novel methods the authors clearly demonstrate this re-evaluation phenomenon in several existing datasets.

      Strengths:

      The paper is well-written and the figures are clear. The authors provided evidence for the behaviour effect using several techniques and generated surrogate data (where the ground truth is known) to demonstrate the robustness of their methods.

      Weaknesses:

      The description of the results of the fMRI analysis in the text is not complete: weakening the claim that their re-evaluation algorithm better reveals neural valuation processes.

    3. Reviewer #2 (Public Review):

      Summary:

      Zylberberg and colleagues show that food choice outcomes and BOLD signal in the vmPFC are better explained by algorithms that update subjective values during the sequence of choices compared to algorithms based on static values acquired before the decision phase. This study presents a valuable means of reducing the apparent stochasticity of choices in common laboratory experiment designs. The evidence supporting the claims of the authors is solid, although currently limited to choices between food items because no other goods were examined. The work will be of interest to researchers examining decision-making across various social and biological sciences.

      Strengths:

      The paper analyses multiple food choice datasets to check the robustness of its findings in that domain.

      The paper presents simulations and robustness checks to back up its core claims.

      Weaknesses:

      To avoid potential misunderstandings of their work, I think it would be useful for the authors to clarify their statements and implications regarding the utility of item ratings/bids (e-values) in explaining choice behavior. Currently, the paper emphasizes that e-values have limited power to predict choices without explicitly stating the likely reason for this limitation given its own results or pointing out that this limitation is not unique to e-values and would apply to choice outcomes or any other preference elicitation measure too. The core of the paper rests on the argument that the subjective values of the food items are not stored as a relatively constant value, but instead are constructed at the time of choice based on the individual's current state. That is, a food's subjective value is a dynamic creation, and any measure of subjective value will become less accurate with time or new inputs (see Figure 3 regarding choice outcomes, for example). The e-values will change with time, choice deliberation, or other experiences to reflect the change in subjective value. Indeed, most previous studies of choice-induced preference change, including those cited in this manuscript, use multiple elicitations of e-values to detect these changes. It is important to clearly state that this paper provides no data on whether e-values are more or less limited than any other measure of eliciting subjective value. Rather, the paper shows that a static estimate of a food's subjective value at a single point in time has limited power to predict future choices. Thus, a more accurate label for the e-values would be static values because stationarity is the key assumption rather than the means by which the values are elicited or inferred.

      There is a puzzling discrepancy between the fits of a DDM using e-values in Figure 1 versus Figure 5. In Figure 1, the DDM using e-values provides a rather good fit to the empirical data, while in Figure 5 its match to the same empirical data appears to be substantially worse. I suspect that this is because the value difference on the x-axis in Figure 1 is based on the e-values, while in Figure 5 it is based on the r-values from the Reval algorithm. However, the computation of the value difference measure on the two x-axes is not explicitly described in the figures or methods section and these details should be added to the manuscript. If my guess is correct, then I think it is misleading to plot the DDM fit to e-values against choice and RT curves derived from r-values. Comparing Figures 1 and 5, it seems that changing the axes creates an artificial impression that the DDM using e-values is much worse than the one fit using r-values.

      Relatedly, do model comparison metrics favor a DDM using r-values over one using e-values in any of the datasets tested? Such tests, which use the full distribution of response times without dividing the continuum of decision difficulty into arbitrary hard and easy bins, would be more convincing than the tests of RT differences between the categorical divisions of hard versus easy.

      Revaluation and reduction in the imprecision of subjective value representations during (or after) a choice are not mutually exclusive. The fact that applying Reval in the forward trial order leads to lower deviance than applying it in the backwards order (Figure 7) suggests that revaluation does occur. It doesn't tell us if there is also a reduction in imprecision. A comparison of backwards Reval versus no Reval would indicate whether there is a reduction in imprecision in addition to revaluation. Model comparison metrics and plots of the deviance from the logistic regression fit using e-values against backward and forward Reval models would be useful to show the relative improvement for both forms of Reval.

      Did the analyses of BOLD activity shown in Figure 9 orthogonalize between the various e-value- and r-value-based regressors? I assume they were not because the idea was to let the two types of regressors compete for variance, but orthogonalization is common in fMRI analyses so it would be good to clarify that this was not used in this case. Assuming no orthogonalization, the unique variance for the r-value of the chosen option in a model that also includes the e-value of the chosen option is the delta term that distinguishes the r and e-values. The delta term is a scaled count of how often the food item was chosen and rejected in previous trials. It would be useful to know if the vmPFC BOLD activity correlates directly with this count or the entire r-value (e-value + delta). That is easily tested using two additional models that include only the r-value or only the delta term for each trial.

      Please confirm that the correlation coefficients shown in Figure 11 B are autocorrelations in the MCMC chains at various lags. If this interpretation is incorrect, please give more detail on how these coefficients were computed and what they represent.

      The paper presents the ceDDM as a proof-of-principle type model that can reproduce certain features of the empirical data. There are other plausible modifications to bounded evidence accumulation (BEA) models that may also reproduce these features as well or better than the ceDDM. For example, a DDM in which the starting point bias is a function of how often the two items were chosen or rejected in previous trials. My point is not that I think other BEA models would be better than the ceDDM, but rather that we don't know because the tests have not been run. Naturally, no paper can test all potential models and I am not suggesting that this paper should compare the ceDDM to other BEA processes. However, it should clearly state what we can and cannot conclude from the results it presents.

      This work has important practical implications for many studies in the decision sciences that seek to understand how various factors influence choice outcomes. By better accounting for the context-specific nature of value construction, studies can gain more precise estimates of the effects of treatments of interest on decision processes. That said, there are limitations to the generalizability of these findings that should be noted.

      These limitations stem from the fact that the paper only analyzes choices between food items and the outcomes of the choices are not realized until the end of the study (i.e., participants do not eat the chosen item before making the next choice). This creates at least two important limitations. First, preferences over food items may be particularly sensitive to mindsets/bodily states. We don't yet know how large the choice deltas may be for other types of goods whose value is less sensitive to satiety and other dynamic bodily states. Second, the somewhat artificial situation of making numerous choices between different pairs of items without receiving or consuming anything may eliminate potential decreases in the preference for the chosen item that would occur in the wild outside the lab setting. It seems quite probable that in many real-world decisions, the value of a chosen good is reduced in future choices because the individual does not need or want multiples of that item. Naturally, this depends on the durability of the good and the time between choices. A decrease in the value of chosen goods is still an example of dynamic value construction, but I don't see how such a decrease could be produced by the ceDDM.

    1. eLife assessment

      This landmark paper introduces the generation and analysis of a connectome resource of the entire ventral nerve cord of a fruit fly which is one of the top model organisms to investigate how a nervous system forms and functions. The work introduces new and improved approaches - from tissue preparation to automated reconstruction - to generate a detailed connectome from a complex adult ventral nerve cord. This extensive new dataset provides cell type and lineage annotations, putative neurotransmitter expression information, and the potential to link to genetic driver lines, with compelling evidence to support the claims made.

    2. Reviewer #1 (Public Review):

      Summary:

      Drosophila is one of the most studied model organisms to understand how neural circuits form and function to control intricate animal behaviors. The ventral nerve cord (VNC) part of the fly's CNS serves as a sensory processing and motor output center just like our spinal cord. Over the last decade, the VNC has become a fruitful platform to understand neural circuits responsible for motor behavior such as walking and flying. The missing resource was the complete connectome of the VNC neurons. This study provides this needed resource. The authors documented their approaches on how to generate the data from tissue preparation to computer-assisted reconstruction in a simple manner and left the in-depth analysis of the network features of the connecting neurons to two other well-written companion articles.

      Strengths:<br /> Unlike many other previously published EM datasets, the authors presented a ready-to-view connectome dataset of the adult fly VNC. Readers, without needing permission, can access the dataset to find their neurons of interest and determine their synaptic partners with a few clicks. The authors also share their novel approaches in a detailed manner for others to reproduce similar EM volumes for other tissues.

      Weaknesses:

      The reconstruction completion, around 50%, might be considered a weakness. However, the data appear to have ~ %50 completion across all different neuropils suggesting that sampling is homogenous and does not induce bias. Nevertheless, a higher percentage will give a more complete picture.

    3. Reviewer #2 (Public Review):

      Summary:

      Takemura et al. achieved a milestone in connectomics with their dense reconstruction of the Male Adult Nerve Cord (MANC) in Drosophila, revealing the neural circuitry of the primary premotor and motor domains in the CNS of the fruit fly. The team meticulously reconstructed neuron morphologies and synaptic connections and registered these data with light microscopy datasets (of driver lines for example), made neuronal lineage annotations and neurotransmitter predictions, providing the basis for new hypotheses about motor control. A description of the dataset and methods are presented here, while cell type annotations and characterisation of connectivity between brain descending neurons and motor neurons are provided in two companion papers, Marin et al. and Cheong, Eichler, Stürner et al., respectively. This dataset and analysis will provide a rich resource for future neuroscientific exploration.

      Strengths:

      The authors fully utilise a wealth of tools and techniques developed over the course of over a decade to produce a new publicly available dataset with an impressive number of reconstructed neurons and synapses. The precision and recall of connections are as high or higher than past datasets (e.g. the Hemibrain), pointing to the reliability of any downstream analyses performed on this connectome. These data are augmented with neurotransmitter identities, providing essential information for modelling and computational analysis. The MANC connectome can also be linked to genetic tools through registration to pre-existing light microscopy datasets, allowing experimentalists to test hypotheses made based on the connectome.

      Weaknesses:

      This dataset presents the nerve cord connectome of just a single animal, so connectivity variability and validity will be hard to assess. However, it is bilaterally reconstructed, which does allow comparison between bilaterally symmetrical neurons on the left and right sides of the nerve cord, increasing confidence in connections observed on both sides. Damage occurred to the nerves during sample preparation, which will have to be considered when analysing sensory connectivity.

    1. eLife assessment

      Work described in this manuscript reveals the importance of the zinc transporter SLC30A1 in the antimicrobial function of macrophages, specifically against Salmonella. Cell-targeted deletion of the zinc transporter increased susceptibility of mice to systemic infection with Salmonella, leading to decreases in several cell functions such as nos2 expression. The authors argue that zinc homeostasis promotes macrophage cell function that is not conductive to the intracellular proliferation of Salmonella. This study provides novel and supportive evidence for a new pathway in nutritional immunity.

    2. Reviewer #1 (Public Review):

      This is an important and very well conducted study providing novel evidence on the role of zinc homeostasis for the control of infection with the intracellular bacterium S. typhimurium also disentangling the underlying mechanisms and providing clear evidence on the importance of spatio-temporal distribution of (free) zinc within the cell.

      Comments:

      It would be important to provide more information on the genotype of mice. It is rather unlikely that C57Bl6 mice survive up to two weeks after i.p. injection of 1x10E5 bacteria.

      To be sure that macrophages Slc30A1 fl/fl LysMcre mice really have an impaired clearance of bacteria it would be important to rule out an effect of Slc30A1 deletion of bacterial phagocytosis and containment (f.e. evaluation of bacterial numbers after 30 min of infection).

      Does the addition of zinc to macrophages negatively affect iNOS transcription as previously observed for the divalent metal iron and is a similar mechanism also employed (CEBPß/NF-IL6 modulation) (Dlaska M et al. J Immunol 1999)?

      How does Zinc or TPEN supplementation to bacteria in LB medium affect the log growth of Salmonella?

    3. Reviewer #2 (Public Review):

      This paper explores the importance of zinc metabolism in host defense against the intracellular pathogen Salmonella Typhimurium. Using conditional mice with a deletion of the Slc30a1 zinc exporter, the authors show a critical role for zinc homeostasis in the pathogenesis of Salmonella. Specifically, mice deficient in Slc30a1 gene in LysM+ myeloid cells are hypersusceptible to Salmonella infection, and their macrophages show alter phenotypes in response to Salmonella. The study adds important new information on the role metal homeostasis plays in microbe host interactions. Despite the strengths, the manuscript has some weaknesses. The authors conclude that lack of slc30a1 in macrophages impairs nos2-dependent anti-Salmonella activity. However, this idea is not tested experimentally. In addition, the research presented on Mt1 is preliminary. The text related to Figure 7 could be deleted without affecting the overall impact of the findings.

    4. Reviewer #3 (Public Review):

      Na-Phatthalung et al observed that transcripts of the zinc transporter Slc30a1 was upregulated in Salmonella-infected murine macrophages and in human primary macrophages therefore they sought to determine if, and how, Slc30a1 could contribute to the control of bacterial pathogens. Using a reporter mouse the authors show that Slc30a1 expression increases in a subset of peritoneal and splenic macrophages of Salmonella-infected animals. Specific deletion of Slc30a1 in LysM+ cells resulted in a significantly higher susceptibility of mice to Salmonella infection which, counter to the authors conclusions, is not explained by the small differences in the bacterial burden observed in vivo and in vitro. Although loss of Slc30a1 resulted in reduced iNOS levels in activated macrophages, the study lacks experiments that mechanistically link loss of NO-mediated bactericidal activity to Salmonella survival in Slc30a1 deficient cells. The additional deletion of Mt1, another zinc binding protein, resulted in even lower nitrite levels of activated macrophages but only modest effects on Salmonella survival. By combining genetic approaches with molecular techniques that measure variables in macrophage activation and the labile zinc pool, Na-Phattalung et al successfully demonstrate that Slc30a1 and metallothionein 1 regulate zinc homeostasis in order to modulate effective immune responses to Salmonella infection. The authors have done a lot of work and the information that Slc30a1 expression in macrophages contributes to control of Salmonella infection in mice is a new finding that will be of interest to the field. Whether the mechanism by which SLC30A1 controls bacterial replication and/or lethality of infection involves nitric oxide production by macrophages remains to be shown.

    1. eLife assessment

      Serotonin is an important neurotransmitter and its synaptic concentration is controlled by re-uptake by the sodium-coupled serotonin transporter SERT. The manuscript by Chan et al reports results from a systematic deep mutagenesis approach to study the surface expression and APP+ (5HT analogue) transport mechanism of the human serotonin transporter. The authors complement this experimental evidence with large-scale molecular simulations of the transporter in the presence of APP+. The use of deep mutagenesis and large-scale adaptive sampling simulations is impressive, and could contribute to understanding the structural requirements for folding and how transporters evolve to recognize different substrates.

    2. Reviewer #1 (Public Review):

      Sertonin is an important neurotransmitter and it synaptic concentration is controlled by re-uptake by the sodium-coupled serotonin transporter SERT. In this paper, some 6000 mutations of SERT were made and tested for surface expression and uptake of a serotonin analogue APP+. The SERT mutants were analysed and compared to the SERT structure and dynamics based on MD simulations. The authors have concluded that mutations located on surface exposed regions are tolerated whilst those involved in packing and structural integrity are not. Gain-of-function mutations map onto regions that in most cases favour opening of a solvent-exposed intracellular vestibule. Closure of the intracellular gate is thought to be rate-limiting to the transport cycle, and thus the evolutionary-based screen is consistent with the clustering of gain-of-function mutations.

      Strengths:<br /> This paper using a large unbiased data-set to probe the evolution of the serotonin transporter SERT for the substrate APP+. They have been able to compare both localisation and transport data, which is an interesting data-set. Using MD simulations they are further able to provide some rationale basis for the gain-of-function mutants.

      Weaknesses:<br /> They can only detect surface expression of myc-tagged SERT based on conjugation with a fluorescent anti-myc antibody. As such, they cannot distinguish between SERT mutants that abolish expression vs. those that are no longer trafficking to the plasma membrane. This is a downside, as it would have been interesting to know the fraction of SERT mutations disrupt trafficking. Indeed, the relationship between misfolding and targeting is poorly understood beyond the calnexin- calreticulin cycle. Furthermore, there seems to be a gap between the large-scale mutagenesis data and the MD simulations in which the main mechanistic conclusions seem to be based on (carried out in a separate publication). Thus, overall while the mutation data-set is impressive its not clear how this aids to our mechanistic understanding of SERT.

    3. Reviewer #2 (Public Review):

      The manuscript by Chan et al reports results of a systematic mutagenesis approach to study the surface expression and APP+ transport mechanism of serotonin transporter. They complement this experimental evidence with large-scale molecular simulations of the transporter in the presence of APP+. The use of deep mutagenesis and large-scale adaptive sampling simulations is impressive and could be very exciting contributions to the field.

      On the whole, the results appear to provide a fascinating insight into the effects of mutations on transport mechanisms, and how those interrelate with the structural fold and biophysical properties of a dynamic protein and its substrate pathways. A weakness of the conclusions based on the molecular simulation is that it relies on comparison with previously-published work involving non-identical simulation systems (i.e. different protonation states).

      Conclusions in this work about the origins of the sodium:serotonin 1:1 stoichiometry should also be considered in the context of the fact that there are two sodium ions bound in the structures of SERT, and more work is needed to explain why this ion is not also released/co-transported.

      Some of the methods require additional information to be provided to be reproducible, for example, for the Transition Path Theory results, and so it is not possible to assess these conclusions with the manuscript in its current form.

    4. Reviewer #3 (Public Review):

      The results of the deep mutagenesis screen represent a wealth of information on the expression and function of SERT that everyone studying this protein will appreciate. However, as the authors explain, the screen identified mutations that increased APP+ transport but inhibited transport of the cognate substrate, 5-HT. Because of the methods used, 5-HT could not be used as a substrate, somewhat limiting the usefulness of the screen.

      However, the authors have taken advantage of this limitation to address the mechanistic features of SERT that discriminate between 5-HT and APP+. From the position of mutations that augment APP+ transport, they have identified the aqueous pathway created in inward facing SERT conformations as a region of importance. Based on the MD simulations, transition to inward facing conformations is facilitated by 5-HT but less so by APP+. The authors conclude, quite reasonably, that mutations interfering with the stability of inward-closed SERT states could overcome the reduced ability of APP+ to open the pathway.

      Another reasonable conclusion based on the mutant screen, is that mutations detrimental to surface expression were found in packed hydrophobic regions of the protein, but similar mutations in the permeation pathways were less likely to decrease expression. The authors postulate that this provides an evolutionary advantage by maintaining the structural fold while allowing modification of ion and substrate binding and coupling sites, a reasonable but speculative conclusion.

      Not all gain-of-function mutations have to be specific to APP+. The authors point out that Ala173Gly converts SERT to the residue found in NET and DAT at this position. It would have been interesting to know how this mutation and others affect 5-HT transport. Indeed, the lack of any 5-HT transport measurements with the mutants is a glaring weakness of the manuscript.

    1. eLife assessment

      The authors provide a high quality genome of the xenacoelomorph worm Xenoturbella bocki and discuss its structure and evolution. Understanding the genomic structure of this group provides important insights into bilaterian evolution. The authors make a solid case that the data they present can support the placement of Xenacoelomorpha within the deuterostomes rather than as a sister group to all other bilaterians, but do not unequivocally reject the competing scenario.

    2. Reviewer #1 (Public Review):

      The authors report a high-quality genome assembly for a member of Xenacoelomorpha, a taxon that is at the center of the last remaining great controversies in animal evolution. The taxon and the species in question have "jumped around" the animal tree of life over the past 25 years, and seemed to have found their place as a sister-group to all remaining bilaterians. This hypothesis posits that the earliest split within Bilateria includes Xenacoelomorpha on the one hand and a clade known as Nephrozoa (Protostomia + Deuterostomia) on the other, and is thus referred to as the Nephrozoa hypothesis. Nephrozoa is supported by phylogenomic evidence, by a number of synapomorphic morphological characters in the Nephrozoa (namely, the presence of nephridia) and lack of some key bilaterian characters in Xenacoelomorpha, and by the presence of unique miRNAs in Nephrozoa.

      The Nephrozoa hypothesis has been challenged several times by the authors' groups who alternatively suggest placing Xenacoelomorpha within Deuterostomia as a sister group to a clade known as Ambulacraria. This hypothesis (the Xenambulacraria hypothesis) is supported by alternative phylogenomic datasets and by the shared presence of a number of unique molecular signatures. In this contribution, the authors aim to strengthen their case by providing full genome data for Xenoturbella bocki.<br /> The actual sequencing and analysis are technically and methodologically excellent. Some of the analyses were done several years ago using approaches that may now seem obsolete, but there is no reason not to include them. As a detailed report of a newly sequenced genome, the manuscript meets the highest standards.

      The authors emphasize a number of key findings. One is the fact that the genome is not as simple as one might expect from a "basal" taxon, and is on par with other bilaterian genomes and even more complex than the genome of secondarily simplified bilaterians. There is an implicit expectation here that the sister group to all Bilateria would represent the primitive state. This is of course not true, and the authors are aware of this, but it sometimes feels as though they are using this implicit assumption as a straw dog argument to say that since the genome is not as simple as expected, X. bocki must be nested within Bilateria. The authors get around this by acknowledging that their finding is consistent with a "weak version of the Nephrozoa hypothesis", which is essentially the Nephrozoa phylogenetic hypothesis without implicit assumptions of simplicity.

      Another finding is a refutation of the miRNA data supporting Nephrozoa. This is an important finding although it is somewhat flogging a dead horse, since there is already a fair amount of skepticism about the validity of the miRNA data (now over 20 years old) for higher-level phylogenetics.

      The finding that the authors feel is most important is gene presence-absence data that recovers a topology in which X. bocki is sister to Abulacraria. The problem is that the same tree does not support the monophyly of Xenacoelomorpha. This may be an artifact of fast evolving acoel genomes, as the authors suggest, but it still raises questions about the robustness of the data.

      In sum, the authors' results and analyses leave an open window for the Xenambulacraria hypothesis, but do not refute the Nephrozoa hypothesis. The manuscript is a valuable contribution to the debate but does not go a significant way towards its resolution.<br /> The manuscript has gone through several rounds of review and revision on a preprint server and is thus fairly clear of typos, inconsistencies and lack of clarity. The authors are honest and open in their interpretation of the results and their strengths.

    3. Reviewer #2 (Public Review):

      The manuscript describes the genome assembly and analysis of Xenoturbella bocki, a worm that bears many morphological features ascribed to basal bilateria. The authors aim to analyse this genome in an attempt to determine the phylogenetic position of X. bocki as a representative of Xenacoelomorpha and its associated acoelomorphs. In doing so, they want to inform the debate as to whether xenacoelomorph belong among, or is in fact paraphyletic to all bilaterians.

      This paper presents a high-quality assembly of the X. bocki genome. By virtue of the phylogenetic position of this species, this genome has considerable scientific interest. This assembly appears to be highly complete and is a strength of the paper. The further characterisation of the genome is well executed and presented. Solid results from this paper include a comprehensive description of the Hox genes, miRNA and neruopeptide repertoire, as well as a description of the linkage group and how they relate to the ancestral linkage groups.

      Where this paper is weaker is that for the central claims and questions of this paper, i.e,. the question of the phylogenetic position of xenacoelomorph and whether X. bocki is a slowly evolving, but otherwise representative member of this clade, remains insufficiently resolved.

      The authors have achieved the goal of describing the X. bocki genome very well. By contrast, it is unclear, based on the presented evidence, whether xenacoelomorph is truly a monophyletic group. The balance of the evidence seems to suggest that the X. bocki genome belongs within the bilateria group. However, it is unclear as to what is driving the position of the other acoels. Assumign that X. bocki and the other two species in that group are monophyletic, then the evidence will favour the authors' conclusion (but without clearly rejecting the alternatives).

      This paper will likely further animate the debate regarding this basal species, and also questions related to the ancestral characters of bilateria as a whole. In particular the results from the HOX and paraHOX clusters, may provide an interesting counterpoint to the previous results based on the acoels.

    1. eLife assessment

      The study presents a valuable finding on quantifying the orientation and organization of chondrocyte columns in the prenatal and postnatal growth plate cartilage using advanced 3D imaging and a sophisticated image analysis pipeline. The evidence supporting the authors' conclusions regarding the lack of columns in the fetal growth plate is considered inadequate due to technical caveats, inconsistencies in the data and corresponding model, and failure to correctly put the findings in context.

    2. Reviewer #1 (Public Review):

      Rubin et al. study chondrocyte columns in the prenatal and postnatal growth plate in 3D for the first time, using a novel analysis pipeline in which Confetti clones in the murine growth plate are analysed morphometrically. Prenatal chondrocytes were found not to be organised in columns parallel to the main orientation of the long bone, but rather, prenatal chondrocytes were commonly organised perpendicular to the main direction of growth. In the postnatal (P40) growth plate there was a diverse arrangement of columns, but more of the columns were vertically aligned

      I enjoyed reading the work and the analysis is rigorous. However, I think that it is not valid to state that columns do not form in the embryo. The data only supports the finding that strictly vertical columns do not form in the embryo, as the cells are still organised into columns, albeit with a range of orientations. I do not like the term "typically" aligned, as how can we know what is "typical" when orientation has never before been assessed in 3D... And the authors' data demonstrates that it is certainly not "typical" for chondrocyte to organise into vertical columns prenatally.

      It would be very interesting to delve deeper into the reason for the change in orientation of columns between pre- and post-natal. For example, does more circumferential growth happen prenatally as compared to postnatally? Is the rate of circumferential vs longitudinal growth different between prenatal and postnatal, and could the change in column orientation be responsible for a (possible) shift in the balance between longitudinal vs circumferential growth before vs after birth? The first sentence of the Discussion refers to the role of chondrocyte columns in driving bone elongation, but aren't they also involved in driving bone morphology?

      I feel describing the activity of the cells as "mis-rotations" which implies the orientations are not intentional. It is likely not accidental or mistaken that the chondrocytes align in the ways they do- the diaphysis is largely for longitudinal growth while the epiphyses, and lateral expansion of the joint is also important. I find the data in Figure 4 fascinating, especially the variation in orientations between the regions of the growth plate (from proximal to distal), with the most lateral orientation at the most proximal and distal ends- it would be nice to see more discussion of these variations and what they may be contributing to.

      The abstract focuses solely on the analysis of columns prenatally and would benefit from the inclusion of the data from the postnatal growth plate and from the chondrocyte rotations.

    3. Reviewer #2 (Public Review):

      The origin and function of proliferative chondrocyte columns in the growth plate that are generally aligned with predicted longitudinal growth vectors have been robustly debated since the implementation of clonal analysis and live cell imaging techniques more than a decade ago. In particular, live cell imaging demonstrated that in the proliferative zone, most daughter pairs rotate fully or partially after division to form columns of stacked cells and a minority of pairs fail to rotate. These observations and others led to a mechanistic model of column formation, but limitations in the live cell imaging methods that only visualize a single round of division and rotation left open an important question - what is the effect of different rotation profiles on column formation, bone growth, and morphology?

      This manuscript describes the use of an inducible lineage tracing system in the mouse combined with a novel image analysis pipeline to analyze column formation over multiple cell divisions. The main conclusion is that many clones generate single columns in postnatal mice (as expected), but clones in embryonic growth plate cartilage form clusters distributed laterally, not aligned with longitudinal growth. These findings are interpreted to suggest that column formation is not required for long bone growth in the embryo and that lateral expansion of proliferative chondrocyte clusters may drive an increase in bone width.

      Although these findings are intriguing and potentially impactful, there are important caveats to the approach that generate significant uncertainty in both the measurements and the conclusions. (1) The claim that embryonic growth plate chondrocytes do not form columns conflicts with the observation of columnar stacks in the clusters. (2) Interpretation of nuclear elevation data is based on the unproven assumption that nuclei should be stacked in cell columns. (3) Clonal analysis of proliferative chondrocyte cell division and stacking behaviors is only valid if clone labeling is initiated in a proliferative chondrocyte, not when the founder cell is a resting chondrocyte. The data are insufficient to validate this absolute requirement.

    4. Reviewer #3 (Public Review):

      The manuscript by Rubin and Agrawal et al presents a very nice imaging analysis of clonal cell organization in the fetal and late juvenile mouse growth cartilages. The authors have performed a thorough quantification of the orientations of clusters and of clones of cells with respect to the growth axis. They conclude that growth cartilage is not as strictly 'columnar' as has been commonly described, especially at the fetal stage. There is value to having such quantifications in the literature as a reminder that interpretations of phenotypes need to be rooted in the cell biology of the stage at hand, as emphasized by the authors. However, although the approach is comprehensive, aspects of the quantification methods are not described adequately to determine if they are correct for the questions. There are also some inequivalent comparisons to prior literature and an oversight of important published observations showing that some of these conclusions have been known for decades, though not as thoroughly quantitative. There have long been observations that some growth cartilages do not have proliferative columns oriented in the axis of growth and that not all columns of a growth cartilage are perfectly organized; these facts do not negate the observations that columnar organization does exist, as re-confirmed here, and that it correlates with and contributes to rapid growth rates. Each of these points is further elaborated below.

    1. eLife assessment

      This fundamental work advances our understanding of the central coding and control mechanisms regulating sympathetic nervous system efferent signals to bone. The evidence supporting the conclusion is mostly convincing, although the inclusion of higher resolution images for certain data and further discussions would strengthen the study. This paper holds potential interest for skeletal biologists and neuroscientists who study the brain-bone sympathetic neural circuits.

    2. Reviewer #1 (Public Review):

      This manuscript presents, for the first time, the utilization of PRV viral transneuronal tracing to elucidate the central coding and control mechanisms governing sympathetic nervous system (SNS) efferent signals to bone. This groundbreaking work not only holds promising research prospects but also establishes a robust foundation for understanding the neural regulation of bone metabolism.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this study, the authors have used virtual transneuronal tracing technology to identify for the first time the central sympathetic nervous system outflow sites that innervate bone.

      Strengths:<br /> The study provides a comprehensive atlas of the brain regions that potentially play a role in coding and decoding sympathetic nervous system signals to bone.

      Weaknesses:<br /> While the study provides compelling evidence for the brain-bone sympathetic nervous system neuroaxis, it is unclear if diseases that affect bone (e.g. diabetes, osteoporosis, kidney failure) disrupt brain-bone sympathetic neural circuits.

    4. Reviewer #3 (Public Review):

      It has been reported that the sympathetic nervous system (SNS) mediates bone metabolism and nociceptive functions. However, the exact localization and organization of the central SNS circuitry innervating bone and the brain sites have not been mapped and efferent SNS outflow to bone has not yet been characterized yet. Authors used pseudorabies (PRV) viral transneuronal tracing approach to identify central SNS outflow sites that innervate bone. The authors found that the central SNS outflow to bone originates from brain nuclei, sub-nuclei and regions of six brain divisions (midbrain and pons, hypothalamus, hindbrain medulla, forebrain, cerebral cortex, and thalamus). The authors provided compelling evidence for a brain-bone SNS neuroaxis that may regulate bone metabolism and nociceptive functions, which provided a greater understanding of the neural regulation of bone metabolism and would stimulate further research into bone pain and the neural regulation of bone metabolism. Authors may discuss and summarize their results in detail for a better understanding of their findings and enhancing the manuscript's utility for readers.

    1. eLife assessment

      This paper is valuable in that it provides a critical missing link between measures of structural connectivity and rhythmic tapping abilities, pointing to some interesting possibilities for how tapping synchronization is carried out. The methodology and findings are solid, and of interest to those studying the neural mechanisms of timing.

    2. Reviewer #1 (Public Review):

      Garcia-Saldivar and colleagues present a manuscript investigating connections between diffusion-weighted imaging (DWI) parameters and paced finger tapping measures. A cohort of human participants (n=32) performed a paced finger tapping task with a synchronization-continuation paradigm, in which they were required to listen to a paced metronome, begin tapping in synchrony with it, and then continue tapping at the same rate without it. Both auditory and visual metronomes were used, at a range of intervals. All subjects received structural scans measuring DWI, with an emphasis on superficial and deep white matter structures. This latter analysis was the most innovative, as it allowed the authors to examine microstructural effects in short-range cortical connections.

      Behaviorally, the authors replicated some well-known effects in paced finger tapping, with better performance for auditory over visual rhythms, negative lag-1 autocorrelations, and best performance at a range of ~1.5Hz. For the DWI analyses, a large number of correlations were observed across a wide variety of connections with various brain regions. The most salient effects observed were a connection between asynchrony, only for the auditory condition, and connections between the right auditory and motor systems, around the duration of peak performance, as well as a "chronotopic" organization across parts of the corpus callosum, most notably in areas linking motor regions between hemispheres.

      Overall, this paper provides a critical missing link between measures of structural connectivity and rhythmic tapping abilities, pointing to some interesting possibilities for how tapping synchronization (at least for auditory intervals) is carried out. Negative aspects of the paper come from the largely exploratory aspects of the analysis, as well as potential biases from the low sample size.

    3. Reviewer #2 (Public Review):

      This is a valuable study of the relationships between aspects of white matter structure in the brain and the accuracy of tapping performance on auditory and visual versions of a synchronization-continuation task. The authors find brain-behaviour relationships between absolute asynchrony (precision of phase alignment between taps and stimulus events), but only for certain temporal rates (650 and 750 ms ISI, not 550, 850, or 950 ms ISI). Other behavioural metrics do not significantly correlate with white matter measures, and no visual condition behavioural metrics correlate either. The methodology and findings are solid, and of interest to those studying the neural mechanisms of timing.

      The question is interesting, as the neural mechanisms of timing, and the nature of how modality differences in timing arise, are important, given that certain modality differences in timing accuracy (e.g., auditory benefits relative to visual) are less striking in our closest evolutionary relatives. Overall, the methods are well-presented and both behavioural and neural measures are appropriate.

      The results are generally well-reported, although there is a lack of clarity about multiple comparison corrections for the number of separate behavioural metrics, different interval lengths examined, and the two sensory modalities.

      Some weaknesses:<br /> The use of absolute (unsigned) asynchrony as a measure of 'predictive' ability is not fully justified. Signed asynchrony may be a more informative measure of predictive ability, as (small) negative asynchronies (taps prior to event onset) are often interpreted as indicating prediction, whereas positive asynchronies (taps after the event onset) are not.<br /> The work may benefit from considering the 'phase' and 'period' nature of the different behavioural measures, as they may tap different aspects of timing. Separating the behavioural metrics into those reflecting phase synchrony versus period matching may be a useful distinction, as the period-related metrics are the ones that do not have evidence of correlation with brain metrics.<br /> The manuscript does not present a very clear framework for why certain measures might be predicted to correlate with white matter structure and others not, and the pattern of results is also not easily interpretable. This may just be the nature of the data, but it would help clarify if more justification for the selection of task and stimulus rates was presented, along with an idea of the predictions made by different theoretical approaches for what relationships between this particular set of behavioural and brain data might exist. Similarly, a more nuanced discussion might further explore the potential reasons for the lack of evidence for a relationship at shorter and longer auditory interval lengths, as well as for any of the visual condition measures.

      Overall, the authors find white-matter structure relationships with absolute asynchrony measures during auditory (but not visual) synchronization-continuation at certain rates. These findings appear reasonably justified.

    1. Author Response:

      eLife assessment

      We thank the Editors for identifying qualified reviewers. We agree that the “evidence supporting this claim (that ‘many breast cancer mutations are mildly deleterious’) is incomplete”. Much more detail is needed to state this decisively and we do not claim completeness here. As far as validation, we carried out synthetic testing of the models as suggested by Reviewer #1 and the results seem good.

      Reviewer #1:

      We thank the Reviewer for a very thorough examination of not only the current paper but also our previous paper. We agree that the illustration material can be overwhelming and we plan to use the Reviewer’s advice in that matter. In addition, we originally put some textbook material in the Appendix, and arguably some of it may be considered superfluous.

      Most of the references the Reviewer provides are known to us, although it is likely we should cite and discuss more. All of the above will be included in the revision we are planning.

      The Reviewer is certainly correct that population growth and spatial effects play a major role in cancer. However, the effects of constraining environment are quite strong and the reality lies somewhere between the Moran and branching process models; exactly what we attempt to clarify. As for spatial effects, most tumors extracted in clinic are dissected in bulk and sub-sampling is rare, so the spatial information is rarely accessible.

      The subsequent point of importance concerns the weak specificity of the site frequency spectra (SFS) with respect to the underlying genetic and demographic forces. This cannot be denied. However, we just meant to state that our SFS are consistent with a model involving slightly deleterious passengers.

      Regarding the validation of the estimation procedures which is a point well-taken, we carried out synthetic testing of the models as suggested by Reviewer #1 and the results seem good. This will be discussed in full in the revision.

      In our view, the most important remark is the one concerning scaling of the models. The Reviewer is certainly correct that 100 stem cells are insufficient to drive a realistic tumor. However, what we had in mind but not explained sufficiently, is that a sample of 100 cells corresponds to average-depth coverage in bulk sequencing. Therefore, the strict interpretation is that the model mirrors what is observed in the sample. A more accurate approach would be to up-scale the model and then sample 100 cells from it. The Moran-type model can be up-scaled using diffusion approximation, and we hope to include these computations in the revision. The associated criticism concerning tumor growth seems less relevant, since we experimented with less or more stringent constraints in our models.

      Reviewer #2:

      We thank Reviewer #2 for studying our paper and some very positive comments. Among others, the Reviewer underscores the fact that the Moran-type model generates SFS concordant with the data (with all necessary reservations). The Reviewer concurs with us that conditioning on non-extinction is not very common in the literature, while it should be.

      Similarly as the Reviewer, we are somewhat puzzled by the differences in behavior between models A and B. Model B seems more parsimonious, but Model A looks more similar to the critical or slightly supercritical branching process. We will work to clarify these observations.

    2. eLife assessment

      This study uses numerical simulations to characterize and compare variants of two widely used mathematical models and then applies those models to inferring evolutionary parameters from breast cancer data. The copious numerical results will be of some interest to mathematical biologists working with similar models. The finding that many breast cancer mutations are mildly deleterious is valuable but the evidence supporting this claim is incomplete because the mathematical modelling and statistical methods are insufficiently justified and inadequately validated.

    3. Reviewer #1 (Public Review):

      This paper can be seen as an extension of a recent study by two of the same authors [1]. In the previous paper, the authors considered two variants of the Moran process, labelled Model A and Model B, and examined differences between the evolutionary dynamics of these two models. They further described the site frequency spectra, expected allele counts, and expected singleton counts of these models, building on analytical results from prior studies, and used numerical simulations to investigate the models' evolutionary dynamics. Finally, they compared the site frequency spectra of the two models (using numerical simulations) to spectra derived from a small breast cancer data set (two sets of three samples).

      In the new paper, the authors consider the same two Moran process variants (Model A and Model B) and some related branching processes. As before, they compare the site frequency spectra and various summary statistics of these models, but here they present only numerical simulations (except that some prior analytical results are summarized in Appendix A, which are never referred to in the main text and seem unconnected to the study). They then compare the site frequency spectra of these models (again using numerical simulations) to those derived from the same breast cancer samples as before and thus infer some evolutionary parameters.

      The first main conclusion is that the critical branching process and the Moran process models behave similarly and generate similar site frequency spectra. This finding is unsurprising (indeed, the authors acknowledge that the result "has been expected"). For a reasonably large population size, the population size in the critical branching process has been shown to vary relatively little over time and the model is thus essentially a continuous time Moran process (see, for example, Equation 8.55 in ref 2). Nor is it surprising that the authors see stronger similarities when they select only the subset of branching process replicates in which the final population size is particularly close to the initial population size (this is because, in these replicates, the population size likely varies even less than usual).

      The second main conclusion is that, although "the mutational SFS alone is not adequate" to quantify the strength of selection, "All fitted values for the selective disadvantage of passenger mutations are nonzero, supporting the view that they exert deleterious selection during tumorigenesis". Although the question of whether mildly deleterious mutations play an important role in cancer evolution is of considerable interest, it's debatable whether the results presented here help resolve the issue.

      Many prominent researchers have called into question whether cancer evolutionary parameters can be reliably inferred from site frequency spectra (e.g., [3-7]), even using sophisticated statistical methods. The statistical approach used here (though not named as such in the paper) is a crude kind of approximate Bayesian computation. To improve the accuracy of the results, it would have been better to have set reasonably vague priors for the uncertain mutation rates, rather than fixing them arbitrarily. It would also have been better to have chosen a likelihood function explicitly based on an analysis of the sampling and error distributions, rather than just summing the absolute logged deviations. It is well known that "Checking the model is crucial to statistical analysis" and "A good Bayesian analysis, therefore, should include at least some check of the adequacy of the fit of the model to the data and the plausibility of the model for the purposes for which the model will be used" [8]. The authors' failure to describe any attempt to validate or check their model, using simulated data or otherwise, casts doubt on the reliability of their inferences.

      Putting aside the potential biassing effects of sampling error, measurement error, and the limitations of the authors' statistical method, it is well established that both population growth and spatial structure profoundly alter the shape of site frequency spectra in ways that can mimic the effects of selection (e.g. [9-11]). Indeed, Figures 3, 4 and 5 show that the critical and super-critical branching processes generate markedly different site frequency spectra. It follows that if the population dynamics and spatial structure of the mathematical model used for inference don't match those of the biological process that produced the data then any inferred evolutionary parameter values will be unreliable. Breast cancer has two indisputable ecological features that shape its evolutionary dynamics: the cell population expands by many orders of magnitude from a single cell, and the population is spatially structured. In the authors' mathematical model, the population size is initially 100 cells and either remains constant or varies little, and there is no spatial structure. These profound mismatches between model and data cast further doubt on what is supposed to be the paper's most important biological finding.

      In this paper the authors offer no justification for their decision to model breast cancer as a non-growing, non-spatial cell population. Nor do they engage with the extensive recent literature on the challenges of inferring evolutionary parameters from cancer site frequency spectra (they cite none of the many relevant papers listed at https://www.sottorivalab.org/neutral-evolution.html). Their 2022 paper [1] claims that, "it sometimes makes sense to consider cancer growth in the framework of constant-population models. Our models correspond to the situation in which a constant population of N "healthy" stem cells is gradually replaced by a growing clone of transformed cells with increasing fitness." No evidence was presented to support this hypothesis regarding breast cancer progression. On the other hand, a wealth of evidence supports the consensus view that, in breast cancer and other human solid tumours, the number of cells with unlimited proliferative potential is several orders of magnitude greater than 100 and grows over time (e.g. [12]).

      Analytic expressions for the site frequency spectra with neutral mutations are already known. It is well known that the site frequency spectrum of an exponentially growing population has a tail following a power law S_k ~ k^(-2) [13, 14]. Similarly, it is known that for the critical branching process or the Moran process, the site frequency spectrum at equilibrium is S_k ~ k^(-1) [13, 15]. Especially noteworthy yet uncited studies that use those results about site frequency spectra to make inferences based on sequencing data include ref 16, in which selection is inferred, and ref 17, in which evolutionary parameters of constant populations (healthy cell populations) are inferred.

      Although the paper is well written, the figures are ineffective in communicating the results. As others have put it, "A figure is meant to express an idea or introduce some facts or a result that would be too long (or nearly impossible) to explain only with words" and "If your figure is able to convey a striking message at first glance, chances are increased that your article will draw more attention from the community" [18]. On the contrary, Figures 3, 4, 5 and 6 are bewilderingly complicated, crowded, and repetitive. These figures comprise no fewer than fifty-six plots, each containing numerous curves or histograms, spread across four pages. To compare the results of different scenarios, the reader is presumably expected to put these figures side by side and try to spot the differences, hampered by inconsistent axis ranges, absence of axis labels, absence of titles, absence of legends, and unreliable captions ("cyan" seems to refer to pale blue, and "orange" to something closer to red). For example, the only notable difference between Figures 3 and 4 is in the shape of a single green curve in panel I. In the main text of a published paper, one would expect fewer, more carefully curated figures drawing attention to salient features, so that the reader can infer the main results with minimal effort. The rest can be put in supplementary figures.

      In summary, this paper adds somewhat to our understanding of some standard mathematical models; whether it tells us anything new about cancer is open to debate.

      References<br /> (1) Kurpas, Monika K., and Marek Kimmel. "Modes of selection in tumors as reflected by two mathematical models and site frequency spectra." Frontiers in Ecology and Evolution 10 (2022): 889438.<br /> (2) Bailey, Norman TJ. The elements of stochastic processes with applications to the natural sciences. John Wiley & Sons, 1964.<br /> (3) Tarabichi, Maxime, et al. "Neutral tumor evolution?." Nature Genetics 50.12 (2018): 1630-1633.<br /> (4) McDonald, Thomas O., Shaon Chakrabarti, and Franziska Michor. "Currently available bulk sequencing data do not necessarily support a model of neutral tumor evolution." Nature Genetics 50.12 (2018): 1620-1623.<br /> (5) Balaparya, Abdul, and Subhajyoti De. "Revisiting signatures of neutral tumor evolution in the light of complexity of cancer genomic data." Nature Genetics 50.12 (2018): 1626-1628.<br /> (6) Noorbakhsh, Javad, and Jeffrey H. Chuang. "Uncertainties in tumor allele frequencies limit power to infer evolutionary pressures." Nature Genetics 49.9 (2017): 1288-1289.<br /> (7) Bozic, Ivana, Chay Paterson, and Bartlomiej Waclaw. "On measuring selection in cancer from subclonal mutation frequencies." PLoS Computational Biology 15.9 (2019): e1007368.<br /> (8) Neher, Richard A., and Oskar Hallatschek. "Genealogies of rapidly adapting populations." Proceedings of the National Academy of Sciences 110.2 (2013): 437-442.<br /> (9) Gelman, Andrew, et al. Bayesian data analysis (Third Edition). Chapman and Hall/CRC, 2014.<br /> (10) Fusco, Diana, et al. "Excess of mutational jackpot events in expanding populations revealed by spatial Luria-Delbrück experiments." Nature Communications 7.1 (2016): 12760.<br /> (11) Noble, Robert, et al. "Spatial structure governs the mode of tumour evolution." Nature Ecology & Evolution 6.2 (2022): 207-217.<br /> (12) Lawson, Devon A., et al. "Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells." Nature 526.7571 (2015): 131-135.<br /> (13) Gunnarsson, Einar B., Leder, Kevin, and Foo Jasmine. "Exact site frequency spectra of neutrally evolving tumors: A transition between power laws reveals a signature of cell viability" Theoretical Population Biology 142 (2021) 67-90<br /> (14) Durrett, Richard "Branching Process Models of Cancer" Springer (2015)<br /> (15) Durrett, Richard "Probability Models for DNA Sequence Evolution" Springer Science & Business media (2008)<br /> (16) Williams, Mark J. et al. "Quantification of subclonal selection in cancer from bulk sequencing data." Nature Genetics 50 (6). 895-903 (2018)<br /> (17) Moeller, Marius E. et al. "Measures of genetic diversification in somatic tissues at bulk and single-cell resolution" eLife (2024) 12:RP89780<br /> (18) Rougier, Nicolas P., Michael Droettboom, and Philip E. Bourne. "Ten simple rules for better figures." PLoS Computational Biology 10.9 (2014): e1003833.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors present a comparison of two models of cancer evolution with advantageous drivers and deleterious passengers: a fixed-population "Moran" model, and a "Branching Process" (BP) model with dynamic population size. The Moran model is more mathematically-tractable, but since cancer is a disease of uncontrolled growth, it is unclear to me how clinically-relevant it is to consider a model with constant population size. Intriguingly, both models can explain observed Site Frequency Spectrums (SFSs) in three breast cancers, which suggests that the Moran model may have some value. This distinction between the two models is addressed well.

      Strengths:

      The comparisons of the various BP models (extinction/non-extinction, and balanced/supercritical) are very interesting. The survivability of rare, fitness-disadvantaged clones has huge implications for treatment resistance in general - drug resistant clones are very often disadvantaged in the absence of drug. Clinical sequencing is, most decidedly, investigating population dynamics conditioned on non-extinction, however most published models do not condition on non-extinction - an unfortunate community oversight that this publication rectifies.

      Site Frequency Spectrums in three breast cancers are measured with unprecedented resolution to my knowledge (allele abundances below one in a thousand).

      Detailed description of the behavior of the various models.

      Weaknesses:

      I do not believe Moran B is a useful theoretical distinction between Moran A. Incorporating fitness effects into the birth process, instead of the death process, is generally mathematically equivalent when time is measured in generations (or cell divisions). Visible differences in the two models in Figures 2-6 by all accounts seem to be due to the fact that Moran B experiences more evolution in the balanced/driver-dominated case, and less evolution in the passenger dominated case. We generally do not use arbitrary time steps for this reason - we quantify time in 'generations'.

    1. eLife assessment

      This investigation marks an important advancement in our understanding of motor thalamus connectivity, illustrating a complex integration of inputs that reshapes previous models. The study utilizes compelling methodologies that expose a dynamic synaptic network, although the evidence of triple-input convergence on individual neurons and for multiple driver type inputs onto motor thalamic neurons remains incomplete. Despite this, the findings provide a persuasive rationale for revisiting our perceptions of the thalamic role in motor control, with a call for further studies to substantiate the breadth of these functional interactions.

    2. Reviewer #1 (Public Review):

      The manuscript demonstrates an analysis of the synaptic organization within the motor thalamus, emphasizing the interplay between the ventrolateral (VL) and ventroanterior (VA) nuclei and their respective inputs. The primary aim is to unravel the complexities of synaptic interactions among the motor cortex's layer 5 (M1L5), the cerebellum (Cb), and the basal ganglia output nuclei (GPi and SNr), which converge upon the VA/VL nuclei of the motor thalamus. This examination is executed using a combination of anatomical tracing, optogenetics, and electrophysiological recordings in mouse brain slices, which together yield novel insights into the motor control circuitry.

      The study uncovers that contrary to traditional models that presumed segregation, some motor thalamic neurons simultaneously integrate inputs from the cerebellum and basal ganglia. Furthermore, a subset of these neurons also receive convergent inputs from M1L5 and basal ganglia, underscoring the complexity of these synaptic networks. Notably, the study reveals that both M1L5 and Cb inputs exhibit driver-type synaptic properties, suggesting a significant impact on thalamic relay neurons.

      The functional implications of this synaptic convergence suggest a complex gating mechanism by the inhibitory outputs of the basal ganglia, which could modulate information flow within the motor thalamus. This modulation is significant not only for transthalamic information processing but also for the integration of cerebellar inputs to the motor cortex. The study also highlights direct projections from M1L5 to the motor thalamus, indicating a potential direct influence on thalamic activity, in addition to the known indirect influence through the cortico-basal ganglia-thalamo-cortical loop.

      The manuscript suggests that the traditional understanding of motor thalamic connectivity requires reconsideration, and it emphasizes the necessity of further investigation to understand fully the functional implications of this synaptic convergence. Future research may focus on more direct demonstrations of triple-input convergence and its behavioral consequences, as well as cross-species comparative studies to enhance the findings' applicability.

      While the study provides valuable contributions to our knowledge of the motor thalamus, illuminating the intricate synaptic architecture of the motor thalamus and setting the stage for future explorations that will deepen our comprehension of motor control and thalamic function.

    3. Reviewer #2 (Public Review):

      This study assesses how inputs from primary motor cortex layer 5 (M1L5), basal ganglia output nuclei (GPi and SNr), and cerebellum (Cb) converge onto motor thalamus nuclei (VA/VL).

      Methodology includes anatomical tracing, optogenetics and electrophysiological recordings in mouse brain slices.

      The major findings are:<br /> - Some motor thalamic neurons receive input from both cerebellar and basal ganglia. This is contrary to the common belief that assumes these two inputs are segregated in the motor thalamus.

      - Some motor thalamus neurons receive converging input from both motor cortex (M1L5) and basal ganglia.

      - Both M1L5 and Cb inputs to the motor thalamus have driver-type synaptic properties, indicating a strong influence on thalamic relay neurons.

      Functional implications are:<br /> - Given the inhibitory nature of basal ganglia output neurons, the converging inputs can allow for basal ganglia to gate information flow through the motor thalamus. This applies to transthalamic information, ie information conveyed through the thalamus across cortical regions, as well as cerebellar information flow to motor cortex.

      - The direct projection from M1L5 to motor thalamus suggests that motor cortex can affect motor thalamic activity not only indirectly, through the traditional cortico-basal ganglia-thalamo-cortical loop, but also through direct projections.

      The study is convincing and has important implications for the field. Methodology involves elegant viral techniques.

      The main weakness is that there is no direct functional demonstration of all the 3 inputs from motor cortex, cerebellum, and basal ganglia, converging onto the same cells in motor thalamus. All the recordings concern dual area stimulations, and the anatomical studies show a very small overlap of all the 3 inputs onto motor thalamus.

    1. eLife assessment

      This paper presents a new method for separating organelles in an unbiased way. The method is applied to the separation of distinct subpopulations of insulin vesicles. There are concerns around whether the vesicles measured are in fact insulin vesicles and whether the observed changes in vesicle populations upon glucose stimulation are biologically meaningful, and thus it is difficult to assess at this stage how well the technique performs. This paper is likely to be of wide interest to cell biologists studying a variety of compartments, as well as to researchers in the beta cell field.

    2. Reviewer #1 (Public Review):

      This manuscript presents an exciting new method for separating insulin secretory granules using insulator-based dielectrophoresis (iDEP) of immunolabeled vesicles. The method has the advantage of being able to separate vesicles by subtle biophysical differences that do not need to be known by the experimenter, and hence could in principle be used to separate any type of organelle in an unbiased way. Any individual organelle ("particle") will have a characteristic ratio of electrokinetic to dielectrophoretic mobilities (EKMr) that will determine where it migrates in the presence of an electric field. Particles with different EKMr will migrate differently and thus can be separated. The present manuscript is primarily a methods paper to show the feasibility of the iDEP technique applied to insulin vesicles. Experiments are performed on cultured cells in low or high glucose, with the conclusion that there are several distinct subpopulations of insulin vesicles in both conditions, but that the distributions in the two conditions are different. As it is already known that glucose induces release of mature insulin vesicles and stimulates new vesicle biosynthesis and maturation, this finding is not necessarily new, but is intended as a proof of principle experiment to show that the technique works. This is a promising new technology based on solid theory that has the possibility to transform the study of insulin vesicle subpopulations, itself an emerging field. The technique development is a major strength of the paper. Also, cellular fractionation and iDEP experiments are performed well, and it is clear that the distribution of vesicle populations is different in the low and high glucose conditions. However, more work is needed to characterize the vesicle populations being separated, leaving open the possibility that the separated populations are not only insulin vesicles, but might consist of other compartments as well. It is also unclear whether the populations might represent immature and mature vesicles, distinct pools of mature vesicles such as the readily releasable pool and the reserve pool, or vesicles of different age. Without a better characterization of these populations, it is not possible to assess how well the iDEP technique is doing what is claimed.

      Major comments:

      (1) There is no attempt to relate the separated populations of vesicles to known subpopulations of insulin vesicles such as immature and mature vesicles, or the more recently characterized Syt9 and Syt7 vesicle subpopulations that differ in protein and lipid composition (Kreutzberger et al. 2020). Given that it is unclear exactly what populations of vesicles will be immunolabeled (see point #2 below), it is also possible that some of the "subpopulations" are other compartments being separated in addition to insulin vesicles. It will be important to examine other markers on these separated populations or to perform EM to show that they look like insulin vesicles.

      (2) An antibody to synaptotagmin V is used to immunolabel vesicles, but there has been confusion between synaptotagmins V and IX in the literature and it isn't clear what exactly is being recognized by this antibody (this reviewer actually thinks it is Syt 9). If it is indeed recognizing Syt 9, it might already be labeling a restricted population of insulin vesicles (Kreutzberger et al. 2020). The specificity of this antibody should be clarified. Furthermore, Figure 2 is not convincing at showing that this synaptotagmin antibody specifically labels insulin vesicles nor is there convincing colocalization of this synaptotagmin antibody with insulin vesicles. In the image shown, several cells show very weak or no staining of both insulin and the synaptotagmin. The highlighted cell appears to show insulin mainly in a perinuclear structure (probably the Golgi) rather than in mature vesicles (which should be punctate), and insulin is not particularly well-colocalized with the synaptotagmin. Other cells in the image appear to have even less colocalization of insulin and synaptotagmin, and there is no quantification of colocalization. It seems possible that this antibody is recognizing other compartments in the cell, which would change the interpretation of the populations measured in the iDEP experiments. It would also be good to perform synaptotagmin staining under glucose-stimulating conditions, in case this alters the localization.

      (3) The EKMr values of the vesicle populations between the low and high glucose conditions don't seem to precisely match. It is unclear if this just a technical limitation in comparing between experiments or instead suggests that glucose stimulation does not just change the proportion of vesicles in the subpopulations (i.e. the relative fluorescent intensities measured), but rather the nature of the subpopulations (i.e. they have distinct biophysical characteristics). This again gets to the issue of what these vesicle subpopulations represent. If glucose stimulation is simply converting immature to mature vesicles, one might expect it to change the proportion of vesicles, but not the biophysical properties of each subpopulation.

      (4) The title of the paper promises "isolation" of insulin vesicles, but the manuscript only presents separation and no isolation of the separated populations. Isolation of the separated populations is important to be able to better define what these populations are (see point #1 above). Isolation is also critical if this is to be a valuable technique in the future. Yet the paper is unclear on whether it is actually technically feasible to isolate the populations separated by iDEP. In line 367, it states "this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.)." However, in line 361 it says "developing the capability to port the collected individual boluses will enable downstream analyses such as mass spectrometry or electron microscopy," suggesting that true isolation of these populations is not yet feasible. This should be clarified.

    3. Reviewer #2 (Public Review):

      This manuscript used DC-iDEP, a technology previously used on other organelle preparations to isolate insulin secretory granules from INS1 cells based on differences in dielectrophoretic and electrokinetic properties of synaptotagmin V positive insulin granules.

      The major motivation presented for this work is to provide a methodology to allow for more sensitive isolation of subpopulations of granules allowing better understanding of the biochemical composition of these populations. This manuscript clearly demonstrates the ability of this technology to separate these subpopulations which will allow for future biochemical characterizations of insulin granules in future studies.

      After proving these subpopulations can be observed, this method was then utilized to show there are shifts in these subpopulations when granules are isolated from glucose stimulated cells. Overall the method of isolation is novel and could provide a tool for further characterization of purified secretory granules.

      The observation of glucose stimulation causing shifts in subpopulations is unsurprising. Glucose stimulation could cause a depletion of insulin and other secretory content from a subset of granules. It would be expected that this loss of content would cause a shift in electrochemical properties of the granules, but this is a nice confirmation that the isolation method has the sensitivity to delineate these changes.

      Major comments:

      (1) It is unclear what Synaptotagmin isoform is being looked at. Synaptotagmin V and IX have been repetitively interchanged in the literature. See note in syt IX section of "Moghadam and Jackson 2013 Front. Endocrinology" or read "Fukuda and Sagi-Eisenberg Calcium Bind Proteins 2008".

      The 386 aa. isoform that is abundant in PC12 cells has been robustly observed in INS1 cells in multiple studies and has been frequently referred to as syt IX. The sequence the antibody was raised against should be determined from the company where this was purchased and then this should be mapped to to which isoform of Synaptotagmin by sequence and clarified in the text.

      (2) Immunofluorescence of insulin and syt V is confusing. The example images do not appear to show robust punctate structures that are characteristic of secretory granules (in both the insulin and syt V stain).

      (3) In the discussion it says, "Finally, this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.) that otherwise would not be possible given the low-abundance components of these subpopulations."

      It would help to elaborate more on the yield and concentrations of isolated granules. This would give a better sense of what level of biochemical characterization could be performed on sub-populations of granules.

    4. Reviewer #3 (Public Review):

      The manuscript from Barekatain et al. is investigating heterogeneity within the population of insulin vesicles from an insulinoma cell line (INS-1E) in response to glucose stimulation. Prevailing dogma in the beta-cell field suggests that there are distinct pools of mature insulin granules, such as ready-releasable and a reserve pool, which contribute to distinct phases of insulin release in response to glucose stimulation. Whether these pools (and others) are distinct in protein/lipid composition or other aspects is not known, but has been suggested. In this manuscript, the authors use density gradient sedimentation to enrich for insulin vesicles, noting the existence of a number of co-purifying contaminants (ER and mitochondrial markers). Following immunolabeling with synaptotagmin V and fluorescent-conjugated secondary antibodies, insulin vesicles were applied to a microfluidic device and separated by dielectrophoretic and electrokinetic forces following an applied voltage. The equilibrium between these opposing forces was used to physically separate insulin granules. Here some differences were observed in the insulin (Syt V positive) granule populations, when isolated from cells that were either non-stimulated or stimulated with glucose, which has been suggested previously by other studies as noted by the authors; however in the current manuscript, the inclusion of a number of control experiments may provide a better context for what the data reveal about these changes.

      The major strength of the paper is in the use of the novel, highly sophisticated methodology to examine physical attributes of insulin granules and thus begin to provide some insight into the existence of distinct insulin granule populations within a beta-cell -these include insulin granules that are maturing, membrane-docked (i.e. readily releasable), in reserve, newly-synthesized, aged, etc. Whether physical differences exist between these various granule pools is not known. In this capacity, the technical abilities of the current manuscript may begin to offer some insight into whether these perceived distinctions are physical.

      The major weakness of the manuscript is that the study falls short in terms of linking the biology to the sophisticated changes observed and primarily focuses on differences in response to glucose. Without knowing what the various populations of granules are, it is challenging to understand what the changes in response to glucose mean.

      Specific concerns are as follows:

      (1) There is confusion on what the DC-iDEP separation between stimulated and stimulated cells reveals. Do these changes reflect maturation state of granules, nascent vs. old granules? Ready-releasable vs. reserve pool? The comments in the text seem to offer all possibilities.

      (2) It is unclear what we can infer regarding the physical changes of granules between the stimulated states of the cells. Without an understanding of the magnitude of the effect, it is unclear how biologically significant these changes are. For example, what degree of lipid or protein remodeling would be necessary to give a similar change?

      (3) The reliance on a single vesicle marker, Syt V, is concerning given that granule remodeling is the focus.

      (4) Additional confirmation that the isolated vesicles are in fact insulin granules would be helpful. As noted, granules were gradient enriched, but did carry contaminants. Note that the microscopy image provided does not provide any real validation for this marker.

      Further confirmation that the immune-isolated vesicles are in fact insulin granules should be included. EM with immunogold labeling post-SytV enrichment would be a potential methodology to confirm.

      (5) It would be useful to understand if the observed effects are specific to the INS-1E cell line or are a more universal effect of glucose on beta-cells.

    1. eLife assessment

      Using continuum theory of elastic solids the authors present evidence that periodic muscle contraction leads to elongation of C. elegans embryos by storing elastic energy that is subsequently released by extending the embryo's long axis. This important finding could apply to other developmental processes and be exploited in soft robotics. The presented evidence is convincing on the phenomenological level adopted in the work. How bending energy is converted into elongation on a more microscopic level remains to be worked out.

    1. eLife assessment

      This is an important computational study that applies the machine learning method of bilinear modeling to the problem of relating gene expression to connectivity. Specifically, the author attempts to use transcriptomic data from mouse retinal neurons to predict their known connectivity with promising results. On revision, the approach was tested against a second data set from C. elegans. A limited number of genes studied in this second dataset may have resulted in performance that matched but did not exceed prior models. However, taken together, the results were felt to provide solid evidence for the value of the approach.

    1. eLife assessment

      In this important study, the authors report a novel measurement of the Escherichia coli chemotactic response and demonstrate that these bacteria display an attractant response to potassium, which is connected to intracellular pH level. The experimental evidence provided is convincing and the work will be of interest to microbiologists studying chemotaxis.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      In this important study, the authors report a novel measurement of the Escherichia coli chemotactic response and demonstrate that these bacteria display an attractant response to potassium, which is connected to intracellular pH level. Whilst the experiments are mostly convincing, there are some confounders regards pH changes and fluorescent proteins that remain to be addressed.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper shows that E. coli exhibits a chemotactic response to potassium by measuring both the motor response (using a bead assay) and the intracellular signaling response (CheY phosporylation level via FRET) to step changes in potassium concentration. They find increase in potassium concentration induces a considerable attractant response, with amplitude comparable to aspartate, and cells can quickly adapt (and generally over-adapt). The authors propose that the mechanism for potassium response is through modifying intracellular pH; they find both that potassium modifies pH and other pH modifiers induce similar attractant responses. It is also shown, using Tar- and Tsr-only mutants, that these two chemoreceptors respond to potassium differently. Tsr has a standard attractant response, while Tar has a biphasic response (repellent-like then attractant-like). Finally, the authors use computer simulations to study the swimming response of cells to a periodic potassium signal secreted from a biofilm and find a phase delay that depends on the period of oscillation.

      Strengths:

      The finding that E. coli can sense and adapt to potassium signals and the connection to intracellular pH is quite interesting and this work should stimulate future experimental and theoretical studies regarding the microscopic mechanisms governing this response. The evidence (from both the bead assay and FRET) that potassium induces an attractant response is convincing, as is the proposed mechanism involving modification of intracellular pH. The updated manuscript controls for the impact of pH on the fluorescent protein brightness that can bias the measured FRET signal. After correction the response amplitude and sharpness (hill coefficient) are comparable to conventional chemoattractants (e.g. aspartate), indicating the general mechanisms underlying the response may be similar. The authors suggest that the biphasic response of Tar mutants may be due to pH influencing the activity of other enzymes (CheA, CheR or CheB), which will be an interesting direction for future study.

      Weaknesses:

      The measured response may be biased by adaptation, especially for weak potassium signals. For other attractant stimuli, the response typically shows a low plateau before it recovers (adapts). In the case of potassium, the FRET signal does not have an obvious plateau following the stimuli of small potassium concentrations, perhaps due to the faster adaptation compared to other chemoattractants. It is possible cells have already partially adapted when the response reaches its minimum, so the measured response may be a slight underestimate of the true response. Mutants without adaptation enzymes appear to be sensitive to potassium only at much larger concentrations, where the pH significantly disrupts the FRET signal; more accurate measurements would require development of new mutants and/or measurement techniques.

      We acknowledge and appreciate the reviewer's concerns regarding the potential impact of adaptation on the measured response magnitude. We have estimated the effect of adaptation on the measured response magnitude. The half-time of adaptation at 30 mM KCl was measured to be approximately 80 s, corresponding to a time constant of t = 80/ln(2) = 115.4 s, which is significantly longer than the time required for medium exchange in the flow chamber (less than 10 s). Consequently, the relative effect of adaptation on the measured response magnitude should be less than 1-exp(-10/t) = 8.3%. Even for the fastest adaptation (at the lowest KCl concentration) we measured, the effect should be less than 20%, which is within experimental uncertainties. Nevertheless, we agree that developing new techniques to measure the dose-response curve more precisely would be beneficial.

      Reviewer #2 (Public Review):

      Zhang et al investigated the biophysical mechanism of potassium-mediated chemotactic behavior in E coli. Previously, it was reported by Humphries et al that the potassium waves from oscillating B subtilis biofilm attract P aeruginosa through chemotactic behavior of motile P aeruginosa cells. It was proposed that K+ waves alter PMF of P aeruginosa. However, the mechanism was this behaviour was not elusive. In this study, Zhang et al demonstrated that motile E coli cells accumulate in regions of high potassium levels. They found that this behavior is likely resulting from the chemotaxis signalling pathway, mediated by an elevation of intracellular pH. Overall, a solid body of evidence is provided to support the claims. However, the impacts of pH on the fluorescence proteins need to be better evaluated. In its current form, the evidence is insufficient to say that the fluoresce intensity ratio results from FRET. It may well be an artefact of pH change.

      The authors now carefully evaluated the impact of pH on their FRET sensor by examining the YFP and CFP fluorescence with no-receptor mutant. The authors used this data to correct the impact of pH on their FRET sensor. This is an improvement, but the mathematical operation of this correction needs clarification. This is particularly important because, looking at the data, it is not fully convincing if the correction was done properly. For instance, 3mM KCl gives 0.98 FRET signal both in Fig3 and FigS4, but there is almost no difference between blue and red lines in Fig 3. FigS4 is very informative, but it does not address the concern raised by both reviewers that FRET reporter may not be a reliable tool here due to pH change.

      We apologize for not making the correction process clear. We corrected the impact of pH on the original signals for both CFP and YFP channels by

      where and represent the pH-corrected and original PMT signal (CFP or YFP channel) from the moment of addition of L mM KCl to the moment of its removal, respectively, and  is the correction factor, which is the ratio of PMT signal post- to pre-KCl addition for the no-receptor mutant at L mM KCl, for CFP or YFP channel as shown Fig. S5. The pH-corrected FRET response is then calculated as the ratio of the pH-corrected YFP to the pH-corrected CFP signals, normalized by the pre-stimulus ratio.

      As shown in Author response image1, which represents the same data as Fig. 3A and Fig. S5A, the original normalized FRET responses to 3 mM KCl are 0.967 for the wild-type strain (Fig. 3) and 0.981 for the no-receptor strain (Fig. S5). The standard deviation of the FRET values under steady-state conditions is 0.003. Thus, the difference in responses between the wild-type and no-receptor strains is significant and clearly exceeds the standard deviation. The pH correction factors CpH at 3 mM KCl are 1.004 for the YFP signal and 1.016 for the CFP signal. Consequently, the pH-corrected FRET responses are 0.967´1.016/1.004=0.979 for the wild-type and 0.981´1.016/1.004=0.993 for the no-receptor strain. The reason the pH-corrected FRET response for the no-receptor strain is 0.993 instead of the expected 1.000 is that this value represents the lowest observed response rather than the average value for the FRET response.

      The detailed mathematical operation for correcting the pH impact has now been included in the “FRET assay” section of Materials and Methods.

      Author response image 1.

      Chemotactic response of the wild-type strain (A, HCB1288-pVS88) and the no-receptor strain (B, HCB1414-pVS88) to stepwise addition and removal of KCl. The blue solid line denotes the original normalized signal. Downward and upward arrows indicate the time points of addition and removal of 3 mM KCl, respectively. The horizontal red dashed line denotes the original normalized FRET response value to 3 mM KCl.

      The authors show the FRET data with both KCl and K2SO4, concluding that the chemotactic response mainly resulted from potassium ions. However, this was only measured by FRET. It would be more convincing if the motility assay in Fig1 is also performed with K2SO4. The authors did not address this point. In light of complications associated with the use of the FRET sensor, this experiment is more important.

      We thank the reviewer for the suggestion. We agree that additional confirmation with a motility assay is important. To address this, we have now measured the response of the motor rotational signal to 15 mM K2SO4 using the bead assay and compared it with the response to 30 mM KCl. The results are shown in Fig. S2. The response of motor CW bias to 15 mM K2SO4 exhibited an attractant response, characterized by a decreased CW bias upon the addition of K2SO4, followed by an over-adaptation that is qualitatively similar to the response to 30 mM KCl. However, there were notable differences in the adaptation time and the presence of an overshoot. Specifically, the adaptation time to K2SO4 was shorter compared to that for KCl, and there was a notable overshoot in the CW bias during the adaptation phase. These differences may have resulted from the weaker response to K2SO4 (Fig. S1B) and additional modifications due to CysZ-mediated cellular uptake of sulfate (Zhang et al., Biochimica et Biophysica Acta 1838,1809–1816 (2014)). The faster adaptation and overshoot complicated the chemotactic drift in the microfluidic assay as in Fig. 1, such that we were unable to observe a noticeable drift in a K2SO4 gradient under the same experimental conditions used for the KCl gradient.

      The response of motor rotational signal to 15 mM K2SO4 has been added to Fig. S2.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The response curve and adaptation level/time in the main text (Fig. 4) should be replaced by the corrected counterparts (currently in Fig. S5). The current version is especially confusing because Fig. 6 shows the corrected response, but the difference from Fig. 4 is not mentioned.

      We thank the reviewer for the suggestion. We have now merged the results of the original Fig. S5 into Fig. 4.

      a. The discussion of the uncorrected response with small hill coefficient and potentially negative cooperativity was left in the text (lines 223-234), but the new measurements show this is not true for the actual response. This should be removed or significantly rephrased.

      We thank the reviewer for the suggestion. We have now removed the statement about potentially negative cooperativity and added the corrected results for the actual response.

      (2) It may be helpful to restate the definition of f_m in the methods (near Eq. 3-4).

      Thank you for the suggestion. We have now restated the definition of fm and fL below Eq. 3-4: “In the denominator on the right-hand side of Eq. 3, the two terms within the parentheses of exponential expression represent the methylation-dependent (fm) and ligand-dependent (fL) free energy, respectively.”

    3. Reviewer #1 (Public Review):

      Summary:

      This paper shows that E. coli exhibits a chemotactic response to potassium by measuring both the motor response (using a bead assay) and the intracellular signaling response (CheY phosporylation level via FRET) to step changes in potassium concentration. They find increase in potassium concentration induces a considerable attractant response, with amplitude comparable to aspartate, and cells can quickly adapt (and generally over-adapt). The authors propose that the mechanism for potassium response is through modifying intracellular pH; they find both that potassium modifies pH and other pH modifiers induce similar attractant responses. It is also shown, using Tar- and Tsr-only mutants, that these two chemoreceptors respond to potassium differently. Tsr has a standard attractant response, while Tar has a biphasic response (repellent-like then attractant-like). Finally, the authors use computer simulations to study the swimming response of cells to a periodic potassium signal secreted from a biofilm and find a phase delay that depends on the period of oscillation.

      Strengths:

      The finding that E. coli can sense and adapt to potassium signals and the connection to intracellular pH is quite interesting and this work should stimulate future experimental and theoretical studies regarding the microscopic mechanisms governing this response. The evidence (from both the bead assay and FRET) that potassium induces an attractant response is convincing, as is the proposed mechanism involving modification of intracellular pH. The updated manuscript controls for the impact of pH on the fluorescent protein brightness that can bias the measured FRET signal. After correction the response amplitude and sharpness (hill coefficient) are comparable to conventional chemoattractants (e.g. aspartate), indicating the general mechanisms underlying the response may be similar. The authors suggest that the biphasic response of Tar mutants may be due to pH influencing the activity of other enzymes (CheA, CheR or CheB), which will be an interesting direction for future study.

      Weaknesses:

      The measured response may be biased by adaptation, especially for weak potassium signals. For other attractant stimuli, the response typically shows a low plateau before it recovers (adapts). In the case of potassium, the FRET signal does not have an obvious plateau following the stimuli of small potassium concentrations, perhaps due to the faster adaptation compared to other chemoattractants. It is possible cells have already partially adapted when the response reaches its minimum, so the measured response may be a slight underestimate of the true response. Mutants without adaptation enzymes appear to be sensitive to potassium only at much larger concentrations, where the pH significantly disrupts the FRET signal; more accurate measurements would require the development of new mutants and/or measurement techniques.

      Note added after the second revision: The authors made a reasonable argument regarding the effects of adaptation, which were estimated to be small.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "Self-inhibiting percolation and viral spreading in epithelial tissue" describes a model based on 5-state cellular automata of development of an infection. The model is motivated and qualitatively justified by time-resolved measurements of expression levels of viral, interferon-producing, and antiviral genes. The model is set up in such a way that the crucial difference in outcomes (infection spreading vs. confinement) depends on the initial fraction of special virus-sensing cells. Those cells (denoted as 'type a') cannot be infected and do not support the propagation of infection, but rather inhibit it in a somewhat autocatalytic way. Presumably, such feedback makes the transition between two outcomes very sharp: a minor variation in concentration of ``a' cells results in qualitative change from one outcome to another. As in any percolation-like system, the transition between propagation and inhibition of infection goes through a critical state with all its attributes. A power-law distribution of the cluster size (corresponding to the fraction of infected cells) with a fairly universal exponent and a cutoff at the upper limit of this distribution.

      Strengths:

      The proposed model suggests an explanation for the apparent diversity of outcomes of viral infections such as COVID.

      Author response: We thank the referee for the concise and accurate summary of our work.

      Weaknesses:

      Those are not real points of weakness, though I think addressing them would substantially improve the manuscript.

      Author response: Below we will address these point by point.

      The key point in the manuscript is the reduction of actual biochemical processes to the NOVAa rules. I think more could be said about it, be it referring to a set of well-known connections between expression states of cells and their reaction to infection or justifying it as an educated guess.

      Author response: We have now improved this part in the model section. We have added a few sentences explaining how the cell state transitions are motivated by the UMAP results:

      “The cell state transitions triggered by IFN signaling or viral replication are known in viral infection, but how exactly the transitions are orchestrated for specific infections is poorly understood. The UMAP cell state distribution hints at possible preferred transitions between states. The closer two cell states are on the UMAP, the more likely transitions between them are, all else being equal. For instance, the antiviral state (𝐴) is easily established from a susceptible cell (𝑂), but not from the fully virus-hijacked cell (𝑉 ). The IFN-secreting cell state (𝑁) requires the co-presence of the viral and antiviral genes and thus the cell cluster is located between the antiviral state (𝐴) and virus-infected state (𝑉 ) but distant from the susceptible cells (𝑂).

      Inspired by the UMAP data visualization (Fig. 1a), we propose the following transitions between five main discrete cell states”

      Another aspect where the manuscript could be improved would be to look a little beyond the strange and 'not-so-relevant for a biomedical audience' focus on the percolation critical state. While the presented calculation of the precise percolation threshold and the critical exponent confirm the numerical skills of the authors, the probability that an actual infected tissue is right at the threshold is negligible. So in addition to the critical properties, it would be interesting to learn about the system not exactly at the threshold: For example, how the speed of propagation of infection depends on subcritical p_a and what is the cluster size distribution for supercritical p_a.

      Author response: We agree that further exploring the model away from the critical threshold is worthwhile. While our main focus has been on explaining the large degree of heterogeneity in outcomes – readily explained as a consequence of the sharp threshold-like behavior – we now include plots of the time-evolution of the infection (as well as the remaining states) over time for subcritical values of pa. The plots can be found in Figure S4 of the supplement.

      Reviewer #2 (Public Review):

      Xu et al. introduce a cellular automaton model to investigate the spatiotemporal spreading of viral infection. In this study, the author first analyzes the single-cell RNA sequencing data from experiments and identifies four clusters of cells at 48 hours post-viral infection, including susceptible cells (O), infected cells (V), IFN-secreting cells (N), and antiviral cells (A). Next, a cellular automaton model (NOVAa model) is introduced by assuming the existence of a transient pre-antiviral state (a). The model consists of an LxL lattice; each site represents one cell. The cells change their state following the rules depending on the interaction of neighboring cells. The model introduces a key parameter, p_a, representing the fraction of pre-antiviral state cells. Cell apoptosis is omitted in the model. Model simulations show a threshold-like behavior of the final attack rate of the virus when p_a changes continuously. There is a critical value p_c, so that when p_a < p_c, infections typically spread to the entire system, while at a higher p_a > p_c, the propagation of the infected state is inhibited. Moreover, the radius R that quantifies the diffusion range of N cells may affect the critical value p_c; a larger R yields a smaller value of the critical value p_c. The structure of clusters is different for different values of R; greater R leads to a different microscopic structure with fewer A and N cells in the final state. Compared with the single-cell RNA seq data, which implies a low fraction of IFN-positive cells - around 1.7% - the model simulation suggests R=5. The authors also explored a simplified version of the model, the OVA model, with only three states. The OVA model also has an outbreak size. The OVA model shows dynamics similar to the NOVAa model. However, the change in microstructure as a function of the IFN range R observed in the NOVAa model is not observed in the OVA model.

      Author response: We thank the referee for the comprehensive summary of our work.

      Data and model simulation mainly support the conclusions of this paper, but some weaknesses should be considered or clarified.

      Author response: Thank you - we will address these point by point below.

      (1) In the automaton model, the authors introduce a parameter p_a, representing the fraction of pre-antiviral state cells. The authors wrote: ``The parameter p_a can also be understood as the probability that an O cell will switch to the N or A state when exposed to the virus of IFNs, respectively.' Nevertheless, biologically, the fraction of pre-antiviral state cells does not mean the same value as the probability that an O cell switches to the N or A state. Moreover, in the numerical scheme, the cell state changes according to the deterministic role N(O)=a and N(a)=A. Hence, the probability p_a did not apply to the model simulation. It may need to clarify the exact meaning of the parameter p_a.

      Author response: We acknowledge that this was an imprecise formulation, and have now changed it.

      What we tried to convey with that comment was that, alternatively to having a certain fraction of cells be in the a state initially, one could instead have devised a model in which We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%. each O-state cell simply had a probability to act as an a-state cell upon exposure to the virus or to interferons, i.e. to switch to an N state (if exposed to virus) or to the A state (if exposed to interferons). In this simplified model, there would be no functional difference, since it would simply amount to whether each cell had a probability to be designated an a-cell initially (as in our model), or upon exposure. So our remark mainly served to explain that the role of the p_a parameter is simply to encode that a certain fraction of virus-naive cells behave this way (whether predetermined or not).

      (2) The current model is deterministic. However, biologically, considering the probabilistic model may be more realistic. Are the results valid when the probability update strategy is considered? By the probability model, the cells change their state randomly to the state of the neighbor cells. The probability of cell state changes may be relevant for the threshold of p_a. It is interesting to know how the random response of cells may affect the main results and the critical value of p_a.

      Author response: This is a good point - we are firm believers in the importance of stochasticity. We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%.

      We now discuss these findings in the supplement and include the figure below as Fig. S5.

      Author response image 1.

      (3) Figure 2 shows a critical value p_c = 27.8% following a simulation on a lattice with dimension L = 1000. However, it is unclear if dimension changes may affect the critical value.

      Author response: Re-running the simulations on a lattice 4x as large (i.e. L=2000) yields a similar critical value of 27-28% for R=1, so we are confident that finite size effects do not play a major role at L=1000 and beyond. For R=5, however, we find that a minimum lattice size greater than L=1000 is necessary to determine the critical threshold. Concretely, we find that the threshold value pc for R=5 changes somewhat when the lattice size is increased from 1000 to 2000, but is invariant under a change from 2000 to 3000, so we conclude that L=2000 is sufficient for R=5. The pc value for R=5 cited in the manuscript (~0.4%) was determined from simulations at L=2000.

      Reviewer #3 (Public Review):

      Summary:

      This study considers how to model distinct host cell states that correspond to different stages of a viral infection: from naïve and susceptible cells to infected cells and a minority of important interferon-secreting cells that are the first line of defense against viral spread. The study first considers the distinct host cell states by analyzing previously published single-cell RNAseq data. Then an agent-based model on a square lattice is used to probe the dependence of the system on various parameters. Finally, a simplified version of the model is explored, and shown to have some similarity with the more complex model, yet lacks the dependence on the interferon range. By exploring these models one gains an intuitive understanding of the system, and the model may be used to generate hypotheses that could be tested experimentally, telling us "when to be surprised" if the biological system deviates from the model predictions.

      Author response: Thank you for the summary! We agree with the role that you describe for a model such as this one.

      Strengths:

      -  Clear presentation of the experimental findings and a clear logical progression from these experimental findings to the modeling.

      -  The modeling results are easy to understand, revealing interesting behavior and percolation-like features.

      -  The scaling results presented span several decades and are therefore compelling. - The results presented suggest several interesting directions for theoretical follow-up work, as well as possible experiments to probe the system (e.g. by stimulating or blocking IFN secretion).

      Weaknesses:

      -  Since the "range" of IFN is an important parameter, it makes sense to consider lattice geometries other than the square lattice, which is somewhat pathological. Perhaps a hexagonal lattice would generalize better.

      -  Tissues are typically three-dimensional, not two-dimensional. (Epithelium is an exception). It would be interesting to see how the modeling translates to the three-dimensional case. Percolation transitions are known to be very sensitive to the dimensionality of the system.

      Author response: We agree that probing different lattice geometries (2- and 3-dimensional alike) would be interesting and worthwhile. However, for this manuscript, we prefer to confine the analysis to the current, simple case. We do agree, however, that an extensive exploration of the role of geometry is an interesting future possibility.

      -  The fixed time-step of the agent-based modeling may introduce biases. I would consider simulating the system with Gillespie dynamics where the reaction rates depend on the ambient system parameters.

      -  Single-cell RNAseq data typically involves data imputation due to the high sparsity of the measured gene expression. More information could be provided on this crucial data processing step since it may significantly alter the experimental findings.

      Justification of claims and conclusions:

      The claims and conclusions are well justified.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is necessary to explain what UMAP does. Is clustering done in the space of twenty-something original dimensions or 2D? How UMAP1 and UMAP2 are selected and are those the same in all plots?

      Author response: We have now added a few sentences to clarify the point raised above - the second snippet explains how clustering is performed:

      “As a dimension reduction algorithm, UMAP is a manifold learning technique that favors the preservation of local distances over global distances (McInnes et al., 2018; Becht et al., 2019). It constructs a weighted graph from the data points and optimizes the graph layout in the low-dimensional space.”

      “We cluster the cells with the principal components analysis (PCA) results from their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells then optimize the modularity function to determine clusters. We present the cluster information on the UMAP plane and use the same UMAP coordinates for all the plots in this paper hereafter.”

      Figure 1, what do bars in the upper right corners of panels d,e,f, and g indicate? ``Averaged' refers to time average? Something is missing in ``Cell proportions are labeled with corresponding colors in a)' .

      Author response: Thank you - we have now modified the figure caption. The bars in the upper right corners of panels d, e, f are color keys for gene expression, the brighter the color is, the higher the gene expression is.

      “Averaged” gene expression refers to the mean expression of that particular gene across the cells within each indicated cluster.

      The lines in c) correspond to cell proportions in different states at different time points. The same state in 1) and c) is shown in the same color.

      Line 46, ``However' does not sound right in this context. Would ``Also' be better?

      Author response: We agree and have corrected it in the revised manuscript.

      Line 96``The viral genes are also partially expressed in these cells, but different from the 𝑁 cluster, the antiviral genes are fully expressed (Fig. S1 and S2).' The sentence needs to be rephrased.

      Author response: We have rephrased the sentence: “As in the N cluster, the viral gene E is barely detected in these cells, indicating incomplete viral replication. However, in contrast to the N cluster, the antiviral genes are expressed to their full extent (Fig. S1 and S2).”

      Line 126, missing "be", ``large' -> ``larger'.

      Author response: Thank you, we have now corrected these typos.

      Line 139-140 The logical link between ignoring apoptosis and the diffusion of IFN is unclear.

      Author response: We modified the sentence as “Here, we assume that the secretion of IFNs by the 𝑁 cells is a faster process than possible apoptosis (Wen et al., 1997; Tesfaigzi, 2006) of these cells and that the diffusion of IFNs to the neighborhood is not significantly affected by apoptosis.”

      Fig. 2a Do the yellow arrows show the effect of IFN and the purple arrows the propagation of viral infection?

      Author response: That is correct. We have added this information to the figure caption: “The straight black arrows indicate transitions between cell states. The curved yellow arrows indicate the effects of IFNs on activating antiviral states. The curved purple arrows indicate viral spread to cells with 𝑂 and 𝑎 states.”

      Fig. 3, n(s) as the axis label vs P(s) in the text? How do the curves in panel a) look when the p_a is well above or below p_c?

      Author response: Thank you. We have edited the labels in the figure to reflect the symbols used in the text.

      Boundary conditions? From Fig. 4, apparently periodic?

      Author response: Yes, we use periodic boundary conditions in the model. We clarify it in the model section now (last sentence).

      It will be good to see a plot with time dependences of all cell types for a couple of values of p_a, illustrating propagation and cessation of the infection.

      Author response: We agree, and have added a Figure S4 in the supplement which explores exactly that. Thank you for the suggestion.

      A verbal qualitative description of why p_a has such importance and how the infection is terminated for large p_a would help.

      Reviewer #2 (Recommendations For The Authors):

      Below are two minor comments:

      (1) In the single-cell RNA sequencing data analysis, the authors describe the cell clusters O, V, A, and N. However, showing how the clusters are identified from the data might be more straightforward.

      Author response: Technically, we cluster the cells using principal components analysis (PCA) results of their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells and then optimize the modularity function to determine clusters. We manually annotate the clusters with O, V, A, and N based on the detected abundance of viral genes, antiviral genes, and IFNs.

      (2) In Figure 3, what does n(s) mean in Figure 3a? And what is the meaning of the distribution P(s) of infection clusters? It may be stated clearly.

      Author response: The use of n(s) was inconsistent, and we have now edited the figure to instead say P(s), to harmonize it with the text. P(s) is the distribution of cluster sizes, s, expressed as a fraction of the whole system. In other words, once a cluster has reached its final size, we record s=(N+V)/L^2 where N and V are the number of N and V state cells in the cluster (note that, by design, each simulation leads to a single cluster, since we seed the infection in one lattice point). We now indicate more clearly in the caption and the main text what exactly P(s) and s refer to.

      Reviewer #3 (Recommendations For The Authors):

      - Would the authors kindly share the simulation code with the community? Also, the data analysis code should be shared to follow current best practices. This needs to be standard practice in all publications. I would go as far as to say that in 2024 publishing a data analysis / simulation study without sharing the relevant code should be ostracized by the community.

      Author response: We absolutely agree and have created a GitHub repository in which we share the C++ source code for the simulations and a Python notebook for plotting. The public repository can be found at https://github.com/BjarkeFN/ViralPercolation. We add this information in supplement under section “Code availability”.

      ­

      - I would avoid the use of the wording "critical" threshold since this is almost guaranteed to infuriate a certain type of reader.

      ­

      - Line 265 has a curious use of " ... " which should be replaced with something more appropriate.

      Author response: Thank you for pointing it out! We have checked the typos.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript describes a model based on 5-state cellular automata of development of an infection. The model is motivated and qualitatively justified by time-resolved measurements of expression levels of viral, interferon-producing, and antiviral genes. The model is set up in such a way that the crucial difference in outcomes (infection spreading vs. confinement) depends on the initial fraction of special virus-sensing cells. Those cells (denoted as 'type a') cannot be infected and do not support the propagation of infection, but rather inhibit it in a somewhat autocatalytic way. Presumably, such feedback makes the transition between two outcomes very sharp: a minor variation in concentration of 'a' cells results in qualitative change from one outcome to another. As in any percolation-like system, the transition between propagation and inhibition of infection goes through a critical state with all its attributes, including a power-law distribution of the cluster size (corresponding to the fraction of infected cells) with a fairly universal exponent and a cutoff at the upper limit of this distribution.

      Strengths:

      The proposed model suggests a well-justified explanation for the frequently observed yet puzzling diversity of outcomes of viral infections such as COVID.

      Weaknesses:

      None.

    3. eLife assessment

      This study presents a cellular automaton model to study the dynamics of virus-induced signalling and innate host defense against viruses such as SARS-CoV-2 in epithelial tissue. The simulations and data analysis are convincing and represent a valuable contribution that would be of interest to researchers studying the dynamics of viral propagation.

    4. Reviewer #2 (Public Review):

      Xu et al. introduce a cellular automaton model to investigate the spatiotemporal spreading of viral infection. In this study, the author first analyzes the single-cell RNA sequencing data from experiments and identifies four clusters of cells at 48 hours post-viral infection, including susceptible cells (O), infected cells (V), IFN-secreting cells (N), and antiviral cells (A). Next, a cellular automaton model (NOVAa model) is introduced by assuming the existence of a transient pre-antiviral state (a). The model consists of an LxL lattice; each site represents one cell. The cells change their state following the rules depending on the interaction of neighboring cells. The model introduces a key parameter, p_a, representing the fraction of pre-antiviral state cells. Cell apoptosis is omitted in the model. Model simulations show a threshold-like behavior of the final attack rate of the virus when p_a changes continuously. There is a critical value p_c, so that when p_a < p_c, infections typically spread to the entire system, while at a higher p_a > p_c, the propagation of the infected state is inhibited. Moreover, the radius R that quantifies the diffusion range of N cells may affect the critical value p_c; a larger R yields a smaller value of the critical value p_c. The authors further examine the result with stochastic version dynamics, and the main findings are unchanged upon stochastic dynamics. The structure of clusters is different for different values of R; greater R leads to a different microscopic structure with fewer A and N cells in the final state. Compared with the single-cell RNA seq data, which implies a low fraction of IFN-positive cells of around 1.7%, the model simulation suggests R=5. The authors also explored a simplified version of the model, the OVA model, with only three states. The OVA model also has an outbreak size. The OVA model shows dynamics similar to the NOVAa model. However, the change in microstructure as a function of the IFN range R observed in the NOVAa model is not observed in the OVA model.

    5. Reviewer #3 (Public Review):

      Summary:

      This study considers how to model distinct host cell states that correspond to different stages of a viral infection: from naïve and susceptible cells to infected cells and a minority of important interferon-secreting cells that are the first line of defense against viral spread. The study first considers the distinct host cell states by analyzing previously published single-cell RNAseq data. Then an agent-based model on a square lattice is used to probe the dependence of the system on various parameters. Finally, a simplified version of the model is explored, and shown to have some similarity with the more complex model, yet lacks the dependence on the interferon range. By exploring these models one gains an intuitive understanding of the system, and the model may be used to generate hypotheses that could be tested experimentally, telling us "when to be surprised" if the biological system deviates from the model predictions.

      Strengths:

      - Clear presentation of the experimental findings and a clear logical progression from these experimental findings to the modeling.<br /> - The modeling results are easy to understand, revealing interesting behavior and percolation-like features.<br /> - The scaling results presented span several decades and are therefore compelling.<br /> - The results presented suggest several interesting directions for theoretical follow-up work, as well as possible experiments to probe the system (e.g. by stimulating or blocking IFN secretion).

      Weaknesses:

      - The fixed time-step of the agent-based modeling may introduce biases. I would consider simulating the system with Gillespie dynamics where the reaction rates depend on the ambient system parameters.<br /> - Single-cell RNAseq data requires careful handling or it may generate false leads. The strength of the RNAseq evidence presented is not clear.

      Two places where the manuscript could be extended:

      - Since the "range" of IFN is an important parameter, it makes sense to consider other lattice geometries other than the square lattice, which is somewhat pathological. Perhaps a hexagonal lattice would generalize better.<br /> - Tissues are typically three-dimensional, not two-dimensional. (Epithelium is an exception). It would be interesting to see how the modeling translates to the three-dimensional case. Percolations transitions are known to be very sensitive to the dimensionality of the system.

      Justification of claims and conclusions:

      The claims and conclusions are well justified.

    1. eLife assessment

      This valuable study reports that actin-related proteins may be involved in transcriptional regulation during spermatogenesis. The supporting data remain incomplete, and more extensive disentanglement from the canonical role of these actin-related proteins and the experimental validation of in silico predictions are required. This work will be of interest to reproductive biologists and other researchers working on non-canonical roles of actin and actin-related proteins.

    2. Reviewer #1 (Public Review):

      Summary:

      This study offers a new perspective. ACTL7A and ACTL7B play roles in epigenetic regulation in spermiogenesis. Actin-like 7 A (ACTL7A) is essential for acrosome formation, fertilization, and early embryo development. ACTL7A variants cause acrosome detachment responsible for male infertility and early embryonic arrest. It has been reported that ACTL7A is localized on the acrosome in mouse sperms (Boëda et al., 2011). Previous studies have identified ACTL7A mutations (c.1118G>A:p.R373H; c.1204G>A:p.G402S, c.1117C>T:p.R373C), All these variants were located in the actin domain and were predicted to be pathogenic, affecting the number of hydrogen bonds or the arrangement of nearby protein structures (Wang et al., 2023; Xin et al., 2020; Zhao et al., 2023; Zhou et al., 2023). This work used AI to model the role of ACTL7A/B in the nucleosome remodeling complex and proposed a testis-specific conformation of SCRAP complex. This is different from previous studies.

      Strengths:

      This study provides a new perspective to reveal the additional roles of these proteins.

      Weaknesses:

      The results section contains a substantial background description. However, the results and discussion sections require streamlining. There is a lack of mutual support for data between the sections, and direct data to support the authors' conclusions are missing.

    3. Reviewer #2 (Public Review):

      Summary:

      How dynamics of gene expression accompany cell fate and cellular morphological changes is important for our understanding of molecular mechanisms that govern development and diseases. The phenomenon is particularly prominent during spermatogenesis, the process which spermatogonia stem cells develop into sperm through a series of steps of cell division, differentiation, meiosis, and cellular morphogenesis. The intricacy of various aspects of cellular processes and gene expression during spermatogenesis remains to be fully understood. In this study, the authors found that testis-specific actin-related proteins (which usually participate in modifying cells' cytoskeletal systems) ACTL7A and ACTL7B were expressed and localized in the nuclei of mouse spermatocytes and spermatids. Based on this observation, the authors analyzed protein sequence conservations of ACTL7B across dozens of species and identified a putative nuclear localization sequence (NLS) that is often responsible for the nuclear import of proteins that carry them. Using molecular biology experiments in a heterologous cell system, the authors verified the potential role of this internal NLS and found it indeed could facilitate the nuclear localization of marker proteins when expressed in cells. Using gene-deleted mouse models they generated previously, the authors showed that deletion of Actl7b caused changes in gene expression and mis-localization of nucleosomal histone H3 and chromatin regulator histone deacetylase HDAC1 and 2, supporting their proposed roles of ACTL7B in regulating gene expression. The authors further used alpha-Fold 2 to model the potential protein complexes that could be formed between the ARPs (ACTL7A and ACTL7B) and known chromatin modifiers, such as INO80 and SWI/SNF complexes and found that consistent with previous findings, it is likely that ACTL7A and ACTL7B interact with the chromatin-modifying complexes through binding to their alpha-helical HSA domain cooperatively. These results suggest that ACTL7B possesses novel functions in regulating chromatin structure and thus gene expression beyond conventional roles of cytoskeleton regulation, providing alternative pathways for understanding how gene expression is regulated during spermatogenesis and the etiology of relevant infertility diseases.

      Strengths:

      The authors provided sufficient background to the study and discussions of the results. Based on their previous research, this study utilized numerous methods, including protein complex structural modeling method alpha-fold 2 Multimers, to further investigate the functional roles of ACTL7B. The results presented here are in general of good quality. The identification of a potential internal NLS in ACTL7B is mostly convincing, in line with the phenotypes presented in the gene deletion model.

      Weaknesses:

      While the study offered an interesting new look at the functions of ARP proteins during spermatogenesis, some of the study is mainly theoretical speculations, including the protein complex formation. Some of the results may need further experimental verifications, for example, differentially expressed genes that were found in potentially spermatogenic cells at different developmental stages, in order to support the conclusions and avoid undermining the significance of the study.

    4. Reviewer #3 (Public Review):

      In this manuscript, Pierre Ferrer and colleagues explore the exciting possibility that, in the male germ line, the composition and function of deeply conserved chromatin remodeling complexes is fine-tuned by the addition of testis-specific actin-related proteins (ARPs). In this regard, the Authors aim to extend previously reported non-canonical (transcriptional) roles of ARPs in somatic cells to the unique developmental context of the germ line. The manuscript is focused on the potential regulatory role in post-meiotic transcription of two ARPs: ACTL7A and ACTL7B (particularly the latter). The canonical function of both testis-specific ARPs in spermatogenesis is well established, as they have been previously shown to be required for the extensive cellular morphogenesis program driving post-meiotic development (spermiogenesis). Disentangling the actual functions of ACTL7A and ACTL7B as transcriptional regulators from their canonical role in the profound morphological reshaping of post-meiotic cells (a process that also deeply impacts nuclear architecture and regulation) represents a key challenge in terms of interpreting the reported findings (see below).

      The authors begin by documenting, via fluorescence microscopy, the intranuclear localization of ACTL7B. This ARP is convincingly shown to accumulate in the nucleus of spermatocytes and spermatids. Using a series of elegant reporter-based experiments in a somatic cell line, the authors map the driver of this nuclear accumulation to a potential NLS sequence in the ACTL7B actin-like body domain. Ferrer and colleagues then performed a testicular RNA-seq analysis in ACTL7B KO mice to define the putative role of ACTL7B in male germ cell transcription. They report substantial changes to the testicular transcriptome - particularly the upregulation of several classes of genes - in ACTL7B KO mice. However, wild-type testes were used as controls for this experiment, thus introducing a clear confounding effect to the analysis (ACTL7B KO testes have extensive post-meiotic defects due to the canonical role of ACTL7B in spermatid development). Then, the authors employ cutting-edge AI-driven approaches to predict that both ACTL7A and ACTL7B are likely to bind to four key chromatin remodeling complexes. Although these predictions are based on a robust methodology, they would certainly benefit from experimental validation. Finally, the authors associate the loss of ACTL7B with decreased lysine acetylation and lower levels of the HDAC1 and HDAC3 chromatin remodelers in the nucleus of developing spermatids.

      Globally, these data may provide important insight into the unique processes male germ cells employ to sustain their extraordinarily complex transcriptional program. Furthermore, the concept that (comparably younger) testis-specific proteins can be incorporated into ancient chromatin remodeling complexes to modulate their function in the germ line is timely and exciting.

      It is my opinion that the manuscript would benefit from additional experimental validation to better support the authors' conclusions. In particular, I believe that addressing two critical points would substantially strengthen the message of the manuscript:

      (1) The proposed role of ACTL7B in post-meiotic transcriptional regulation temporally overlaps with the protein's previously reported canonical functions in spermiogenesis (PMID: 36617158 and 37800308). Indeed, the canonical functions of ACTL7B have been shown to have a profound effect at the level of spermatid morphology and to impact nuclear organization. This potentially renders the observed transcriptional deregulation in ACTL7B KO testes an indirect consequence of spermatid morphology defects. I acknowledge that it is experimentally difficult to disentangle the proposed intranuclear roles of ACTL7B from the protein's well-documented cytoplasmic function. Perhaps the generation of a NLS-scrambled ACTL7B variant could offer some insight. In light of the substantial investment this approach would represent, I would suggest, as an alternative, that instead of using wild-type testes as controls for the transcriptome and chromatin localization assays, the authors consider the possibility of using testicular tissue from a mutant with similarly abnormal spermiogenesis but due to transcription-independent defects. This would, in my opinion, offer a more suitable baseline to compare ACTL7B KO testes with.

      (2) The manuscript would greatly benefit if experimental validation of the AI-driven predictions were to be provided (in terms of the binding capacity of ACTL7A and ACTL7B to key chromatin remodeling complexes). More so it seems that the authors have the technical expertise / available mass spectrometry data required for this purpose (lines 664-665). Still on this topic, given the predicted interactions of ACTL7A and ACTL7B with the SRCAP, EP400, SMARCA2 and SMARCA4 complexes (Figure 7), it is rather counter-intuitive that the Authors chose for their immunofluorescence assays, in ACTL7B KO testes, to determine the chromatin localization of HDAC1 and HDAC3, rather than that of any of above four complexes.

    1. eLife assessment

      The authors develop a novel genetic strategy for specific and comprehensive labeling of axo-axonic cells, also referred to as chandelier cells, in the mouse brain. The approach and analysis are rigorous such that the data convincingly support the key conclusions, including the expanded distribution of axo-axonic cells throughout the brain. This study provides important new information about the distribution of a significant neuronal cell type, as well as new tools for future studies. This work will be of broad interest to neuroscientists who work on the anatomical and functional organization of neural circuits.

    2. Reviewer #2 (Public Review):

      Summary:

      The goals of this study were to develop a genetic approach that would specifically and comprehensively target axo-axonic cells (AACs) throughout the brain and then to describe the patterns and characteristics of the targeted AACs in multiple, selected brain regions. The investigators have been successful in providing the most complete description of the regional distribution of putative (pAACs) throughout the brain to date. The supporting evidence is convincing, and the findings should serve as a guide for more detailed studies of AACs within each brain region and lead to new insights into their connectivity and functional organization of this important group of GABAergic interneurons.

      Strengths:

      The study has numerous strengths. A major strength is the development of a unique intersectional genetic strategy that uses cell lineage (Nkx2.1) and molecular (Unc5b or Pthlh) markers to identify AACs specifically and, apparently, nearly completely throughout the mouse brain. While AACs have been described previously in the cerebral cortex, hippocampus and amygdala, there has been no specific genetic marker that selectively identifies all AACs in these regions.

      Importantly, the current genetic strategy labels pAACs in additional brain regions, including the claustrum-insular complex, extended amygdala, and several olfactory centers in which AACs have not been previously recognized. In general, the findings provide support for the specificity of the methods for targeting AACs and include several examples of labeling near markers of axon initial segments, providing validation of their AAC identity.

      The descriptions and numerous low magnification images of the brain provide a roadmap for subsequent, detailed studies of AACs in numerous brain regions. The overview and summaries of the findings in the Abstract, Introduction and Discussion are particularly clear and helpful in placing the extensive regional descriptions of AACs in context.

      Weaknesses:

      Considering the unique and striking characteristics of AACs, it would have been ideal to include a clear, high resolution confocal image of an AAC from the Unc5b;Nkx2.1 mouse that would display the beauty of these cells with their numerous cartridges of axon terminals, emanating from a single AAC. While several cells are illustrated, the processes are often obscured by other labeling or the background created by the blue Dapi labeling. A high-resolution image of an isolated cell would not only support the identity of the cells as AACs but also demonstrate the potential advantages of their labeling for more detailed anatomical and neurophysiological studies. High magnification views of the axon terminals adjacent to AnkG-labeled axon initial segments are included and provide strong support for the identity of the cells. However, they cannot convey the extensiveness and patterns of the axonal arborizations of these cells.

      The intersectional genetic methods included use of the lineage marker Nkx2.1 with either Unc5b or Pthlh as the molecular marker. As described, the mice with intersectional targeting of Nkx2.1 and Unc5b appear to show the most specific brain-wide labeling for AACs, and the majority of the descriptions are from these mice. The targeting with Nkx2.1 and Pthlh is less convincing and there appears to be a disconnect between the descriptions and the images. While the descriptions emphasize that the labeling is very similar in the two types of mice, the images suggest distinct differences, including labeling of non AACs in striatum and layer 4 of the cortex in the Pthlh;Nkx2.1 mouse, as described in the manuscript. In addition, the Pthlh;Nkx2.1 mouse has higher cell targeting in some regions and fewer labeled cells in others. Perhaps it would be more accurate to present the Pthlh;Nkx2.1 mouse as differing from the Unc5b;Nkx2.1 mouse, but useful for AAC labeling in select regions and under some conditions, such as following tamoxifen administration at specific ages. As currently presented, the inclusion of the Pthlh;Nkx2.1 detracts from the otherwise convincing argument that the Unc5b;Nkx2.1 mouse provides a specific and comprehensive way to identify AACs.

    3. Reviewer #3 (Public Review):

      Summary:

      Raudales et al. aimed at providing an insight into the brain-wide distribution and synaptic connectivity of bona fide GABAergic inhibitory interneuron subtypes focusing on the axo-axonic cell (AAC), one of the most distinctive interneuron subtypes, which innervates the axon initial segments of glutamatergic projection neurons. They establish intersectional genetic strategies that enable them to specifically and comprehensively capture AACs based on their lineage (Nkx2.1) and marker expression (Unc5b, Pthlh). They find that AACs are deployed across essentially all the pallium-derived brain structures as well as anterior olfactory nucleus, taenia tecta, and lateral septum. They show that AACs in distinct areas and layers of the neocortex as well as different subregions of the hippocampal formation display unique soma and synaptic density and morphological variations. Rabies virus-based retrograde monosynaptic input tracing reveals that AACs in the neocortex, the hippocampus, and the basolateral amygdala receive synaptic inputs from common as well as specific brain regions and supports the utility of this novel genetic approach. This study elucidates brain-wide neuroanatomical features and morphological variations of AACs with solid techniques and analysis. Their novel AAC-targeting strategies will facilitate the study of their development and function in different brain regions. The conclusions in this paper are well supported by the data. However, there are a few minor comments.

      (1) The authors added a description about validation of ChCs in the method section: "Validation was conducted with high-magnification confocal microscopy and defined by a cell exhibiting at least two RFP-labelled axons colocalized with AIS labelled by AnkryinG or Phospho-IκBα". However, this does not clearly define pAACs themselves. If they follow this criteria, an RFP-labeled cell exhibiting only one synaptic cartridge that is colocalized with an AIS should be a pAAC. Is this what the authors are triying to say?

      On the other hand, in the response to reviewers, the authors apparently define pAACs in a different way, in which they more focus on the number of cells exhibiting cartridges that are associated with AISs in a certain anatomical region rather than the number of cartridges per cell.

      "For BNST we did not positively identify more than a few exhibiting overlap with AnkryinG/IκBα, so we currently leave them as pAACs"<br /> "Putative AAC (pAACs) refers to populations in which relatively few single cell examples of AACs exhibiting co-localized cartridges were found"

      The authors need to directly define pAACs.

      (2) In the response to reviewers, the authors claimed that both Pthlh and Unc5b mice are useful for studying developing AACs. It would be nice if they include this content in the text (e.g. Discussion).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors set out to develop genetic tools that can specifically and comprehensively label Axo-Axonic Cells (AACs), also known as Chandelier cells. These AACs possess unique morphological and connectivity features, making them an ideal subject for studying various aspects of cell types across different experimental methods. To achieve both specificity and comprehensiveness in AAC labeling, the authors employ an intersectional strategy that combines lineage origin and molecular markers. This approach successfully targets AACs across the mouse brain and reveals their widespread distribution in various brain structures beyond the previously known regions. Additionally, the authors utilize rabies transneuronal labeling to provide a comprehensive overview of AACs, their variations, and input sources throughout the brain. This experimental approach offers a powerful model system for investigating the role of AACs in circuit development and function across diverse brain regions.

      Strengths:

      Genetic Tools and Specificity: The authors' genetic tools show qualitative evidence of specificity for AACs, opening new avenues for targeted research on these cells. The use of intersectional strategies enhances the precision of AAC labeling.

      Widespread Distribution: The study significantly broadens our understanding of AAC distribution, revealing their presence in brain regions beyond what was previously documented. This expanded knowledge is a valuable contribution to the field.

      Transneuronal Labeling: The inclusion of rabies transneuronal labeling provides a comprehensive view of AACs, their variations, and input sources, allowing for a more holistic understanding of their role in neural circuits.

      Weaknesses:

      Quantitative Analysis: While the claim of specificity appears qualitatively convincing, the manuscript could be improved with more quantitative analysis.

      We are glad that the reviewers appreciated our multimodal and brain-wide characterizations of the AAC population. We include many qualitative AAC examples and would like to highlight the quantitative nature of our whole brain cell body and cartridge analyses, made possible by transgenic targeting and our serial two-photon tomography imaging platform (STP). In addition to providing this brain wide AAC atlas, we also propose AACs as perhaps one of the best case examples for a bona fide cell type, which may inspire further in-depth anatomical and functional studies of AACs, and efforts to capture other ground truth cell types.

      Comprehensiveness Claim: The assertion of comprehensiveness, implying labeling "almost all" AACs in all brain regions, is challenging to substantiate conclusively. Acknowledging the limitations of proving complete comprehensiveness and discussing them in the discussion section would be more appropriate than asserting it in the results section.

      We thank the reviewer for this suggestion and have revised the results and discussion sections accordingly. The issue of how to access comprehensiveness in AAC labeling is a fair and important point, as dense brain-wide AAC labeling has not been achieved and assessed before. Previous studies had used less efficient and specific methods for capturing AACs, primarily in select areas of cortex, hippocampus, and amygdala. These AAC populations are recapitulated by our genetic strategies with higher density and specificity. It does not seem that we have missed any previously-reported AAC populations; in fact, we discovered multiple previously unreported populations. Another evidence supporting our “comprehensive” labeling of AACs is that two independent Unc5b and Pthlh transgenic strategies showed very similar AAC distribution patterns (Fig. 1 Suppl. 3). However, we recognize that probably the only way to fully assess “completeness” of labeling may be to compare with anatomical ground truth, such as by dense EM reconstruction of all AACs across the brain volume. This is currently not technically possible but may become feasible in the future. 

      Local Inputs: While the manuscript focuses on inter-areal inputs to AACs, it would benefit from exploring local inputs as well. Identifying the local neurons that target AACs and analyzing their patterns could provide valuable insights into AAC function within specific brain regions.

      This is a good suggestion. However, our serial two-photon tomography imaging platform does not have the capability for reliably preserving tissue sections for immunohistochemical processing afterward. Additionally, though our starter AAV injections were limited to 100-150nL, there were far too many input cells labelled at the injection side to resolve individual input cells and correlate with their synaptic partners (e.g. a rabies-labelled pyramidal cell within the injection site may still project to starter cell few hundred microns away). Thus, our rabies input mapping was best suited for characterizing long-range inputs and was the focus here. For studying local inputs to AACs, future studies could combine very dilute starter AAV injections with multi-marker characterization of cell types by immunohistochemistry or FISH.  

      Discussion Focus: The discussion section should delve deeper into the biological implications of the findings, moving beyond technical significance. Exploring similarities and differences in input patterns between AACs and other cell types, and linking them to the locations of starter cells or specific connectivity patterns in the brain, would enrich the discussion. For instance, investigating whether input patterns can be predicted based on the locations of starter cells or connectivity specificity could provide valuable insights.

      We thank the reviewer for this suggestion. We have expanded the discussion to include more on the relevance and implications of our input mapping results to different starter populations of AACs.

      Reviewer #2 (Public Review):

      Summary:

      The goals of this study were to develop a genetic approach that would specifically and comprehensively target axo-axonic cells (AACs) throughout the brain and then to describe the patterns and characteristics of the targeted AACs in multiple, selected brain regions. The investigators have been successful in providing the most complete description of the regional distribution of putative (pAACs) throughout the brain to date. The supporting evidence is convincing, even though incomplete in some brain regions. The findings should serve as a guide for more detailed studies of AACs within each brain region and lead to new insights into the connectivity and functional organization of this important group of GABAergic interneurons.

      Strengths:

      The study has numerous strengths. A major strength is the development of a unique intersectional genetic strategy that uses cell lineage (Nkx2.1) and molecular (Unc5b or Pthlh) markers to identify axo-axonic AACs specifically and, apparently, nearly completely throughout the mouse brain. While AACs have been described previously in the cerebral cortex, hippocampus, and amygdala, there has been no specific genetic marker that selectively identifies all AACs in these regions.

      The current genetic strategy has labeled pAACs in a large number of additional brain regions, including the claustrum-insular complex, extended amygdala, and several olfactory centers. In general, the findings provide support for the specificity of the methods for targeting AACs, and include some examples of labeling near markers of axon initial segments. However, the Investigators are careful to refer to labeled neurons as "putative AACs" as they have not been fully characterized and their identity verified.

      The descriptions and numerous low-magnification images of the brain provide a roadmap for subsequent, detailed studies of AACs in numerous brain regions. The overview and summaries of the findings in the Abstract, Introduction, and Discussion are particularly clear and helpful in placing the extensive regional descriptions of AACs in context.

      Weaknesses:

      One weakness of the study is the lack of an illustration of the high-resolution cell labeling that can be achieved with the methods, including labeling of numerous rows of axon terminals in contact with axon initial segments. The initial images of the brain-wide distribution of putative AACs are necessarily presented at low magnification. Although the authors indicate that the cells have "highly characteristic AAC labeling patterns throughout the neocortex, hippocampus and BLA", these morphological details cannot be visualized by the reader at the current magnification, even when the images are enlarged on the computer screen. Some of the details become evident in later Figures, but an initial illustration of single cell labeling with confocal microscopy, or tracing of their characteristic axonal arbors, would support the specificity of the labeling in the low magnification images.

      We thank the reviewer for the suggestion. We have now added high-resolution images showing the colocalization of AAC axon boutons (cartridges) along AnkG positive postsynaptic axon initial segments in Fig. 2 Suppl. 1, Figure 1 panels a, d, e, and Fig. 4 panels b, c. These images unequivocally demonstrate AAC identity and specificity.

      Table 1 indicates that the AAC identity of the cells has been validated in many brain regions but not in all. The methods used for validation have not been described and should be included for completeness. The authors are careful to acknowledge that labeled cells in some regions have not been validated and refer to such cells as pAACs.

      Validation was defined by colocalization of RFP-labelled AAC cartridges and AnkryinG or Phospho-IκBα-labelled axon initial segments, imaged by confocal microscopy. We provide high-magnification examples throughout figures 2-6 and supplements. We have also tried to clarify this better in the methods section entitled “Immunohistochemistry.” Putative AAC (pAACs) refers to populations in which relatively few single cell examples of AACs exhibiting co-localized cartridges were found, largely due to the sparsity of the low tamoxifen dosage used (see response above).

      The intersectional genetic methods included the use of the lineage marker Nkx2.1 with either Unc5b or Pthlh as the molecular marker. As described, the mice with intersectional targeting of Nkx2.1 and Unc5b appear to show the most specific brain-wide labeling for AACs, and the majority of the descriptions are from these mice. The targeting with Nkx2.1 and Pthlh is less convincing. The title for Figure 1 Supplemental Figure 3 suggests a similar AAC distribution in the Pthlh;Nkx2.1 mouse compared to the Unc5b;Nkx2.1 mouse. However, the descriptions of the individual panels suggest a number of inconsistencies and non-AAC labeling. The heavy labeling in the caudate and cells in layer 4 is particularly problematic. Based on the data presented, it appears that heavy labeling achieved in these mice could not be relied on for specific labeling of all AACs, although specific labeling could be achieved under some conditions, such as following tamoxifen administration at select ages.

      The reviewer is correct about Pthlh being less specific for AACs than Unc5b when crossed to a constitutive Nkx2.1 recombinase driver line. Pthlh/Nkx2.1 intersection labeled a set of layer 4 cells in somatosensory cortex and dense cells in striatum, which are clearly not AACs. But these are the only main difference compared to Unc5b/Nkx2.1 intersection. As the reviewer points out, it is only when Pthlh is crossed to an inducible Nkx2.1-CreER line and induced embryonically with tamoxifen that there is more specific AAC labeling (at least in cortex). We included this data as well as the intersection with VIP-Cre in case either of these are useful to researchers studying fate-mapping of AACs or bipolar cell interneurons. We have also revised the title of Fig. 1 Suppl. 3 to better convey this.

      The methods described for dense labeling and single-cell labeling are described briefly in the methods. Some discussion of the development of the methods would be useful, including how it was determined that methods for heavy labeling identified AACs specifically and completely.

      We have added a description on the development of these to the methods section entitled “Animals.”

      Reviewer #3 (Public Review):

      Summary:

      Raudales et al. aimed at providing an insight into the brain-wide distribution and synaptic connectivity of bona fide GABAergic inhibitory interneuron subtypes focusing on the axo-axonic cell (AAC), one of the most distinctive interneuron subtypes, which innervates the axon initial segments of glutamatergic projection neurons. They establish intersectional genetic strategies that enable them to specifically and comprehensively capture AACs based on their lineage (Nkx2.1) and marker expression (Unc5b, Pthlh). They find that AACs are deployed across essentially all the pallium-derived brain structures as well as the anterior olfactory nucleus, taenia tecta, and lateral septum. They show that AACs in distinct areas and layers of the neocortex as well as different subregions of the hippocampal formation display unique soma and synaptic density and morphological variations. Rabies virus-based retrograde monosynaptic input tracing reveals that AACs in the neocortex, the hippocampus, and the basolateral amygdala receive synaptic inputs from common as well as specific brain regions and supports the utility of this novel genetic approach. This study elucidates brain-wide neuroanatomical features and morphological variations of AACs with solid techniques and analysis. Their novel AAC-targeting strategies will facilitate the study of their development and function in different brain regions. The conclusions in this paper are well supported by the data. However, there are a few comments to strengthen this study.

      (1) The definition of putative AAC (pAAC) is unclear and Table 1 may not be accurate. Although the authors find synaptic cartridges of RFP-labeled cells in the claustro-insular complex and the dorsal endopiriform nuclei, they still consider these cells as pAACs (not validated). The authors claim that without examining the presence of synaptic cartridges, RFP-labeled cells in the hypothalamus and the bed nuclei of the stria terminalis (BNST) are pAACs while those in the L4 of the somatosensory cortex in Pthlh;Nkx2.1;Ai65 mice are non-AACs. In Table 1, the BNST is supposed to contain AACs (validated), but in the text, the authors claim that RFP-labeled cells in the BNST are pAACs. Could the authors clarify how AACs, pAACs, and non-AACs are defined?

      We thank the reviewer for their interest and comments on our work. Please see our response to reviewer 2 for clarification on putative pAACs. Additionally, we have clarified in the methods under “Immunohistochemistry” how we defined AACs, pAAC, and non-AACs. For BNST we did not positively identify more than a few exhibiting overlap with AnkryinG/IκBα, so we currently leave them as pAACs—Table 1 has been corrected to reflect this.

      (2) The intersectional strategies presented in this study could also specifically capture developing AACs. If so, how early are AACs labeled in the brain? It would also be nice if the authors could add a simple schematic like Fig. 1a showing the time course of Pthlh expression.

      We thank the reviewer for suggesting the application of our method in studying AAC development. As the onset of Unc5b is in early postnatal time, tamoxifen induction of Unc5b-CreER in early postnatal days can enable studies of AAC neurite and synapse development, maturation, and plasticity. Similarly, Pthlh expression in the brain is relatively low/absent at P4 and present at P14 and later timepoints. Pthlh-Flp;Nkx2.1-Cre intersection can be used to study postnatal AAC development and plasticity.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      While the claim of specificity appears qualitatively convincing, additional quantitative analysis would make the authors' claim much stronger. For example in Figure 4 (f-h), where the authors show an overlap of AAC axons with AnkG labeling, there also appears to be a region of AAC axon lacking adjacent AnkG labeling. The author could quantify the fraction of cartridges that overlap with AnkG labeling in different brain regions, potentially stringing their claim that pAACs are AACs as well as providing important documentation of the diversity or homogeneity of compartment targeting across the brain.

      As mentioned previously, we only performed AnkG co-labeling analysis on low-dose tamoxifen/sparsely labelled samples in which we could readily differentiate individual cells. This was performed on samples with the Ai65 cytoplasmic reporter—for validation purposes we could positively identify co-labelled cartridges, but it would be more difficult to accurately identify any cartridges not co-labeled (since the entire axon was labelled with RFP). For precisely identifying and mapping AAC cartridge locations we found the intersectional synaptophysin-EGFP reporter (Fig. 2k-n) to be a more precise method for specifically labeling the “cartridge” segment of AAC axons. However, we did not try AnkG staining on samples from this reporter line, as they were set aside for STP imaging.

      Regarding the claim of comprehensiveness, labeling "almost all" AACs in all brain regions is a high standard and challenging to demonstrate conclusively. The study already significantly expands our understanding of AAC distribution, and the authors might consider discussing the limitations of proving complete comprehensiveness in the discussion rather than claiming it in the results section.

      We again thank the reviewer for this critique. As mentioned above, we have revised the results and discussion sections to better convey this point across.

      Furthermore, the manuscript connectivity section primarily focuses on inter-areal inputs to AACs, but it could benefit from exploring local inputs as well. By identifying the local neurons that target AACs, the authors could ask if there is any general property or rule of the local projections to AACs across the brain, or at least within the cortex. Moreover, a clear indication of the injection site would be helpful, particularly in Figure 7, where there seems to be some discrepancy between the histograms and fluorescent images regarding local projections. The histograms of Figure 7, seem to indicate that the local projection to AACs is a small fraction of all the presynaptic neurons, however, the fluorescent image for the SSp seems to suggest otherwise with many fluorescent cells in the injected area.

      We thank the reviewer for these comments. Regarding the local inputs in the rabies tracing datasets, it is a limitation (as mentioned above) of our STP platform’s inability to preserve tissue for immunohistochemistry labeling as well as our relatively dense starter cell labeling. Instead, our focus here was on long-range inputs (i.e. outside the ipsilateral ARA area of injection), which was simply not known for these AAC populations. We have revised the Figure 7 legend and added a description in the methods section to more clearly indicate that we only included long-range input projections in the Figure 7 histograms.

      In the discussion, the authors should delve more into the biological implications of their findings rather than solely emphasizing the technical significance. They could explore the similarities and differences in input patterns between AACs and other cell types, potentially linking them to the locations of their starter cells or specific connectivity patterns in the brain. For example, the authors could check if the input patterns could be predicted from the projections to the layers where their starter cells are located (either from an Atlas like the Allen Connectivity Atlas, or from retrograde rabies injections in the same locations). Can the differences between the input patterns to PVC and AAC be predicted for their location versus some specificity of connections?

      Thank you for the extensive comment. We address this point above, and have revised our discussion accordingly.

      Reviewer #2 (Recommendations For The Authors):

      The Figure legends vary in completeness and quality.

      (1) The legend for Figure 1 is very informative, and section e-g serves as a useful guide, as the legend includes the names of the brain regions related to the abbreviations and also indicates the specific panels that show the identified structures. Because of the large number of structures and the number of panels in each Figure, it would be ideal to follow the same pattern in the remaining figures.

      (2) Several edits are needed in the legend for Figure 1 Supplement Figure 1. The descriptions of a-f could be improved by providing general terms to describe the brain regions associated with the latter list of abbreviations (as has been done with the identification of the cerebral cortex, hippocampus, and olfactory centers and their related panels). One suggestion would be to write out insula, claustrum, and endopiriform prior to listing the abbreviations (AI, CLA, EP) (b-c) and adding amygdaloid complex and extended amygdala before the abbreviations (COA, BLA, MeA) (d-f) and (BST) (d).

      We thank the reviewer, as the suggestion of further expanding the abbreviations is a good one. As such, we have revised/reorganized the anatomical abbreviations in the figure legends for Figure 1 Supplement Figures 1, 2, and 3.

      Descriptions for Panels g-j require editing to link the appropriate panels and the descriptions. Panels for BSTpr appear to be g-h (rather than f-g) and i,j (rather than h-i.

      We have fixed this typo in the legend for Figure 1 Supplement Figure 1.

      Descriptions for Panels k-n could be edited to include abbreviations for the identified brain regions. For example, include the abbreviation ARHP after arcuate nuclei and indicate panels m-n (rather than j-l); include PVP after paraventricular and indicate panel n (rather than m); include DMPH after dorsomedial nuclei and indicate k-m (rather than j-l).

      Thank you for the suggestion. We have expanded the abbreviations in Figure 1 Supplement 1 accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Please clarify if tdTomato, EGFP (from helper AAVs), and RFP (from rabies virus) are native signals or IHC signals in legends.

      We have added the descriptors “native” or “stained” to all figure legends containing fluorescent images.

      (2) Fig. 4b and c: Please add insets of high-magnification images showing AAC boutons along AnkG-labeled AISs.

      We have added these insets to Fig. 4b and c.

      (3) Fig. 7S1: It appears that d and e are reversed. Judging from the positions of starter cells, d is for PV-Cre? Please make sure. It is also better to draw the laminar border in d and e.

      The original genotype labels are correct for Fig. 7S1 d and e. We have added the laminar borders as suggested.

      (4) Fig. 9b: Just for consistency, please label with the name of the helper AAV.

      Added.

      (5) Line 617: intragranular>>>infragranular?

      Corrected, thank you.

      (6) It may be unclear to some readers if the images in the figures are from confocal or STP. The authors may want to clarify that all images in the figures are generated by confocal microscopy in the method section.

      We have clarified this better in the methods section, “Microcopy and image analysis.”

      (7) The authors should clarify that STP was used to map input cells to the brain in the result section.

      We have added this description in the results section.

    1. eLife assessment

      This useful study provides a novel method to detect sleep cycles based on variations in the slope of the power spectrum from electroencephalography signals. The method, dispensing with time-consuming and potentially subjective manual identification of sleep cycles, is supported by solid evidence and analyses but some aspects could be better illustrated and the source of the discrepancies between classical and fractal cycles should be identified. This study will be of interest to researchers and clinicians working on sleep and brain dynamics.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Rosenblum et al introduce a novel and automatic way of calculating sleep cycles from human EEG. Previous results have shown that the slope of the non-oscillatory component of the power spectrum (called the aperiodic or fractal component) changes with the sleep stage. Building on this, the authors present an algorithm that extracts the continuous-time fluctuations in the fractal slope and propose that peaks in this variable can be used to identify sleep cycle limits. Cycles defined in this way are termed "fractal cycles". The main focus of the article is a comparison of fractal and classical, manually defined sleep cycles in numerous datasets.

      Strengths:

      The manuscript amply illustrates through examples the strong overlap between fractal and classical cycle identification. Accordingly, a high percentage (81%) can be matched one-to-one between methods and sleep cycle duration is well correlated (around R = 0.5). Moreover, the methods track certain global changes in sleep structure in different populations: shorter cycles in children and longer cycles in patients medicated with REM-suppressing anti-depressants. Finally, a major strength of the results is that they show similar agreement between fractal and classical sleep cycle length in 5 different data sets, showing that it is robust to changes in recording settings and methods.

      These results suggest that the fractal cycle methodology could provide a valuable new method to study sleep architecture and avoid the time-consuming steps of manual cycle identification. Moreover, it has the potential to be applied to animal studies which rarely deal with sleep cycle structure.

      Weaknesses:

      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

    3. Reviewer #2 (Public Review):

      Summary:

      This study focused on using strictly the slope of the power spectral density (PSD) to perform automated sleep scoring and evaluation of the durations of sleep cycles. The method appears to work well because the slope of the PSD is highest during slow-wave sleep, and lowest during waking and REM sleep. Therefore, when smoothed and analyzed across time, there are cyclical variations in the slope of the PSD, fit using an IRASA (Irregularly resampled auto-spectral analysis) algorithm proposed by Wen & Liu (2016).

      Strengths:

      The main novelty of the study is that the non-fractal (oscillatory) components of the PSD that are more typically used during sleep scoring can be essentially ignored because the key information is already contained within the fractal (slope) component. The authors show that for the most part, results are fairly consistent between this and conventional sleep scoring, but in some cases show disagreements that may be scientifically interesting.

      Weaknesses:

      One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

      The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

      The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

      There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

      The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

      To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

    4. Author response:

      We thank the reviewers and editors for their review and assessment of our manuscript and comprehensive feedback. The manuscript will be revised to address all the reviewers’ comments. Specifically, to address the comment of Reviewer 1 and the editor regarding the lack of quantitative comparison between the classical and fractal cycle approaches and identification of the source of the discrepancies between classical and fractal cycles, we plan to perform and report the following analyses and comparisons:

      (1) Intra-method reliability

      a) Classical cycles. An additional scorer will independently define onsets and offsets of all classical sleep cycles for all datasets and mark sleep cycles with skipped REM sleep. Likewise, we will perform automatic sleep cycle detection. We will add a new Supplementary table showing the averaged cycle durations obtained by the two scorers and automatic algorithm as well as the inter-scorer rate agreement and update the Supplemental Excel file with corresponding information for each cycle for each participant for each dataset.

      b) Fractal cycles. We will correlate the durations of fractal cycles calculated using the parameters defined in the Main text with those calculated using different parameters, namely, the longer and shorter smoothing window lengths, higher and lower minimum peak prominence. Likewise, we will correlate the durations of fractal cycles calculated using frontal vs other available electrodes.

      (2) Origin of method differences

      In the current version of our Manuscript, we describe a few possible sources of discrepancies between classical and fractal cycle durations and numbers. Following the suggestion of one of the reviewers, in the revised Manuscript, we will quantify the sources of discrepancies between the two methods in order to identify the “criteria for recordings in which fractal cycles will produce similar results to the classical method”. Specifically, we will calculate the correlation between the difference in classical vs fractal sleep cycle durations on one side, and either the amplitudes of fractal descend/ascend, relative durations of cycles with skipped REM sleep and wake after sleep onset, or peak flatness on the other side.    

      In addition, we will include a new figure, illustrating the goodness of fit of the data as assessed by the IRASA method. Likewise, we will update Supplementary File 1 (that shows classical and fractal sleep cycles for each participant) with marks that highlight the onsets and offsets of sleep cycles as well as the cycles with skipped REM sleep.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this paper, the authors evaluate the utility of brain age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.  

      Importantly, in this revision, we clarified that we did not intend to use Brain Cognition as an alternative approach. This is because, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. 

      REVISED VERSION: while the authors have partially addressed my concerns, I do not feel they have addressed them all. I do not feel they have addressed the weight instability and concerns about the stacked regression models satisfactorily.

      Please see our responses to Reviewer #1 Public Review #3 below

      I also must say that I agree with Reviewer 3 about the limitations of the brain age and brain cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain age model that is trained to predict age. This suffers from the same problem the authors raise with brain age and would indeed disappear if the authors had a separate measure of cognition against which to validate and were then to regress this out as they do for age correction. I am aware that these conceptual problems are more widespread than this paper alone (in fact throughout the brain age literature), so I do not believe the authors should be penalised for that. However, I do think they can make these concerns more explicit and further tone down the comments they make about the utility of brain cognition. I have indicated the main considerations about these points in the recommendations section below. 

      Thank you so much for raising this point. We now have the following statement in the introduction and discussion to address this concern (see below). 

      Briefly, we made it explicit that, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. That is, the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. More importantly, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And this is the third goal of this present study. 

      From Introduction:

      “Third and finally, certain variation in fluid cognition is related to brain MRI, but to what extent does Brain Age not capture this variation? To estimate the variation in fluid cognition that is related to the brain MRI, we could build prediction models that directly predict fluid cognition (i.e., as opposed to chronological age) from brain MRI data. Previous studies found reasonable predictive performances of these cognition-prediction models, built from certain MRI modalities (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). Analogous to Brain Age, we called the predicted values from these cognition-prediction models, Brain Cognition. The strength of an out-of-sample relationship between Brain Cognition and fluid cognition reflects variation in fluid cognition that is related to the brain MRI and, therefore, indicates the upper limit of Brain Age’s capability in capturing fluid cognition. This is, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Consequently, if we included Brain Cognition, Brain Age and chronological age in the same model to explain fluid cognition, we would be able to examine the unique effects of Brain Cognition that explain fluid cognition beyond Brain Age and chronological age. These unique effects of Brain Cognition, in turn, would indicate the amount of co-variation between brain MRI and fluid cognition that is missed by Brain Age.”

      From Discussion:

      “Third, by introducing Brain Cognition,  we showed the extent to which Brain Age indices were not able to capture the variation in fluid cognition that is related to brain MRI. More specifically, using Brain Cognition allowed us to gauge the variation in fluid cognition that is related to the brain MRI, and thereby, to estimate the upper limit of what Brain Age can do. Moreover, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      From our results, Brain Cognition, especially from certain cognition-prediction models such as the stacked models, has relatively good predictive performance, consistent with previous studies (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). We then examined Brain Cognition using commonality analyses (Nimon et al., 2008) in multiple regression models having a Brain Age index, chronological age and Brain Cognition as regressors to explain fluid cognition. Similar to Brain Age indices, Brain Cognition exhibited large common effects with chronological age. But more importantly, unlike Brain Age indices, Brain Cognition showed large unique effects, up to around 11%. As explained above, the unique effects of Brain Cognition indicated the amount of co-variation between brain MRI and fluid cognition that was missed by a Brain Age index and chronological age. This missing amount was relatively high, considering that Brain Age and chronological age together explained around 32% of the total variation in fluid cognition. Accordingly, if a Brain Age index was used as a biomarker along with chronological age, we would have missed an opportunity to improve the performance of the model by around one-third of the variation explained.” 

      This is a reasonably good paper and the use of a commonality analysis is a nice contribution to understanding variance partitioning across different covariates. I have some comments that I believe the authors ought to address, which mostly relate to clarity and interpretation 

      Reviewer #1 Public Review #1

      First, from a conceptual point of view, the authors focus exclusively on cognition as a downstream outcome. I would suggest the authors nuance their discussion to provide broader considerations of the utility of their method and on the limits of interpretation of brain age models more generally. 

      Thank you for your comments on this issue. 

      We now discussed the broader consideration in detail:

      (1) the consistency between our findings on fluid cognition and other recent works on brain disorders, 

      (2) the difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021)

      and 

      (3) suggested solutions we and others made to optimise the utility of Brain Age for both cognitive functioning and brain disorders.

      From Discussion:

      “This discrepancy between the predictive performance of age-prediction models and the utility of Brain Age indices as a biomarker is consistent with recent findings (for review, see Jirsaraie, Gorelik, et al., 2023), both in the context of cognitive functioning (Jirsaraie, Kaufmann, et al., 2023) and neurological/psychological disorders (Bashyam et al., 2020; Rokicki et al., 2021). For instance,  combining different MRI modalities into the prediction models, similar to our stacked models, ocen leads to the highest performance of age prediction models, but does not likely explain the highest variance across different phenotypes, including cognitive functioning and beyond (Jirsaraie, Gorelik, et al., 2023).”

      “There is a notable difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021). We consider the former as a normative type of study and the lader as a case-control type of study (Insel et al., 2010; Marquand et al., 2016). Those case-control Brain Age studies focusing on neurological/psychological disorders often build age-prediction models from MRI data of largely healthy participants (e.g., controls in a case-control design or large samples in a population-based design), apply the built age-prediction models to participants without vs. with neurological/psychological disorders and compare Brain Age indices between the two groups. On the one hand, this means that case-control studies treat Brain Age as a method to detect anomalies in the neurological/psychological group (Hahn et al., 2021). On the other hand, this also means that case-control studies have to ignore underfided models when applied prediction models built from largely healthy participants to participants with neurological/psychological disorders (i.e., Brain Age may predict chronological age well for the controls, but not for those with a disorder). On the contrary, our study and other normative studies focusing on cognitive functioning often build age prediction models from MRI data of largely healthy participants and apply the built age prediction models to participants who are also largely healthy. Accordingly, the age prediction models for explaining cognitive functioning in normative studies, while not allowing us to detect group-level anomalies, do not suffer from being under-fided. This unfortunately might limit the generalisability of our study into just the normative type of study. Future work is still needed to test the utility of brain age in the case-control case.”

      “Next, researchers should not select age-prediction models based solely on age-prediction performance. Instead, researchers could select age-prediction models that explained phenotypes of interest the best. Here we selected age-prediction models based on a set of features (i.e., modalities) of brain MRI. This strategy was found effective not only for fluid cognition as we demonstrated here, but also for neurological and psychological disorders as shown elsewhere (Jirsaraie, Gorelik, et al., 2023; Rokicki et al., 2021). Rokicki and colleagues (2021), for instance, found that, while integrating across MRI modalities led to age prediction models with the highest age-prediction performance, using only T1 structural MRI gave age-prediction models that were better at classifying Alzheimer’s disease. Similarly, using only cerebral blood flow gave age-prediction models that were better at classifying mild/subjective cognitive impairment, schizophrenia and bipolar disorder. 

      As opposed to selecting age-prediction models based on a set of features, researchers could also select age-prediction models based on modelling methods. For instance, Jirsaraie and colleagues (2023) compared gradient tree boosting (GTB) and deep-learning brain network (DBN) algorithms in building age-prediction models. They found GTB to have higher age prediction performance but DBN to have better utility in explaining cognitive functioning. In this case, an algorithm with better utility (e.g., DBN) should be used for explaining a phenotype of interest. Similarly, Bashyam and colleagues (2020) built different DBN-based age-prediction models, varying in age-prediction performance. The DBN models with a higher number of epochs corresponded to higher age-prediction performance. However, DBN-based age-prediction models with a moderate (as opposed to higher or lower) number of epochs were better at classifying Alzheimer’s disease, mild cognitive impairment and schizophrenia. In this case, a model from the same algorithm with better utility (e.g., those DBN with a moderate epoch number) should be used for explaining a phenotype of interest.

      Accordingly, this calls for a change in research practice, as recently pointed out by Jirasarie and colleagues (2023, p7), “Despite mounting evidence, there is a persisting assumption across several studies that the most accurate brain age models will have the most potential for detecting differences in a given phenotype of interest”. Future neuroimaging research should aim to build age-prediction models that are not necessarily good at predicting age, but at capturing phenotypes of interest.”

      Reviewer #1 Public Review #2

      Second, from a methods perspective, there is not a sufficient explanation of the methodological procedures in the current manuscript to fully understand how the stacked regression models were constructed. I would request that the authors provide more information to enable the reader to beUer understand the stacked regression models used to ensure that these models are not overfit. 

      Thank you for allowing us an opportunity to clarify our stacked model. We made additional clarification to make this clearer (see below). We wanted to confirm that we did not use test sets to build a stacked model in both lower and higher levels of the Elastic Net models. Test sets were there just for testing the performance of the models.  

      From Methods:

      “We used nested cross-validation (CV) to build these prediction models (see Figure 7). We first split the data into five outer folds, leaving each outer fold with around 100 participants. This number of participants in each fold is to ensure the stability of the test performance across folds. In each outer-fold CV loop, one of the outer folds was treated as an outer-fold test set, and the rest was treated as an outer-fold training set. Ultimately, looping through the nested CV resulted in a) prediction models from each of the 18 sets of features as well as b) prediction models that drew information across different combinations of the 18 separate sets, known as “stacked models.” We specified eight stacked models: “All” (i.e., including all 18 sets of features),  “All excluding Task FC”, “All excluding Task Contrast”, “Non-Task” (i.e., including only Rest FC and sMRI), “Resting and Task FC”, “Task Contrast and FC”, “Task Contrast” and “Task FC”. Accordingly, there were 26 prediction models in total for both Brain Age and Brain Cognition.

      To create these 26 prediction models, we applied three steps for each outer-fold loop. The first step aimed at tuning prediction models for each of 18 sets of features. This step only involved the outer-fold training set and did not involve the outer-fold test set. Here, we divided the outer-fold training set into five inner folds and applied inner-fold CV to tune hyperparameters with grid search. Specifically, in each inner-fold CV, one of the inner folds was treated as an inner-fold validation set, and the rest was treated as an inner-fold training set. Within each inner-fold CV loop, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters and applied the estimated model to the inner-fold validation set. Acer looping through the inner-fold CV, we, then, chose the prediction models that led to the highest performance, reflected by coefficient of determination (R2), on average across the inner-fold validation sets. This led to 18 tuned models, one for each of the 18 sets of features, for each outer fold.

      The second step aimed at tuning stacked models. Same as the first step, the second step only involved the outer-fold training set and did not involve the outer-fold test set. Here, using the same outer-fold training set as the first step, we applied tuned models, created from the first step, one from each of the 18 sets of features, resulting in 18 predicted values for each participant. We, then, re-divided this outer-fold training set into new five inner folds. In each inner fold, we treated different combinations of the 18 predicted values from separate sets of features as features to predict the targets in separate “stacked” models. Same as the first step, in each inner-fold CV loop, we treated one out of five inner folds as an inner-fold validation set, and the rest as an inner-fold training set. Also as in the first step, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters from our grid. We tuned the hyperparameters of stacked models using grid search by selecting the models with the highest R2 on average across the inner-fold validation sets. This led to eight tuned stacked models.

      The third step aimed at testing the predictive performance of the 18 tuned prediction models from each of the set of features, built from the first step, and eight tuned stacked models, built from the second step. Unlike the first two steps, here we applied the already tuned models to the outer-fold test set. We started by applying the 18 tuned prediction models from each of the sets of features to each observation in the outer-fold test set, resulting in 18 predicted values. We then applied the tuned stacked models to these predicted values from separate sets of features, resulting in eight predicted values. 

      To demonstrate the predictive performance, we assessed the similarity between the observed values and the predicted values of each model across outer-fold test sets, using Pearson’s r, coefficient of determination (R2) and mean absolute error (MAE). Note that for R2, we used the sum of squares definition (i.e., R2 \= 1 – (sum of squares residuals/total sum of squares)) per a previous recommendation (Poldrack et al., 2020). We considered the predicted values from the outer-fold test sets of models predicting age or fluid cognition, as Brain Age and Brain Cognition, respectively.”

      Author response image 1.

      Diagram of the nested cross-validation used for creating predictions for models of each set of features as well as predictions for stacked models. 

      Note some previous research, including ours (Tetereva et al., 2022), splits the observations in the outer-fold training set into layer 1 and layer 2 and applies the first and second steps to layers 1 and 2, respectively. Here we decided against this approach and used the same outer-fold training set for both first and second steps in order to avoid potential bias toward the stacked models. This is because, when the data are split into two layers, predictive models built for each separate set of features only use the data from layer 1, while the stacked models use the data from both layers 1 and 2. In practice with large enough data, these two approaches might not differ much, as we demonstrated previously (Tetereva et al., 2022).

      Reviewer #1 Public Review #3

      Please also provide an indication of the different regression strengths that were estimated across the different models and cross-validation splits. Also, how stable were the weights across splits? 

      The focus of this article is on the predictions. Still, it is informative for readers to understand how stable the feature importance (i.e., Elastic Net coefficients) is. To demonstrate the stability of feature importance, we now examined the rank stability of feature importance using Spearman’s ρ (see Figure 4). Specifically, we correlated the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, we computed 10 Spearman’s ρ for each prediction model of the same features.  We found Spearman’s ρ to be varied dramatically in both age-prediction (range\=.31-.94) and fluid cognition-prediction (range\=.16-.84) models. This means that some prediction models were much more stable in their feature importance than others. This is probably due to various factors such as a) the collinearity of features in the model, b) the number of features (e.g., 71,631 features in functional connectivity, which were further reduced to 75 PCAs, as compared to 19 features in subcortical volume based on the ASEG atlas), c) the penalisation of coefficients either with ‘Ridge’ or ‘Lasso’ methods, which resulted in reduction as a group of features or selection of a feature among correlated features, respectively, and d) the predictive performance of the models. Understanding the stability of feature importance is beyond the scope of the current article. As mentioned by Reviewer 1, “The predictions can be stable when the coefficients are not,” and we chose to focus on the prediction in the current article.   

      Author response image 2.

      Stability of feature importance (i.e., Elastic Net Coefficients) of prediction models. Each dot represents rank stability (reflected by Spearman’s ρ) in the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, there were 10 Spearman’s ρs for each prediction model.  The numbers to the right of the plots indicate the mean of Spearman’s ρ for each prediction model.  

      Reviewer #1 Public Review #4

      Please provide more details about the task designs, MRI processing procedures that were employed on this sample in addition to the regression methods and bias correction methods used. For example, there are several different parameterisations of the elastic net, please provide equations to describe the method used here so that readers can easily determine how the regularisation parameters should be interpreted.  

      Thank you for the opportunity for us to provide more methodical details.

      First, for the task design, we included the following statements:

      From Methods:

      “HCP-A collected fMRI data from three tasks: Face Name (Sperling et al., 2001), Conditioned Approach Response Inhibition Task (CARIT) (Somerville et al., 2018) and VISual MOTOR (VISMOTOR) (Ances et al., 2009). 

      First, the Face Name task (Sperling et al., 2001) taps into episodic memory. The task had three blocks. In the encoding block [Encoding], participants were asked to memorise the names of faces shown. These faces were then shown again in the recall block [Recall] when the participants were asked if they could remember the names of the previously shown faces. There was also the distractor block [Distractor] occurring between the encoding and recall blocks. Here participants were distracted by a Go/NoGo task. We computed six contrasts for this Face Name task: [Encode], [Recall], [Distractor], [Encode vs. Distractor], [Recall vs. Distractor] and [Encode vs. Recall].

      Second, the CARIT task (Somerville et al., 2018) was adapted from the classic Go/NoGo task and taps into inhibitory control. Participants were asked to press a budon to all [Go] but not to two [NoGo] shapes. We computed three contrasts for the CARIT task: [NoGo], [Go] and [NoGo vs. Go]. 

      Third, the VISMOTOR task (Ances et al., 2009) was designed to test simple activation of the motor and visual cortices. Participants saw a checkerboard with a red square either on the lec or right. They needed to press a corresponding key to indicate the location of the red square. We computed just one contrast for the VISMOTOR task: [Vismotor], which indicates the presence of the checkerboard vs. baseline.” 

      Second, for MRI processing procedures, we included the following statements.

      From Methods:

      “HCP-A provides details of parameters for brain MRI elsewhere (Bookheimer et al., 2019; Harms et al., 2018). Here we used MRI data that were pre-processed by the HCP-A with recommended methods, including the MSMALL alignment (Glasser et al., 2016; Robinson et al., 2018) and ICA-FIX (Glasser et al., 2016) for functional MRI. We used multiple brain MRI modalities, covering task functional MRI (task fMRI), resting-state functional MRI (rsfMRI) and structural MRI (sMRI), and organised them into 19 sets of features.”

      “Sets of Features 1-10: Task fMRI contrast (Task Contrast)

      Task contrasts reflect fMRI activation relevant to events in each task. Bookheimer and colleagues (2019) provided detailed information about the fMRI in HCP-A. Here we focused on the pre-processed task fMRI Connectivity Informatics Technology Initiative (CIFTI) files with a suffix, “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” These CIFTI files encompassed both the cortical mesh surface and subcortical volume (Glasser et al., 2013). Collected using the posterior-to-anterior (PA) phase, these files were aligned using MSMALL (Glasser et al., 2016; Robinson et al., 2018), linear detrended (see hdps://groups.google.com/a/humanconnectome.org/g/hcp-users/c/ZLJc092h980/m/GiihzQAUAwAJ) and cleaned from potential artifacts using ICA-FIX (Glasser et al., 2016). 

      To extract Task Contrasts, we regressed the fMRI time series on the convolved task events using a double-gamma canonical hemodynamic response function via FMRIB Software Library (FSL)’s FMRI Expert Analysis Tool (FEAT) (Woolrich et al., 2001). We kept FSL’s default high pass cutoff at 200s (i.e., .005 Hz). We then parcellated the contrast ‘cope’ files, using the Glasser atlas (Gordon et al., 2016) for cortical surface regions and the Freesurfer’s automatic segmentation (aseg) (Fischl et al., 2002) for subcortical regions. This resulted in 379 regions, whose number was, in turn, the number of features for each Task Contrast set of features. “ 

      “Sets of Features 11-13: Task fMRI functional connectivity (Task FC)

      Task FC reflects functional connectivity (FC ) among the brain regions during each task, which is considered an important source of individual differences (Elliod et al., 2019; Fair et al., 2007; Gradon et al., 2018). We used the same CIFTI file “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” as the task contrasts. Unlike Task Contrasts, here we treated the double-gamma, convolved task events as regressors of no interest and focused on the residuals of the regression from each task (Fair et al., 2007). We computed these regressors on FSL, and regressed them in nilearn (Abraham et al., 2014). Following previous work on task FC (Elliod et al., 2019), we applied a highpass at .008 Hz. For parcellation, we used the same atlases as Task Contrast (Fischl et al., 2002; Glasser et al., 2016). We computed Pearson’s correlations of each pair of 379 regions, resulting in a table of 71,631 non-overlapping FC indices for each task. We then applied r-to-z transformation and principal component analysis (PCA) of 75 components (Rasero et al., 2021; Sripada et al., 2019, 2020). Note to avoid data leakage, we conducted the PCA on each training set and applied its definition to the corresponding test set. Accordingly, there were three sets of 75 features for Task FC, one for each task. 

      Set of Features 14: Resting-state functional MRI functional connectivity (Rest FC) Similar to Task FC, Rest FC reflects functional connectivity (FC ) among the brain regions, except that Rest FC occurred during the resting (as opposed to task-performing) period. HCPA collected Rest FC from four 6.42-min (488 frames) runs across two days, leading to 26-min long data (Harms et al., 2018). On each day, the study scanned two runs of Rest FC, starting with anterior-to-posterior (AP) and then with posterior-to-anterior (PA) phase encoding polarity. We used the “rfMRI_REST_Atlas_MSMAll_hp0_clean.dscalar.nii” file that was preprocessed and concatenated across the four runs.  We applied the same computations (i.e., highpass filter, parcellation, Pearson’s correlations, r-to-z transformation and PCA) with the Task FC. 

      Sets of Features 15-18: Structural MRI (sMRI)

      sMRI reflects individual differences in brain anatomy. The HCP-A used an established preprocessing pipeline for sMRI (Glasser et al., 2013). We focused on four sets of features: cortical thickness, cortical surface area, subcortical volume and total brain volume. For cortical thickness and cortical surface area, we used Destrieux’s atlas (Destrieux et al., 2010; Fischl, 2012) from FreeSurfer’s “aparc.stats” file, resulting in 148 regions for each set of features. For subcortical volume, we used the aseg atlas (Fischl et al., 2002) from FreeSurfer’s “aseg.stats” file, resulting in 19 regions. For total brain volume, we had five FreeSurfer-based features: “FS_IntraCranial_Vol” or estimated intra-cranial volume, “FS_TotCort_GM_Vol” or total cortical grey mader volume, “FS_Tot_WM_Vol” or total cortical white mader volume, “FS_SubCort_GM_Vol” or total subcortical grey mader volume and “FS_BrainSegVol_eTIV_Ratio” or ratio of brain segmentation volume to estimated total intracranial volume.”

      Third, for regression methods and bias correction methods used, we included the following statements:

      From Methods:

      “For the machine learning algorithm, we used Elastic Net (Zou & Hastie, 2005). Elastic Net is a general form of penalised regressions (including Lasso and Ridge regression), allowing us to simultaneously draw information across different brain indices to predict one target variable. Penalised regressions are commonly used for building age-prediction models (Jirsaraie, Gorelik, et al., 2023). Previously we showed that the performance of Elastic Net in predicting cognitive abilities is on par, if not better than, many non-linear and morecomplicated algorithms (Pat, Wang, Bartonicek, et al., 2022; Tetereva et al., 2022). Moreover, Elastic Net coefficients are readily explainable, allowing us the ability to explain how our age-prediction and cognition-prediction models made the prediction from each brain feature (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022) (see below). 

      Elastic Net simultaneously minimises the weighted sum of the features’ coefficients. The degree of penalty to the sum of the feature’s coefficients is determined by a shrinkage hyperparameter ‘a’: the greater the a, the more the coefficients shrink, and the more regularised the model becomes. Elastic Net also includes another hyperparameter, ‘ℓ! ratio’, which determines the degree to which the sum of either the squared (known as ‘Ridge’; ℓ! ratio=0) or absolute (known as ‘Lasso’; ℓ! ratio=1) coefficients is penalised (Zou & Hastie, 2005). The objective function of Elastic Net as implemented by sklearn (Pedregosa et al., 2011) is defined as:

      where X is the features, y is the target, and b is the coefficient. In our grid search, we tuned two Elastic Net hyperparameters: a using 70 numbers in log space, ranging from .1 and 100, and ℓ!-ratio using 25 numbers in linear space, ranging from 0 and 1.

      To understand how Elastic Net made a prediction based on different brain features, we examined the coefficients of the tuned model. Elastic Net coefficients can be considered as feature importance, such that more positive Elastic Net coefficients lead to more positive predicted values and, similarly, more negative Elastic Net coefficients lead to more negative predicted values (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022). While the magnitude of Elastic Net coefficients is regularised (thus making it difficult for us to interpret the magnitude itself directly), we could still indicate that a brain feature with a higher magnitude weights relatively stronger in making a prediction. Another benefit of Elastic Net as a penalised regression is that the coefficients are less susceptible to collinearity among features as they have already been regularised (Dormann et al., 2013; Pat, Wang, Bartonicek, et al., 2022).

      Given that we used five-fold nested cross validation, different outer folds may have different degrees of ‘a’ and ‘ℓ! ratio’, making the final coefficients from different folds to be different. For instance, for certain sets of features, penalisation may not play a big part (i.e., higher or lower ‘a’ leads to similar predictive performance), resulting in different ‘a’ for different folds. To remedy this in the visualisation of Elastic Net feature importance, we refitted the Elastic Net model to the full dataset without spli{ng them into five folds and visualised the coefficients on brain images using Brainspace (Vos De Wael et al., 2020) and Nilern (Abraham et al., 2014) packages. Note, unlike other sets of features, Task FC and Rest FC were modelled acer data reduction via PCA. Thus, for Task FC and Rest FC, we, first, multiplied the absolute PCA scores (extracted from the ‘components_’ attribute of ‘sklearn.decomposition.PCA’) with Elastic Net coefficients and, then, summed the multiplied values across the 75 components, leaving 71,631 ROI-pair indices.

      References

      Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., Gramfort, A., Thirion, B., & Varoquaux, G. (2014). Machine learning for neuroimaging with scikitlearn. Frontiers in Neuroinformatics, 8, 14. hdps://doi.org/10.3389/fninf.2014.00014

      Ances, B. M., Liang, C. L., Leontiev, O., Perthen, J. E., Fleisher, A. S., Lansing, A. E., & Buxton, R. B. (2009). Effects of aging on cerebral blood flow, oxygen metabolism, and blood oxygenation level dependent responses to visual stimulation. Human Brain Mapping, 30(4), 1120–1132. hdps://doi.org/10.1002/hbm.20574

      Bashyam, V. M., Erus, G., Doshi, J., Habes, M., Nasrallah, I. M., Truelove-Hill, M., Srinivasan, D., Mamourian, L., Pomponio, R., Fan, Y., Launer, L. J., Masters, C. L., Maruff, P., Zhuo, C., Völzke, H., Johnson, S. C., Fripp, J., Koutsouleris, N., Saderthwaite, T. D., … on behalf of the ISTAGING Consortium,  the P. A. disease C., ADNI, and CARDIA studies. (2020). MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain, 143(7), 2312–2324. hdps://doi.org/10.1093/brain/awaa160

      Bookheimer, S. Y., Salat, D. H., Terpstra, M., Ances, B. M., Barch, D. M., Buckner, R. L., Burgess, G. C., Curtiss, S. W., Diaz-Santos, M., Elam, J. S., Fischl, B., Greve, D. N., Hagy, H. A., Harms, M. P., Hatch, O. M., Hedden, T., Hodge, C., Japardi, K. C., Kuhn, T. P., … Yacoub, E. (2019). The Lifespan Human Connectome Project in Aging: An overview. NeuroImage, 185, 335–348. hdps://doi.org/10.1016/j.neuroimage.2018.10.009

      Butler, E. R., Chen, A., Ramadan, R., Le, T. T., Ruparel, K., Moore, T. M., Saderthwaite, T. D., Zhang, F., Shou, H., Gur, R. C., Nichols, T. E., & Shinohara, R. T. (2021). Pi alls in brain age analyses. Human Brain Mapping, 42(13), 4092–4101. hdps://doi.org/10.1002/hbm.25533

      Cole, J. H. (2020). Multimodality neuroimaging brain-age in UK biobank: Relationship to biomedical, lifestyle, and cognitive factors. Neurobiology of Aging, 92, 34–42. hdps://doi.org/10.1016/j.neurobiolaging.2020.03.014

      Destrieux, C., Fischl, B., Dale, A., & Halgren, E. (2010). Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. NeuroImage, 53(1), 1–15. hdps://doi.org/10.1016/j.neuroimage.2010.06.010

      Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J., Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder, B., Skidmore, A. K., Zurell, D., & Lautenbach, S. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27–46. hdps://doi.org/10.1111/j.16000587.2012.07348.x

      Dubois, J., Galdi, P., Paul, L. K., & Adolphs, R. (2018). A distributed brain network predicts general intelligence from resting-state human neuroimaging data. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1756), 20170284. hdps://doi.org/10.1098/rstb.2017.0284

      Elliod, M. L., Knodt, A. R., Cooke, M., Kim, M. J., Melzer, T. R., Keenan, R., Ireland, D., Ramrakha, S., Poulton, R., Caspi, A., Moffid, T. E., & Hariri, A. R. (2019). General functional connectivity: Shared features of resting-state and task fMRI drive reliable and heritable individual differences in functional brain networks. NeuroImage, 189, 516–532. hdps://doi.org/10.1016/j.neuroimage.2019.01.068

      Fair, D. A., Schlaggar, B. L., Cohen, A. L., Miezin, F. M., Dosenbach, N. U. F., Wenger, K. K., Fox, M. D., Snyder, A. Z., Raichle, M. E., & Petersen, S. E. (2007). A method for using blocked and event-related fMRI data to study “resting state” functional connectivity. NeuroImage, 35(1), 396–405. hdps://doi.org/10.1016/j.neuroimage.2006.11.051

      Fischl, B. (2012). FreeSurfer. NeuroImage, 62(2), 774–781. hdps://doi.org/10.1016/j.neuroimage.2012.01.021

      Fischl, B., Salat, D. H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B., & Dale, A. M. (2002). Whole Brain Segmentation. Neuron, 33(3), 341–355. hdps://doi.org/10.1016/S0896-6273(02)00569-X

      Glasser, M. F., Smith, S. M., Marcus, D. S., Andersson, J. L. R., Auerbach, E. J., Behrens, T. E. J., Coalson, T. S., Harms, M. P., Jenkinson, M., Moeller, S., Robinson, E. C., Sotiropoulos, S. N., Xu, J., Yacoub, E., Ugurbil, K., & Van Essen, D. C. (2016). The Human Connectome Project’s neuroimaging approach. Nature Neuroscience, 19(9), 1175– 1187. hdps://doi.org/10.1038/nn.4361

      Glasser, M. F., Sotiropoulos, S. N., Wilson, J. A., Coalson, T. S., Fischl, B., Andersson, J. L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J. R., Van Essen, D. C., & Jenkinson, M. (2013). The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage, 80, 105–124. hdps://doi.org/10.1016/j.neuroimage.2013.04.127

      Gordon, E. M., Laumann, T. O., Adeyemo, B., Huckins, J. F., Kelley, W. M., & Petersen, S. E. (2016). Generation and Evaluation of a Cortical Area Parcellation from Resting-State Correlations. Cerebral Cortex, 26(1), 288–303. hdps://doi.org/10.1093/cercor/bhu239

      Gradon, C., Laumann, T. O., Nielsen, A. N., Greene, D. J., Gordon, E. M., Gilmore, A. W., Nelson, S. M., Coalson, R. S., Snyder, A. Z., Schlaggar, B. L., Dosenbach, N. U. F., & Petersen, S. E. (2018). Functional Brain Networks Are Dominated by Stable Group and Individual Factors, Not Cognitive or Daily Variation. Neuron, 98(2), 439-452.e5. hdps://doi.org/10.1016/j.neuron.2018.03.035

      Hahn, T., Fisch, L., Ernsting, J., Winter, N. R., Leenings, R., Sarink, K., Emden, D., Kircher, T., Berger, K., & Dannlowski, U. (2021). From ‘loose fi{ng’ to high-performance, uncertainty-aware brain-age modelling. Brain, 144(3), e31–e31. hdps://doi.org/10.1093/brain/awaa454

      Harms, M. P., Somerville, L. H., Ances, B. M., Andersson, J., Barch, D. M., Bastiani, M., Bookheimer, S. Y., Brown, T. B., Buckner, R. L., Burgess, G. C., Coalson, T. S., Chappell, M. A., Dapredo, M., Douaud, G., Fischl, B., Glasser, M. F., Greve, D. N., Hodge, C., Jamison, K. W., … Yacoub, E. (2018). Extending the Human Connectome Project across ages: Imaging protocols for the Lifespan Development and Aging projects. NeuroImage, 183, 972–984. hdps://doi.org/10.1016/j.neuroimage.2018.09.060

      Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., Sanislow, C., & Wang, P. (2010). Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. American Journal of Psychiatry, 167(7), 748–751. hdps://doi.org/10.1176/appi.ajp.2010.09091379

      Jirsaraie, R. J., Gorelik, A. J., Gatavins, M. M., Engemann, D. A., Bogdan, R., Barch, D. M., & Sotiras, A. (2023). A systematic review of multimodal brain age studies: Uncovering a divergence between model accuracy and utility. PaUerns, 4(4), 100712. hdps://doi.org/10.1016/j.pader.2023.100712

      Jirsaraie, R. J., Kaufmann, T., Bashyam, V., Erus, G., Luby, J. L., Westlye, L. T., Davatzikos, C., Barch, D. M., & Sotiras, A. (2023). Benchmarking the generalizability of brain age models: Challenges posed by scanner variance and prediction bias. Human Brain Mapping, 44(3), 1118–1128. hdps://doi.org/10.1002/hbm.26144

      Marquand, A. F., Rezek, I., Buitelaar, J., & Beckmann, C. F. (2016). Understanding Heterogeneity in Clinical Cohorts Using Normative Models: Beyond Case-Control Studies. Biological Psychiatry, 80(7), 552–561. hdps://doi.org/10.1016/j.biopsych.2015.12.023

      Molnar, C. (2019). Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. hdps://christophm.github.io/interpretable-ml-book/

      Nimon, K., Lewis, M., Kane, R., & Haynes, R. M. (2008). An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example. Behavior Research Methods, 40(2), 457–466. hdps://doi.org/10.3758/BRM.40.2.457

      Pat, N., Wang, Y., Anney, R., Riglin, L., Thapar, A., & Stringaris, A. (2022). Longitudinally stable, brain-based predictive models mediate the relationships between childhood cognition and socio-demographic, psychological and genetic factors. Human Brain Mapping, hbm.26027. hdps://doi.org/10.1002/hbm.26027

      Pat, N., Wang, Y., Bartonicek, A., Candia, J., & Stringaris, A. (2022). Explainable machine learning approach to predict and explain the relationship between task-based fMRI and individual differences in cognition. Cerebral Cortex, bhac235. hdps://doi.org/10.1093/cercor/bhac235

      Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Predenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85), 2825–2830.

      Poldrack, R. A., Huckins, G., & Varoquaux, G. (2020). Establishment of Best Practices for Evidence for Prediction: A Review. JAMA Psychiatry, 77(5), 534–540. hdps://doi.org/10.1001/jamapsychiatry.2019.3671

      Rasero, J., Sentis, A. I., Yeh, F.-C., & Verstynen, T. (2021). Integrating across neuroimaging modalities boosts prediction accuracy of cognitive ability. PLOS Computational Biology, 17(3), e1008347. hdps://doi.org/10.1371/journal.pcbi.1008347

      Robinson, E. C., Garcia, K., Glasser, M. F., Chen, Z., Coalson, T. S., Makropoulos, A., Bozek, J., Wright, R., Schuh, A., Webster, M., Huder, J., Price, A., Cordero Grande, L., Hughes, E., Tusor, N., Bayly, P. V., Van Essen, D. C., Smith, S. M., Edwards, A. D., … Rueckert, D. (2018). Multimodal surface matching with higher-order smoothness constraints. NeuroImage, 167, 453–465. hdps://doi.org/10.1016/j.neuroimage.2017.10.037

      Rokicki, J., Wolfers, T., Nordhøy, W., Tesli, N., Quintana, D. S., Alnæs, D., Richard, G., de Lange, A.-M. G., Lund, M. J., Norbom, L., Agartz, I., Melle, I., Nærland, T., Selbæk, G., Persson, K., Nordvik, J. E., Schwarz, E., Andreassen, O. A., Kaufmann, T., & Westlye, L. T. (2021). Multimodal imaging improves brain age prediction and reveals distinct abnormalities in patients with psychiatric and neurological disorders. Human Brain Mapping, 42(6), 1714–1726. hdps://doi.org/10.1002/hbm.25323

      Somerville, L. H., Bookheimer, S. Y., Buckner, R. L., Burgess, G. C., Curtiss, S. W., Dapredo, M., Elam, J. S., Gaffrey, M. S., Harms, M. P., Hodge, C., Kandala, S., Kastman, E. K., Nichols, T. E., Schlaggar, B. L., Smith, S. M., Thomas, K. M., Yacoub, E., Van Essen, D. C., & Barch, D. M. (2018). The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5–21 year olds. NeuroImage, 183, 456–468. hdps://doi.org/10.1016/j.neuroimage.2018.08.050

      Sperling, R. A., Bates, J. F., Cocchiarella, A. J., Schacter, D. L., Rosen, B. R., & Albert, M. S. (2001). Encoding novel face-name associations: A functional MRI study. Human Brain Mapping, 14(3), 129–139. hdps://doi.org/10.1002/hbm.1047

      Sripada, C., Angstadt, M., Rutherford, S., Kessler, D., Kim, Y., Yee, M., & Levina, E. (2019). Basic Units of Inter-Individual Variation in Resting State Connectomes. Scientific Reports, 9(1), Article 1. hdps://doi.org/10.1038/s41598-018-38406-5

      Sripada, C., Angstadt, M., Rutherford, S., Taxali, A., & Shedden, K. (2020). Toward a “treadmill test” for cognition: Improved prediction of general cognitive ability from the task activated brain. Human Brain Mapping, 41(12), 3186–3197. hdps://doi.org/10.1002/hbm.25007

      Tetereva, A., Li, J., Deng, J. D., Stringaris, A., & Pat, N. (2022). Capturing brain-cognition relationship: Integrating task-based fMRI across tasks markedly boosts prediction and test-retest reliability. NeuroImage, 263, 119588. hdps://doi.org/10.1016/j.neuroimage.2022.119588

      Vieira, B. H., Pamplona, G. S. P., Fachinello, K., Silva, A. K., Foss, M. P., & Salmon, C. E. G. (2022). On the prediction of human intelligence from neuroimaging: A systematic review of methods and reporting. Intelligence, 93, 101654. hdps://doi.org/10.1016/j.intell.2022.101654

      Vos De Wael, R., Benkarim, O., Paquola, C., Lariviere, S., Royer, J., Tavakol, S., Xu, T., Hong, S.J., Langs, G., Valk, S., Misic, B., Milham, M., Margulies, D., Smallwood, J., & Bernhardt, B. C. (2020). BrainSpace: A toolbox for the analysis of macroscale gradients in neuroimaging and connectomics datasets. Communications Biology, 3(1), 103. hdps://doi.org/10.1038/s42003-020-0794-7

      Woolrich, M. W., Ripley, B. D., Brady, M., & Smith, S. M. (2001). Temporal Autocorrelation in Univariate Linear Modeling of FMRI Data. NeuroImage, 14(6), 1370–1386. hdps://doi.org/10.1006/nimg.2001.0931

      Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. hdps://doi.org/10.1111/j.1467-9868.2005.00503.x

    2. eLife assessment

      This useful manuscript challenges the utility of current paradigms for estimating brain-age with magnetic resonance imaging measures. It presents solid evidence to support the suggestion that an alternative approach focused on predicting cognition may be more beneficial. This work will be of interest to researchers working on brain-age and related models.

    3. Reviewer #1 (Public Review):

      In this paper, the authors evaluate the utility of brain-age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain-age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.

      Comments on revised version:

      I thank the authors for the revision of the manuscript and for being more explicit about the inherent conceptual limitations of Brain Age / Brain Cognition. I have no further comments.

    1. eLife assessment

      This work explores the role of one the most abundant circRNAs, circHIPK3, in bladder cancer cells, showing with convincing data that circHIPK3 depletion affects thousands of genes and that those downregulated (including STAT3) share an 11-mer motif with circHIPK3, corresponding to a binding site for IGF2BP2. The experiments demonstrate that circHIPK3 can compete with the downregulated mRNAs targets for IGF2BP2 binding and that IGF2BP2 depletion antagonizes the effect of circHIPK3 depletion by upregulating the genes containing the 11-mer. These important findings contribute to the growing recognition of the complexity of cancer signaling regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis.

    2. Reviewer #1 (Public Review):

      In this work the authors propose a new regulatory role for one the most abundant circRNAs, circHIPK3. They demonstrate that circHIPK3 interacts with an RNA binding protein (IGF2BP2), sequestering it away from its target mRNAs. This interaction is shown to regulates the expression of hundreds of genes that share a specific sequence motif (11-mer motif) in their untranslated regions (3'-UTR), identical to one present in circHIPK3 where IGF2BP2 binds. The study further focuses on the specific case of STAT3 gene, whose mRNA product is found to be downregulated upon circHIPK3 depletion. This suggests that circHIPK3 sequesters IGF2BP2, preventing it from binding to and destabilizing STAT3 mRNA. The study presents evidence supporting this mechanism and discusses its potential role in tumor cell progression. These findings contribute to the growing complexity of understanding cancer regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis.

      Strengths:

      The authors show mechanistic insight into a proposed novel "sponging" function of circHIPK3 which is not mediated by sequestering miRNAs but rather a specific RNA binding protein (IGF2BP2). They address the stoichiometry of the molecules involved in the interaction, which is a critical aspect that is frequently overlooked in this type of studies. They provide both genome-wide analysis and a specific case (STAT3) which is relevant for cancer progression. Overall, the authors have significantly improved their manuscript in their revised version.

      Weaknesses:

      There are seemingly contradictory effects of circHIPK3 and STAT3 depletion in cancer progression. However, the authors have addressed these issues in their revised manuscript, incorporating potential reasons that might explain such complexity.

    3. Reviewer #2 (Public Review):

      The manuscript by Okholm and colleagues identified an interesting new instance of ceRNA involving a circular RNA. The data are clearly presented and support the conclusions. Quantification of the copy number of circRNA and quantification of the protein were performed, and this is important to support the ceRNA mechanism.

      This is the second rebuttal and the authors further improved the manuscript. The data are of interest for the large spectrum of readers of the journal.

    4. Reviewer #3 (Public Review):

      Summary:

      In Okholm et al., the authors evaluate the functional impact of circHIPK3 in bladder cancer cells. By knocking it down and performing an RNA-seq analysis, the authors found thousand deregulated genes which look unaffected by miRNAs sponging function and that are, instead, enriched for a 11-mer motif. Further investigations showed that the 11-mer motif is shared with the circHIPK3 and able to bind the IGF2BP2 protein. The authors validated the binding of IGF2BP2 and demonstrated that IGF2BP2 KD antagonizes the effect of circHIPK3 KD and leads to the upregulation of genes containing the 11-mer. Among the genes affected by circHIPK3 KD and IGF2BP2 KD, resulting in downregulation and upregulation respectively, the authors found STAT3 gene which also consistently leads to the concomitant upregulation of one of its targets TP53. The authors propose a mechanism of competition between circHIPK3 and IGF2BP2 triggered by IGF2BP2 nucleation, potentially via phase separation.

      Strengths:

      The number of circRNAs continues to drastically grow however the field lacks detailed molecular investigations. The presented work critically addresses some of the major pitfalls in the field of circRNAs and there has been a careful analysis of aspects frequently poorly investigated. The time-point KD followed by RNA-seq, investigation of miRNAs-sponge function of circHIPK3, identification of 11-mer motif, identification and validation of IGF2BP2, and the analysis of copy number ratio between circHIPK3 and IGF2BP2 in assessing the potential ceRNA mode of action have been extensively explored and, comprehensively convincing.

      Weaknesses:

      The authors addressed the majority of the weak points raised initially. However the role played by the circHIPK3 in cancer remains elusive and not elucidated in full in this study.

      Overall, the presented study surely adds some further knowledge in describing circHIPK3 function, its capability to regulate some downstream genes, and its interaction and competition for IGF2BP2. However, whereas the experimental part sounds technically logical, it remains unclear the overall goal of this study and the achieved final conclusions.

      This study is a promising step forward in the comprehension of the functional role of circHIPK3. These data could possibly help to better understand the circHIPK3 role in cancer

    5. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment 

      This study explores the role of one the most abundant circRNAs, circHIPK3, in bladder cancer cells, providing convincing data that circHIPK3 depletion affects thousands of genes and that those downregulated (including STAT3) share an 11-mer motif with circHIPK3, corresponding to a binding site for IGF2BP2. The experiments demonstrate that circHIPK3 can compete with the downregulated mRNAs targets for IGF2BP2 binding and that IGF2BP2 depletion antagonizes the effect of circHIPK3 depletion by upregulating the genes containing the 11mer motif. These valuable findings contribute to the growing recognition of the complexity of cancer signaling regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In this work the authors propose a new regulatory role for one the most abundant circRNAs, circHIPK3. They demonstrate that circHIPK3 interacts with an RNA binding protein (IGF2BP2), sequestering it away from its target mRNAs. This interaction is shown to regulates the expression of hundreds of genes that share a specific sequence motif (11-mer motif) in their untranslated regions (3'-UTR), identical to one present in circHIPK3 where IGF2BP2 binds. The study further focuses on the specific case of STAT3 gene, whose mRNA product is found to be downregulated upon circHIPK3 depletion. This suggests that circHIPK3 sequesters IGF2BP2, preventing it from binding to and destabilizing STAT3 mRNA. The study presents evidence supporting this mechanism and discusses its potential role in tumor cell progression. These findings contribute to the growing complexity of understanding cancer regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis.

      Strengths:

      The authors show mechanistic insight into a proposed novel "sponging" function of

      circHIPK3 which is not mediated by sequestering miRNAs but rather a specific RNA binding protein (IGF2BP2). They address the stoichiometry of the molecules involved in the interaction, which is a critical aspect that is frequently overlooked in this type of studies. They provide both genome-wide analysis and a specific case (STAT3) which is relevant for cancer progression. Overall, the authors have significantly improved their manuscript in their revised version.

      Weaknesses:

      While the authors have performed northern blots to measure circRNA levels, an estimation of the circRNA overexpression efficiency, namely the circular-to-linear expression ratio, would be desired. The seemingly contradictory effects of circHIPK3 and STAT3 depletion in cancer progression, are now addressed by the authors in their revised manuscript, incorporating potential reasons that might explain such complexity.

      We have now included a full version of the northern blot, where no discernible linear precursor can be detected, supporting efficient circHIPK3 WT and circHIPK3 MUT production (please see the detailed description in the specific comments below). We agree that the observations about STAT3 homeostasis and cancer progression, is not a straightforward extrapolation as discussed. 

      Reviewer #2 (Public Review):

      Summary: 

      The authors have diligently addressed most of the points raised during the review process (except the important point of "additional in vitro experiments [...] needed to investigate the implication of circHIPK3 in bladder cancer cell phenotype" for which no additional experiments were performed), resulting in an improvement in the study. The data are now described with clarity and conciseness, enhancing the overall quality of the manuscript. 

      Strengths: 

      New, well-defined molecular mechanism of circRNAs involvement in bladder cancer. 

      Weaknesses: 

      Lack of solid translational significance data. 

      The focus of this study has been to disclose molecular mechanisms of action by circHIPK3, with implications for cancer. We agree that further studies are needed to fully understand the impact of circHIPK3 in bladder cancer.  

      Reviewer #3 (Public Review):

      In Okholm et al., the authors evaluate the functional impact of circHIPK3 in bladder cancer cells. By knocking down circHIPK3 and performing an RNA-seq analysis, the authors found thousands of deregulated genes which look unaffected by miRNAs sponging function and that are, instead, enriched for a 11-mer motif. Further investigations showed that the 11mer motif is shared with the circHIPK3 and able to bind the IGF2BP2 protein. The authors validated the binding of IGF2BP2 and demonstrated that IGF2BP2 KD antagonizes the effect of circHIPK3 KD and leads to the upregulation of genes containing the 11-mer. Among the genes affected by circHIPK3 KD and IGF2BP2 KD, resulting in downregulation and upregulation respectively, the authors found the STAT3 gene, which also consistently has concomitant upregulation of one of its targets TP53. The authors propose a mechanism of competition between circHIPK3 and IGF2BP2 triggered by IGF2BP2 nucleation, potentially via phase separation. 

      Strengths: 

      Although the number of circRNAs continues to grow, this field lacks many instances of detailed molecular investigations. The presented work critically addresses some of the major piaalls in the field of circRNAs, and there has been a careful analysis of aspects frequently poorly investigated. Experiments involving use of time-point knockdown followed by RNAseq, investigation of miRNA-sponge function of circHIPK3, identification of 11-mer motif, identification and validation of IGF2BP2, and the analysis of copy number ratio between circHIPK3 and IGF2BP2 in assessing the potential ceRNA mode of action are thorough and convincing. 

      Weaknesses: 

      It is unclear why the authors used certain bladder cancer cells versus non-bladder cells in some experiments. The efficacy of certain experiments (specifically rescue experiments) and some control conditions is still questionable. Overall, the presented study adds some further knowledge in describing circHIPK3 function, its capability to regulate some downstream genes, and its interaction and competition for IGF2BP2. 

      We have provided a discussion and argumentation of how certain bladder cancer cells (and non-bladder cancer cells) have been used in this study in our previous rebuttal letter and also clarified this further in the materials and methods section in the first revision. Regarding control conditions for experiments, we believe we have included all necessary controls and explanations for these in the revised version (please see the detailed description in the specific comments below). 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major points about revised manuscript

      (1) In Supplementary Figure S5H, the membrane may have been trimmed too closely to the circRNA band, potentially resulting in the absence of the linear RNA band. Could the authors provide a full image of the membrane that includes the loading points? Having access to the complete image would allow for a more comprehensive evaluation of the results, including the presence or absence of expected linear and circular RNA bands.

      I have taken the liberty to move this “major point” from the public review section, as I believe it would be too detailed for this section. We have included the full section of the northern blot, according to the reviewers recommendations. 

      As described in the previous rebuttal letter our northern blots suffered from heavy background signal arising from the rRNA bands, which was the reason for cuttng the northern blot in the previous version of Supplementary figure S5H. We have now shown the entire blot as suggested by the reviewer, so that the reader can more clearly inspect any potential linear precursor band. We previously stated that we could not assess the circular-to-linear ratio due to background signal, since a potential linear HIPK3 precursor RNA could be masked by the rRNA signal. However, the theoretical size of a linear precursor is ~2.9 kb – a region where we do not detect any distinct bands (just above the 18S band), making a rather efficient circularization very likely. In support of this claim, we are using the Laccase2 vector described in Kramer et al, 2015 (Genes dev), which is proven to produce high levels of circHIPK3 compared to negligable amounts of linear precursor (although in a different cell line). We have also included a 5.8S rRNA probe to control for loading and RNase R activity (can also be ascertained by the disappearence of 18S/28S bands). Since we do not have the option to use another probe (limited by the BSJ-specific probe) and it is not practical to deplete for rRNA from 20 µg samples of total RNA, prior to running the northern blot, we find that this data sufficiently proves that our vector constructs produce a decent amount of RNase R-resistant circHIPK3, with no visible/discernible linear precursor.    

      Minor points about revised manuscript

      (1) In Supplementary Figure S3B, the authors offer no explanation as to why genes that become upregulated upon circHIPK3 knockdown generally contain more circHIPK3-RBP binding sites other than for IGF2BP2. A clarification would be of help.

      Again, this issue has been addressed in the previous rebuttal letter. Our response is repeated below:

      We do not have any evidence to explain this observation. One possibility is that other RBPs elicit mRNA-stabilizing effects on average, whereas abundant IGF2BP2 (~ 120.000200.000 copies per cell) now able to bind more target mRNAs and elicit destabilization. This remains highly speculative though.

      (2) In Supplementary Figure S3D, the authors' claim that the 11-mer motif is found more bound to IGF2BP2 than for other circHIPK3-RBPs should referred to the corresponding dataset/reference.

      Again, this issue has been addressed in the previous rebuttal letter. Our response is repeated below:

      This information is stated in the figure legend (K562) and we have now included it in the main text as well: “We evaluated how often binding sites of circHIPK3-RBPs overlap the 11-mer motif and found that this is more often the case for IGF2BP2 binding sites than binding sites of the other circHIPK3-RBPs when scrutinizing K562 datasets (Supplementary Figure S3D)”.

      (3) In the rescue experiment where both circHIPK3 and IGF2BP2 are downregulated, using the term "normalization" to mean reestablishing normal levels of gene expression can lead to confusion with the concept of normalization as it is commonly understood in the context of data processing (i.e. the mathematical process of adjusting data to account for various factors that might affect measurements). I would recommend the authors to use a term that more specifically describes the biological process they are referring to, such as "restoration of normal expression levels" or simply "return to normal levels".

      We agree that this term could be misunderstood. This has now been changed as recommended.

      (4) The figure legend of Supplementary Figure 5F is wrongly labeled. The legend for panel F actually corresponds to panel G and vice versa. 

      This has now been corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      The authors have diligently addressed most of the points raised during the review process (except the important point of "additional in vitro experiments [...] needed to investigate the implication of circHIPK3 in bladder cancer cell phenotype" for which no additional experiments were performed), resulting in an improvement in the study. The data are now described with clarity and conciseness, enhancing the overall quality of the manuscript. Therefore, I support the publication of this work. 

      We thank the reviewer for the positive comments.

      Reviewer #3 (Recommendations For The Authors): 

      Please ensure that when the changes are made (especially for major points) by addressing the reviewer's comments, these are all appropriately incorporated in the text (for example the use of Act B as a low affinity positive control (now in Fig 4A), is not explained in the text neither the legends/methods) 

      This has now been included.

      Please ensure that all the legends correspond to the right figures (eg: Supplementary Figure with rescue experiment is 5F, but the corresponding legend in the manuscript is the S5G). 

      This has now been corrected.

      Please for future reviewing processes ensure the new parts are properly highlighted or coloured differently in the manuscript

      This has now been done more thoroughly.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Below, we provide a detailed account of the changes we made. For clarity and ease of review:

      •        Original reviewers' comments are included and highlighted in grey

      •        Our responses to each comment are written in black text

      •        Print screens illustrating the specific changes made to the manuscript are enclosed within black squares

      eLife assessment

      The authors aim to develop a CRISPR system that can be activated upon sensing an RNA. As an initial step to this goal, they describe RNA-sensing guide RNAs for controlled activation of CRISPR modification. Many of the data look convincing and while several steps remain to achieve the stated goal in an in vivo setting and for robust activation by endogenous RNAs, the current work will be important for many in the field.  

      The eLife assessment summarises our ambition to create a CRISPR system controlled by RNA sensing. The synopsis provided encapsulates the essence of our research, emphasising both the progress we have made and the challenges that lie ahead. This assessment fully resonates with our views.

      Public Reviews:

      Reviewer #1 (Public Review):

      This paper describes RNA-sensing guide RNAs for controlled activation of CRISPR modification. This works by having an extended guide RNA with a sequence that folds back onto the targeting sequence such that the guide RNA cannot hybridise to its genomic target. The CRISPR is "activated" by the introduction of another RNA, referred to as a trigger, that competes with this "back folding" to make the guide RNA available for genome targeting. The authors first confirm the efficacy of the approach using several RNA triggers and a GFP reporter that is activated by dCas9 fused to transcriptional activators. A major potential application of this technique is the activation of CRISPR in response to endogenous biomarkers. As these will typically be longer than the first generation triggers employed by the authors they test some extended triggers, which also work though not always to the same extent. They then introduce MODesign which may enable the design of bespoke or improved triggers. After that, they determine that the mode of activation by the RNA trigger involves cleavage of the RNA complexes. Finally, they test the potential for their system to work in a developmental setting - specifically zebrafish embryos. There is some encouraging evidence, though the effects appear more subtle than those originally obtained in cell culture. 

      Overall, the potential of a CRISPR system that can be activated upon sensing an RNA is high and there are a myriad of opportunities and applications for it. This paper represents a reasonable starting point having developed such a system in principle. 

      The weakness of the study is that it does not demonstrate that the system can be used in a completely natural setting. This would require an endogenous transcript as the RNA trigger with a clear readout. Such an experiment would clearly strengthen the paper and provide strong confidence that the method could be employed for one of the major applications discussed by the authors. The zebrafish data relied on exogenous RNA triggers whereas the major applications (as I understood them) would use endogenous triggers. 

      Related, most endogenous RNAs are longer than the various triggers tested and may require extensive modification of the system to be detected or utilised effectively. 

      While additional data would clearly be beneficial, there should nevertheless be a more detailed discussion of these caveats and/or the strengths and applications of the system as it is presented (i.e. utility with synthetic triggers).  

      We agree with the observation regarding the subtler effects in the zebrafish embryos and the reliance on exogenous RNA triggers. Indeed, the utilisation of endogenous transcripts as triggers in a natural setting is a logical next step. We further acknowledge the need to delve deeper into the complexities and challenges of our system, particularly concerning the detection of endogenous RNA, thus offering valuable insights for researchers looking to adapt our system for various applications. In order to clarify these limitations, we made some changes in the final version of our paper. The following paragraphs have been therefore included in the manuscript discussion:

      “In their current iteration, iSBH-sgRNAs show considerable promise for mammalian synthetic biology applications. Specifically, their ability to detect synthetic triggers could be pivotal in the development of complex synthetic RNA circuits and logic gates, thereby advancing the field of cellular reprogramming. However, further work is required to achieve better ON/OFF activation ratios in vivo and more homogeneous activity across tissues in the presence of RNA triggers. Additional chemical modifications could improve iSBH-sgRNA properties, and we believe that chemical modification strategies adopted for siRNA drugs or antisense oligos (Khvorova and Watts (2017)) could also be essential for further iSBH-sgRNA technology development. As iSBH-sgRNAs might be targeted by endogenous nucleases, leading to their degradation, a strategy for preventing this could involve additional chemical modifications. When inserted at certain key positions, such modifications could prevent interaction between iSBH-sgRNAs and cellular enzymes by introducing steric clashes or inhibiting RNA hydrolysis.

      Once achieving superior dynamic ranges of iSBH-sgRNA activation in vivo, the next steps would involve understanding the classes of endogenous RNAs that could act as triggers. The chances that an iSBH-sgRNA encounters an endogenous RNA trigger inside a cell would depend on the relative concentrations of the two RNA species. Therefore, a first step towards determining potential endogenous RNA triggers will involve identifying RNA species with comparable expression levels as iSBH-sgRNAs. Then, iSBH-sgRNAs could be designed against these RNA species, followed by experimental validation. It is important to note that eukaryotic cells express a wide range of transcripts of varying sizes, expression levels, and subcellular localisations, all of which could greatly affect iSBH-sgRNA activation levels. Based on the data presented here, we speculate that RNA species up to 300nt that are also highly expressed might act as good triggers. Furthermore, as sgRNAs are involved in targeting Cas9 to genomic DNA in the nucleus, attempting to detect transcripts that are sequestered in the nucleus might also provide additional benefit.”

      Reviewer #3 (Public Review):

      In this work, the authors describe engineering of sgRNAs that render Cas9 DNA binding controllable by a second RNA trigger. The authors introduce several iterations of their engineered sgRNAs, as well as a computational pipeline to identify designs for user-specified RNA triggers which offers a helpful alternative to purely rational design. Also included is an investigation of the fate of the engineered sgRNAs when introduced into cells, and the use of this information to inform installation of modified nucleotides to improve engineered sgRNA stability. Engineered sgRNAs are demonstrated to be activated by trigger RNAs in both cultured mammalian cells and zebrafish. 

      The conclusions made by the authors in this work are predominantly supported by the data provided. However, some claims are not consistent with the data shown and some of the figures would benefit from revision or further clarification. 

      Strengths: 

      - The sgRNA engineering in this paper is performed and presented in a systematic and logical fashion.

      - Inclusion of a computational method to predict iSBH-sgRNAs adds to the strength of the engineering. 

      - Investigation into the cellular fate of the engineered sgRNAs and the use of this information to guide inclusion of chemically modified nucleotides is also a strength. 

      - Demonstration of activity in both cultured mammalian cells and in zebrafish embryos increases the impact and utility of the technology reported in this work. 

      Weaknesses: 

      - While the methods here represent an important step forward in advancing the technology, they still fall short of the dynamic range and selectivity likely required for robust activation by endogenous RNA.

      - While the iSBH-sgRNAs where the RNA trigger overlaps with the spacer appear to function robustly, the modular iSBH-sgRNAs seem to perform quite a bit less well. The authors state that modular iSBHsgRNAs show better activity without increasing background when the SAM system is added, but this is not supported by the data shown in Figure 3D, where in 3 out of 4 cases CRISPR activation in the absence of the RNA trigger is substantially increased.

      - There is very little discussion of how the performance of the technology reported in this work compares to previous iterations of RNA-triggered CRISPR systems, of which there are many examples.  

      Concerning the methods falling short of the dynamic range and selectivity required for robust activation by endogenous RNA, we acknowledge this limitation and recognise the need for improvement in this area. In the resubmitted version of the manuscript, we provided a detailed discussion on how the selection of appropriate triggers might partially improve dynamic ranges and selectivity. This includes an exploration of various strategies and considerations that may enhance the robustness of our system (print screen above, also used for addressing Reviewer #1 comments). 

      Regarding the inconsistent performance of the modular iSBH-sgRNAs, we acknowledge that modular iSBH-sgRNAs seem to perform slightly less well than first- and second-generation designs. In order to illustrate this, we modified corresponding bar graphs to include fold turn-on iSBH-sgRNA activation in addition to significance (Figures 1, 2 and 3 of the manuscript). We also acknowledge this fact in the text, as well as we recognise this discrepancy in the Figure 3.D and provide further clarifications. To help conveying this message even further, we introduced a new figure (Figure 3- figure supplement 2) to accompany the heat map shown in the Figure 3.D. with corresponding bar graphs. These changes are documented below:

      “…promoters. We ran 11 MODesign simulations for each trigger, incrementally extending the loop size while keeping the sgRNA 2 spacer input constant. HEK293T validation experiments showed that choosing modular iSBH-sgRNAs that detect the 4 U6-expressed triggers is possible (Figure 3.D, Figure 3- figure supplement 1.C). Despite not performing quite as well as second-generation designs (Figure 2.A.,Figure 3.D),modular iSBH-sgRNA still enable efficient RNA detection, especially for smaller RNAs such as triggers A and D. For highly efficient designs such asmodular iSBH-sgRNA (D), addition of the SAM effector system (Konermann et al. (2015)) boosted ON-state activation with only a negligible increase in the the OFF-state non-specific activation. Orthogonality tests suggested that activation of modular iSBH-sgRNA designs was specifically conditioned by complementary RNA triggers (Figure 3.E, Figure 3 - figure supplement 2), showing the exquisite specificity of the system.”

      Author response image 1.

      This supplementary figure reinterprets the data presented in Figure 3.E. using bar plots for enhanced clarity and comparison. It depicts the results of cotransfecting HEK293T cells with four modular iSBH-sgRNAs (A, B, C, and D) and examines all combinations of iSBH-sgRNA: RNA trigger pairings. The bar plots provide a visual representation of mean values with error bars indicating the standard deviation, based on three biological replicates.

      Regarding the concern about the lack of comparison with previous iterations of RNA-triggered CRISPR systems, we also acknowledged other similar technologies within the discussion. We also point readers to a literature review we recently published (doi/full/10.1089/crispr.2022.0052) where we describe other similar technologies in more detail.

      “To date, a variety of RNA-inducible gRNA designs have been developed (Hanewich-Hollatz et al. (2019); Hochrein et al. (2021); Jakimo et al. (2018); Jiao et al. (2021); Jin et al. (2019); Li et al. (2019); Liu et al. (2022); Lin et al. (2020); Siu and Chen (2019); Galizi et al. (2020); Hunt and Chen (2022b,a); Ying et al. (2020); Choi et al. (2023)). Nevertheless, there is a lack of direct, head-to-head comparisons of these designs under standardised experimental conditions. Some designs were evaluated in vitro, others in bacterial systems, and some in mammalian cells. Consequently, it is challenging to conclusively determine which design exhibits superior properties (Pelea et al. (2022)). Notably, to the best of our knowledge, the iSBH-sgRNA systemis the first RNA-inducible gRNA design tested in vivo and characterising the iSBH-sgRNA activation mechanism was essential for implementing iSBH-sgRNA technology in zebrafish embryos. In vivo, chemical modifications in the spacer sequence were vital for iSBH-sgRNA stability and function.”

    2. eLife assessment

      The authors aim to develop a CRISPR system that can be activated upon sensing an RNA. As an initial step to this goal, they describe RNA-sensing guide RNAs for controlled activation of CRISPR modification. Many of the data look convincing and while several steps remain to achieve the stated goal in an in vivo setting and for robust activation by endogenous RNAs, the current work will be important for many in the field.

    3. Reviewer #1 (Public Review):

      This paper describes RNA-sensing guide RNAs for controlled activation of CRISPR modification. This works by having an extended guide RNA with a sequence that folds back onto the targeting sequence such that the guide RNA cannot hybridise to its genomic target. The CRISPR is "activated" by the introduction of another RNA, referred to as a trigger, that competes with this "back folding" to make the guide RNA available for genome targeting. The authors first confirm the efficacy of the approach using several RNA triggers and a GFP reporter that is activated by dCas9 fused to transcriptional activators. A major potential application of this technique is the activation of CRISPR in response to endogenous biomarkers. As these will typically be longer than the first generation triggers employed by the authors they test some extended triggers, which also work though not always to the same extent. They then introduce MODesign which may enable the design of bespoke or improved triggers. After that, they determine that the mode of activation by the RNA trigger involves cleavage of the RNA complexes. Finally, they test the potential for their system to work in a developmental setting - specifically zebrafish embryos. There is some encouraging evidence, though the effects appear more subtle than those originally obtained in cell culture.

      Overall, the potential of a CRISPR system that can be activated upon sensing an RNA is high and there are a myriad of opportunities and applications for it. This paper represents a reasonable starting point having developed such a system in principle.<br /> The weakness of the study is that it does not demonstrate that the system can be used in a completely natural setting. This would require an endogenous transcript as the RNA trigger with a clear readout. The authors now acknowledge this limitation in their revised manuscript. Future studies and experiments should focus on these aspects in order for the system to be employed to its full and intended potential.

    1. eLife assessment

      This study presents valuable findings describing how two brain regions, the midbrain periaqueductal gray matter and basolateral amygdala, communicate when a predator threat is detected. Though the periaqueductal gray is usually viewed as a downstream effector, this work contributes to a growing body of literature from this lab showing that the periaqueductal gray produces effects by acting on the basolateral amygdala. The experimental design, data collection and analysis methods provide solid evidence for the main claims. Although anatomical and immediately early gene results suggest the paraventricular nucleus of the thalamus may serve as a mediator of dorsolateral periaqueductal gray to basolateral amygdala neurotransmission, this finding would benefit from a functional assessment. This study will appeal to a broad audience, including basic scientists interested in neural circuits, basic and clinical researchers interested in fear, and behavioral ecologists interested in foraging.

    2. Reviewer #1 (Public Review):

      In the presence of predators, animals display attenuated foraging responses and increased defensive behaviors that serve to protect them from potential predatory attacks. Previous studies have shown that the basolateral nucleus of the amygdala (BLA) and the periaqueductal gray matter (PAG) are necessary for the acquisition and expression of conditioned fear responses. However, it remains unclear how BLA and PAG neurons respond to predatory threats when animals are foraging for food. To address this question, Kim and colleagues conducted in vivo electrophysiological recordings from BLA and PAG neurons and assessed approach-avoidance responses while rats search for food in the presence of a robotic predator.

      The authors observed that rats exhibited a significant increase in the latency to obtain the food pellets and a reduction in the pellet success rate when the predator robot was activated. A subpopulation of PAG neurons showing increased firing rate in response to the robot activation didn't change their activity in response to food pellet retrieval during the pre- or post-robot sessions. Optogenetic stimulation of PAG neurons increased the latency to procure the food pellet in a frequency- and intensity-dependent manner, similar to what was observed during the robot test. Combining optogenetics with single-unit recordings, the authors demonstrated that photoactivation of PAG neurons increased the firing rate of 10% of BLA cells. A subsequent behavioral test in 3 of these same rats demonstrated that BLA neurons responsive to PAG stimulation displayed higher firing rates to the robot than BLA neurons nonresponsive to PAG stimulation. Next, because the PAG does not project monosynaptically to the BLA, the authors used a combination of retrograde and anterograde neural tracing to identify possible regions that could convey robot-related information from PAG to the BLA. They observed that neurons in specific areas of the paraventricular nucleus of the thalamus (PVT) that are innervated by PAG fibers contained neurons that were retrogradely labeled by the injection of CTB in the BLA. In addition, PVT neurons showed increased expression of the neural activity marker cFos after the robot test, suggesting that PVT may be a mediator of PAG signals to the BLA.

      Overall, the idea that the PAG interacts with the BLA via the midline thalamus during a predator vs. foraging test is new and quite interesting. The authors have used appropriated tools to address their questions.

      In this revised version of the manuscript, the authors have made important modifications in the text, inserted new data analyses, and incorporated additional references, as recommended by the reviewers. These modifications have significantly improved the quality of the manuscript.

    3. Reviewer #2 (Public Review):

      The authors characterized activity of the dorsal periaqueductal gray (dPAG) - basolateral amygdala (BLA) circuit. They show that BLA cells that are activated by dPAG stimulation are also more likely to be activated by a robot predator. These same cells are also more likely to display synchronous firing.

      The authors also replicate prior results showing that dPAG stimulation evokes fear and the dPAG is activated by a predator.

      Lastly, the report performs anatomical tracing to show that the dPAG may act on the BLA via the paraventricular thalamus (PVT). Indeed, the PVT receives dPAG projections and also projects to the BLA. However, the authors do not show if the PVT mediates dPAG to BLA communication with any functional behavioral assay. Furthermore, the authors also do not thoroughly characterize the activity of BLA cells during the predatory assay.

      The major impact in the field would be to add evidence to their prior work, strengthening the view that the BLA can be downstream of the dPAG.

    4. Reviewer #3 (Public Review):

      In the present study, the authors examined how dPAG neurons respond to predatory threats and how dPAG and BLA communicate threat signals. The authors employed single-unit recording and optogenetics tools to address these issues in an 'approach food-avoid predator' paradigm. They characterized dPAG and BLA neurons responsive to a looming robot predator and found that dPAG opto-stimulation elicited fleeing and increased BLA activity. Importantly, they found that dPAG stimulation produces activity changes in subpopulations of BLA neurons related to predator detection, thus supporting the idea that dPAG conveys innate fear signals to the amygdala. In addition, injections of anterograde and retrograde tracers into the dPAG and BLA, respectively, along with the examination of c-FOS activity in midline thalamic relay stations, suggest that the paraventricular nucleus of the thalamus (PVT) may serve as a mediator of dPAG to BLA neurotransmission. Of relevance, the study helps to validate an important concept that dPAG mediates primal fear emotion and may engage upstream amygdalar targets to evoke defensive responses. The series of experiments provide a compelling case for supporting their conclusions. The study brings important concepts revealing dynamics of fear-related circuits particularly attractive to a broad audience, from basic scientists interested in neural circuits to psychiatrists.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      We sincerely value the insightful and constructive feedback (italicized) provided by the reviewers, which has been instrumental in identifying areas of our manuscript that required further clarification or amendment. In response to these valuable comments, we have significantly revised the manuscript to enhance clarity and accuracy. Specifically, we have corrected an oversight related to the robot’s velocity and secondary antibody ratios, and addressed previously missing values in Figs. 3E and 4E. Importantly, these corrections did not alter the outcomes of our results. Additionally, we have enriched our manuscript with new data analyses, as reflected in Figures 1B, 1F, 2H-J, 4D, 4F-H, S1A, S1C-E, S3H, S5, and Table 1, ensuring a more comprehensive presentation of our findings. Below are our responses detailing each comment and explaining the modifications integrated into the revised manuscript.

      Reviewer 1:

      (1) To address the question of whether PAG photostimulation biases the cells that respond to the robot, a counterbalanced experiment, in which the BLA activity is initially recorded during the foraging vs. robot test and the PAG stimulation happens at the end of the session, should have been performed.

      In our study, we investigated fear behavior and BLA cell responses to intrinsic dPAG photostimulation (320 pulses) in naïve animals, followed by their reactions to an extrinsic predatory robot. We recognize the reviewer's concern regarding the potential  influence of initial dPAG photostimulation on BLA neuron responses to the robot. We address this issue in our discussion (pg. 13) as follows: “However, it is crucial to consider the recent discovery that optogenetic stimulation of CA3 neurons (3000 pulses) leads to gain-of-function changes in CA3-CA3 recurrent (monosynaptic) excitatory synapses (Oishi et al., 2019). Although there is no direct connection between dPAG neurons and the BLA (Vianna and Brandao 2003, McNally, Johansen, and Blair 2011, Cameron et al. 1995), and no studies have yet demonstrated gain-of-function changes in polysynaptic pathways to our knowledge, the potential for our dPAG photostimulation (320 pulses) to induce similar changes in amygdalar neurons, thereby enhancing their sensitivity to predatory threats, cannot be dismissed.”

      (2) In Figure 3, it is unclear which criteria (e.g. response latency, minimum Z score, spike fidelity) was used to identify the BLA neurons that were indirectly activated by PAG stimulation. A graphic containing at least the distribution of the response latencies for each BLA neuron after PAG laser activation is needed.

      We have specified the criteria for determining the responsiveness of BLA neurons to dPAG stimulation on page 22. This involves analyzing the first 500-ms post-stimulation across five 0.1-s bins. Units were classified as ‘stim cells’ if they showed z-scores greater than 3 (z > 3) in any of the bins during the initial 500-ms period post-stimulation. Neurons activated by both pellet procurement and dPAG stimulation were not included in the 'stim cell' category. Additionally, we have included a graphic in the revised manuscript (Fig. S3C) that presents the distribution of response latencies of BLA neurons to dPAG stimulation.

      (3) To strengthen the claim that it is a BLA-PVT-PAG circuit that carries information about predatory threat, a new experiment using CTB and cFos could be used to demonstrate that PAG neurons that project to PVT are recruited during the robot exposure.

      Our study primarily aimed to explore the transmission of threat signals between the dPAG and BLA. We acknowledge that our evidence for the PVT’s intermediary role, derived from CTB injections in the BLA and subsequent CTB+cFos co-labeling analysis in the PVT (Fig. 4G and 4H), is limited. Accordingly, we have moderated the emphasis on the PVT’s involvement in both the abstract and introduction. We now present the PVT’s role as a promising direction for future research in the discussion section of our revised manuscript.

      (4) In Fig 2, the authors' interpretation is that photostimulation of PAG neurons elicits fleeing responses in the rats. However, there is a vast literature demonstrating that the PAG is also involved in nociception. Although this is recognized by the authors in the first part of the introduction and briefly described in the discussion, the authors should more explicitly explain that PAG stimulation produces analgesia and thus is unlikely to underlie the escaping responses observed. This may not be intuitive for a broader audience.

      We appreciate the reviewer's insightful suggestion to elaborate on the PAG involvement in nociception and analgesia, as supported by the literature. While our initial manuscript acknowledged these functions, we have now expanded our discussion to address the PAG’s multifaceted roles (pg. 12): “As mentioned in the introduction, the dPAG is recognized as part of the ascending nociceptive pathway to the BLA (De Oca et al. 1998, Gross and Canteras 2012, Herry and Johansen 2014, Kim, Rison, and Fanselow 1993, Ressler and Maren 2019, Walker and Davis 1997). The dPAG is also implicated in non-opioid analgesia (e.g., Bagley and Ingram 2020, Cannon et al. 1982, Fields 2000). However, it is essential to emphasize that, despite its roles in pain modulation, the primary behavior observed in dPAG-stimulated, naive rats foraging for food in an open arena was goal-directed escape to the safe nest, underscoring the dPAG’s critical function in survival behaviors.” Note that this aligns with human studies on PAG stimulation (e.g., Carrive and Morgan 2012, Magierek et al. 2003), particularly those by Amano et al. (Amano et al. 1982), which reported patients feeling an urge to escape, similar to being chased, upon PAG stimulation.

      (5) To truly demonstrate the functional links between the PAG and BLA, more experiments are needed. For example, one could record from BLA neurons during the robot surge while performing optogenetic inhibition of the PAG neurons. There is also no evidence that activity in the indirect pathway that connects the PAG to the BLA is indispensable for the expression of defensive responses towards the robot (e.g., causality tests using chemogenetic or optogenetic inactivation).

      We agree that incorporating optogenetic inhibition of PAG neurons while simultaneously recording from BLA neurons during a robot surge would strengthen the evidence for the functional connectivity between the PAG and BLA. Such an experiment would necessitate the transfection and photoinhibition of a wide array of dPAG neurons responsive to predatory threats. This procedure is technically more viable in transgenic mouse models, given their suitability for genetic manipulation. In light of this, and in response to the suggestions in the Joint Public Review, we have revised the abstract, introduction, and discussion to offer a more cautious interpretation of our findings. This revision reflects a careful consideration of both the evidence and the limitations inherent in our study (pg. 13): “While our findings demonstrate that opto-stimulation of the dPAG is sufficient to trigger both fleeing behavior and increased BLA activity, we have not established that the dPAG is necessary for the BLA’s response to predatory threats. To establish causality, it is essential to conduct experiments such as optogenetic inhibition to determine whether the dPAG is indispensable for activating BLA neurons and initiating escape behavior in the face of threats. The complexity of targeting the dPAG, which includes its dorsomedial, dorsolateral, lateral, and ventrolateral subdivisions (e.g., Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993), suggests the need for future studies using transgenic mouse models. Should inactivation of the dPAG negate the BLA's response to predatory threats, it would underscore the dPAG's central role in this defensive mechanism. Conversely, if BLA responses remain unaffected by dPAG inactivation, this could indicate the existence of multiple pathways for antipredatory defense mechanisms.”

      (6) The manuscript lacks information about the number of rats and trials that were used across the experiments (e.g. Fig 2G-J). In some occasions, the authors start the experiments with a specific number of animals and then reduce the N by half without providing a rationale (e.g. Fig. 3). Equally confusing is the experimental timeline. For example: a) Were the pre-robot, robot, and post-robot sessions always performed within the same day? b) It was described that microdrivable arrays were used, but did the same rats experienced the robot test more than one time? c) How many bins were used for normalization during the Z-score calculation and when were the data binned at 100 ms versus 1 s? d) How many trials were used for each analysis? For example, to identify robot cells, did the authors establish a minimum number of trials per animal to calculate the peristimulus time histograms? Having a significant number of trials is critical to make sure that the observed neuronal responses are replicable across the trials. e) How was the neuronal activity related to "pellet retrieval" aligned during robot sessions? Was the activity aligned with the moment in which the rat touches the pellet or when the animal returns to the nest with the pellet? f) How did the authors control for trials in which the rat consumed the pellets in the same local vs. those in which they returned to the nest to eat it? All these points are extremely important for future replicability.

      We apologize for any confusion caused by the initial lack of detail in our experimental procedures. The revised manuscript has been updated with comprehensive methodological details:  

      (i) The study involved thirteen rats (ChR2, n = 9; EYFP, n = 4), subjected to dPAG stimulation using fixed light parameters (473 nm, 20 Hz, 10-ms pulse width, 2 s duration) during Long and Short pellet distance trials (refer to Fig. 2E-G). The stimulation intensity was adjusted to each animal's response (fleeing behavior), ranging from 1-3 mW. Additional testing occurred over multiple days, with incremental adjustments to stimulation parameters (intensity, frequency, duration) after confirming normal baseline foraging behavior (Fig. 2H-J, at x = 0). These details are now clearly depicted in the manuscript.

      (ii) The primary objective was to investigate BLA neuron responses to dPAG opto-stimulation. Six rats were initially tested, with three later assessed for their reactions to dPAG stimulation in the presence of an actual predator, to gauge behavioral effects.

      (iii) Regarding the experimental timeline:

      a) Pre-robot, robot, and post-robot sessions were conducted successively on the same day.

      b) Sessions with the robot predator were repeated until habituation occurred or when unit recordings were deemed invalid due to microdrive limitations or the absence of unit detection. Throughout these sessions, the success rate for pellet retrieval remained consistently low. Specifically, the mean success rate for the dPAG recordings was 2.803% + 1.311. For the BLA recordings, animals did not succeed in retrieving pellets during any of the robot trials. To provide a more detailed account of the methodology, the manuscript has been updated to include the number of recording days and the units recorded in the "Behavioral Procedures" section.

      c) As described in Materials and Methods, unit recording data were binned at 0.1-s intervals and normalized against a 5-s pre-event baseline (50 bins). For statistical analyses in Figure 1F’s rightmost column, 1-s bins were used to simplify post-hoc analysis corrections.

      d) Each recording session consisted of 5-15 trials. Trials were excluded if rats attempted to procure the pellet within 10 s post-dPAG stimulation or robot activation, ensuring accurate characterization of unit responsiveness. Consequently, the number of trials varied among subjects.

      e) Pellet retrieval was indicated by the animal entering a designated zone 19 cm from the pellet, driven by hunger.

      f) Animals were trained to retrieve pellets and return to their nest for consumption prior to robot testing sessions, as elaborated in the “Baseline foraging” section.

      (7) In the abstract, the authors mention that predictive cues are ambiguous during naturalistic predatory threats, but it is not clear what do they mean by ambiguous. In addition, in the introduction section, the authors describe that the present study will investigate how the dPAG and BLA communicate threat signals. However, the author should clarify right in the beginning that these two regions are not monosynaptically connected with each other and cite the proper references.

      The abstract’s original sentence, “…where predictive cues are ambiguous and do not afford reiterative trial-and-error learning…” has been refined to “…characterized by less explicit cues and the absence of reiterative trial-and-error learning events …” This adjustment more accurately reflects that cues in natural settings often lack the clear and consistent quality of those in controlled experimental settings, which is necessary for the straightforward process of trial-and-error learning.

      Regarding the dPAG and BLA connectivity, the revised introduction (pg. 5) now states: “Considering the lack of direct monosynaptic projections between dPAG and BLA neurons (Vianna and Brandao 2003, McNally, Johansen, and Blair 2011, Cameron et al. 1995), we utilized anterograde and retrograde tracers in the dPAG and BLA, respectively. This was complemented by c-Fos expression analysis following exposure to predatory threats. Our anatomical findings suggest that the paraventricular nucleus of the thalamus (PVT) may be part of a network that conveys predatory threat information from the dPAG to the BLA.”

      (8) In the introduction section, the authors should clarify that the US information is conveyed from the PAG to BLA via the lateral thalamus (posterior intralaminar nucleus, medial geniculate nucleus) or dorsal midline thalamus (paraventricular nucleus of the thalamus). The statement regarding how "the PAG functions as part of the ascending pain transmission pathway, providing footshock US information to the BLA" is misleading because the PAG does not send monosynaptic projections directly to the BLA.

      The revised text (pg. 3) now reads: “…suggest that the dPAG is part of the ascending US pain transmission pathway to the BLA, the presumed site for CS-US association formation (De Oca et al. 1998, Gross and Canteras 2012, Herry and Johansen 2014, Kim, Rison, and Fanselow 1993, Ressler and Maren 2019, Walker and Davis 1997). This pathway is thought to be mediated through the lateral and dorsal-midline thalamus regions, including the posterior intralaminar nucleus and paraventricular nucleus of the thalamus (Krout and Loewy, 2000; McNally, Johansen, and Blair, 2011; Yeh, Ozawa, and Johansen, 2021; but see Brunzell and Kim, 2001).”

      (9) The author's assumption that threat information flows from the PAG to the BLA, rather than BLA to PAG, based on electrical stimulation and lesion experiments performed in previous studies is problematic for at least three reasons: a) Electrical stimulation can activate fibers of passage as well as presynaptic neurons antidromically. b) The lesion approach may not have targeted 100% of the neurons in PAG, which extends anatomically along the antero-posterior axis of the midbrain for several millimeters in rats. This observation also disagrees with more recent studies using optogenetics and imaging tools demonstrating that the PAG is the downstream target of the BLA-CeA pathway. c) The authors cited prior reports describing the role of the amygdala-PAG pathway in dampening the US response and providing a negative signal to the PAG. However, a series of previous studies demonstrating that the PAG serves as the downstream target of the central nucleus of the amygdala for the expression of defensive response are completely ignored by the authors. Here are just some examples: Massi et al, 2023, PMID: 36652513; Tovote et al 2016, PMID: 27279213; Penzo et al, 2014 PMID: 24523533).

      We recognize the complexities in interpreting findings from electrical stimulation and lesion studies. Our prior work (Kim et al. 2013) supports the conclusion that predatory threat information directionally flows from the dPAG to the BLA, as evidenced by distinct behavioral outcomes from experimental manipulations of dPAG and BLA. Specifically, dPAG stimulation-induced fleeing behavior was blocked by BLA lesions (as well as muscimol inactivation), whereas BLA stimulation-induced fleeing was unaffected by dPAG or combined dPAG+vPAG lesions (refer to Fig. 5A), suggesting a flow from dPAG to BLA. Our manuscript further clarifies that dPAG optostimulation results confirmed that escape behavior in foraging rats, induce by dPAG electrical stimulation (Kim et al. 2013), was activated by intrinsic dPAG neurons rather than by fibers of passage or current spread to other brain regions.  

      Furthermore, the PAG’s anatomical and functional diversity, with distinct segments along its longitudinal axis associated with different defensive behaviors, reinforces our conclusions. The dPAG is implicated in flight responses, while the vPAG is associated with freezing behavior (e.g., Bandler and Shipley 1994, Kim, Rison, and Fanselow 1993, Lefler, Campagner, and Branco 2020, Morgan, Whitney, and Gold 1998). The critiques' referenced studies primarily focus on the BLA-CeA-vPAG circuit's role in freezing during Pavlovian fear conditioning, contrasting with our emphasis on the dPAG-PVT-BLA circuit and its mediation in escape behavior in response to naturalistic predatory threats.

      We also note that different invasive procedures can yield varying behavioral outcomes. For example, both acute (e.g., optogenetic and muscimol inactivation) and chronic (e.g., surgical ablation) manipulations within the same brain circuit have shown diverse effects across species (Otchy et al. 2015). Moreover, optogenetics comes with its own set of conceptual and technical challenges (Adamantidis et al. 2015), including the difficulty of targeting, quantifying and photo-inhibiting 100% of PAG neurons. Despite the limitations of each technique, our collective evidence from lesions, inactivation, electrical stimulation (Kim et al. 2013), optostimulation, and single-unit recordings (the present study) supports the premise that the dPAG acts upstream of the BLA in processing predatory threat information.

      (10) In the discussion, the authors suggest that the PVT may be the interface between the PAG and the BLA for the expression of antipredatory defensive behavior during their foraging vs. robot test, but previous studies looking at the role of PVT in antipredator defensive behavior and/or approach-avoidance conflict tasks are not cited and discussed in the manuscript (Engelke et al, 2021, PMID: 33947849; Choi et al 2019, PMID: 30979815; Choi and McNally 2017, PMID: 28193686).

      We thank the reviewer for pointing out these pivotal studies, which we have carefully reviewed and integrated into the revised manuscript (pg. 14): “These results, in conjunction with previous research on the roles of the dPAG, PVT, and BLA in producing flight behaviors in naïve rats (Choi and Kim 2010, Daviu et al. 2020, Deng, Xiao, and Wang 2016, Kim et al. 2013, Kim et al. 2018, Kong et al. 2021, Ma et al. 2021, Reis et al. 2021), the anterior PVT’s involvement in cat odor-induced avoidance behavior (Engelke et al. 2021), and the PVT’s regulation of behaviors motivated by both appetitive and aversive stimuli (Choi and McNally 2017, Choi et al. 2019), suggest the involvement of the dPAGàPVTàBLA pathways in antipredatory defensive mechanisms, particularly as rats leave the safety of the nest to forage in an open arena (Figure 4I) (Reis et al. 2023).”  

      (11) The authors use the expression "looming robot predator" in many cases throughout the manuscript. However, it is unclear whether the defensive responses observed in the rats are elicited by the looming stimulus produced by the movement of the robot towards the rats. The authors describe that rats do not respond to a stationary robot, but would the sound produced by the movement of the robot elicit defensive responses? Would non-approaching lateral or dorsoventral movements (not associated with looming) be sufficient to induce defensive behavior in the rats? There is a vast literature in the field about defensive behaviors induced by looming stimuli. The authors should empirically demonstrate that the escaping responses induced by the robot are mediated by looming or refrain to use the looming terminology to avoid confusion.

      Our use of "looming robot predator" is based on empirical evidence from a prior parametric study, which identified the forward, or 'looming,' motion of the Robogator as the key stimulus eliciting a flight response in rats (Kim, Choi, and Lee 2016). This reaction significantly decreased when the robot moved backward from the same starting position, producing a similar sound, and was absent when the robot remained stationary. This suggests that neither sound alone nor the mere presence of a novel object provokes goal-directed escape behavior (Kong et al. 2021). This aligns with studies indicating that simulated looming stimuli, like an expanding disk, induce flight or freezing responses in mice (De Franceschi et al. 2016, Yilmaz and Meister 2013).

      It should be noted that the 2013 study by Yilmaz & Meister (Yilmaz and Meister 2013) on the looming disk paradigm showed that not all mice responded to the stimuli (e.g., Figs. 2A and 3A), with those that did exhibiting rapid habituation by the second exposure. This contrasts with our predatory robot paradigm (Choi and Kim 2010), where all rats consistently fled from the looming robotic predator across multiple trials, underscoring the critical role of looming motion in simulating predator attacks that trigger flight behavior in rats.

      Thus, the term "looming" accurately captures the nature of the robot's movement and its effect on eliciting defensive responses in rats. Nonetheless, should the editors agree with the reviewer's suggestion to minimize potential confusion, we are willing to substitute "looming" with "approaching," although we consider the terms to be synonymous in the context of our study.

      (12) If the authors are citing the Rescorla-Wagner model, they should include at least one additional sentence to explain it, as many people in the field are not familiar with this model.

      In response to the request for clarification on the Rescorla-Wagner model, we have added an explanatory sentence (pg. 4): “Fundamentally, the negative feedback circuit between the amygdala and the dPAG serves as a biological implementation of the Rescorla–Wagner (1972) model, a foundational theory of associative learning that emphasizes the importance of prediction errors in reinforcement (i.e., US), as applied to FC (Fanselow 1998).”

      (13) The authors need to include the normality test used to determine whether a parametric or non-parametric statistical analysis was the most appropriate test for each experiment.

      We have included the outcomes of the normality tests, detailed in Table S1.

      (14) In Fig. 1F, the authors show a representative PAG neuron with peristimulus-time histogram and rasters reaching frequencies higher than 100 Hz and sustained firing rates of >50 Hz following robot activation. The authors should include a firing rate analysis (e.g., average firing rate and maximum firing rate before and after robot activation) of the 22 robot-responsive PAG neurons recorded during the session to clarify whether this high firing rate, which is atypical in other brain regions, is commonly observed in the PAG. Showing the isolated waveforms of some representative neurons would help to clarify whether the activity is being recorded from a single-isolated unit instead of multiple units within the same channel.

      In response to the critique, we have expanded our analysis to include both average and maximum firing rates before and after robot activation for the 22 robot-responsive PAG neurons. This detailed firing rate analysis, illustrating their distribution, has been incorporated into the revised manuscript (refer to Figure S1C and S1D). Furthermore, to alleviate concerns regarding the identification of single-unit activity versus potential multi-unit recordings, we have included peri-event raster plots and waveforms for two additional representative neurons in Figure 1F.

      (15) In Figure 2, the authors should indicate when the recordings are performed on anesthetized vs. freely-moving awake animals.

      In the original manuscript, we specified that the optrode recordings depicted in Figure 2B were conducted on anesthetized rats. To enhance clarity and directly address the critique, we have now clearly indicated this condition in Figure 2A as well.

      (16) The optogenetic stimulation parameters used in Fig 2H indicate that 0.5 mW was sufficient to induce behavioral changes. This is surprising because most optogenetic experiments in the field use much higher intensities (> 5mW). If much lower intensities are sufficient to drive PAG-mediated behaviors, this may be a very important observation that should be conveyed to the field. I recommend the reviewers clarify if they in fact used 0.5 mW and then discuss that the laser intensity used in the experiments was 10X lower than that required for other brain regions

      In our study, we indeed observed that 0.5 mW of dPAG stimulation increased the latency to procure the pellet without completely preventing the action. Notably, at 1 mW, more than half of the animals (n = 5/9 rats; Fig. 2H) and at 3 mW, all rats (9/9) failed to procure the pellet and fled from the foraging area to the nest (Fig. 2G). These results indicate that even lower intensities were sufficient to elicit behavioral changes through dPAG stimulation in a large foraging arena, highlighting the dPAG's sensitivity to optogenetic manipulation. This finding is consistent with our earlier research on dPAG electrical stimulation, which required significantly lower intensities to provoke defensive behaviors compared to the BLA. Specifically, the stimulation intensity needed for aversive behavior in the dPAG was substantially lower (dPAG: 65.0 ± 6.85 µA) than for the BLA (BLA: 275.0 ± 24.44 µA) (Kim et al. 2013). Furthermore, Deng et al. (Deng, Xiao, and Wang 2016) showed that 1 mW of blue light could elicit a 60% freezing response, with 2 mW triggering flight behavior within a latency of 0.6 seconds.

      (17) In Fig 2 G-J, how many animals are being used per group and how was the sequence of the experiments performed? This is very important for replicability.

      A total of three rats were utilized for the robot testing experiments depicted in Fig. 2 G-J. The experimental sequence for these animals consisted of successive pre-stimulation, stimulation, post-stimulation, and robot sessions. We have updated the manuscript to include this information.

      (18) For the photostimulation of PAG neurons in Figs. 2 and 3, the authors need to clarify if the same parameters of laser stimulation used during the anesthetized recordings were also used during the behavioral tests. Also, the wavelength corresponding to the blue laser should be 473 nm instead of 437 nm.

      We thank the reviewer for identifying the error. We confirm that the opto-stimulation parameters (473 nm, 10-ms pulse width, 2 s duration) were consistently applied across both anesthetized recordings and behavioral tests. This consistency has been explicitly stated in the revised manuscript to ensure clarity regarding our experimental approach.

      (19) In Fig. 3I, how was the representative trials selected? Instead of picking up the most representative trials, the authors should demonstrate the response of the cell during the entire session.

      In response to the critique, we clarify that the color-coded PETH shown in Fig. 3I represents averaged BLA activity across a comprehensive set of trials. This includes 8 pre-stimulation, 10 stimulation, and 8 post-stimulation trials for the robot-activated sessions, with a similar distribution for non-stimulated sessions. This approach was chosen to provide a representative overview of the cell's response throughout the entire session. To address the request for more detailed data, we have added traditional PETHs to the revised manuscript (see Fig. S3H), which depict the cell's response across all trials.

      (20) Fig 4 D should demonstrate a colabeling between the anterograde PAG fibers in the PVT and the retrogradely labeled neurons from BLA instead of PAG fibers only.

      We wish to clarify that Fig. 4D is intended to show the distribution of dPAG terminals within the midline thalamic nuclei, as noted in prior research (Krout and Loewy 2000). Although dPAG terminals are distributed throughout the midline thalamus, our observations have specifically highlighted a notable increase in c-Fos expression within the paraventricular nucleus of the thalamus (PVT) in rats subjected to the robotic predator stimulus, in contrast to those in the foraging-only control condition (Fig. 4E). Addressing the reviewer's point, we direct attention to Fig. 4G, which includes images labeled "Robot-experienced" and "Merge." This figure demonstrates a subset of PVT neurons that were retrogradely labeled with CTB injected into the BLA, anterogradely labeled with AAV injected into the dPAG, and activated (as indicated by c-Fos expression) in response to the robotic predator. This provides specific colabeling evidence between anterograde PAG fibers in the PVT and retrogradely labeled neurons from the BLA, directly addressing the critique.

      (21) The resolution of the cFos images is very low and makes it hard to appreciate.

      We have updated Figs. 4F and 4G with high-resolution versions to ensure the details are more clearly visible. Furthermore, should there be a need for even greater clarity, we are prepared to supply the images as TIFF files, which are known for preserving high image quality.

      Reviewer 2:

      (1) The text is clearly written, and I appreciated the inclusion of interesting citations, such as the one about paintings by cavemen. The authors also do a good job of discussing the underlying theoretical framework and the figures are easy to understand. Although the topic is very interesting, the amount of novel work is somewhat low. Figure 1 shows that dPAG cells are activated by the predator, and this has been shown by many prior reports. Similarly, Figure 2 shows that dPAG activation creates defensive responses, and this too has been shown by many prior reports.

      We appreciate the reviewer’s positive remarks. We acknowledge the rich body of research documenting dPAG neuronal activation by various predator cues such as odors (e.g., fox urine) (Lu et al. 2023), and scenarios involving anesthetized or spontaneously moving rat/cat predators, either physically partitioned or harness-restrained (Bindi et al. 2022, Deng, Xiao, and Wang 2016, Esteban Masferrer et al. 2020). Nevertheless, our study distinguishes itself by examining dPAG neuronal responses to a robotic predator, uniquely designed to replicate consistent looming motions across multiple trials and subjects within an environment that simulates natural foraging conditions, inclusive of a safe nest (cf. Choi and Kim, 2010). This approach allowed us to not only reveal the immediate activation of dPAG neurons in response to a rapidly approaching predator but also to explore the consequent fleeing behavior towards safety, thereby providing new insights into the dPAG's role in mediating goal-directed defensive responses in a more ecologically-relevant setting. Furthermore, our investigation extends beyond these findings to assess the impact of dPAG activation on BLA neuronal responses and their functional connectivity during predator-prey interactions, offering a fresh perspective on the neural circuits that support survival behaviors in animals when confronted with naturalistic threats.

      (2) The results in Figure 3 are novel and interesting, but the characterization of BLA activity is incomplete. For example, what are the percentages of BLA cells that are inhibited or activated by all major behaviors observed? These behaviors include approach to pellet, escape from robot, freezing, stretch-attend postures, etc. These same analyses should also be added to dPAG activity in Figure 1. How does BLA single cell encoding of these behaviors relate to their responsivity to dPAG stimulation? And, finally, it is unclear what is the significance of BLA correlated synchronous firing. Is the animal more or less likely to be performing certain behaviors when correlated BLA firing occurs?

      Our analysis, as presented in Figs. 3I, 3K, and S3D-F, selectively focused on BLA cell responses during distinct behaviors such as approaching a pellet and escaping from the robot. These behaviors were selected because their precise temporal markers allow for accurate correlation with BLA cell activity, building on the findings of our previous research (Kim et al. 2018, Kong et al. 2021).

      The robot's motion, programmed to advance a fixed distance before retreating to its starting position, is designed to repeatedly elicit foraging, thus facilitating analysis of neural changes during conflict situations involving food approach and predator avoidance. However, this also leads to the rapid diminution of freezing and stretch-attend postures inside the nest as animals quickly adapt to the robot's movement pattern, rendering a time-stamped analysis of these behaviors unfeasible under our experimental conditions. While the inclusion of these behaviors in our analysis would be insightful, especially in extended interaction scenarios where the robot advances to the nest opening and remains before returning in a less predictable manner, such conditions would likely reduce foraging behavior due to increased fear, deviating from our study's primary objective of elucidating the interactions between the dorsal periaqueductal gray (dPAG) and the basolateral amygdala (BLA) functions.

      Regarding the significance of BLA correlated synchronous firing, our findings, particularly in Figures 3M-O and S4, demonstrate significant synchronous activity among BLA neuronal pairs during encounters with the robot, as opposed to pre-stim, stim, and post-stim sessions. This synchrony is notably prominent among neurons responsive to dPAG stimulation, indicating that BLA neurons involved in processing dPAG signals may play a crucial role in enhancing BLA network coherence to effectively manage predatory threat information (pg. 13).

      (3) In Figure 4, the authors identify the PVT as a potential region that can mediate dPAG to BLA communication via anatomical tracing. However, functional assays are missing. For example, if the PVT is inhibited chemogenetically, does this result in a smaller number of BLA cells that are activated by dPAG stimulation? Does activation of the dPAG-PVT or the PVT-BLA projections cause defensive behaviors? Functionally showing that the dPAG-PVT-BLA circuit controls defensive actions would be a major advance in the field and would greatly enhance the significance of this paper. It would also provide an anatomical substrate to support the view that the BLA is downstream of the dPAG, which was first demonstrated by the authors in their elegant 2013 PNAS paper.

      We appreciate the reviewer’s constructive critique and valuable suggestions on the necessity for functional validation of the dPAG-PVT-BLA circuit's involvement in mediating defensive behaviors. In light of these comments, we have carefully considered and included a discussion on the importance of these proposed experiments as a direction for future research in our manuscript revision (also see response to Reviewer 1’s critique #5).

      Our initial work in 2013 (Kim et al. 2013) laid the groundwork for identifying BLA neurons responsive to dPAG stimulation and suggested the PVT as a potential relay in this neural circuit. Recognizing the limitations of our current study, which does not include direct functional assays, we have adjusted our manuscript to convey the speculative aspect of the dPAG-PVT-BLA circuit’s role more accurately. Moreover, we have enriched our discussion by citing relevant studies that lend support to our proposed circuit mechanism. These references serve to place our findings within the broader context of existing research and highlight the imperative for subsequent studies to empirically confirm the functional significance of the dPAG-PVT-BLA pathway in driving defensive behaviors.

      Reviewer 3:

      (1) The Introduction refers to a negative feedback amygdala-dPAG from a study of the Johansen group, but in this case, the authors were referring to the ventrolateral and not the dorsal PAG.

      We thank the reviewer for pointing out the need to distinguish between the dPAG and vPAG regions in our introduction. While Johansen et al. (2010) investigated the roles of PAG (including both dPAG and vPAG regions; see their Supplementary Figs. 4, 5, and 10), the differentiation between their specific contributions to the amygdala's negative feedback mechanism was not explicitly detailed in their initial publication. This distinction was further elaborated upon in later work by the same group (Yeh, Ozawa, and Johansen 2021), which specifically illuminated the dPAG's role in conditioned fear memory formation and its neural pathways to the PVT that influence fear learning. To reflect this nuanced understanding, we have revised our introduction (pg. 3): “In parallel, Johansen et al. (2010) found that pharmacological inhibition of the PAG, encompassing both dPAG and vPAG regions, diminishes the behavioral and neural responses in the amygdala elicited by periorbital shock US, thereby impairing the acquisition of auditory FC.”

      (2) In the experiments recording dPAG in response to the predator threat, the authors mentioned cells activated by the predator threat, referred to as "robot cells." Were these cells inhibited in response to threat?

      In the Result and Materials and Methods sections, we report that 23.4% (22 out of 94) of dPAG neurons, termed “robot cells,” showed a significant increase in firing rates (z > 3) within a latency of less than 500 ms during exposure to the looming robot threat, but not during the pre- and post-robot sessions. These cells are highlighted in Figures 1E-G. In contrast, we identified only a single unit exhibiting a decrease in activity (z-score < -3) in response to the robot threat. Given the overwhelming prevalence of cells with excitatory responses to the threat, our discussions and analyses have primarily centered on these excited cells. Nevertheless, to ensure a full depiction of our observations, we have included data on the inhibited unit in the revised manuscript, specifically in Figure S1E.

      (3) The authors claim that tetrodes were implanted in the dorsal PAG; however, the electrodes' tips shown in the figures are positioned more ventrally in the lateral PAG (see Figures 1B, S5A).

      The PAG is anatomically organized into dorsomedial (dmPAG), dorsolateral (dlPAG), lateral (lPAG), and ventrolateral (vlPAG) columns along the rostro-caudal axis of the aqueduct. The designation "dorsal PAG" (dPAG) traditionally encompasses the dmPAG, dlPAG, and lPAG regions, a classification supported by extensive track-tracing, neurochemical, and immunohistochemical evidence (e.g., (Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993)). As Bandler and Shipley (Bandler and Shipley 1994) summarized, “These findings suggest that what has been traditionally called the 'dorsal PAG' (a collective term for regions dorsal and lateral to the aqueduct), consists of three anatomically distinct longitudinal columns: dorsomedial and lateral columns…and a dorsolateral column…" Similarly, Schenberg et al. (Schenberg et al. 2005) clarified in their review that, “According to this parcellation...the defensive behaviors (freezing, flight or fight) and aversion-related responses (switch-off behavior) were ascribed to the DMPAG, DLPAG, and LPAG (usually named the ‘dorsal’ PAG).” In our study, electrode placements were strictly within these specified dPAG regions. The electrode tip locations depicted in Figures 1B and S5A correspond with the -6.04 mm template (left panel below) from Paxinos & Watson’s atlas (Paxinos and Watson 1998), situated anteriorly to the emergence of the  vlPAG (right panel below). To enhance clarification in our manuscript, we provide a detailed definition of the dPAG that includes the dmPAG, dlPAG,  and lPAG, and support our electrode placement rationale with references to established literature (pg. 5).

      Author response image 1.

      (4) It would be nice to include a series of observations applying inhibitory tools (i.e., optogenetic photo inhibition) in the dPAG and BLA and see how they affect the behavioral responses in the 'approach food-avoid predator' paradigm. Moreover, it would be interesting to explore how inhibiting the dPAG to PVT pathway influences the flee response during the robot surge.

      We appreciate the suggestion to explore the effects of optogenetic inhibition in the dPAG and BLA on behavioral responses within the 'approach food-avoid predator' paradigm, as well as the potential impact of inhibiting the dPAG to PVT pathway on flee responses during robot surge incidents. As mentioned in our response to Reviewer 1’s critique #5, the application of optogenetic inhibition necessitates transfecting, quantifying, and photoinhibiting a comprehensive set of dPAG neurons activated by predatory threats. This approach is more viable in future studies that can leverage transgenic mouse models for their genetic tractability. Following the Joint Public Review’s recommendations, we have revised our manuscript to ensure a more measured interpretation of our data, carefully balancing the evidence from tracer studies against the limitations of our current methodology.

      Furthermore, referencing Reviewer 1’s critique #9, it is important to consider that various invasive techniques can yield different behavioral outcomes. For instance, research by Olveczky and colleagues (Otchy et al. 2015) demonstrated that acute manipulations (i.e., optogenetic and muscimol inactivation) and chronic surgical ablation of the same brain circuit can produce distinct effects in rats and finches. Despite these methodological constraints, our collective results from lesion, inactivation, electrical stimulation (Kim et al. 2013), optostimulation, and single-unit recording (present) studies cohesively suggest that the dPAG functions upstream of the BLA in processing predatory threat signals.

      (5) The authors should also examine whether 'synaptic' appositions exist between the anterogradely labeled terminals from the dPAG and the double labeled CTB and cFOS neurons in the PVT.

      We appreciate the suggestion to investigate the presence of synaptic appositions, which could potentially offer valuable insights into the synaptic connections and functional interactions within this neural circuit. However, due to the specialized nature of electron microscopy required for these examinations and the extensive resources it entails, this line of inquiry falls beyond the scope of our current study. We hope to address this aspect in future studies, where we can dedicate the necessary resources and expertise to conducting these intricate analyses.

      (6) It is odd to see the projection fields shown in Fig. 4D, where the projection to the PVT looks much sparser compared to other targets in the thalamus and hypothalamus. If the projection to the PVT has such an important function, why does it seem so weak? This should be discussed. Also, because the projection to the PVT seems sparse, the authors should consider alternative paths like the one involving the cuneiform nucleus. The cuneiform nucleus is an important region responding to looming shadows with strong bidirectional links to the dorsolateral periaqueductal gray, providing strong projections to the rostral PVT.

      The perceived scarcity of the dPAG-PVT pathway might not reflect its functional significance accurately. The PVT's small size could make its projections appear less dense in broad anatomical studies. To address this, we have updated Figure 4D with a high-resolution image that offers a detailed view of the PVT region. This enhancement (refer to the updated Fig. 4, bottom) more accurately depicts the projection density within the PVT. It is also critical to consider that the functional impact of neural pathways is not solely dependent on the quantity of projecting neurons. For instance, work by Deisseroth and colleagues (Rajasethupathy et al. 2015) has shown that even relatively sparse monosynaptic projections from the anterior cingulate cortex to the hippocampus can exert significant effects on neural circuit dynamics. Additionally, we have expanded our discussion to consider the potential roles of other circuits, such as the cuneiform nucleus, in driving the behavioral responses observed in our study (pg. 15): “Given the recent significance attributed to the superior colliculus in detecting innate visual threats (Lischinsky and Lin 2019, Wei et al. 2015, Zhou et al. 2019) and the cuneiform nucleus in the directed flight behavior of mice (Bindi et al. 2023, Tsang et al. 2023), further exploration into the communication between these structures and the dPAG-BLA circuitry is warranted.”

      (7) Finally, in the Discussion, it would be nice to comment on how the BLA mediates flee responses. Which pathways are likely involved?

      This excellent suggestion has been incorporated in the discussion (pg. 15): “Future studies will also need to delineate the downstream pathways emanating from the BLA that orchestrate goal-directed flight responses to external predatory threats as well as internal stimulations from the dPAG/BLA circuit. Potential key structures include the dorsal/posterior striatum, which has been associated with avoidance behaviors in response to airpuff in head-fixed mice (Menegas et al. 2018) and flight reactions triggered by auditory looming cues (Li et al. 2021). Additionally, the ventromedial hypothalamus (VMH) has been implicated in flight behaviors in mice, evidenced by responses to the presence of a rat predator (Silva et al. 2013) and upon optogenetic activation of VMH Steroidogenic factor 1 (Kunwar et al. 2015) or the VMH-anterior hypothalamic nucleus pathway (Wang, Chen, and Lin 2015). Investigating the indispensable role of these structures in flight behavior could involve lesion or inactivation studies. Such interventions are anticipated to inhibit flight behaviors elicited by amygdala stimulation and predatory threats, confirming their critical involvement. Conversely, activating these structures in subjects with an inactivated or lesioned amygdala, which would typically inhibit fear responses to external threats (Choi and Kim 2010), is expected to induce fleeing behavior, further elucidating their functional significance.”

      Adamantidis, A., S. Arber, J. S. Bains, E. Bamberg, A. Bonci, G. Buzsaki, J. A. Cardin, R. M. Costa, Y. Dan, Y. Goda, A. M. Graybiel, M. Hausser, P. Hegemann, J. R. Huguenard, T. R. Insel, P. H. Janak, D. Johnston, S. A. Josselyn, C. Koch, A. C. Kreitzer, C. Luscher, R. C. Malenka, G. Miesenbock, G. Nagel, B. Roska, M. J. Schnitzer, K. V. Shenoy, I. Soltesz, S. M. Sternson, R. W. Tsien, R. Y. Tsien, G. G. Turrigiano, K. M. Tye, and R. I. Wilson. 2015. "Optogenetics: 10 years after ChR2 in neurons--views from the community."  Nat Neurosci 18 (9):1202-12. doi: 10.1038/nn.4106.

      Amano, K., T. Tanikawa, H. Kawamura, H. Iseki, M. Notani, H. Kawabatake, T. Shiwaku, T. Suda, H. Demura, and K. Kitamura. 1982. "Endorphins and pain relief. Further observations on electrical stimulation of the lateral part of the periaqueductal gray matter during rostral mesencephalic reticulotomy for pain relief."  Appl Neurophysiol 45 (1-2):123-35.

      Bagley, E. E., and S. L. Ingram. 2020. "Endogenous opioid peptides in the descending pain modulatory circuit."  Neuropharmacology 173:108131. doi: 10.1016/j.neuropharm.2020.108131.

      Bandler, R., P. Carrive, and S. P. Zhang. 1991. "Integration of somatic and autonomic reactions within the midbrain periaqueductal grey: viscerotopic, somatotopic and functional organization."  Prog Brain Res 87:269-305. doi: 10.1016/s0079-6123(08)63056-3.

      Bandler, R., and K. A. Keay. 1996. "Columnar organization in the midbrain periaqueductal gray and the integration of emotional expression."  Prog Brain Res 107:285-300. doi: 10.1016/s0079-6123(08)61871-3.

      Bandler, R., and M. T. Shipley. 1994. "Columnar organization in the midbrain periaqueductal gray: modules for emotional expression?"  Trends Neurosci 17 (9):379-89. doi: 10.1016/0166-2236(94)90047-7.

      Bindi, R. P., C. C. Guimaraes, A. R. de Oliveira, F. F. Melleu, M. A. X. de Lima, M. V. C. Baldo, S. C. Motta, and N. S. Canteras. 2023. "Anatomical and functional study of the cuneiform nucleus: A critical site to organize innate defensive behaviors."  Ann N Y Acad Sci 1521 (1):79-95. doi: 10.1111/nyas.14954.

      Bindi, R. P., R. G. O. Maia, F. Pibiri, M. V. C. Baldo, S. L. Poulter, C. Lever, and N. S. Canteras. 2022. "Neural correlates of distinct levels of predatory threat in dorsal periaqueductal grey neurons."  Eur J Neurosci 55 (6):1504-1518. doi: 10.1111/ejn.15633.

      Cameron, A. A., I. A. Khan, K. N. Westlund, and W. D. Willis. 1995. "The efferent projections of the periaqueductal gray in the rat: a Phaseolus vulgaris-leucoagglutinin study. II. Descending projections."  J Comp Neurol 351 (4):585-601. doi: 10.1002/cne.903510408.

      Cannon, J. T., G. J. Prieto, A. Lee, and J. C. Liebeskind. 1982. "Evidence for opioid and non-opioid forms of stimulation-produced analgesia in the rat."  Brain Res 243 (2):315-21. doi: 10.1016/0006-8993(82)90255-4.

      Carrive, P, and M. M. Morgan. 2012. "Periaqueductal Gray." In The Human Nervous System, edited by J. K.; Paxinos Mai, G., 367-400. London: Academic Press.

      Carrive, P. 1993. "The periaqueductal gray and defensive behavior: functional representation and neuronal organization."  Behav Brain Res 58 (1-2):27-47. doi: 10.1016/0166-4328(93)90088-8.

      Choi, E. A., P. Jean-Richard-Dit-Bressel, C. W. G. Clifford, and G. P. McNally. 2019. "Paraventricular Thalamus Controls Behavior during Motivational Conflict."  J Neurosci 39 (25):4945-4958. doi: 10.1523/JNEUROSCI.2480-18.2019.

      Choi, E. A., and G. P. McNally. 2017. "Paraventricular Thalamus Balances Danger and Reward."  J Neurosci 37 (11):3018-3029. doi: 10.1523/JNEUROSCI.3320-16.2017.

      Choi, J. S., and J. J. Kim. 2010. "Amygdala regulates risk of predation in rats foraging in a dynamic fear environment."  Proc Natl Acad Sci U S A 107 (50):21773-7. doi: 10.1073/pnas.1010079108.

      De Franceschi, G., T. Vivattanasarn, A. B. Saleem, and S. G. Solomon. 2016. "Vision Guides Selection of Freeze or Flight Defense Strategies in Mice."  Curr Biol 26 (16):2150-4. doi: 10.1016/j.cub.2016.06.006.

      De Oca, B. M., J. P. DeCola, S. Maren, and M. S. Fanselow. 1998. "Distinct regions of the periaqueductal gray are involved in the acquisition and expression of defensive responses."  J Neurosci 18 (9):3426-32. doi: 10.1523/JNEUROSCI.18-09-03426.1998.

      Deng, H., X. Xiao, and Z. Wang. 2016. "Periaqueductal Gray Neuronal Activities Underlie Different Aspects of Defensive Behaviors."  J Neurosci 36 (29):7580-8. doi: 10.1523/JNEUROSCI.4425-15.2016.

      Engelke, D. S., X. O. Zhang, J. J. O'Malley, J. A. Fernandez-Leon, S. Li, G. J. Kirouac, M. Beierlein, and F. H. Do-Monte. 2021. "A hypothalamic-thalamostriatal circuit that controls approach-avoidance conflict in rats."  Nat Commun 12 (1):2517. doi: 10.1038/s41467-021-22730-y.

      Esteban Masferrer, M., B. A. Silva, K. Nomoto, S. Q. Lima, and C. T. Gross. 2020. "Differential Encoding of Predator Fear in the Ventromedial Hypothalamus and Periaqueductal Grey."  J Neurosci 40 (48):9283-9292. doi: 10.1523/JNEUROSCI.0761-18.2020.

      Fanselow, M. S. 1998. "Pavlovian conditioning, negative feedback, and blocking: mechanisms that regulate association formation."  Neuron 20 (4):625-7. doi: 10.1016/s0896-6273(00)81002-8.

      Fields, H. L. 2000. "Pain modulation: expectation, opioid analgesia and virtual pain."  Prog Brain Res 122:245-53. doi: 10.1016/s0079-6123(08)62143-3.

      Gross, C. T., and N. S. Canteras. 2012. "The many paths to fear."  Nat Rev Neurosci 13 (9):651-8. doi: 10.1038/nrn3301.

      Herry, C., and J. P. Johansen. 2014. "Encoding of fear learning and memory in distributed neuronal circuits."  Nat Neurosci 17 (12):1644-54. doi: 10.1038/nn.3869.

      Kim, E. J., O. Horovitz, B. A. Pellman, L. M. Tan, Q. Li, G. Richter-Levin, and J. J. Kim. 2013. "Dorsal periaqueductal gray-amygdala pathway conveys both innate and learned fear responses in rats."  Proc Natl Acad Sci U S A 110 (36):14795-800. doi: 10.1073/pnas.1310845110.

      Kim, E. J., M. S. Kong, S. G. Park, S. J. Y. Mizumori, J. Cho, and J. J. Kim. 2018. "Dynamic coding of predatory information between the prelimbic cortex and lateral amygdala in foraging rats."  Sci Adv 4 (4):eaar7328. doi: 10.1126/sciadv.aar7328.

      Kim, J. J., J. S. Choi, and H. J. Lee. 2016. "Foraging in the face of fear: Novel strategies for evaluating amygdala functions in rats." In Living without an amygdala, edited by D. G. Amaral and R. Adolphs, 129-148. The Guilford Press.

      Kim, J. J., R. A. Rison, and M. S. Fanselow. 1993. "Effects of amygdala, hippocampus, and periaqueductal gray lesions on short- and long-term contextual fear."  Behav Neurosci 107 (6):1093-8. doi: 10.1037//0735-7044.107.6.1093.

      Kong, M. S., E. J. Kim, S. Park, L. S. Zweifel, Y. Huh, J. Cho, and J. J. Kim. 2021. "'Fearful-place' coding in the amygdala-hippocampal network."  Elife 10. doi: 10.7554/eLife.72040.

      Krout, K. E., and A. D. Loewy. 2000. "Periaqueductal gray matter projections to midline and intralaminar thalamic nuclei of the rat."  J Comp Neurol 424 (1):111-41. doi: 10.1002/1096-9861(20000814)424:1<111::aid-cne9>3.0.co;2-3.

      Kunwar, P. S., M. Zelikowsky, R. Remedios, H. Cai, M. Yilmaz, M. Meister, and D. J. Anderson. 2015. "Ventromedial hypothalamic neurons control a defensive emotion state."  Elife 4. doi: 10.7554/eLife.06633.

      Lefler, Y., D. Campagner, and T. Branco. 2020. "The role of the periaqueductal gray in escape behavior."  Curr Opin Neurobiol 60:115-121. doi: 10.1016/j.conb.2019.11.014.

      Li, Z., J. X. Wei, G. W. Zhang, J. J. Huang, B. Zingg, X. Wang, H. W. Tao, and L. I. Zhang. 2021. "Corticostriatal control of defense behavior in mice induced by auditory looming cues."  Nat Commun 12 (1):1040. doi: 10.1038/s41467-021-21248-7.

      Lischinsky, J. E., and D. Lin. 2019. "Looming Danger: Unraveling the Circuitry for Predator Threats."  Trends Neurosci 42 (12):841-842. doi: 10.1016/j.tins.2019.10.004.

      Lu, B., P. Fan, M. Li, Y. Wang, W. Liang, G. Yang, F. Mo, Z. Xu, J. Shan, Y. Song, J. Liu, Y. Wu, and X. Cai. 2023. "Detection of neuronal defensive discharge information transmission and characteristics in periaqueductal gray double-subregions using PtNP/PEDOT:PSS modified microelectrode arrays."  Microsyst Nanoeng 9:70. doi: 10.1038/s41378-023-00546-8.

      Magierek, V., P. L. Ramos, N. G. da Silveira-Filho, R. L. Nogueira, and J. Landeira-Fernandez. 2003. "Context fear conditioning inhibits panic-like behavior elicited by electrical stimulation of dorsal periaqueductal gray."  Neuroreport 14 (12):1641-4. doi: 10.1097/00001756-200308260-00020.

      McNally, G. P., J. P. Johansen, and H. T. Blair. 2011. "Placing prediction into the fear circuit."  Trends Neurosci 34 (6):283-92. doi: 10.1016/j.tins.2011.03.005.

      Menegas, W., K. Akiti, R. Amo, N. Uchida, and M. Watabe-Uchida. 2018. "Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli."  Nat Neurosci 21 (10):1421-1430. doi: 10.1038/s41593-018-0222-1.

      Morgan, M. M., P. K. Whitney, and M. S. Gold. 1998. "Immobility and flight associated with antinociception produced by activation of the ventral and lateral/dorsal regions of the rat periaqueductal gray."  Brain Res 804 (1):159-66. doi: 10.1016/s0006-8993(98)00669-6.

      Otchy, T. M., S. B. Wolff, J. Y. Rhee, C. Pehlevan, R. Kawai, A. Kempf, S. M. Gobes, and B. P. Olveczky. 2015. "Acute off-target effects of neural circuit manipulations."  Nature 528 (7582):358-63. doi: 10.1038/nature16442.

      Paxinos, G., and C. Watson. 1998. The Rat Brain in Stereotaxic Coordinates. San Diego: Academic Press.

      Rajasethupathy, P., S. Sankaran, J. H. Marshel, C. K. Kim, E. Ferenczi, S. Y. Lee, A. Berndt, C. Ramakrishnan, A. Jaffe, M. Lo, C. Liston, and K. Deisseroth. 2015. "Projections from neocortex mediate top-down control of memory retrieval."  Nature 526 (7575):653-9. doi: 10.1038/nature15389.

      Ressler, R. L., and S. Maren. 2019. "Synaptic encoding of fear memories in the amygdala."  Curr Opin Neurobiol 54:54-59. doi: 10.1016/j.conb.2018.08.012.

      Schenberg, L. C., R. M. Povoa, A. L. Costa, A. V. Caldellas, S. Tufik, and A. S. Bittencourt. 2005. "Functional specializations within the tectum defense systems of the rat."  Neurosci Biobehav Rev 29 (8):1279-98. doi: 10.1016/j.neubiorev.2005.05.006.

      Silva, B. A., C. Mattucci, P. Krzywkowski, E. Murana, A. Illarionova, V. Grinevich, N. S. Canteras, D. Ragozzino, and C. T. Gross. 2013. "Independent hypothalamic circuits for social and predator fear."  Nat Neurosci 16 (12):1731-3. doi: 10.1038/nn.3573.

      Tsang, E., C. Orlandini, R. Sureka, A. H. Crevenna, E. Perlas, I. Prankerd, M. E. Masferrer, and C. T. Gross. 2023. "Induction of flight via midbrain projections to the cuneiform nucleus."  PLoS One 18 (2):e0281464. doi: 10.1371/journal.pone.0281464.

      Vianna, D. M., and M. L. Brandao. 2003. "Anatomical connections of the periaqueductal gray: specific neural substrates for different kinds of fear."  Braz J Med Biol Res 36 (5):557-66. doi: 10.1590/s0100-879x2003000500002.

      Walker, D. L., and M. Davis. 1997. "Involvement of the dorsal periaqueductal gray in the loss of fear-potentiated startle accompanying high footshock training."  Behav Neurosci 111 (4):692-702. doi: 10.1037//0735-7044.111.4.692.

      Wang, L., I. Z. Chen, and D. Lin. 2015. "Collateral pathways from the ventromedial hypothalamus mediate defensive behaviors."  Neuron 85 (6):1344-58. doi: 10.1016/j.neuron.2014.12.025.

      Wei, P., N. Liu, Z. Zhang, X. Liu, Y. Tang, X. He, B. Wu, Z. Zhou, Y. Liu, J. Li, Y. Zhang, X. Zhou, L. Xu, L. Chen, G. Bi, X. Hu, F. Xu, and L. Wang. 2015. "Processing of visually evoked innate fear by a non-canonical thalamic pathway."  Nat Commun 6:6756. doi: 10.1038/ncomms7756.

      Yeh, L. F., T. Ozawa, and J. P. Johansen. 2021. "Functional organization of the midbrain periaqueductal gray for regulating aversive memory formation."  Mol Brain 14 (1):136. doi: 10.1186/s13041-021-00844-0.

      Yilmaz, M., and M. Meister. 2013. "Rapid innate defensive responses of mice to looming visual stimuli."  Curr Biol 23 (20):2011-5. doi: 10.1016/j.cub.2013.08.015.

      Zhou, Z., X. Liu, S. Chen, Z. Zhang, Y. Liu, Q. Montardy, Y. Tang, P. Wei, N. Liu, L. Li, R. Song, J. Lai, X. He, C. Chen, G. Bi, G. Feng, F. Xu, and L. Wang. 2019. "A VTA GABAergic Neural Circuit Mediates Visually Evoked Innate Defensive Responses."  Neuron 103 (3):473-488 e6. doi: 10.1016/j.neuron.2019.05.027.

    1. eLife assessment

      This valuable work describes a new protein factor that is required for filamentous phage assembly. Convincing evidence is provided for the binding of PSB15 to the packaging signal of the single-stranded DNA, Trx, and cardiolipin, and a mechanism for how the phage DNA is targeted to the assembly site in the bacterial inner membrane is presented. The work will be of interest to microbiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This work describes a new protein factor required for filamentous phage assembly. The protein PSB15 binds to the packaging signal of the ssDNA, Trx and cardiolipin. A mechanism how the phage DNA is targeted to the assembly site in the bacterial inner membrane is discussed.

      Strengths:

      The work describes a clever way to detect factors required for phage propagation by looking at the plaque size of pseudorevertants that arise after infection of a phage with a directed mutation in the packaging signal. This led to the detection of a phage protein expressed from ORF9, the PSB15.

      The authors convincingly show that PSB15 is expressed in infected cells and can complement a phage with a mutated orf9.

      Weaknesses:

      Given the fact that the phage LF-UK is not well explored, many open questions should be mentioned in the introduction. For the study, it is important to know if the phageLF-UK has a mimick or homolog of gV and gXI, and if not, whether PSB15 could take their role.

      I am not convinced of the proposition of their term "checkpoint". The truth is that the authors do not know the real purpose of PSB15. I do not see an advantage for a checkpoint that only adds an additional step to enter the phage assembly site. There must be a biochemical reason for the action of PSB15. Looking at Figure 7, the step from "Release" to "Loading" is just adding many unknowns, e.g. how to transfer the DNA, how to dispose of PSB15 and Trx? Also, in the previous step are three question marks that do not add any solid information.

      The in vivo study of subcellular localization is very questionable. Why is there a single fluorescent dot if there are thousands of PSB15 molecules expressed in the cell? I have my doubts that the conclusions the authors make here are correct and meaningful. The movies do not add anything significant.

    3. Reviewer #2 (Public Review):

      Secretion of the prototypical F-associated filamentous phage (Ff) of E. coli depends on the selective binding of a hairpin (the packaging signal, PS) by two phage encoded protein, pVII and pIX. PVII and pIX target the PS to IM channels formed by pI and pIV. However, integrative filamentous phages lack a homologue of pIX and pIV, and many of them also lack a homologue of pVII, raising questions on the assembly and secretion of new phages. In the manuscript, Yueh et al. present the identification of a phage-encoded protein, PSB15, which binds to the PS signal of a Xanthomonas integrative filamentous phage, ΦLf-UK. They showed that PSB15 is required for viral assembly and is conserved in several other integrative filamentous phages. They further analyzed how PSB15 binds to PS and demonstrated that it associates to the IM, which targets phage DNA to it. Finally, they show that thioredoxin, the only host protein that was found to be essential for Ff secretion, interacts with PSB15 and releases the PSB15-PS complex from the IM. These results are important because they elucidate a major step in the secretion of integrative filamentous phage, and the role of thioredoxin on filamentous phage secretion in general.

      I found the data and interpretation convincing. However, the presentation and description are confusing in places because the reader has to juggle between figures. A scheme depicting what is known and unknown in the integration of Ff phages and interactive filamentous phages in the introduction would be useful to the general reader.

    1. eLife assessment

      This study presents important data describing cell states of olfactory ensheathing cells, and how these cell states may relate to repair after spinal cord injury. While the overall framework used for characterizing these cells is solid, the quantification and contextualization of results are incomplete, given that measurements, significance statistics, and discussion of both previous work and experimental methods that would be necessary to support several claims are not provided. With more thorough quantification and discussion, this work will be of interest to stem cell biologists and spinal cord injury researchers.

    2. Joint Public Review:

      Summary

      This manuscript explores the transcriptomic identities of olfactory ensheathing cells (OECs), glial cells that support life-long axonal growth in olfactory neurons, as they relate to spinal cord injury repair. The authors show that transplantation of cultured, immunopurified rodent OECs at a spinal cord injury site can promote injury-bridging axonal regrowth. They then characterize these OECs using single-cell RNA sequencing, identifying five subtypes and proposing functional roles that include regeneration, wound healing, and cell-cell communication. They identify one progenitor OEC subpopulation and also report several other functionally relevant findings, notably, that OEC marker genes contain mixtures of other glial cell type markers (such as for Schwann cells and astrocytes), and that these cultured OECs produce and secrete Reelin, a regrowth-promoting protein that has been disputed as a gene product of OECs.

      This manuscript offers an extensive, cell-level characterization of OECs, supporting their potential therapeutic value for spinal cord injury and suggesting potential underlying repair mechanisms. The authors use various approaches to validate their findings, providing interesting images that show the overlap between sprouting axons and transplanted OECs, and showing that OEC marker genes identified using single-cell RNA sequencing are present in vivo, in both olfactory bulb tissue and spinal cord after OEC transplantation.

      Despite the breadth of information presented, however, further quantification of results and explanation of experimental approaches would be needed to support some of the authors' claims. Additionally, a more thorough discussion is needed to contextualize their findings relative to previous work.

      (1) Important quantification is lacking for the data presented. For example, multiple figures include immunohistochemistry or immunocytochemistry data (Figures 1, 5, 6), but they are presented without accompanying measures like fractions of cells labeled or comparisons against controls. As a result, for axons projecting via OEC bridges in Figure 1, it is unclear how common these bridges are in the presence or absence of OECs. For Figure 6., it is unclear whether cells having an alternative OEC morphology coincide with progenitor OEC subtype marker genes to a statistically significant degree. Similar quantification is missing in other types of data such as Western blot images (Fig. 9) and OEC marker gene data (for which p-values are not reported; Table S2).

      The addition of quantitative measures and, where appropriate, statistical comparisons with p-values or other significance measures, would be important for supporting the authors' claims and more rigorously conveying the results.

      (2) Some aspects of the experimental design that are relevant to the interpretation of the results are not explained. For example, OECs appear to be collected from only female rats, but the potential implications of this factor are not discussed.

      Additionally, it is unclear from the manuscript to what degree immunopurified cells are OECs as opposed to other cell types. The antibody used to retain OECs, nerve growth factor receptor p75 (Ngfr-p75), can also be expressed by non-OEC olfactory bulb cell types including astrocytes [1-3]. The possible inclusion of Ngfr-p75-positive but non-OEC cell types in the OEC culture is not sufficiently addressed. Such non-OEC cell types are also not distinguished in the analysis of single-cell RNA sequencing data (only microglia, fibroblasts, and OECs are identified; Figure 2). Thus, it is currently unclear whether results related to the OEC subtype may have been impacted by these experimental factors.

      (3) The introduction, while well written, does not discuss studies showing no significant effect of OEC implantation after spinal cord injury. The discussion also fails to sufficiently acknowledge this variability in the efficacy of OEC implantation. This omission amplifies bias in the text, suggesting that OECs have significant effects that are not fully reflected in the literature. The introduction would need to be expanded to properly address the nuance suggested by the literature regarding the benefits of OECs after spinal cord injury. Additionally, in the discussion, relating the current study to previous work would help clarify how varying observations may relate to experimental or biological factors.

      (a) Cragnolini, A.B. et al., Glia, (2009), doi: 10.1002/glia.20857.<br /> (b) Vickland H. et al., Brain Res., (1991), doi: 10.1016/0006-8993(91)91659-O.<br /> (c) Ung K. et al., Nat Commun., (2021), doi: 10.1038/s41467-021-25444-3.

    1. eLife assessment

      This study presents valuable research comparing three different species of extant cartilaginous fishes and describes new data on ratfish. The methods are convincing although the reviewers noted that standardized methods are essential when comparing numerical datasets. This study would be of interest to skeletal biologists working on the evolution of chondrichthyan skeletons.

    2. Reviewer #1 (Public Review):

      Summary:

      It seems as if the main point of the paper is about the new data related to rat fish although your title is describing it as extant cartilaginous fishes and you bounce around between the little skate and ratfish. So here's an opportunity for you to adjust the title to emphasize ratfish is given the fact that leader you describe how this is your significant new data contribution. Either way, the organization of the paper can be adjusted so that the reader can follow along the same order for all sections so that it's very clear for comparative purposes of new data and what they mean. My opinion is that I want to read, for each subheading in the results, about the the ratfish first because this is your most interesting novel data. Then I want to know any confirmation about morphology in little skate. And then I want to know about any gaps you fill with the cat shark. (It is ok if you keep the order of "skate, ratfish, then shark, but I think it undersells the new data).

      Strengths:

      The imagery and new data availability for ratfish are valuable and may help to determine new phylogenetically informative characters for understanding the evolution of cartilaginous fishes. You also allude to the fossil record.

      Opportunities:

      I am concerned about the statement of ratfish paedomorphism because stage 32 and 33 were not statistically significantly different from one another (figure and prior sentences). So, these ratfish TMDs overlap the range of both 32 and 33. I think you need more specimens and stages to state this definitely based on TMD. What else leads you to think these are paedomorphic? Right now they are different, but it's unclear why. You need more outgroups.

      Your headings for the results subsection and figures are nice snapshots of your interpretations of the results and I think they would be better repurposed in your abstract, which needs more depth.

      Historical literature is more abundant than what you've listed. Your first sentence describes a long fascination and only goes back to 1990. But there are authors that have had this fascination for centuries and so I think you'll benefit from looking back. Especially because several of them have looked into histology and development of these fishes.

      I agree that in the past 15 years or so a lot more work has been done because it can be done using newer technologies and I don't think your list is exhaustive. You need to expand this list and history which will help with your ultimate comparative analysis without you needed to sample too many new data yourself.

      I'd like to see modifications to figure 7 so that you can add more continuity between the characters, illustrated in figure 7 and the body of the text. Generally Holocephalans are the outgroup to elasmobranchs - right now they are presented as sister taxa with no ability to indicate derivation. Why isn't the catshark included in this diagram?

      In the last paragraph of the introduction, you say that "the data argue" and I admit, I am confused. Whose data? Is this a prediction or results or summary of other people's work? Either way, could be clarified to emphasize the contribution you are about to present.

    3. Reviewer #2 (Public Review):

      General comment:

      This is a very valuable and unique comparative study. An excellent combination of scanning and histological data from three different species is presented. Obtaining the material for such a comparative study is never trivial. The study presents new data and thus provides the basis for an in-depth discussion about chondrichthyan mineralised skeletal tissues. I have, however, some comments. Some information is lacking and should be added to the manuscript text. I also suggest changes in the result and the discussion section of the manuscript.

      Introduction:

      The reader gets the impression almost no research on chondrichthyan skeletal tissues was done before the 2010 ("last 15 years", L45). I suggest to correct that and to cite also previous studies on chondrichthyan skeletal tissues, this includes studies from before 1900.

      Material and Methods:

      Please complete L473-492: Three different Micro-CT scanners were used for three different species? ScyScan 117 for the skate samples. Catshark different scanner, please provide full details. Chimera Scncrotron Scan? Please provide full details for all scanning protocols.

      TMD is established in the same way in all three scanners? Actually not possible. Or, all specimens were scanned with the same scanner to establish TMD? If so please provide the protocol.

      Please complete L494 ff: Tissue embedding medium and embedding protocol is missing. Specimens have been decalcified, if yes how? Have specimens been sectioned non-decalcified or decalcified?

      Please complete L506 ff: Tissue embedding medium and embedding protocol is missing. Description of controls are missing.

      Results:

      L147: It is valuable and interesting to compare the degree of mineralisation in individuals from the three different species. It appears, however, not possible to provide numerical data for Tissue Mineral Density (TMD). First requirement, all specimens must be scanned with the same scanner and the same calibration values. This in not stated in the M&M section. But even if this was the case, all specimens derive from different sample locations and have, been preserved differently. Type of fixation, extension of fixation time in formalin, frozen, unfrozen, conditions of sample storage, age of the samples, and many more parameters, all influence TMD values. Likewise the relative age of the animals (adult is not the same as adult) influences TMD. One must assume different sampling and storage conditions and different types of progression into adulthood. Thus, the observation of different degrees of mineralisation is very interesting but I suggest not to link this observation to numerical values.

      Parts of the results are mixed with discussion. Sometimes, a result chapter also needs a few references but this result chapter is full of references.

      Based on different protocols, the staining characteristics of the tissue are analysed. This is very good and provides valuable additional data. The authors should inform the not only about the staining (positive of negative) abut also about the histochemical characters of the staining. L218: "fast green positive" means what? L234: "marked by Trichrome acid fuchsin" means what? And so on, see also L237, L289, L291<br /> Discussion

      Please completely remove figure 7, please adjust and severely downsize the discussion related to figure 7. It is very interesting and valuable to compare three species from three different groups of elasmobranchs. Results of this comparison also validate an interesting discussion about possible phylogenetic aspects. This is, however, not the basis for claims about the skeletal tissue organisation of all extinct and extant members of the groups to which the three species belong. The discussion refers to "selected representatives" (L364), but how representative are the selected species? Can there be a extant species that represents the entire large group, all sharks, rays or chimeras? Are the three selected species basal representatives with a generalist life style?

      Please completely remove the discussion about paedomorphosis in chimeras (already in the result section). This discussion is based on a wrong idea about the definition of paedomorphosis. Paedomorphosis can occur in members of the same group. Humans have paedormorphic characters within the primates, Ambystoma mexicanum is paedormorphic within the urodeals. Paedomorphosis does not extend to members of different vertebrate branches. That elasmobranchs have a developmental stage that resembles chimera vertebra mineralisation does not define chimera vertebra centra as paedomorphic. Teleost have a herocercal caudal fin anlage during development, that does not mean the heterocercal fins in sturgeons or elasmobranchs are paedomorphic characters.

      L432-435: In times of Gadow & Abott (1895) science had completely wrong ideas bout the phylogenic position of chondrichthyans within the gnathostomes. It is curious that Gadow & Abott (1895) are being cited in support of the paedomorphosis claim.

      The SCPP part of the discussion is unrelated to the data obtained by this study. Kawaki & WEISS (2003) describe a gene family (called SCPP) that control Ca-binding extracellular phosphoproteins in enamel, in bone and dentine, in saliva and in milk. It evolved by gene duplication and differentiation. They date it back to a first enamel matrix protein in conodonts (Reif 2006). Conodonts, a group of enigmatic invertebrates have mineralised structures but these structure are neither bone nor mineralised cartilage. Cat fish (6 % of all vertebrate species) on the other hand, have bone but do not have SCPP genes (Lui et al. 206). Other calcium binding proteins, such as osteocalcin, were initially believed to be required for mineralisation. It turned out that osteocalcin is rather a mineralisation inhibitor, at best it regulates the arrangement collagen fiber bundles. The osteocalcin -/- mouse has fully mineralised bone. As the function of the SCPP gene product for bone formation is unknown, there is no need to discuss SCPP genes. It would perhaps be better to finish the manuscript with summery that focuses on the subject and the methodology of this nice study.

    1. eLife assessment

      This useful study reports that epididymal proteins are required for embryogenesis after fertilization. The data presented are generally convincing, but the study is incomplete because it does not investigate in detail how those proteins cause DNA fragmentation and compromised embryonic development. This work will be of interest to reproductive biologists and andrologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The main observation that the sperm from CRISP proteins 1 and 3 KO lines are post-fertilization less developmentally competent is convincing. However, the molecular characterization of the mechanism that leads to these defects and the temporal appearance of the defects requires additional studies.

      Strengths:

      The generation of these double mutant mice is valuable for the field. Moreover, the fact that the double mutant line of Crisp 1 and 3 is phenotypically different from the Crisp 1 and 4 line suggests different functions of these epididymis proteins. The methods used to demonstrate that developmental defects are largely due to post-fertilization defects are also a considerable strength. The initial characterization of these sperm has altered intracellular Ca2+ levels, and increased rates of DNA fragmentation are valuable.

      Weaknesses:

      The study is mechanistically incomplete because there is no direct demonstration that the absence of these proteins alters the epididymal environment and fluid, wherein during the passage through the epididymis the sperm become affected. Also, a direct demonstration of how the proteins in question cause or lead to DNA damage and increased Ca2+ requires further characterization.

    3. Reviewer #2 (Public Review):

      The authors showed that CRISP1 and CRISP3, secreted proteins in the epididymis, are required for early embryogenesis after fertilization through DNA integrity in cauda epididymal sperm. This paper is the first report showing that the epididymal proteins are required for embryogenesis after fertilization. However, some data in this paper (Table 1 and Figure 2A) are overlapped in a published paper (Curci et al., FASEB J, 34,15718-15733, 2020; PMID: 33037689). Furthermore, the authors did not address why the disruption of CRISP1/3 leads to these phenomena (the increased level of the intracellular Ca2+ level and impaired DNA integrity in sperm) with direct evidence. Therefore, if the authors can address the following comments to improve the paper's novelty and clarification, this paper may be worthwhile to readers.

    1. Author response

      Reviewer #1 (Public Review):

      The authors aimed to investigate if 2-hydroxybutyrate (2HB), a metabolite induced by exercise, influences physiological changes, particularly metabolic alterations post-exercise training. They treated young mice and cultured myoblasts with 2HB, conducted exercise tests, metabolomic profiling, gene expression analysis, and knockdown experiments to understand 2HB's mechanisms. Their findings indicate that 2HB enhances exercise tolerance, boosts branch chain amino acid (BCAA) enzyme gene expression in skeletal muscles, and increases oxidative capacity. They also highlight the role of SIRT4 in these effects. This study establishes 2HB, once considered a waste product, as a regulator of exercise-induced metabolic processes. The study's strength lies in its consistent results across in vitro, in vivo, and ex vivo analyses.

      The authors propose a mechanism in which 2HB inhibits BCAA breakdown, raises NAD+/NADH ratio, activates SIRT4, increases ADP ribosylation, and controls gene expression.

      However, some questions remain unclear based on these findings:

      This study focused on the effects of short-term exercise (1 or 5 bouts of treadmill running) and short-term 2HB treatment (1 or 4 days of treatment). Adaptations to exercise training typically occur progressively over an extended period. It's important to investigate the effects of long-term 2HB treatment and whether extended combined 2HB treatment and exercise training have independent, synergistic, or antagonistic effects.

      We agree with the reviewer that investigation of longer-term 2HB treatment may potentially yield interesting findings with more implications to exercise physiology. To investigate the effects of 2HB treatment against or in combination with a progressive exercise training protocol would require an experiment duration between 4 to 12 weeks, based on previous studies (Systematic Review by Massett et al., Frontiers in Physiology, 2021, 10.3389/fphys.2021.782695). However, our experience with these types of experiments is that such a pursuit would require a breadth of work beyond the scope of this current study. For instance, if there were evidence of weakened effect of 2HB over time, one may be compelled to investigate other organs such as the liver to find signs of metabolic adaptation to the exogenous metabolite. If there were additive or synergistic effects on exercise performance, one may be compelled to investigate changes to the cardiovascular system in addition to the skeletal muscle. Additional questions would be raised around the skeletal muscle as well, including assessment of structural and fibre-type changes. Further, these additional mechanisms would need to be characterized in a time course fashion. Rather, we view the scope of the current study to be the acute response to 2HB as an initial report on mechanistic effects of 2HB.

      Exercise training leads to significant mitochondrial changes, including increased mitochondrial biogenesis in skeletal muscle. It would be valuable to compare the impact of 2HB treatment on mitochondrial content and oxidative capacity in treated mice to that in exercised mice.

      We agree with the author that it is of interest to investigate how 2HB may affect mitochondrial biogenesis. However, our preliminary findings were that 2HB-treated MEFs, C2C12s, and mouse soleus muscles showed no change in PGC1α gene expression after four days of treatment (data not shown). As a follow-up assessment of mitochondrial protein expression, although not specific to mtDNA derived genes, we quantified the expression of the respiratory chain proteins in cells and soleus muscle and found no effect of 2HB treatment (SFig. 5,6). At this stage we conclude that there is not evidence of 2HB modifying mitochondrial biogenesis in this time frame and that further investigation would be best suited to a follow-up study such as one interested in long-term exercise training.

      The authors demonstrate that 2-ketobutyrate (2KB) can serve as an oxidative fuel, suggesting a role for the intact BCAA catabolic pathway. However, it's puzzling that the knockout of BCKDHA, a subunit crucial for the second step of BCAA catabolism, did not result in changes in oxidative capacity in cultured myoblasts.

      While we report the BCKDH complex to be dispensable for 2KB oxidation it is important to note that previous studies have reported the following: (1) that 2KB is a viable substrate for BCKDH, (2) that 2KB is a viable substrate for pyruvate dehydrogenase, and (3) that pyruvate dehydrogenase is also dispensable for 2KB oxidation (see Steele et al., J Nutr., 114: 701-710, and Paxton et al. Biochem J., 234:295-303). Collectively, these data have led previous studies to conclude that BCKDH and pyruvate dehydrogenase are redundant for the first step of 2KB oxidation, with a preference for BCKDH. The flux through either may depend upon the metabolic environment. The aim for figure 3C was to determine whether the BCAA degradation pathway was required for 2KB oxidation. We conclude that this pathway is required, first at the step of PCC.

      While these past studies were mentioned in paragraph 2 of the discussion, in light of the reviewer’s comment we have expanded this paragraph. We have added language to explain that future research interested in the presented 2HB mechanism should carefully consider BCKDH and PDH expression in the cell or tissue of interest, as the metabolism of 2KB is quite central to the presented mechanism.

      Nevertheless, this innovative model of metabolic signaling during exercise will serve as a valuable reference for informing future.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript entitled "A 2-HB-mediated feedback loop regulates muscular fatigue" by the Johnson group reports interesting findings with implications for the health benefits of exercise. The authors use a combination of metabolic/biochemical in vivo and in vitro assays to delineate a metabolic route triggered by 2-HB (a relatively stable metabolite induced by exercise in humans and mice) that controls branched-chain amino transferase enzymes and mitochondrial oxidative capacity. Mechanistically, the author shows that 2-HB is a direct inhibitor of BCAT enzymes that in turn control levels of SIRT4 activity and ADp-ribosylation in the nucleus targeting C/EBP transcription factor, affecting BCAA oxidation genes (see Fig 4i in the paper). Overall, these are interesting and novel observations and findings with relevance to human exercise, with the potential implication of using these metabolites to mimic exercise benefits, or conditions or muscular fatigue that occurs in different human chronic diseases including rheumatic diseases or long COVID.

      Weaknesses:

      There are several experiments/comments that will strengthen the manuscript-

      (1) A final model in Figure 6 integrating the exercise/mechanistic findings, expanding on Fig 4i) will clarify the findings.

      We appreciate the reviewer’s suggestion to incorporate the exercise findings into a summary figure. However, upon internal review we find that such a figure is too similar to Fig 4i to warrant a new diagram.

      (2) In some of the graphs, statistics are missing (e.g Fig 6G).

      Some figures are included primarily for the reader to visualize the data while statistical comparison is conducted in a separate figure, for example Fig 2D-G. However, we have revised the figure legends to ensure that statistical comparisons are described for all appropriate figures, including Fig 6G identified by the reviewer.

      (3) The conclusions on SIRT4 dependency should be carefully written, as it is likely that this is only one potential mechanism, further validation with mouse models would be necessary.

      We appreciate the reviewers feedback and take the point well that a NAD-dependent mechanism will likely stimulate other sirtuins, which are often in fact expressed at greater levels than SIRT4. To reflect this comment in the manuscript we have altered paragraph 5 of the discussion to now focus on sirtuins. We briefly discuss SIRT4 and highlight the need for future consideration of other sirtuins, perhaps particularly mitochondrial sirtuins.

      (4) One of the needed experiments to support the oxidative capacity effects that could be done in cultured cells, is the use of radiosotope metabolites including BCCAs to determine the ability to produce CO2. Alternatively or in combination metabolite flux using isotopes would be useful to strengthen the current results.

      We appreciate the suggestion from the reviewer and we will look to conduct such an experiment in our follow-up work.

      We sincerely thank the reviewers for their input on this study as their suggestions have led to an improved manuscript for the version of record. The reviewer comments are well taken and we are glad that they will be present alongside the final manuscript to provide an important perspective on the work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Using a cross-modal sensory selection task in head-fixed mice, the authors attempted to characterize how different rules reconfigured representations of sensory stimuli and behavioral reports in sensory (S1, S2) and premotor cortical areas (medial motor cortex or MM, and ALM). They used silicon probe recordings during behavior, a combination of single-cell and population-level analyses of neural data, and optogenetic inhibition during the task.

      Strengths:

      A major strength of the manuscript was the clarity of the writing and motivation for experiments and analyses. The behavioral paradigm is somewhat simple but well-designed and wellcontrolled. The neural analyses were sophisticated, clearly presented, and generally supported the authors' interpretations. The statistics are clearly reported and easy to interpret. In general, my view is that the authors achieved their aims. They found that different rules affected preparatory activity in premotor areas, but not sensory areas, consistent with dynamical systems perspectives in the field that hold that initial conditions are important for determining trial-based dynamics.

      Weaknesses:

      The manuscript was generally strong. The main weakness in my view was in interpreting the optogenetic results. While the simplicity of the task was helpful for analyzing the neural data, I think it limited the informativeness of the perturbation experiments. The behavioral read-out was low dimensional -a change in hit rate or false alarm rate- but it was unclear what perceptual or cognitive process was disrupted that led to changes in these read-outs. This is a challenge for the field, and not just this paper, but was the main weakness in my view. I have some minor technical comments in the recommendations for authors that might address other minor weaknesses.

      I think this is a well-performed, well-written, and interesting study that shows differences in rule representations in sensory and premotor areas and finds that rules reconfigure preparatory activity in the motor cortex to support flexible behavior.

      Reviewer #2 (Public Review):

      Summary:

      Chang et al. investigate neuronal activity firing patterns across various cortical regions in an interesting context-dependent tactile vs visual detection task, developed previously by the authors (Chevee et al., 2021; doi: 10.1016/j.neuron.2021.11.013). The authors report the important involvement of a medial frontal cortical region (MM, probably a similar location to wM2 as described in Esmaeili et al., 2021 & 2022; doi: 10.1016/j.neuron.2021.05.005; doi: 10.1371/journal.pbio.3001667) in mice for determining task rules.

      Strengths:

      The experiments appear to have been well carried out and the data well analysed. The manuscript clearly describes the motivation for the analyses and reaches clear and well-justified conclusions. I find the manuscript interesting and exciting!

      Weaknesses:

      I did not find any major weaknesses.

      Reviewer #3 (Public Review):

      This study examines context-dependent stimulus selection by recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, including S1, S2, MM, and ALM. Mice are trained to either withhold licking or perform directional licking in response to visual or tactile stimulus. Depending on the task rule, the mice have to respond to one stimulus modality while ignoring the other. Neural activity to the same tactile stimulus is modulated by task in all the areas recorded, with significant activity changes in a subset of neurons and population activity occupying distinct activity subspaces. Recordings further reveal a contextual signal in the pre-stimulus baseline activity that differentiates task context. This signal is correlated with subsequent task modulation of stimulus activity. Comparison across brain areas shows that this contextual signal is stronger in frontal cortical regions than in sensory regions. Analyses link this signal to behavior by showing that it tracks the behavioral performance switch during task rule transitions. Silencing activity in frontal cortical regions during the baseline period impairs behavioral performance.

      Overall, this is a superb study with solid results and thorough controls. The results are relevant for context-specific neural computation and provide a neural substrate that will surely inspire follow-up mechanistic investigations. We only have a couple of suggestions to help the authors further improve the paper.

      (1) We have a comment regarding the calculation of the choice CD in Fig S3. The text on page 7 concludes that "Choice coding dimensions change with task rule". However, the motor choice response is different across blocks, i.e. lick right vs. no lick for one task and lick left vs. no lick for the other task. Therefore, the differences in the choice CD may be simply due to the motor response being different across the tasks and not due to the task rule per se. The authors may consider adding this caveat in their interpretation. This should not affect their main conclusion.

      We thank the Reviewer for the suggestion. We have discussed this caveat and performed a new analysis to calculate the choice coding dimensions using right-lick and left-lick trials (Fig. S3h) on page 8. 

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).”

      We also have included the caveats for using right-lick and left-lick trials to calculate choice coding dimensions on page 13.

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      (2) We have a couple of questions about the effect size on single neurons vs. population dynamics. From Fig 1, about 20% of neurons in frontal cortical regions show task rule modulation in their stimulus activity. This seems like a small effect in terms of population dynamics. There is somewhat of a disconnect from Figs 4 and S3 (for stimulus CD), which show remarkably low subspace overlap in population activity across tasks. Can the authors help bridge this disconnect? Is this because the neurons showing a difference in Fig 1 are disproportionally stimulus selective neurons?

      We thank the Reviewer for the insightful comment and agree that it is important to link the single-unit and population results. We have addressed these questions by (1) improving our analysis of task modulation of single neurons  (tHit-tCR selectivity) and (2) examining the relationship between tHit-tCR selective neurons and tHit-tCR subspace overlaps.  

      Previously, we averaged the AUC values of time bins within the stimulus window (0-150 ms, 10 ms bins). If the 95% CI on this averaged AUC value did not include 0.5, this unit was considered to show significant selectivity. This approach was highly conservative and may underestimate the percentage of units showing significant selectivity, particularly any units showing transient selectivity. In the revised manuscript, we now define a unit as showing significant tHit-tCR selectivity when three consecutive time bins (>30 ms, 10ms bins) of AUC values were significant. Using this new criterion, the percentage of tHittCR selective neurons increased compared with the previous analysis. We have updated Figure 1h and the results on page 4:

      “We found that 18-33% of neurons in these cortical areas had area under the receiver-operating curve (AUC) values significantly different from 0.5, and therefore discriminated between tHit and tCR trials (Fig. 1h; S1: 28.8%, 177 neurons; S2: 17.9%, 162 neurons; MM: 32.9%, 140 neurons; ALM: 23.4%, 256 neurons; criterion to be considered significant: Bonferroni corrected 95% CI on AUC did not include 0.5 for at least 3 consecutive 10-ms time bins).”

      Next, we have checked how tHit-tCR selective neurons were distributed across sessions. We found that the percentage of tHit-tCR selective neurons in each session varied (S1: 9-46%, S2: 0-36%, MM:25-55%, ALM:0-50%). We examined the relationship between the numbers of tHit-tCR selective neurons and tHit-tCR subspace overlaps. Sessions with more neurons showing task rule modulation tended to show lower subspace overlap, but this correlation was modest and only marginally significant (r= -0.32, p= 0.08, Pearson correlation, n= 31 sessions). While we report the percentage of neurons showing significant selectivity as a simple way to summarize single-neuron effects, this does neglect the magnitude of task rule modulation of individual neurons, which may also be relevant. 

      In summary, the apparent disconnect between the effect sizes of task modulation of single neurons and of population dynamics could be explained by (1) the percentages of tHit-tCR selective neurons were underestimated in our old analysis, (2) tHit-tCR selective neurons were not uniformly distributed among sessions, and (3) the percentages of tHit-tCR selective neurons were weakly correlated with tHit-tCR subspace overlaps. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      For the analysis of choice coding dimensions, it seems that the authors are somewhat data limited in that they cannot compare lick-right/lick-left within a block. So instead, they compare lick/no lick trials. But given that the mice are unable to initiate trials, the interpretation of the no lick trials is a bit complicated. It is not clear that the no lick trials reflect a perceptual judgment about the stimulus (i.e., a choice), or that the mice are just zoning out and not paying attention. If it's the latter case, what the authors are calling choice coding is more of an attentional or task engagement signal, which may still be interesting, but has a somewhat different interpretation than a choice coding dimension. It might be worth clarifying this point somewhere, or if I'm totally off-base, then being more clear about why lick/no lick is more consistent with choice than task engagement.

      We thank the Reviewer for raising this point. We have added a new paragraph on page 13 to clarify why we used lick/no-lick trials to calculate choice coding dimensions, and we now discuss the caveat regarding task engagement.  

      “No-lick trials included misses, which could be caused by mice not being engaged in the task. While the majority of no-lick trials were correct rejections (respond-to-touch: 75%; respond-to-light: 76%), we treated no-licks as one of the available choices in our task and included them to calculate choice coding dimensions (Fig. S4c,d,f). To ensure stable and balanced task engagement across task rules, we removed the last 20 trials of each session and used stimulus parameters that achieved similar behavioral performance for both task rules (Fig. 1d; ~75% correct for both rules).”

      In addition, to address a point made by Reviewer 3 as well as this point, we performed a new analysis to calculate choice coding dimensions using right-lick vs left-lick trials. We report this new analysis on page 8:

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).” 

      We added discussion of the limitations of this new analysis on page 13:

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      The authors find that the stimulus coding direction in most areas (S1, S2, and MM) was significantly aligned between the block types. How do the authors interpret that finding? That there is no major change in stimulus coding dimension, despite the change in subspace? I think I'm missing the big picture interpretation of this result.

      That there is no significant change in stimulus coding dimensions but a change in subspace suggests that the subspace change largely reflects a change in the choice coding dimensions.

      As I mentioned in the public review, I thought there was a weakness with interpretation of the optogenetic experiments, which the authors generally interpret as reflecting rule sensitivity. However, given that they are inhibiting premotor areas including ALM, one might imagine that there might also be an effect on lick production or kinematics. To rule this out, the authors compare the change in lick rate relative to licks during the ITI. What is the ITI lick rate? I assume pretty low, once the animal is welltrained, in which case there may be a floor effect that could obscure meaningful effects on lick production. In addition, based on the reported CI on delta p(lick), it looks like MM and AM did suppress lick rate. I think in the future, a task with richer behavioral read-outs (or including other measurements of behavior like video), or perhaps something like a psychological process model with parameters that reflect different perceptual or cognitive processes could help resolve the effects of perturbations more precisely.

      Eighteen and ten percent of trials had at least one lick in the ITI in respond-to-touch and  respond-tolight blocks, respectively. These relatively low rates of ITI licking could indeed make an effect of optogenetics on lick production harder to observe. We agree that future work would benefit from more complex tasks and measurements, and have added the following to make this point (page 14):

      “To more precisely dissect the effects of perturbations on different cognitive processes in rule-dependent sensory detection, more complex behavioral tasks and richer behavioral measurements are needed in the future.”

      Reviewer #2 (Recommendations For The Authors):

      I have the following minor suggestions that the authors might consider in revising this already excellent manuscript :

      (1) In addition to showing normalised z-score firing rates (e.g. Fig 1g), I think it is important to show the grand-average mean firing rates in Hz.

      We thank the Reviewer for the suggestion and have added the grand-average mean firing rates as a new supplementary figure (Fig. S2a). To provide more details about the firing rates of individual neurons, we have also added to this new figure the distribution of peak responses during the tactile stimulus period (Fig. S2b).

      (2) I think the authors could report more quantitative data in the main text. As a very basic example, I could not easily find how many neurons, sessions, and mice were used in various analyses.

      We have added relevant numbers at various points throughout the Results, including within the following examples:

      Page 3: “To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM and ALM (Fig. 1e-g, Fig. S1a-h, and Fig. S2a; S1: 6 mice, 10 sessions, 177 neurons, S2: 5 mice, 8 sessions, 162 neurons, MM: 7 mice, 9 sessions, 140 neurons, ALM: 8 mice, 13 sessions, 256 neurons).”

      Page 5: “As expected, single-unit activity before stimulus onset did not discriminate between tactile and visual trials (Fig. 2d; S1: 0%, 177 neurons; S2: 0%, 162 neurons; MM: 0%, 140 neurons; ALM: 0.8%, 256 neurons). After stimulus onset, more than 35% of neurons in the sensory cortical areas and approximately 15% of neurons in the motor cortical areas showed significant stimulus discriminability (Fig. 2e; S1: 37.3%, 177 neurons; S2: 35.2%, 162 neurons; MM: 15%, 140 neurons; ALM: 14.1%, 256 neurons).”

      Page 6: “Support vector machine (SVM) and Random Forest classifiers showed similar decoding abilities

      (Fig. S3a,b; medians of classification accuracy [true vs shuffled]; SVM: S1 [0.6 vs 0.53], 10 sessions, S2

      [0.61 vs 0.51], 8 sessions, MM [0.71 vs 0.51], 9 sessions, ALM [0.65 vs 0.52], 13 sessions; Random

      Forests: S1 [0.59 vs 0.52], 10 sessions, S2 [0.6 vs 0.52], 8 sessions, MM [0.65 vs 0.49], 9 sessions, ALM [0.7 vs 0.5], 13 sessions).”

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).” 

      Page 8: “In contrast, we found that S1, S2 and MM had stimulus CDs that were significantly aligned between the two block types (Fig. S4e; magnitude of dot product between the respond-to-touch stimulus CDs and the respond-to-light stimulus CDs, mean ± 95% CI for true vs shuffled data: S1: 0.5 ± [0.34, 0.66] vs 0.21 ± [0.12, 0.34], 10 sessions; S2: 0.62 ± [0.43, 0.78] vs 0.22 ± [0.13, 0.31], 8 sessions; MM: 0.48 ± [0.38, 0.59] vs 0.24 ± [0.16, 0.33], 9 sessions; ALM: 0.33 ± [0.2, 0.47] vs 0.21 ± [0.13, 0.31], 13 sessions).”  Page 9: “For respond-to-touch to respond-to-light block transitions, the fractions of trials classified as respond-to-touch for MM and ALM decreased progressively over the course of the transition (Fig. 5d; rank correlation of the fractions calculated for each of the separate periods spanning the transition, Kendall’s tau, mean ± 95% CI: MM: -0.39 ± [-0.67, -0.11], 9 sessions, ALM: -0.29 ± [-0.54, -0.04], 13 sessions; criterion to be considered significant: 95% CI on Kendall’s tau did not include 0).

      Page 11: “Lick probability was unaffected during S1, S2, MM and ALM experiments for both tasks, indicating that the behavioral effects were not due to an inability to lick (Fig. 6i, j; 95% CI on Δ lick probability for cross-modal selection task: S1/S2 [-0.18, 0.24], 4 mice, 10 sessions; MM [-0.31, 0.03], 4 mice, 11 sessions; ALM [-0.24, 0.16], 4 mice, 10 sessions; Δ lick probability for simple tactile detection task: S1/S2 [-0.13, 0.31], 3 mice, 3 sessions; MM [-0.06, 0.45], 3 mice, 5 sessions; ALM [-0.18, 0.34], 3 mice, 4 sessions).”

      (3) Please include a clearer description of trial timing. Perhaps a schematic timeline of when stimuli are delivered and when licking would be rewarded. I may have missed it, but I did not find explicit mention of the timing of the reward window or if there was any delay period.

      We have added the following (page 3): 

      “For each trial, the stimulus duration was 0.15 s and an answer period extended from 0.1 to 2 s from stimulus onset.”

      (4) Please include a clear description of statistical tests in each figure legend as needed (for example please check Fig 4e legend).

      We have added details about statistical tests in the figure legends:

      Fig. 2f: “Relationship between block-type discriminability before stimulus onset and tHit-tCR discriminability after stimulus onset for units showing significant block-type discriminability prior to the stimulus. Pearson correlation: S1: r = 0.69, p = 0.056, 8 neurons; S2: r = 0.91, p = 0.093, 4 neurons; MM: r = 0.93, p < 0.001, 30 neurons; ALM: r = 0.83, p < 0.001, 26 neurons.” 

      Fig. 4e: “Subspace overlap for control tHit (gray) and tCR (purple) trials in the somatosensory and motor cortical areas. Each circle is a subspace overlap of a session. Paired t-test, tCR – control tHit: S1: -0.23, 8 sessions, p = 0.0016; S2: -0.23, 7 sessions, p = 0.0086; MM: -0.36, 5 sessions, p = <0.001; ALM: -0.35, 11 sessions, p < 0.001; significance: ** for p<0.01, *** for p<0.001.”  

      Fig. 5d,e: “Fraction of trials classified as coming from a respond-to-touch block based on the pre-stimulus population state, for trials occurring in different periods (see c) relative to respond-to-touch → respondto-light transitions. For MM (top row) and ALM (bottom row), progressively fewer trials were classified as coming from the respond-to-touch block as analysis windows shifted later relative to the rule transition. Kendall’s tau (rank correlation): MM: -0.39, 9 sessions; ALM: -0.29, 13 sessions. Left panels: individual sessions, right panels: mean ± 95% CI. Dash lines are chance levels (0.5). e, Same as d but for respond-to-light → respond-to-touch transitions. Kendall’s tau: MM: 0.37, 9 sessions; ALM: 0.27, 13 sessions.”

      Fig. 6: “Error bars show bootstrap 95% CI. Criterion to be considered significant: 95% CI did not include 0.”

      (5) P. 3 - "To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM, and ALM using 64-channel silicon probes (Fig. 1e-g and Fig. S1a-h)." Please specify if these areas were recorded simultaneously or not.

      We have added “We recorded from one of these cortical areas per session, using 64-channel silicon probes.”  on page 3.  

      (6) Figure 4b - Please describe what gray and black lines show.

      The gray traces are the distance between tHit and tCR trajectories in individual sessions and the black traces are the averages across sessions in different cortical areas. We have added this information on page 6 and in the Figure 4b legend. 

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).

      Fig. 4b: “Distance between tHit and tCR trajectories in S1, S2, MM and ALM. Gray traces show the time varying tHit-tCR distance in individual sessions and black traces are session-averaged tHit-tCR distance (S1:10 sessions; S2: 8 sessions; MM: 9 sessions; ALM: 13 sessions).”

      (7) In addition to the analyses shown in Figure 5a, when investigating the timing of the rule switch, I think the authors should plot the left and right lick probabilities aligned to the timing of the rule switch time on a trial-by-trial basis averaged across mice.

      We thank the Reviewer for suggesting this addition. We have added a new figure panel to show the probabilities of right- and left-licks during rule transitions (Fig. 5a).

      Page 8: “The probabilities of right-licks and left-licks showed that the mice switched their motor responses during block transitions depending on task rules (Fig. 5a, mean ± 95% CI across 12 mice).” 

      (8) P. 12 - "Moreover, in a separate study using the same task (Finkel et al., unpublished), high-speed video analysis demonstrated no significant differences in whisker motion between respond-to-touch and respond-to-light blocks in most (12 of 14) behavioral sessions.". Such behavioral data is important and ideally would be included in the current analysis. Was high-speed videography carried out during electrophysiology in the current study?

      Finkel et al. has been accepted in principle for publication and will be available online shortly. Unfortunately we have not yet carried out simultaneous high-speed whisker video and electrophysiology in our cross-modal sensory selection task.

      Reviewer #3 (Recommendations For The Authors):

      (1) Minor point. For subspace overlap calculation of pre-stimulus activity in Fig 4e (light purple datapoints), please clarify whether the PCs for that condition were constructed in matched time windows. If the PCs are calculated from the stimulus period 0-150ms, the poor alignment could be due to mismatched time windows.

      We thank the Reviewer for the comment and clarify our analysis here. We previously used timematched windows to calculate subspace overlaps. However, the pre-stimulus activity was much weaker than the activity during the stimulus period, so the subspaces of reference tHit were subject to noise and we were not able to obtain reliable PCs. This caused the subspace overlap values between the reference tHit and control tHit to be low and variable (mean ± SD, S1:  0.46± 0.26, n = 8 sessions, S2: 0.46± 0.18, n = 7 sessions, MM: 0.44± 0.16, n = 5 sessions, ALM: 0.38± 0.22, n = 11 sessions).  Therefore, we used the tHit activity during the stimulus window to obtain PCs and projected pre-stimulus and stimulus activity in tCR trials onto these PCs. We have now added a more detailed description of this analysis in the Methods (page 32). 

      “To calculate the separation of subspaces prior to stimulus delivery, pre-stimulus activity in tCR trials (100 to 0 ms from stimulus onset) was projected to the PC space of the tHit reference group and the subspace overlap was calculated. In this analysis, we used tHit activity during stimulus delivery (0 to 150 ms from stimulus onset) to obtain reliable PCs.”   

      We acknowledge this time alignment issue and have now removed the reported subspace overlap between tHit and tCR during the pre-stimulus period from Figure 4e (light purple). However, we think the correlation between pre- and post- stimulus-onset subspace overlaps should remain similar regardless of the time windows that we used for calculating the PCs. For the PCs calculated from the pre-stimulus period (-100 to 0 ms), the correlation coefficient was 0.55 (Pearson correlation, p <0.01, n = 31 sessions). For the PCs calculated from the stimulus period (0-150 ms), the correlation coefficient was 0.68 (Figure 4f, Pearson correlation, p <0.001, n = 31 sessions). Therefore, we keep Figure 4f.  

      (2) Minor point. To help the readers follow the logic of the experiments, please explain why PPC and AMM were added in the later optogenetic experiment since these are not part of the electrophysiology experiment.

      We have added the following rationale on page 9.

      “We recorded from AMM in our cross-modal sensory selection task and observed visually-evoked activity (Fig. S1i-k), suggesting that AMM may play an important role in rule-dependent visual processing. PPC contributes to multisensory processing51–53 and sensory-motor integration50,54–58.  Therefore, we wanted to test the roles of these areas in our cross-modal sensory selection task.”

      (3) Minor point. We are somewhat confused about the timing of some of the example neurons shown in figure S1. For example, many neurons show visually evoked signals only after stimulus offset, unlike tactile evoked signals (e.g. Fig S1b and f). In addition, the reaction time for visual stimulus is systematically slower than tactile stimuli for many example neurons (e.g. Fig S1b) but somehow not other neurons (e.g. Fig S1g). Are these observations correct?

      These observations are all correct. We have a manuscript from a separate study using this same behavioral task (Finkel et al., accepted in principle) that examines and compares (1) the onsets of tactile- and visually-evoked activity and (2) the reaction times to tactile and visual stimuli. The reaction times to tactile stimuli were slightly but significantly shorter than the reaction times to visual stimuli (tactile vs visual, 397 ± 145 vs 521 ± 163 ms, median ± interquartile range [IQR], Tukey HSD test, p = 0.001, n =155 sessions). We examined how well activity of individual neurons in S1 could be used to discriminate the presence of the stimulus or the response of the mouse. For discriminability for the presence of the stimulus, S1 neurons could signal the presence of the tactile stimulus but not the visual stimulus. For discriminability for the response of the mouse, the onsets for significant discriminability occurred earlier for tactile compared with visual trials (two-sided Kolmogorov-Smirnov test, p = 1x10-16, n = 865 neurons with DP onset in tactile trials, n = 719 neurons with DP onset in visual trials).

    2. eLife assessment

      This important work advances our understanding of how brains flexibly gate actions in different contexts, a topic of great interest to the broader field of systems neuroscience. Recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, the authors found that preparatory activity in motor cortical areas of the mouse depends on the context in which an action will be carried out, consistent with previous theoretical and experimental work. Furthermore, the authors provide causal evidence that these changes support flexible gating of actions. The carefully carried out experiments were analyzed using state-of-the-art methodology and provide convincing conclusions.

    3. Reviewer #1 (Public Review):

      Summary:

      Using a cross-modal sensory selection task in head-fixed mice, the authors attempted to characterize how different rules reconfigured representations of sensory stimuli and behavioral reports in sensory (S1, S2) and premotor cortical areas (medial motor cortex or MM, and ALM). They used silicon probe recordings during behavior, a combination of single-cell and population-level analyses of neural data, and optogenetic inhibition during the task.

      Strengths:

      A major strength of the manuscript was the clarity of the writing and motivation for experiments and analyses. The behavioral paradigm is somewhat simple but well-designed and well-controlled. The neural analyses were sophisticated, clearly presented, and generally supported the authors' interpretations. The statistics are clearly reported and easy to interpret. In general, my view is that the authors achieved their aims. They found that different rules affected preparatory activity in premotor areas, but not sensory areas, consistent with dynamical systems perspectives in the field that hold that initial conditions are important for determining trial-based dynamics.

      I think this is a well-performed, well-written and interesting study that shows differences in rule representations in sensory and premotor areas, and finds that rules reconfigure preparatory activity in motor cortex to support flexible behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      Chang et al. investigated neuronal activity firing patterns across various cortical regions in an interesting context-dependent tactile vs visual detection task, developed previously by the authors (Chevee et al., 2021; doi: 10.1016/j.neuron.2021.11.013). The authors report the important involvement of a medial frontal cortical region (MM, probably a similar location to wM2 as described in Esmaeili et al., 2021 & 2022; doi: 10.1016/j.neuron.2021.05.005; doi: 10.1371/journal.pbio.3001667) in mice for determining task rules.

      Strengths:

      The experiments appear to have been well carried out and the data well analysed. The manuscript clearly describes the motivation for the analyses and reaches clear and well-justified conclusions. I find the manuscript interesting and exciting!

      Weaknesses:

      I did not find any major weaknesses.

    5. Reviewer #3 (Public Review):

      Summary:

      This study examines context-dependent stimulus selection by recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, including S1, S2, MM, and ALM. Mice are trained to either withhold licking or perform directional licking in response to visual or tactile stimulus. Depending on the task rule, the mice must respond to one stimulus modality while ignoring the other. Neural activity to the same tactile stimulus is modulated by task in all the areas recorded, with significant activity changes in a subset of neurons and population activity occupying distinct activity subspaces. Recordings further reveal a contextual signal in the pre-stimulus baseline activity that differentiates task context. This signal is correlated with subsequent task modulation of neural activity. Comparison across brain areas shows that this contextual signal is stronger in frontal cortical regions than sensory regions. Analyses link this signal to behavior by showing that it tracks the behavioral performance switch during task rule transitions. Silencing activity in frontal cortical regions during the baseline period impairs behavioral performance.

      Strengths:

      This is a carefully done study with solid results and thorough controls. The authors identify a contextual signal in baseline neural activity that predicts rule-dependent decision-related activity. The comprehensive characterization across a sensorimotor pathway is another strength. Analyses and perturbation experiments link this contextual signal to animals' behavior. The results provide a neural substrate that will surely inspire follow-up mechanistic investigations.

      Weaknesses:

      None. The authors have further improved the manuscript during the revision with additional analyses.

      Impact:

      This study reports an important neural signature for context-dependent decision-making that has important implications for mechanisms of context-dependent neural computation in general.

    1. eLife assessment

      This fundamental study provides insights into the interplay of endogenous orienting and the planning of goal-directed gaze shifts (saccades). Using an elegant experimental protocol and detailed analyses of the time course of saccadic choices, the authors provide compelling evidence for independent mechanisms that guide early, reflexive eye movements and later, voluntary gaze shifts. This work will be of interest to neuroscientists and psychologists working on vision and motor control and to those researching decision-making across disciplines.

    2. Reviewer #1 (Public Review):

      Summary:

      The classical pro/antisaccade task has become a valuable diagnostic tool in neurology and psychiatry (Antoniades et al., 2013, Vision Res). Although it is well-established that antisaccades require substantially longer latencies than prosaccades, the exact attentional mechanisms underlying these differences are not yet fully elucidated. This study investigates the separate influences of exogenous and endogenous attention on saccade generation. These two mechanisms are often confounded in classical pro/antisaccade tasks. In the current study, the authors build on their previous work using an urgent choice task (Salinas et al., 2019, eLife) to time-resolve the influences of exogenous and endogenous factors on saccade execution. The key contribution of the current study is to show that, when controlling for exogenous capture, antisaccades continue to require longer processing times. This longer processing time may be explained by a coupling between endogenous attention and saccade motor plans.

      Strengths:

      In the classical pro/antisaccade task the direction of exogenous capture (caused by the presentation of the cue) is typically congruent with the direction of prosaccades and incongruent with antisaccades. A key strength of the current study is the introduction of different experimental conditions that control for the effects of exogenous capture on saccade generation. In particular, Experiments 3 and 4 provide strong evidence for two independent (exogenous and endogenous) mechanisms that guide saccadic choices, acting at different times. Differences in timing for pro and antisaccades during the endogenous phase were consistent and independent of whether the exogenous capture biased early saccades toward the correct prosaccade direction or toward the correct antisaccade directions.

      As in previous studies by the same group (Salinas et al., 2019, eLife; Goldstein et al., 2023, eLife), the detailed analysis of the time course of goal-directed saccades allowed the authors to determine the exact, additional time of 30 ms that is necessary to generate a correct antisaccade versus prosaccade.

      Overall, the manuscript is very well written, and the data are presented clearly.

      Weaknesses:

      The main research question could be defined more clearly. In the abstract and at some points throughout the manuscript, the authors indicate that the main purpose of the study was to assess whether the allocation of endogenous attention requires saccade planning [e.g., ll.3-5 or ll.247-248]. While the data show a coupling between endogenous attention and saccades, they do not point to a specific direction of this coupling (i.e., whether endogenous attention is necessary to successfully execute a saccade plan or whether a saccade plan necessarily accompanies endogenous attention).

      Some of the analyses were performed only on subgroups of the participants. The reporting of these subgroup analyses is transparent and data from all participants are reported in the supplementary figures. Still, these subgroup analyses may make the data appear more consistent, compared to when data is considered across all participants. For instance, the exogenous capture in Experiments 1 and 2 appears much weaker in Figure 2 (subgroup) than Figure S3 (all participants). Moreover, because different subgroups were used for different analyses, it is often difficult to follow and evaluate the results. For instance, the tachometric curves in Figure 2 (see also Figure 3 and 4) show no motor bias towards the cue (i.e., performance was at ~50% for rPTs <75 ms). I assume that the subsequent analyses of the motor bias were based on a very different subgroup. In fact, based on Figure S2, it seems that the motor bias was predominantly seen in the unreliable participants. Therefore, I often found the figures that were based on data across all participants (Figures 7 and S3) more informative to evaluate the overall pattern of results.

    3. Reviewer #2 (Public Review):

      Goldstein et al. provide a thorough characterization of the interaction of attention and eye movement planning. These processes have been thought to be intertwined since at least the development of the Premotor Theory of Attention in 1987, and their relationship has been a continual source of debate and research for decades. Here, Goldstein et al. capitalize on their novel urgent saccade task to dissociate the effects of endogenous and exogenous attention on saccades towards and away from the cue. They find that attention and eye movements are, to some extent, linked to one another but that this link is transient and depends on the nature of the task. A primary strength of the work is that the researchers are able to carefully measure the timecourse of the interaction between attention and eye movements in various well-controlled experimental conditions. As a result, the behavioral interplay of two forms of attention (endogenous and exogenous) is illustrated at the level of tens of milliseconds as they interact with the planning and execution of saccades towards and away from the cued location. Overall, the results allow the authors to make meaningful claims about the time course of visual behavior, attention, and the potential neural mechanisms at a timescale relevant to everyday human behavior.

    4. Reviewer #3 (Public Review):

      Summary and overall evaluation:

      Human vision is inherently limited so that only a small part of a visual scene can be perceived at a given moment. To address this limitation, the visual system has evolved a number of strategies and mechanisms that work in concert. First, humans move their eyes using saccadic eye movements. This allows us to place the high-resolution region in the center of the eye's retina (the fovea centralis) on objects of interest so that these are sampled with high acuity. Second, salient, conspicuous stimuli that appear abruptly and/or differ strongly from the other stimuli in the scene, seem to automatically attract ("exogenous") attention, so that a large share of the neuronal "resources" for visual processing is devoted to the stimuli, which improves the perception of the stimuli. Third, stimuli that are important for the current task and the current behavioral goals can be prioritized by attention mechanisms ("endogenous" attention), which also secures their allocated share of processing resources and helps them be perceived. It is well-established that eye movements are closely linked to the mechanisms of attention (for a review, see Carrasco, 2011, cited in the manuscript). However, it is still unclear what role voluntary, endogenous attention plays in the control of saccadic eye movements.

      The present study used an experimental procedure involving time-pressure for responding, in order to uncover how the control of saccades by exogenous and endogenous attention unfolds over time. The findings of the study indicate that saccade planning was indeed influenced by the locus of endogenous attention, but that this influence was short-lasting and could be overcome quickly. Taken together, the present findings reveal new dynamics between endogenous attention and eye movement control, and lead the way for studying them using experiments under time pressure.

      The results provided by the present study advance our understanding of vision, eye movements, and their control by brain mechanisms for attention. In addition, they demonstrate how tasks involving time pressure can be used to study the dynamics of cognitive processes. Therefore, the present study seems highly important not only for vision science, but also for psychology, (cognitive) neuroscience, and related research fields more generally.

      Strengths:

      The experiments of the study are performed with great care and rigor and the data is analyzed thoroughly and comprehensively. Overall, the results support the authors' conclusions, so I have only minor comments (see below). Taken together, the findings seem important for a wide community of researchers in vision science, psychology, and neuroscience.

      Weaknesses (minor points):

      (1) In this experimental paradigm, participants must decide where to saccade based on the color of the cue in the visual periphery (they should have made a prosaccade toward a green cue and an antisaccade away from a magenta cue). Thus, irrespective of whether the cue signaled that a prosaccade or an antisaccade was to be made, the identity of the cue was always essential for the task (as the authors explain on p. 5, lines 129-138). Also, the location where the cue appeared was blocked, and thus known to the participants in advance, so that endogenous attention could be directed to the cue at the beginning of a trial (e.g., p. 5, lines 129-132). These aspects of the experimental paradigm differ from the classic prosaccade/antisaccade paradigm (e.g. Antoniades et al., 2013, Vision Research). In the classic paradigm, the identity of the cues does not have to be distinguished to solve the task, since there is only one stimulus that should be looked at (prosaccade) or away from (antisaccade), and whether a prosaccade or antisaccade was required is constant across a block of trials. Thus, in contrast to the present paradigm, in the classic paradigm, the participants do not know where the cue is about to appear, but they know whether to perform a prosaccade or an antisaccade based on the location of the cue.

      The present paradigm keeps the location of the cue constant in a block of trials by intention, because this ensures that endogenous attention is allocated to its location and is not overpowered by the exogenous capture of attention that would happen when a single stimulus appeared abruptly in the visual field. Thus, the reason for keeping the location of the cue constant seems convincing. However, I wondered what consequences the constant location would have for the task representations that persist across the task and govern how attention is allocated. In the classic paradigm, there is always a single stimulus that captures attention exogenously (as it appears abruptly). In a prosaccade block, participants can prioritize the visual transient caused by the stimulus, and follow it with a saccade to its coordinates. In an antisaccade block, following the transient with a saccade would always be wrong, so that participants could try to suppress the attention capture by the transient, and base their saccade on the coordinates of the opposite location. Thus, in prosaccade and antisaccade blocks, the task representations controlling how visual transients are processed to perform the task differ. In the present task, prosaccades and antisaccades cannot be distinguished by the visual transients. Thus, such a situation could favor endogenous attention and increase its influence on saccade planning, even though saccade planning under more naturalistic conditions would be dominated by visual transients. I suggest discussing how this (and vice versa the emphasis on visual transients in the classic paradigm) could affect the generality of the presented findings (e.g., how does this relate to the interpretation that saccade plans are obligatorily coupled to endogenous attention? See, Results, p. 10, lines 306-308, see also Deubel & Schneider, 1996, Vision Research).

      (2) Discussion (p. 16, lines 472-475): The authors suppose that "It is as if the exogenous response was automatically followed by a motor bias in the opposite direction. Perhaps the oculomotor circuitry is such that an exogenous signal can rapidly trigger a saccade, but if it does not, then the corresponding motor plan is rapidly suppressed regardless of anything else.". I think this interesting point should be discussed in more detail. Could it also be that instead of suppression, other currently active motor plans were enhanced? Would this involve attention? Some attention models assume that attention works by distributing available (neuronal) processing resources (e.g., Desimone & Duncan, 1995, Annual Review of Neuroscience; Bundesen, 1990, Psychological Review; Bundesen et al., 2005, Psychological Review) so that the information receiving the largest share of resources results in perception and is used for action, but this happens without the active suppression of information.

      (3) Methods, p. 19, lines 593-596: It is reported that saccades were scored based on their direction. I think more information should be provided to understand which eye movements entered the analysis. Was there a criterion for saccade amplitude? I think it would be very helpful to provide data on the distributions of saccade amplitudes or on their accuracy (e.g. average distance from target) or reliability (e.g. standard deviation of landing points). Also, it is reported that some data was excluded from the analysis, and I suggest reporting how much of the data was excluded. Was the exclusion of the data related to whether participants were "reliable" or "unreliable" performers?

      (4) Results, p. 9, lines 262-266: Some data analyses are performed on a subset of participants that met certain performance criteria. The reasons for this data selection seem convincing (e.g. to ensure empirical curves were not flat, line 264). Nevertheless, I suggest to explain and justify this step in more detail. In addition, if not all participants achieved an acceptable performance and data quality, this could also speak to the experimental task and its difficulty. Thus, I suggest discussing the potential implications of this, in particular, how this could affect the studied mechanisms, and whether it could limit the presented findings to a special group within the studied population.

    1. Author response:

      [The following is the authors’ response to the current reviews.]

      In response to Reviewer #2, we agree with the reviewer that it needs to be noted that not all forms of recognition are the same and have added the following: "However, we note that not all forms of recognition are the same; researchers may prefer to have their work featured instead of personal stories or critiques of the scientific environment."


      [The following is the authors’ response to the previous reviews.]

      We thank both reviewers for their detailed comments and insightful suggestions. Below we summarize our responses to each concern in addition to the edits within the manuscript.

      We would also like to add a clarification to the eLife assessment, it states “This important bibliometric analysis shows that authors of scientific papers whose names suggest they are female or East Asian get quoted less often in news stories about their work.” We show that individuals with names predicted to be from women or East Asian name origins are less likely to be quoted or mentioned in Nature’s scientific news stories than expected by publication demographics. In this study, we did not compare the level of coverage of a scientific article by the demographics of the authors of the article.

      Reviewer #1

      The article is not so clearly structured, which makes it hard to follow. A better framing, contextualization, and conceptualization of their analysis would help the readers to better understand the results. There are some unclear definitions and wrong wording of key concepts.

      We have adapted our wording in the text and added a more detailed discussion which hopefully makes the paper easier to comprehend. These changes are described in the context of your reviewer's suggestions and addressed in the next section.

      Language use: Male/Female refers to sex, not to gender.

      We have now updated the language throughout the text. Thank you for pointing this out.

      Regional disparities are not the same as names' origin. While the first might relate to the academic origin of authors, inferred from their institutional belonging, the latter reflects the authors' inferred identity. Ethnic identities and the construction of prejudice against specific populations need proper contextualization.

      We have added better contextualization in the manuscript and reworded the section in our results and discussion to clarify that we are analyzing disparities related to perceived ethnicity and not regions. We also added the following text to the results section “In our analysis, we use name origin as an estimate for the perceived ethnicity of a primary source by a journalist. Our prediction is not intended to assign ethnicity to an individual, but to be used broadly as a tool to quantify representational differences in a journalist's sociologically constructed perception of a primary source's ethnicity.” We also added the following text to our Discussion: “Our use of name origins is a proxy for a journalist's or referring scholarly peer’s potential perceptions of the ethnicity of a primary source as signaled by an individual's name. We do not intend to assign an identity to an individual, but to generate a broad metric to measure possible bias for particular ethnicities during journalists' primary source gathering.”

      It would be helpful to have a clear definition of what are quotes, mentions, and citations. For me, it was not so clear and made understanding the results more difficult.

      We added the following text to the results section Extracted Data Used for Analysis: “Quoted names are any names that were attached to a quote within the article. Mentioned names are any names that were stated within the article. Cited names are all author names of a scientific paper that was cited in the news article.”

      The comparison against Nature published research articles is not perfect because journalists will also cover articles not published in Nature. If for example, the gender representation in the quoted articles is not the same between Nature journals and other journals, then this source of inequality would be missing (e.g. if the journalists are biased against women, but not as much when they published in Nature, because they are also biased towards Nature articles). Also, the gender representation among Nature authors could not be the same as in general. Nevertheless, this seems to be a fair benchmark, especially if the authors did not have access to other more comprehensive databases. But a statement of limitations including these potential issues would be good to have.

      To add better context to the generalizability of our work, we added the following text to our discussion: “Furthermore, the news articles present on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership.”

      "we select the highest probability origin for each name as the resultant assignment". Threshold based approaches for race/ethnicity name-based inference have been criticized by the literature as they might reproduce biases (see Kozlowski, D., Murray, D. S., Bell, A., Hulsey, W., Larivière, V., Monroe-White, T., & Sugimoto, C. R. (2022). Avoiding bias when inferring race using name-based approaches. Plos one, 17(3), e0264270.). The authors could use the full distribution of probabilities over names instead of selecting one. The formulae proposed (3-5) could be easily adapted to this change.

      We thank the author for pointing this out. We have updated our analysis to use the probabilities instead of hard assignments. Figure 3 and formulae 3-5 have been updated. While we observe a slight shift in the calculated values, the overall trends are unchanged.

      Is it possible to make an analysis that intersects both name origin and gender? I am not sure if the sample size would allow for this, but if some other dimensions were collapsed, it would be very important to show what happens at the intersection of these two dimensions of discrimination.

      We agree that identifying any differences in quotation patterns at the intersection of gender and name origin would be very useful to identify. To address this, we added supplemental table 5. This table identifies the number of quotes per predicted name origin and gender over all years and article types. In this table, we don’t see a significant difference in gender distribution across predicted name origins.

      Given a larger sample size, we would be able to better identify more subtle differences, but at this sample size, we cannot make more detailed inferences. Additionally, this also addresses a QC-issue, where predicted gender accuracy varies by name origin, specifically East Asian name origin. From our data, we don’t see a large difference in proportions across any name origin. We added the following text to the results section to incorporate this analysis:

      “However, it should be noted that the error rate varies by name origin with the largest decrease in performance on names with an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]

      . In our analysis, we did not observe a large difference in names predicted to come from a man or woman between predicted East Asian and other name origins (Table 5). “

      The use of vocabulary should be more homogeneous. For example, in page 13 the authors start to use the concepts of over/under enrichment, which appeared before in a title but was not used.

      The text has been updated to remove all mentions of “over/under enrichment” with “over/under representation”

      In the discussions section, it would be important to see as a statement of limitations the problems that automatic origin and gender inference have.

      We thank the reviewer for this suggestion. We have added the following paragraph to our discussion.

      Computational tools enabled us to automatically analyze thousands of articles to identify existing disparities by gender and name origin, but these tools are not without limitations. Our tools are unable to identify non-binary people and rely on gender predictors that are known to have region-specific biases, with the largest decrease in performance on names of an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]. Furthermore, name origin is only a proxy for externally perceived racial or ethnic origins of a source or author and is not as accurate as self-identified race or ethnicity. Self-identification better captures the lived experience of an individual that computational estimates from a name can not capture. This is highlighted in our inability to distinguish between Black and White people from the US by their names. As the collection of demographic data by publication outlets grows, we believe this will enable a more fine-grained and accurate analysis of disparities in scientific journalism.

      Figures 2a and 3a show that the affiliations of authors and their countries was going to be used in this analysis. Yet, this section is not present in the article. I would encourage the authors to add this to the analysis as it would show important patterns, and to intersect the dimensions of gender, name origin and country.

      We were interested in using this analysis in our work, but unfortunately the sample size of cited works in each country was too small to make inferences. If this work was extended to larger scientific outlets to include larger corpora such as The Guardian or New York Times, we think one could be able to make more robust inferences. Since our work only focuses on Nature, we decided not to include this analysis. However, we do include a section in our discussion for future work.

      “As a proxy for measuring possible geographical bias of a journalist, we attempted to identify if there was any geographical bias of cited authors. To do this, we identified the affiliation of each cited author and identified their affiliated country. Unfortunately, we could not robustly extract a large enough number of cited authors from different countries to make any conclusive statements. Expanding our work to other science journalism outlets could help identify possible ways in which geographic region, genders, and perceived ethnicity interact and affect scientific visibility of specific groups. While we are unable to identify that journalists have a specific geographical bias, having reporters explicitly focused on specific regional sources will broaden coverage of international opinions in science.”

      It is not clear at that point what column dependence means.

      The abstract has been updated to state, “Gender disparity in Nature quotes was dependent on the article type.”

      Reviewer #2

      We thank the reviewer for their very detailed and insightful suggestions regarding our analysis and the key caveats that needed better contextualization in our analysis. We went through each major point the reviewer brought up below and included any additional text that was needed.

      In some cases, the manuscript lacks consistency in terminology, and uses word choice that is strange (e.g., "enrichment" and "depletion" when discussion representation).

      We thank the review for pointing this out, we have removed all instances of depletion/enrichment for over/under-representation

      Caveats to Claim 1. So while Claim 1 holds, it does not hold for all comparator sets and for all years. I don't think this is critical of the paper-the authors do discuss the trend in Claim 2-but interpretation of this claim should take care of these caveats, and readers should consider the important differences in first and last authorship.

      We thank the reviewer for their detailed feedback on this section. We have added the missing contextualization of our results. In the results section, I changed the figure caption to: “Speakers predicted to be men are sometimes overrepresented in quotes, but this depends on the year and article type.” Added the following paragraph “When considering the relative proportion of authors and speakers predicted to be men, we only find a slight over-representation of men. This overrepresentation is dependent on the authorship position and the year. Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      Generalizability to other contexts of science journalism:

      We thank the reviewer for their feedback on the generalizability of our work. We have now added the following text to our discussion to provide the reader with a better context of our results: “To articles presented on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found very similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The

      Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership. ”

      Shallow discussion:

      The authors highlight gender parity in career features, but why exactly is there gender parity in this format

      We thank the reviewer for encouraging us to better contextualize our findings in the broader discourse. We have now added several sections to our Discussion. To address gender parity, we have added the following text: “This finding, coupled with the near equal number of articles written by journalists predicted to be men or women, argues for more diversity in topical coverage. "Career Feature" articles highlight current topics relevant to working scientists and frequently highlight systemic issues with the scientific environment. This column allows space for marginalized people to critique the current state of affairs in science or share their personal stories. This type of content encourages the journalist to seek out a diverse set of primary sources. Including more content that is not primarily focused on recent publications, but all topics surrounding the practice of science, can serve as an additional tool to rapidly achieve gender parity in journalistic recognition.”

      Representation in quotations varies by first and last author, most certainly as a result of the academic division of labor in the life sciences. However, what does it say about the scientific quotation that it appears first authors are more often to be quoted? Does this mean that the division of labor is changing such that the first authors are the lead scientists? Or does it imply that senior authors are being skipped over, or giving away their chance to comment on a study to the first author?

      We thank the reviewer for asking bringing up these important questions. We have added better context to our first author analysis in our discussion. We have included the following two sections to address this. Also, we want to state that we find last authors to be slightly more quoted than first authors, as depicted in Fig. 2d., with first author quotation percentage largely appearing below the red line. We included this text in a response above and include it again here for convenience.

      “Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins.

      Furthermore, we see that the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      What might be the downstream impacts on the public stemming from the under-representation of scientists with East Asian names? According to Figure 3d, not only are East Asian names under-represented in quotations, but they are becoming more under-represented over time as they appear as authors in a greater number of Nature publications; Those with European names are proportionately represented in quotations given their share of authors in Nature. Why might this be, especially seeing as Anglo names are heavily over-represented?

      To address this point, we have added the following text to our discussion: “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins. Furthermore, the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      I am very confused by Figure 1B. It mixes the counts of News-related items with (non-Springer) research articles in a single stacked bar plot which makes determining the quantity of either difficult. I would advise splitting them out

      Figure 1B has been updated, and the News and Research articles have been separated.

      When querying the first 2000 or so results from the SpringerNature API, are the authors certain that they are getting a random sample of papers?

      These papers were the first 200 English language "Journal" papers returned by the Springer Nature API for each month, resulting in 2400 papers per year from 2005 through 2020. These papers are the first 200 papers published each month by a Springer Nature journal, which may not be completely random, but we believe to be a reasonably representative sample. Furthermore, the Springer Nature comparator set is being used as an additional comparator to the complete set of all Nature research papers used in our analyses.

      In all figures: the authors use capital letters to indicate panels in the caption, but lowercase letters in the figure itself and in the main text. This should be made consistent.

      This has been updated.

      In all figures: the authors should make the caption letter bold in the figure captions, which makes it much easier to find descriptions of specific panels

      This has been updated.

      In the section "coreNLP": the authors mention "co-reference resolution" but without really remarking why it is being used. This is an issue throughout the methods-the authors describe what method they are using but either they don't mention why they are using that method until later, or else not at all.

      We have added better reasoning behind our coreNLP selected methods: “We used the standard set of annotaters: tokenize, ssplit, pos, lemma, ner, parse, coref, and additionally the quote annotator. These perform text tokenization, sentence splitting, part of speech recognition, lemmatization, named entity recoginition, division of sentences into constituent phrases, co-reference resolution, and identification of quoted entities, respectively. We used the "statistical" algorithm to perform coreference resolution for speed. Each of these aspects is required to identify the names of quoted or mentioned speakers and identify any of their associated pronouns. All results were output to json format for further downstream processing.”

      We included a better description of scrapy: “Scrapy is a tool that applies user-defined rules to follow hyperlinks on webpages and return the information contained on each webpage.

      We used Scrapy to extract all web pages containing news articles and extract the text.”

      We also included our motivation for bootstrapping: “We used the boostrap method to construct confidence intervals for each of our calculated statistics.”

      In the section "Name Formatting for Gender Prediction in Quotes or Mentions", genderizeR is mentioned before an introduction to the tool

      We added the following text to provide context: “Even though genderizeR, the computational method used to predict the name's gender, only uses the first name to make the gender prediction, identifying the full name gives us greater confidence that we correctly identified the first name. “

      In the section "Name Formatting for Gender Prediction of Authors", you state that you exclude papers with only one author. How many papers is this? I assume few, in Nature, but if not I can imagine gender differences based on who writes first-authored papers.

      We find that the number excluded is roughly 7% of all papers, which is consistent across Nature and Springer Nature (1113/15013 for cited springer articles, 2899/42155 for random springer articles, 955/12459 for nature authors). We have added the following text to the manuscript for better context: “Roughly 7% of all papers were estimated to be by a single author and removed from this analysis.: 1113/15013 for cited Springer articles, 2899/42155 for random Springer articles, 955/12459 for Nature research articles.”

      In "Name Origin Analysis", for the in-text reference to Equation 3: include the prefix "Eq." or similar to mark this as referencing the equation and not something else

      This has been updated.

      The use of the word "enrichment" in reference to the representation of East Asian authors is strange and does not fit the colloquial definition of the term. I suggest just using a simpler term like "representation" instead

      Similarly, the authors use the word "depletion" to reflect the lower rate of quotes to scientists with East-Asian names, but I feel a simpler word would be more appropriate.

      We thank the reviewer for this suggestion, all instances of “enrichment/depletion” have been replaced with “over/under representation”

      The authors claim in Figure 2d that there is a steady increase in the rate of first author citations, however, this graph is not convincing. It appears to show much more noise than anything resembling a steady change.

      We have reworded our figure description to state that there is a consistent bias towards quoting last authors. Our figure description now states: “Panel d shows a consistent but slight bias towards quoting the last author of a cited article than the first author over time.”

      Supplemental Figures 1b and 1c do not seem to be mentioned in the main text, and I struggle to see their relevance.

      We thank the reviewer for identifying this error; these subpanels have been removed.

    1. Reviewer #2 (Public Review):

      This manuscript illustrates the power of "combined" research, incorporating a range of tools, both old and new to answer a question. This thorough approach identifies a novel target in a well-established signalling pathway and characterises a new player in Drosophila CNS development.

      Largely, the experiments are carried out with precision, meeting the aims of the project, and setting new targets for future research in the field. It was particularly refreshing to see the use of multi-omics data integration and Targeted DamID (TaDa) findings to triage scRNA-seq data. Some of the TaDa methodology was unorthodox, however, this does not affect the main finding of the study. The authors (in the revised manuscript) have appropriately justified their TaDa approaches and mentioned the caveats in the main text.

      Their discovery of Spar as a neuropeptide precursor downstream of Alk is novel, as well as its ability to regulate activity and circadian clock function in the fly. Spar was just one of the downstream factors identified from this study, therefore, the potential impact goes beyond this one Alk downstream effector.

    2. Reviewer #3 (Public Review):

      Summary:

      The receptor tyrosine kinase Anaplastic Lymphoma Kinase (ALK) in humans is nervous system expressed and plays an important role as an oncogene. A number of groups have been studying ALK signalling in flies to gain mechanistic insight into its various roles. In flies, ALK plays a critical role in development, particularly embryonic development and axon targeting. In addition, ALK was also shown to regulate adult functions including sleep and memory. In this manuscript, Sukumar et al., used a suite of molecular techniques to identify downstream targets of ALK signalling. They first used targeted DamID, a technique that involves a DNA methylase to RNA polymerase II, so that GATC sites in close proximity to PolII binding sites are marked. They performed these experiments in wild type and ALK loss of function mutants (using an Alk dominant negative ALkDN), to identify Alk responsive loci. Comparing these loci with a larval single cell RNAseq dataset identified neuroendocrine cells as an important site of Alk action. They further combined these TaDa hits with data from RNA seq in Alk Loss and Gain of Function manipulations to identify a single novel target of Alk signalling - a neuropeptide precursor they named Sparkly (Spar) for its expression pattern. They generated a mutant allele of Spar, raised an antibody against Spar, and characterised its expression pattern and mutant behavioural phenotypes including defects in sleep and circadian function.

      Strengths:

      The molecular biology experiments using TaDa and RNAseq were elegant and very convincing. The authors identified a novel gene they named Spar. They also generated a mutant allele of Spar (using CrisprCas technology) and raised an antibody against Spar. These experiments are lovely, and the reagents will be useful to the community. The paper is also well written, and the figures are very nicely laid out making the manuscript a pleasure to read.

      Weaknesses:

      The manuscript has improved very substantially in revision. The authors have clearly taken the comments on board in good faith.

      Editors' note: The authors have satisfactorily addressed the concerns raised in the previous rounds of review. These were related to the unconventional analysis of the TaDa data, the addition of other means of down regulated gene function, and the nature of analyses of behavioural data.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Point-by-point response to concerns raised by reviewer #3:

      The manuscript has improved very substantially in revision. The authors have clearly taken the comments on board in good faith. Yet, some small concerns remain around the behavioural analysis.

      In Fig. 8H and H' average sleep/day is ~100. Is this minutes of sleep? 100 min/day is far too low, is it a typo?

      The numbers for sleep bouts are also too low to me e.g. in Fig 9 number of sleep bouts avg around 4.

      In their response to reviewers the authors say these errors were fixed, yet the figures appear not to have been changed. Perhaps the old figures were left in inadvertently?

      Indeed this correction was somehow missed and we thank the reviewer for noticing this. We have now corrected Fig 8H-H’ and Fig 9D.  

      The circadian anticipatory activity analyses could also be improved. The standard in the field is to perform eduction analyses and quantify anticipatory activity e.g. using the method of Harrisingh et al. (PMID: 18003827). This typically computed as the ratio of activity in the 3hrs preceding light transition to activity in the 6hrs preceding light transition.

      In their response to reviewers, the authors have revised their anticipation analyses by quantifying the mean activity in the 6 hrs preceding light transition. However, in the method of Harrisingh et al., anticipation is the ratio of activity in the 3hrs preceding light transition to activity in the 6hrs preceding light transition. Simply computing the activity in the 6hrs preceding light transition does not give a measure of anticipation, determining the ratio is key.

      We acknowledge the importance of obtaining accurate results in our analysis, therefore we have re-evaluated the anticipation activity by measuring the ratio of the mean activity in the 3h preceding light transition over the activity in the 6h preceding light transition. We have reported the data as percentages in Fig 8F-G and modified the figure legends accordingly.

    1. Reviewer #1 (Public Review):

      Olszyński and colleagues present data showing variability from canonical "aversive calls", typically described as long 22 kHz calls rodents emit in aversive situations. Similarly long but higher-frequency (44 kHz) calls are presented as a distinct call type, including analyses both of their acoustic properties and animals' responses to hearing playback of these calls. While this work adds an intriguing and important reminder, namely that animal behavior is often more variable and complex than perhaps we would like it to be, there is some caution warranted in the interpretation of these data.

      The exclusive use of males is a major concern lacking adequate justification and should be disclosed in the title and abstract to ensure readers are aware of this limitation. With several reported sex differences in rat vocal behaviors this means caution should be exercised when generalizing from these findings. The occurrence of an estrus cycle in typical female rats is not justification for their exclusion. Note also that male rodents experience great variability in hormonal states as well, distinguishing between individuals and within individuals across time. The study of endocrinological influences on behavior can be separated from the study of said behavior itself, across all sexes. Similarly, concerns about needing to increase the number of animals when including all sexes are usually unwarranted (see Shansky [2019] and Phillips et al. [2023]).

      Regarding the analysis where calls were sorted using DBSCAN based on peak frequency and duration, my comment on the originally reviewed version stands. It seems that the calls are sorted by an (unbiased) algorithm into categories based on their frequency and duration, and because 44kHz calls differ by definition on frequency and duration the fact that the algorithm sorts them as a distinct category is not evidence that they are "new calls [that] form a separate, distinct group". I appreciate that the authors have softened their language regarding the novelty and distinctness of these calls, but the manuscript contains several instances where claims of novelty and specificity (e.g. the subtitle on line 193) is emphasized beyond what the data justifies.

      The behavioral response to call playback is intriguing, although again more in line with the hypothesis that these are not a distinct type of call but merely represent expected variation in vocalization parameters. Across the board animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls. This does raise interesting questions about how, ethologically, animals may interpret such variation and integrate this interpretation in their responses. However, the categorical approach employed here does not address these questions fully.

      I appreciate the amendment in discussing the idea of arousal being the key determinant for the increased emission of 44kHz, and the addition of other factors. Some of the items in this list, such as annoyance/anger and disgust/boredom, don't really seem to fit the data. I'm not sure I find the idea that rats become annoyed or disgusted during fear conditioning to be a particularly compelling argument. As such the list appears to be a collection of emotion-related words, with unclear potential associations with the 44kHz calls.

      Later in the Discussion the authors argue that the 44kHz aversive calls signal an increased intensity of a negative valence emotional state. It is not clear how the presented arguments actually support this. For example, what does the elongation of fear conditioning to 10 trials have to do with increased negative emotionality? Is there data supporting this relationship between duration and emotion, outside anthropomorphism? Each of the 6 arguments presented seems quite distant from being able to support this conclusion.

      In sum, rather than describing the 44kHz long calls as a new call type, it may be more accurate to say that sometimes aversive calls can occur at frequencies above 22 kHz. Individual and situational variability in vocalization parameters seems to be expected, much more so than all members of a species strictly adhering to extremely non-variable behavioral outputs.

      [Editors' note: The reviewer agrees that the additional analysis has ruled out the possibility that the calls are due to fatigue.]

    1. Author response:

      eLife assessment 

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies. 

      We thank the editor and reviewers for the accurate summary and positive assessment. We address the comment about increasing the number of reference genomes in the response to reviewer 2.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads). 

      Strengths: 

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate). 

      Weaknesses: 

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis). 

      We agree that this would be an informative addition to the study and will add it to the benchmarking.

      Appraisal: 

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future. 

      Thank you for the positive appraisal.

      Reviewer #2 (Public Review): 

      Summary: 

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling. 

      Strengths: 

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing. 

      Weaknesses: 

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors). 

      Our strategic selection of 14 genomes—spanning a variety of bacterial genera and species, diverse GC content, and both gram-negative and gram-positive species (including M. tuberculosis, which is neither)—was designed to robustly address potential variability in our results. Moreover, all our genome assemblies underwent rigorous manual inspection as the quality of the true genome sequences is the foundation this research is built upon. Given this, the fundamental conclusions regarding the accuracy of variant calls would likely remain unchanged with the addition of more genomes.  However, we do acknowledge that a substantially larger sample size, which is beyond the scope of this study, would enable more fine-grained analysis of species differences in error rates.

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      Thank you for highlighting this observation. The precision, recall, and F1 scores for each sample and condition can be found in Supplementary Table S4. We will investigate the samples that consistently perform below expectation to determine if this is associated with specific species, which may necessitate tailored recommendations for those species. Additionally, we will produce a species-segregated version of Figure 2 for a clearer interpretation and will place it in the supplementary materials.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      We agree that the highest discriminatory power is always desirable for clinical or public health applications. In which case, 25x is probably a better minimum recommendation. However, we are also aware that there are resource-limited settings where parity with Illumina is sufficient. In these cases, 10x depth from ONT would provide sufficient data.

      The manuscript currently emphasises the latter scenario, but we will revise the text to clearly recommend 25x depth as a conservative aim in settings where resources are not a constraint, ensuring the highest possible discriminatory power for applications like outbreak analysis.

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612

      To our knowledge, there is no evidence that sequencing on different ONT machines or barcoding kits leads to a difference in read characteristics or accuracy. To ensure consistency and minimise potential variability, we used the same ONT flowcells for all samples and performed basecalling on the same Nvidia A100 GPU. We will update the methods to emphasise this.

      For Illumina and ONT, the exact machines used for which samples will be added as a supplementary table. We will also add a comment about possible Illumina error rate differences in the ‘Limitations’ section of the Discussion.

      In summary, while there may be specific equipment or preparation artifacts to consider, we took steps to minimise these effects and maintain consistency across our sequencing methods.

      Reviewer #3 (Public Review): 

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation. 

      We fully agree that investigating structural variations (SVs) would be a very interesting and important follow-up. Identifying and generating ground truth SVs is a nontrivial task and we feel it deserves its own space and study. We hope to explore this in the future.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing. 

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon. 

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance. 

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required. 

      We have provided runtime benchmarks for basecalling in Supplementary Figure S16 and detailed these times in Supplementary Table S7. In addition, we state in the Results section (P10 L228-230) “Though we do note that if the person performing the variant calling has received the raw (pod5) ONT data, basecalling also needs to be accounted for, as depending on how much sequencing was done, this step can also be resource-intensive.”

      Even with super-accuracy basecalling considered, our analysis shows that variant calling remains the most resource-intensive step for Clair3, DeepVariant, FreeBayes, and NanoCaller. Therefore, the statement “the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required”, is incorrect. However, we will endeavour to make the basecalling component and considerations more prominent in the Results and Discussion.

    2. eLife assessment

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads).

      Strengths:

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate).

      Weaknesses:

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis).

      Appraisal:

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future.

    4. Reviewer #2 (Public Review):

      Summary:

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling.

      Strengths:

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing.

      Weaknesses:

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors).

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612.

    5. Reviewer #3 (Public Review):

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing.

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon.

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance.

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required.

    1. eLife assessment

      The study, from the group that pioneered migrasome, describes a novel vaccine platform derived from this newly discovered organelle. Using these cleverly engineered migrasomes – that behave like natural migrasomes – as a novel vaccine platform has the potential to overcome obstacles such as cold chain issues for vaccines like messenger RNA. Although the findings are important with practical implications for the vaccine technology, and the evidence, based on appropriate and validated methodology is convincing and is in line with current state-of-the-art, there are some critical issues that need to be addressed. These include a head-to-head comparison with proven vaccine platforms, for example, a SARS-CoV-2 mRNA vaccine or an adjuvanted recombinant spike protein.

    1. Reviewer #1 (Public Review):

      Summary:

      Winged seeds or ovules from the Devonian are crucial to understanding the origin and early evolutionary history of wind dispersal strategy. Based on exceptionally well-preserved fossil specimens, the present manuscript documented a new fossil plant taxon (new genus and new species) from the Famennian Series of Upper Devonian in eastern China and demonstrated that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds by using mathematical analysis.

      Strengths:

      The manuscript is well organised and well presented, with superb illustrations. The methods used in the manuscript are appropriate.

      Weaknesses:

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

    2. eLife assessment

      This useful manuscript describes the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using solid mathematical analysis, the authors demonstrate that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds. The manuscript will help the scientific community to understand the origin and early evolutionary history of wind dispersal strategy of early land plants.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript described the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using Mathematical analysis, the authors suggest that the integuments of the earliest ovules without a cupule, as in the new taxon and Guazia, evolved functions in wind dispersal.

      Strengths:

      The new ovule taxon's morphological part is convincing. It provides additional evidence for the earliest winged ovules, and the mathematical analysis helps to understand their function.

      Weaknesses:

      The discussion should be enhanced to clarify the significance of this finding. What is the new advance compared with the Guazia finding? The authors can illustrate the character transformations using a simplified cladogram. The present version of the main text looks flat.

    1. eLife assessment

      This important study reports the deep evolutionary conservation of a core genetic program regulating spermatogenesis in flies, mice, and humans. The data presented are supportive of the main conclusion and generally convincing. This work will be of interest to evolutionary and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      By combining an analysis of the evolutionary age of the genes expressed in male germ cells, a study of genes associated with spermatocyte protein-protein interaction networks and functional experiments in Drosophila, Brattig-Correia and colleagues provide evidence for an ancient origin of the genetic program underlying metazoan spermatogenesis. This leads to identifying a relatively small core set of functional interactions between deeply conserved gene expression regulators, whose impairment is then shown to be associated with cases of human male infertility.

      Strengths:

      In my opinion, the work is important for three different reasons. First, it shows that, even though reproductive genes can evolve rapidly and male germ cells display a significant level of transcriptional noise, it is still possible to obtain convincing evidence that a conserved core of functionally interacting genes lies at the basis of the male germ transcriptome. Second, it reports an experimental strategy that could also be applied to gene networks involved in different biological problems. Third, the authors make a compelling case that, due to its effects on human spermatogenesis, disruption of the male germ cell orthoBackbone can be exploited to identify new genetic causes of infertility.

      Weaknesses:

      The main strength of the general approach followed by the authors is, inevitably, also a weakness. This is because a study rooted in comparative biology is unlikely to identify newly emerged genes that may adopt key roles in processes such as species-specific gamete recognition. Additionally, using a TPM >1 threshold for protein-coding transcripts may exclude genes, such as those encoding proteins required for gamete fusion, which are thought to be expressed at a very low level. Although these considerations raise the possibility that the chosen approach may miss information that, depending on the species, could be potentially highly functionally important, this by no means reduces its value in identifying genes belonging to the conserved genetic program of spermatogenesis.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a tour de force study that aims to understand the genetic basis of male germ cell development across three animal species (human, mouse, and flies) by performing a genetic program conservation analysis (using phylostratigraphy and network science) with a special emphasis on genes that peak or decline during mitosis-to-meiosis. This analysis, in agreement with previous findings, reveals that several genes active during and before meiosis are deeply conserved across species, suggesting ancient regulatory mechanisms. To identify critical genes in germ cell development, the investigators integrated clinical genetics data, performing gene knockdown and knockout experiments in both mice and flies. Specifically, over 900 conserved genes were investigated in flies, with three of these genes further studied in mice. Of the 900 genes in flies, ~250 RNAi knockdowns had fertility phenotypes. The fertility phenotypes for the fly data can be viewed using the following browser link: https://pages.igc.pt/meionav. The scope of target gene validation is impressive. Below are a few minor comments.

      (1) In Supplemental Figure 2, it is notable that enterocyte transcriptomes are predominantly composed of younger genes, contrasting with the genetic age profile observed in brain and muscle cells. This difference is an intriguing observation and it would be curious to hear the author's comments.

      (2) Regarding the document, the figures provided only include supplemental data; none of the main text figures are in the full PDF.

      (3) Lastly, it would be great to section and stain mouse testis to classify the different stages of arrest during meiosis for each of the mouse mutants in order to compare more precisely to flies.

      This paper serves as a vital resource, emphasizing that only through the analysis of hundreds of genes can we prioritize essential genes for germ cell development. its remarkable that about 60% of conserved genes have no apparent phenotype during germ cell development.

      Strengths:

      The high-throughput screening was conducted on a conserved network of 920 genes expressed during the mitosis-to-meiosis transition. Approximately 250 of these genes were associated with fertility phenotypes. Notably, mutations in 5 of the 250 genes have been identified in human male infertility patients. Furthermore, 3 of these genes were modeled in mice, where they were also linked to infertility. This study establishes a crucial groundwork for future investigations into germ cell development genes, aiming to delineate their essential roles and functions.

      Weaknesses:

      The fertility phenotyping in this study is limited, yet dissecting the mechanistic roles of these proteins falls beyond its scope. Nevertheless, this work serves as an invaluable resource for further exploration of specific genes of interest.

    1. eLife assessment

      This important study reports the developmental dynamics and molecular markers of the rete ovarii during ovarian development. However, the data supporting the main conclusions remain incomplete. This study will be of interest to developmental and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Anbarcia et al. re-evaluates the function of the enigmatic Rete Ovarii (RO), a structure that forms in close association with the mammalian ovary. The RO has generally been considered a functionless structure in the adult ovary. This manuscript follows up on a previous study from the lab that analyzed ovarian morphogenesis using high-resolution microscopy (McKey et al., 2022). The present study adds finer details to RO development and possible function by (1) identifying new markers for OR sub-regions (e.g. GFR1a labels the connecting rete) suggesting that the sub-regions are functionally distinct, (2) showing that the OR sub-regions are connected by a luminal system that allows transport of material from the extra-ovarian rete (EOR) to the inter-ovarian rete (IOG), (3) identifies proteins that are secreted into the OR lumen and that may regulate ovarian homeostasis, and finally, (4) better defines how the vasculature, nervous, and immune system integrates with the OR.

      Strengths:

      The data is beautifully presented and convincing. They show that the RO is composed of three distinct domains that have unique gene expression signatures and thus likely are functionally distinct.

      Weaknesses:

      It is not always clear what the novel findings are that this manuscript is presenting. It appears to be largely similar to the analysis done by McKey et al. (2022) but with more time points and molecular markers. The novelty of the present study's findings needs to be better articulated.

    3. Reviewer #2 (Public Review):

      A large number of ovarian experiments have been conducted - especially in morphological and molecular biology studies - specifically removing the ovarian membrane. This experiment is a good supplement to existing knowledge and plays an important role in early ovarian development and the regulation of ovarian homeostasis during the estrous cycle. There are also innovations in research ideas and methods, which will meet the requirements of experimental design and provide inspiration for other researchers.

      This reviewer did not identify any major issues with the article. However, the following points could be further clarified:

      (1) Is there any comparative data on the proteomics of RO and rete testis in early development? With some molecular markers also derived from rete testis, it would be better to provide the data or references.

      (2) Although the size of RO and its components is quite small and difficult to operate, the researchers in this article had already been able to perform intracavitary injection of EOR and extract EOR or CR for mass spectrometry analysis. Therefore, can EOR, CR, or IOR be damaged or removed, providing further strong evidence of ovarian development function?

      (3) Although IOR is shown on the schematic diagram, it cannot be observed in the immunohistochemistry pictures in Figure 1 and Figure 3. The authors should provide a detailed explanation.

    4. Reviewer #3 (Public Review):

      Summary:

      The rete ovarii (RO) has long been disregarded as a non-functional structure within the ovary. In their study, Anbarci and colleagues have delineated the markers and developmental dynamics of three distinct regions of the RO - the intraovarian rete (IOR), the extraovarian rete (EOR), and the connecting rete (CR). Notably focusing on the EOR, the authors presented evidence illustrating that the EOR forms a convoluted tubular structure culminating in a dilated tip. Intriguingly, microinjections into this tip revealed luminal flow towards the ovary containing potentially secreted functional proteins. Additionally, the EOR cells exhibit associations with vasculature, macrophages, and neuronal projections, proposing the notion that the RO may play a functional role in ovarian development during critical ovariogenesis stages. By identifying marker genes within the RO, the authors have also suggested that the RO could serve as a potential structure linking the ovary with the neuronal system.

      Strengths:

      Overall, the reviewer commends the authors for their systematic research on the RO, shedding light on this overlooked structure in developing ovaries. Furthermore, the authors have proposed a series of hypotheses that are both captivating and scientifically significant, with the potential to reshape our understanding of ovarian development through future investigations.

      Weaknesses:

      There is a lack of conclusive data supporting many conclusions in the manuscript. Therefore, the paper's overall conclusions should be moderated until functional validations are conducted.

    1. eLife assessment

      The authors combined human genetic analysis with zebrafish experiments to produce evidence that alleles that impair the function of EPHA4 cause idiopathic scoliosis (IS), a common spinal deformity. The significance of the findings is important because the cellular and molecular mechanisms that contribute to IS remain poorly understood. The human genetic data are quite convincing whereas the zebrafish data, although supportive, are incomplete.

    2. Joint Public Review:

      Summary:

      Idiopathic scoliosis (IS) is a common spinal deformity. Various studies have linked genes to IS, but underlying mechanisms are unclear such that we still lack understanding of the causes of IS. The current manuscript analyzes IS patient populations and identifies EPHA4 as a novel associated gene, finding three rare variants in EPHA4 from three patients (one disrupting splicing and two missense variants) as well as a large deletion (encompassing EPHA4) in a Waardenburg syndrome patient with scoliosis. EPHA4 is a member of the Eph receptor family. Drawing on data from zebrafish experiments, the authors argue that EPHA4 loss of function disrupts the central pattern generator (CPG) function necessary for motor coordination.

      Strengths:

      The main strength of this manuscript is the human genetic data, which provides convincing evidence linking EPHA4 variants to IS. The loss of function experiments in zebrafish strongly support the conclusion that EPHA4 variants that reduce function lead to IS.

      Weaknesses:

      The conclusion that disruption of CPG function causes spinal curves in the zebrafish model is not well supported. The authors' final model is that a disrupted CPG leads to asymmetric mechanical loading on the spine and, over time, the development of curves. This is a reasonable idea, but currently not strongly backed up by data in the manuscript. Potentially, the impaired larval movements simply coincide with, but do not cause, juvenile-onset scoliosis. Support for the authors' conclusion would require independent methods of disrupting CPG function and determining if this is accompanied by spine curvature. At a minimum, the language of the manuscript could be toned down, with the CPG defects put forward as a potential explanation for scoliosis in the discussion rather than as something this manuscript has "shown". An additional weakness of the manuscript is that the zebrafish genetic tools are not sufficiently validated to provide full confidence in the data and conclusions.

    1. eLife assessment

      This work is important because it attempts to elucidate how immune cells migrate across the blood brain barrier. The authors developed a convincing framework to visualize, recognize and track the movement of different immune cells across primary human and mouse brain microvascular endothelial cells without the need for fluorescence-based imaging using microfluidic devices. The data gathered are solid, and this work will be of interest to the cancer biology, immunology and medical therapeutics fields.

    2. Reviewer #1 (Public Review):

      Summary:

      It is evident that studying leukocyte extravasation in vitro is a challenge. One needs to include physiological flow, culture cells and isolate primary immune cells. Timing is of utmost importance and a reproducible setup essential. Extra challenges are met when extravasation kinetics in different vascular beds is required, e.g., across the blood-brain barrier. In this study, the authors describe a reliable and reproducible method to analyze leukocyte TEM under physiological flow conditions, including this analysis. That the software can also detect reverse TEM is a plus.

      Strengths:

      It is quite a challenge to get this assay reproducible and stable, in particular as there is flow included. Also for the analysis, there is currently no clear software analysis program, and many labs have their own methods. This paper gives the opportunity to unify the data and results obtained with this assay under label-free conditions. This should eventually lead to more solid and reproducible results.

      Also, the comparison between manual and software analysis is appreciated.

      Weaknesses:

      The authors stress that it can be done in BBB models, but I would argue that it is much more broadly applicable. This is not necessarily a weakness of the study but more an opportunity to strengthen the method. So I would encourage the authors to rewrite some parts and make it more broadly applicable.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper develops an under-flow migration tracker to evaluate all the steps of the extravasation cascade of immune cells across the BBB. The algorithm is useful and has important applications.

      Strengths:

      Algorithm is almost as accurate as manual tracking and importantly saves time for researchers.

      Weaknesses:

      Applicability can be questioned because the device used is 2D and physiological biology is in 3D. Comparisons to other automated tools was not performed by the authors.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors aimed to establish a faster and more efficient method of tracking steps of T-cell extravasation across the blood brain barrier. The authors developed a framework to visualize, recognize and track the movement of different immune cells across primary human and mouse brain microvascular endothelial cells without the need for fluorescence-based imaging. The authors succinctly describe the basic requirements for tracking in the introduction followed by an in-depth account of the execution.

      Weaknesses and Strengths:

      Materials & methods and results:

      (1) The methods section also lacks details of the microfluidic device that the authors talk about in the paper. Under physiological sheer stress, the T-cells detach from the pMBMEC monolayer, and are hence unable to be detected; however, this observation requires an explanation pertaining to the reason of occurrence and potential solutions to circumvent it to ensure physiologically relevant experimental parameters.

      (2) The author describes a method for debris exclusion using UFMTrack that eliminates objects of <30 pixels in size from analysis based on a mean pixel size of 400 for T lymphocytes. However, this mean pixel size appears to stem from in-vitro activated CD8 T cells, which rapidly grow and proliferate upon stimulation. In line with this, activated lymphocytes exhibit increased cytoplasmic area, making them appear less dense or "brighter" by phase microscopy compared to naïve lymphocytes, which are relatively compact and subsequently appear dimmer. Given this, it is not clear whether UFMTrack is sufficiently trained to identify naïve human lymphocytes in circulating blood, nor smaller, murine lymphocytes. Analysis of each lymphocyte subtype in terms of pixel size and intensity would be beneficial to strengthen the claim that UFMTrack can identify each of these populations. Additionally, demonstrating that UFMTrack can correctly characterize the behavior of naïve versus activated lymphocytes isolated from murine and human sources would strengthen the claim that UFMTrack can be broadly applied to study lymphocyte dynamics in diverse models without additional training

      (3) Average precision was compared to the analysis of UFMTrack but it is unclear how average precision was calculated. This information should have been included in the methods section

      (4) CD4 and CD8 T cells exhibit distinct biology and interaction kinetics driven in part by their MHC molecule affinity and distinct receptor expression profiles. Thus, it is unclear why two distinct mechanisms of endothelial cell activation are needed to see differences between the populations.

      (5) The BMECs are barrier tissues but were cultured on µdishes in this study. To study the transmigration of T-cells across the endothelium, the model would have been more relevant on a semi-permeable membrane instead of a closed surface.

      (6) Methods are provided for the isolation and expansion of human effector and memory CD4+ T cells. However, there is no mention of specific CD4+ T cell populations used for analysis with UFMTrack, nor a clear breakdown of tracking efficiency for each subpopulation. Further, there is no similar method for the isolation of CD8+ T cell compartments. A clear breakdown of the performance efficiency of UFMTrack with each cell population investigated in this study would provide greater insight into the software's performance with regard to tracking the behavior and movement of distinct immune populations.

      (7) The results section is quite extensive and discusses details of establishment of the framework while highlighting both the pros and cons of the different aspects of the process, for example the limitation of the two models, 2D and 2D+T were highlighted well. However, the results section includes details which may be more fitting in the methods section.

      (8) A few statements in the results section lacked literary support, which was not provided in the discussion either, such as support for increased variance of T-cell instantaneous speed on stimulated vs non-stimulated pMBMECs. Another example is the enhancement of cytokine stimulation directed T-cell movement on the pMBMECs that the authors observed but failed to relay the physiological relevance of it. The authors don't provide enough references for developments in the field prior to their work which form the basis and need for this technology.

      (9) The rationale for use of OT-1 and 2D2-derived murine lymphocytes is unclear here. The OT-1 model has been generated to study antigen-specific CD8+ T cell responses, while the 2D2 model has been generated to recapitulate CD4 T cell-specific myelin oligodendrocyte glycoprotein (MOG) responses.

      Figures and text:

      (1) There are certain discrepancies and misarrangement of figures and text. For example, discussion of the effect of sheer flow on T cell attachment as part of the introduction in figure 1 and then mentioning it in the text again in the results section as part of figure 4 is repetitive.

      (2) Section IV, subsection 1 of the results section, refers to 'data acquisition section above' in line 279, however the said section is part of materials and methods which is provided towards the end of the manuscript.

      (3) There are figures in the manuscript that have not been referenced in the results section, for example, figure 3A and B. Figure 1 hasn't been addressed until subsection 7 of materials and methods

      (4) A lack of significance but an observed trend of increased variance of T cell instantaneous speed is reported in line 296-298; however, the graph (figure 4G) shows a significant change in instantaneous speed between non-stimulated and TNFα-stimulated systems. This is misleading to the readers.

      (5) The authors talk about three beginner experimentors testing the manual T cell tracking process but figure 5 only showcases data from two experimentors without stating the reason for excluding experimentor 1.

      Discussion:

      (1) While the discussion captures the major takeaways from the paper, it lacks relevant supporting references to relate the observation to physiological conditions and applicability.

      (2) The discussion lacks connection to the results since the figures were not referenced while discussing an observed trend

      (3) The authors briefly looked into mouse and human BMECs and their individual interaction with T-cells, but don't discuss the differences between the two, if any, that challenged their framework.

      (4) Even though though the imaging tool relies on difference in appearance for detection, the authors talk about lack of feasibility in detecting transmigration of BMDMs due to their significantly different appearance. The statement lacks a problem solving approach to discuss how and why this was the case.

      Relevance to the field:

      Utilizing the framework provided by the authors, the application can be adapted and/or utilized for visualizing a range of different cell types, provided they are different in appearance. However, this would require extensive changes to the script and won't be adaptable in its current form.

    1. eLife assessment

      This fundamental study provides a modeling regime that provides new insight into the energy-preservation parameters among schooling fish. The strength of the evidence supporting observations such as distilled dynamics between leading and lagging schooling fish which are derived from emergent properties is convincing. Overall, the study provides exciting insights into energetic coupling with respect to group swimming dynamics. Some potential improvements to strengthen the study include clarification regarding degrees of freedom and parameter ranges in the model.

    2. Reviewer #1 (Public Review):

      Summary:

      The study seeks to establish accurate computational models to explore the role of hydrodynamic interactions on energy savings and spatial patterns in fish schools. Specifically, the authors consider a system of (one degree-of-freedom) flapping airfoils that passively position themselves with respect to the streamwise direction, while oscillating at the same frequency and amplitude, with a given phase lag and at a constant cross-stream distance. By parametrically varying the phase lag and the cross-stream distance, they systematically explore the stability and energy costs of emergent configurations. Computational findings are leveraged to distill insights into universal relationships and clarify the role of the wake of the leading foil.

      Strengths:

      (1) The use of multiple computational models (computational fluid dynamics, CFD, for full Navier-Stokes equations and computationally efficient inviscid vortex sheet, VS, model) offers an extra degree of reliability of the observed findings and backing to the use of simplified models for future research in more complex settings.

      (2) The systematic assessment of the stability and energy savings in multiple configurations of pairs and larger ensembles of flapping foils is an important addition to the literature.

      (3) The discovery of a linear phase-distance relationship in the formation attained by pairs of flapping foils is a significant contribution, which helps compare different experimental observations in the literature.

      (4) The observation of a critical size effect for in-line formations of larger, above which cohesion and energetic benefits are lost at once, is a new discovery in the field.

      Weaknesses:

      (1) The extent to which observations on one-degree-of-freedom flapping foils could translate to real fish schools is presently unclear so some of the conclusions on live fish schools are likely to be overstated and would benefit from some more biological framing.

      (2) The analysis of non-reciprocal coupling is not as novel as the rest of the study and potentially not as convincing due to the chosen linear metric of interaction (that is, the flow agreement).

      Overall, this is a rigorous effort on a critical topic: findings of the research can offer important insight into the hydrodynamics of fish schooling, stimulating interdisciplinary research at the interface of computational fluid mechanics and biology.

    3. Reviewer #2 (Public Review):

      The document "Mapping spatial patterns to energetic benefits in groups of flow-coupled swimmers" by Heydari et al. uses several types of simulations and models to address aspects of stability of position and power consumption in few-body groups of pitching foils. I think the work has the potential to be a valuable and timely contribution to an important subject area. The supporting evidence is largely quite convincing, though some details could raise questions, and there is room for improvement in the presentation. My recommendations are focused on clarifying the presentation and perhaps spurring the authors to assess additional aspects:

      (1) Why do the authors choose to set the swimmers free only in the propulsion direction? I can understand constraining all the positions/orientations for investigating the resulting forces and power, and I can also understand the value of allowing the bodies to be fully free in x, y, and their orientation angle to see if possible configurations spontaneously emerge from the flow interactions. But why constrain some degrees of freedom and not others? What's the motivation, and what's the relevance to animals, which are fully free?

      (2) The model description in Eq. (1) and the surrounding text is confusing. Aren't the authors computing forces via CFD or the VS method and then simply driving the propulsive dynamics according to the net horizontal force? It seems then irrelevant to decompose things into thrust and drag, and it seems irrelevant to claim that the thrust comes from pressure and the drag from viscous effects. The latter claim may in fact be incorrect since the body has a shape and the normal and tangential components of the surface stress along the body may be complex.

      (3) The parameter taudiss in the VS simulations takes on unusual values such as 2.45T, making it seem like this value is somehow very special, and perhaps 2.44 or 2.46 would lead to significantly different results. If the value is special, the authors should discuss and assess it. Otherwise, I recommend picking a round value, like 2 or 3, which would avoid distraction.

      (4) Some of the COT plots/information were difficult to interpret because the correspondence of beneficial with the mathematical sign was changing. For example, DeltaCOT as introduced on p. 5 is such that negative indicates bad energetics as compared to a solo swimmer. But elsewhere, lower or more negative COT is good in terms of savings. Given the many plots, large amounts of data, and many quantities being assessed, the paper needs a highly uniform presentation to aid the reader.

      (5) I didn't understand the value of the "flow agreement parameter," and I didn't understand the authors' interpretation of its significance. Firstly, it would help if this and all other quantities were given explicit definitions as complete equations (including normalization). As I understand it, the quantity indicates the match of the flow velocity at some location with the flapping velocity of a "ghost swimmer" at that location. This does not seem to be exactly relevant to the equilibrium locations. In particular, if the match were perfect, then the swimmer would generate no relative flow and thus no thrust, meaning such a location could not be an equilibrium. So, some degree of mismatch seems necessary. I believe such a mismatch is indeed present, but the plots such as those in Figure 4 may disguise the effect. The color bar is saturated to the point of essentially being three tones (blue, white, red), so we cannot see that the observed equilibria are likely between the max and min values of this parameter.

      (6) More generally, and related to the above, I am favorable towards the authors' attempts to find approximate flow metrics that could be used to predict the equilibrium positions and their stability, but I think the reasoning needs to be more solid. It seems the authors are seeking a parameter that can indicate equilibrium and another that can indicate stability. Can they clearly lay out the motivation behind any proposed metrics, and clearly present complete equations for their definitions? Further, is there a related power metric that can be appropriately defined and which proves to be useful?

      (7) Why do the authors not carry out CFD simulations on the larger groups? Some explanations should be given, or some corresponding CFD simulations should be carried out. It would be interesting if CFD simulations were done and included, especially for the in-line case of many swimmers. This is because the results seem to be quite nuanced and dependent on many-body effects beyond nearest-neighbor interactions. It would certainly be comforting to see something similar happen in CFD.

      (8) Related to the above, the authors should discuss seemingly significant differences in their results for long in-line formations as compared to the CFD work of Peng et al. [48]. That work showed apparently stable groups for numbers of swimmers quite larger than that studied here. Why such a qualitatively different result, and how should we interpret these differences regarding the more general issue of the stability of tandem groups?

      (9) The authors seem to have all the tools needed to address the general question about how dynamically stable configurations relate to those that are energetically optimal. Are stable solutions optimal, or not? This would seem to have very important implications for animal groups, and the work addresses closely related topics but seems to miss the opportunity to give a definitive answer to this big question.

      (10) Time-delay particle model: This model seems to construct a simplified wake flow. But does the constructed flow satisfy basic properties that we demand of any flow, such as being divergence-free? If not, then the formulation may be troublesome.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a valuable study that describes the effects of T. pallidum on neural development by applying single-cell RNA sequencing to an iPSC-derived brain organoid model. The evidence supporting the claims of the authors is solid, although further evidence to understand the differences in infection rates would strengthen the conclusions of the study. In particular, the conclusions would be strengthened by validating infection efficiency as this can impact the interpretation of single-cell sequencing results, and how these metrics affect organoid size as well as comparison with additional infectious agents. Furthermore, additional validations of downstream effectors are not adequate and could be improved. 

      Thank you very much for your valuable comments. Since we used the organoid model for the first time to investigate the effects of T. pallidum on brain development, the study design is not perfect. As you have accurately mentioned, the results of the paper do not have more in-depth details, especially to verify the infection rate of T. pallidum. Your valuable comments will be very useful for us for carrying out further research. In addition, the downstream effector validation is inadequate, so we performed an analysis of single-cell sequencing data to strengthen our view in the revised manuscript (See Figure 5F for a description in current manuscript).

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting study by Xu et al showing the effects of infection with the Treponema pallidum virus (which causes syphilis disease) on neuronal development using iPSC-derived human brain organoids as a model and single-cell RNA sequencing. This work provides an important insight into the impact of the virus on human development, bridging the gap between the phenomena observed in studies using animal models as well as non-invasive human studies showing developmental abnormalities in fetuses infected with the virus in utero through maternal vertical transmission.

      Using single-cell RNAseq in combination with qPCR and immunofluorescence techniques, the authors show that T. pallidum infected organoids are smaller in size, in particular during later growth stages, contain a larger number of undifferentiated neuronal lineage cells, and exhibit decreased numbers of specific neuronal subcluster, which the authors have identified as undifferentiated hindbrain neurons.

      The study is an important first step in understanding how T. pallidum affects human neuronal development and provides important insight into the potential mechanisms that underlie the neurodevelopmental abnormalities observed in infected human fetuses. Several important weaknesses have also been noted, which need to be addressed to strengthen the study's conclusions.

      Strengths:

      (1) The study is well written, and the data quality is good for the most part.

      (2) The study provides an important first step in utilizing human brain organoids to study the impact of T. pallidum infection on neuronal development.

      (3) The study's conclusions may provide important insight to other researchers focused on studying how viral infections impact neuronal development. 

      Thank you very much for your positive feedback. Below, you will find our detailed responses to your concerns, addressed point-by-point. I once again sincerely appreciate your time and effort in reviewing our manuscript.

      Weaknesses:

      (1) It is unclear how T. pallidum infection was validated in the organoids. If not all cells are infected, this could have important implications for the study's conclusions, in particular the single-cell RNAseq experiments. Were only cells showing the presence of the virus selected for sequencing? A detailed description of how infection was validated and the process of selection of cells for RNAseq would strongly support the study's conclusions. 

      Thank you for your valuable comment. We completely agree with your point. Exploring the infection rate of T. pallidum to brain organoids is a key factor that must be considered. We selected pluripotent stem cell-derived brain organoids to simulate the process of foetal brain neurodevelopment and cultured them mixed with T. pallidum to mimic T. pallidum invading brain tissue. Since brain organoids are three-dimensional structures formed by nerve cell aggregation, T. pallidum invades organoids from the periphery to the center of the organoids gradually. T. pallidum acts on organoids long enough to increase the infection rates; however, the pathogen is selective in invading human cells. If we only select cells present in T. pallidum for sequencing, the authenticity of simulating "real world" infections is somewhat weakened. To better carry out this study, selecting cells from intact organoids for sequencing, without eliminating cells without T. pallidum, can better simulate the effect of T. pallidum infection on the nervous system. Of course, we should also set up a blank control group.

      (2) The authors show that T. pallidum infection results in impaired development of hindbrain neurons. How does this finding compare to what has already been shown in animal studies? Is a similar deficit in this brain region observed with this specific virus? It would be useful to strengthen the study's conclusions if the authors added a discussion about the observed deficits in hindbrain neuronal development, and prior literature on similar studies conducted in animal models or human patients. Does T. pallidum preferentially target these neurons, or is this a limitation of the current organoid model system? 

      Thank you for your valuable comments. The finding that T. pallidum infection results in impaired development of hindbrain neurons has not been verified in animal experiments. Of course, it is better to further validate the findings in organoid studies through animal experiments. Unfortunately, due to the technical challenges, mature animal models have not been developed for the study of congenital syphilis. Although our team has been working on the development of animal models of congenital neurosyphilis, the current progress is still not satisfactory. After struggling hard in this field for many years, we decided to attempt to utilize human brain organoids instead of animal models to study the impact of T. pallidum infection on neuronal development.

      We also checked prior literature on similar studies that have referred to the content in human patients. Dan Doherty et al. reported that patients with pontocerebellar hypoplasia develop microcephaly at birth or over time after birth (PMID: 23518331). Based on your constructive suggestions, we have added some content related to hindbrain to the “Discussion” section.

      Our study found that T. pallidum could inhibit the differentiation of subNPC1B in brain organoids, thereby reducing the differentiation from subNPC1B to hindbrain neurons, and ultimately affecting the development and maturation of hindbrain neurons during pregnancy. Based on our results, T. pallidum does not preferentially target hindbrain neurons. Of course, there are limitations to the current organoid model system, see the "Limitations" section.

      PMID: 23518331- Dan Doherty et al, Midbrain and hindbrain malformations: advances in clinical diagnosis, imaging, and genetics.

      Revision in the “Discussion” section, line 343-352:

      “The vertebrate hindbrain contains a complex network of dedicated neural circuits that play an essential role in controlling many physiological processes and behaviors, including those related to the cerebellum, pons, and medulla oblongata (Shoja et al., 2018). Patients with pontocerebellar hypoplasia represent the less severe end of the spectrum with early hyperreflexia, developmental delay, and feeding problems, eventually developing spasticity and involuntary movements in childhood, while some patients represent the severe end of the spectrum characterised by polyhydramnios, severe hyperreflexia, contracture, and early death from central respiratory failure. Patients with pontocerebellar hypoplasia develop microcephaly at birth or over time after birth (Doherty et al., 2013).”

      (3) The authors show that T. pallidum-infected organoids are smaller in size by measuring organoid diameter during later stages of organoid growth, with no change during early stages. Does that represent insufficient infection at the early stages? Is this due to increased cell death or lack of cell division in the infected organoids? Experiments using IHC to quantify levels of cleaved caspase and/or protein markers for cell proliferation would be able to address these questions. 

      Thank you for your valuable suggestion. The concentration of T. pallidum in patients with syphilis was generally very low (PMID: 21752804, 35315702, 33099614). In this study, a low concentration of T. pallidum was applied to brain organoids to simulate early foetal transmission of syphilis. Nerve cells mainly establish intercellular connections to form brain organoids in the way of adhesion, which can easily cause organoids to divide and die if treated with a high concentration of T. pallidum. Furthermore, based on your suggestions, we performed additional immunostaining analyses to verify the apoptosis of brain organoids infected by T. pallidum. Cleaved caspase 3 (clCASP3) staining showed that the number of apoptotic cells increased following T. pallidum infection; however, the proportion of apoptotic cells in both groups of brain organoids was very low (Figure supplement 2) (N=12 organoids, each group from three independent bioreactors), which would be not enough to affect the results of the experiment, thereby suggesting that neural differentiation and development of brain organoids were mainly inhibited following T. pallidum infection (rather than promoting organoid apoptosis).

      PMID: 21752804-- Craig Tipple et al, Getting the measure of syphilis: qPCR to better understand early infection.

      PMID: 35315702-- Cuini Wang et al, Quantified Detection of Treponema pallidum DNA by PCR Assays in Urine and Plasma of Syphilis Patients.

      PMID: 33099614—Cuini Wang et al, A New Specimen for Syphilis Diagnosis: Evidence by High Loads of Treponema pallidum DNA in Saliva.

      Revision in the “Results” section, line 105-108:

      “… cleaved caspase 3 (clCASP3) staining showed that the number of apoptotic cells increased significantly following T. pallidum infection, but the proportion of apoptotic cells in both groups of brain organoids was very low (Figure supplement 2) (N=12 organoids, each group from three independent bioreactors) …”

      Revision in the “Materials and methods” section, line 446-447:

      “…anti-cleaved caspase 3 (rabbit, 1:100, Cell Signaling Technology, 9661S),”

      Revision in the “Supplementary File” section, line 78-81:

      Author response image 1.

      The number of clCASP3+ cells in the microscopic field of brain organoids. A nonparametric t-test was used to evaluate the statistical differences between the two groups. (**: P < 0.01).

      (4) In Figure 1D authors show differences in rosette-like structure in the infected organoids. The representative images do not appear to be different in any of the discussed components (e.g., the sox2 signal looks fairly similar between the two conditions). No quantification of these structures was presented. Authors should provide quantification or a more representative image to support their statement. 

      Thank you for your valuable suggestion. I have quantified the neural rosette structure and compared the number of intact rosette-like structures between the two groups (See Figure 1D for a description in current manuscript).

      (5) The IHC images shown in Figures 3E, G, and Figure 4E look very similar between the two conditions despite the discussed decrease in the text. A more suitable representative image should be presented, or the analysis should be amended to reflect the observed results. 

      Thank you for your valuable suggestion. I have replaced more representative images in Figure 3E, G, and Figure 4E in the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study provides an important overview of infectious etiology for neurodevelopment delay.

      Strengths:

      Strong RNA evaluation.

      Weaknesses:

      The study lacks an overview of other infectious agents. The study should address the epigenetic contributors (PMID: 36507115) and the role of supplements in improving outcomes (PMID: 27705610). 

      Addressing the above - with references included - is recommended. 

      Thank you for your valuable comment. Our research is mainly inspired by other infectious agents, such as Zika virus; there are many descriptions of Zika virus in the “Discussion” section of the manuscript to better describe and demonstrate our point of view (See pages 12–13). I was unable to retrieve the article (PMID: 36507115), kindly help in confirming the PMID number. I will be very grateful if you can provide the full text. Secondly, I have carefully read the article (PMID: 27705610), which is a very rich and comprehensive review, and summarised and cited it in appropriate places in our manuscript.

      Revision in the “Discussion- limitation” section, line 375-379:

      “First, although several recent protocols have made use of growth factors to promote further neuronal maturation and survival (Lucke-Wold et al., 2018), the organoid culture scheme needs to be further improved owing to the lower percentage of mature neurons and the challenge of cell necrosis within the organoids at this stage in day 55 organoids.”

      Reviewer #3 (Public Review): 

      This article is the first report to study the effects of T. pallidum on the neural development of an iPSC-derived brain organoid model. The study indicates that T. pallidum inhibits the differentiation of subNPC1B neurons into hindbrain neurons, hence affecting brain organoid neurodevelopment. Additionally, the TCF3 and notch signaling pathways may be involved in the inhibition of the subNPC1B-hindbrain neuron differentiation axis. While the majority of the data in this study support the conclusions, there are still some questions that need to be addressed and data quality needs to be improved. The study provides valuable insights for future investigations into the mechanisms underlying congenital neurodevelopment disability. 

      I sincerely appreciate your comments on our paper. The comments have helped us greatly improve the quality of our paper. Thank you for your time and constructive critique.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Paired t-test analysis is not appropriate if two distinct groups are compared. 

      I sincerely apologize for our presentation. We used a nonparametric t-test to compare the two groups. I have confirmed and corrected the statistical method description of this manuscript (Revision in the “Materials and methods” section (line 553-555) and “Figures-legend” section (line 789-790, 817-818, 829-830) in current manuscript).

      Reviewer #3 (Recommendations For The Authors): 

      (1) Can the authors explain why the mean size of organoids infected with T. pallidum is smaller?

      Thank you for your valuable comment. In our study, T. pallidum infection resulted in brain organisational changes in neural rosette-like structures resembling the proliferative regions of the human ventricular zone and caused fewer and incomplete rosette-like structures. Next, the ventricular zone is also the main area where neural progenitor cells (NPCs) reside (PMID: 33838105); our results showed that the proportion of neural progenitor cells (NPC)1 was reduced after T. pallidum infection. Rosette-like structure size changes owing to NPC depletion. Therefore, the mean size of organoids infected with T. pallidum is smaller.

      Revision in the “Results” section, line 101-104:

      “T. pallidum infection resulted in brain organisational changes in neural rosette-like structures resembling the proliferative regions of the human ventricular zone where NPC reside (Krenn et al., 2021), and caused fewer and incomplete rosette-like structures (P < 0.01) (Figure 1D)”

      (2) Why was the target gene for qRT-PCR validation selected to be HOXA5、HOXC5、HOXA4?

      Thank you for your valuable comment. The qRT-PCR experiment was selected here to verify the analysis results of the scRNA-seq. HOX family genes are key factors controlling early hindbrain development, which are expressed in the hindbrain region during the gastrulation stage of early embryonic development and persist into the nerve cell stage, and are essential for the correct induction of hindbrain development and segmentation (PMID: 2571936, 1983472, 1673098, 15930115). Therefore, we selected the HOX family gene for verification.

      PMID: 2571936-WILKINSON D G, et al. Segmental expression of Hox-2 homoeobox- containing genes in the developing mouse hindbrain.

      PMID: 1983472-- FROHMAN M A, et al. Isolation of the mouse Hox-2.9 gene; analysis of embryonic expression suggests that positional information along the anterior-posterior axis is specified by mesoderm.

      PMID: 1673098--MURPHY P, et al. Expression of the mouse labial-like homeobox-containing genes, Hox 2.9 and Hox 1.6, during segmentation of the hindbrain.

      PMID: 15930115-- MCNULTY C L, et al. Knockdown of the complete Hox paralogous group 1 leads to dramatic hindbrain and neural crest defects.

      (3) Why was qRT-PCR not employed in other experimental validations, but solely to validate early neural-specific transcription factor changes?

      Thank you for your valuable comment. The qRT-PCR experiment was selected to validate early neural-specific transcription factor changes, indicating the reliability of the scRNA-seq. Then, validated scRNA-seq data were used to analyze for other neuro-specific gene differences, such as violin plots and heatmap showing differentially expressed genes (Figure 4D and Figure 5B, C). Of course, we also tested it with other experiments, such as immunohistochemistry and flow cytometric screening.

      (4) The authors found that T. pallidum might reduce the differentiation from subNPC1B to hindbrain neurons by inhibiting subNPC1B differentiation in brain organoids. Why were the subNPC1B-specific markers declining?

      Thank you for your valuable comment. scRNA-seq is aimed at complete brain organoids. Cluster analysis of cell types of organoids is performed according to specific marker genes of different cells. The decrease in the expression of marker genes of certain cell groups indicates that the cell proportion of such cell groups in the whole organoids is reduced. We analysed organoids following T. pallidum infection, uniform manifold approximation and projection (UMAP), and clustering of the NPC1 population demonstrated that T. pallidum downregulated the number of subNPC1B population. Therefore, the results demonstrated a decrease in the subNPC1B -specific markers.

      (5) In comparison to the other figures, Figure 5E letter size is excessively small and ambiguous.

      Thanks for your valuable comments, I have adjusted Figure 5E letter size.

      (6) Figure 5E shows that TCF3, more than one gene, is specifically enriched in subNPC1B of the T. pallidum group. It is best to confirm the impact of the other gene. 

      Thank you for raising this key issue that we had not addressed properly in our previous version of the manuscript; we have added further analytical data. The SCENIC analysis found that the transcriptional activity of 52 genes has significantly changed after T. pallidum infection. Furthermore, GO analyses demonstrated that 27 transcription factors were significantly enriched in four key pathways of neural differentiation and development. TCF3 is the sole transcription factor present in all four terms simultaneously, speculating that TCF3 is the key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum.

      Revision in the “Results” section, line 261-273:

      “Next, the single-cell regulatory network inference and clustering (SCENIC) analysis for the subNPC1B subcluster was performed to assess the differences in the transcriptional activity of the transcription factors between the two groups and found that the transcriptional activity of 52 genes significantly changed after T. pallidum infection (Figure 5E). Furthermore, GO analyses demonstrated that 27 transcription factors were significantly enriched in key pathways of neural differentiation and development in response to nervous system development, positive regulation of sequence-specific DNA-binding transcription factor activity, positive regulation of neuronal differentiation, and DNA templated transcription regulation. Remarkably, transcription factor 3 (TCF3) is the sole transcription factor present in all four terms simultaneously (Figure 5F), speculating that TCF3 is the key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum.”

      Revision in the “Materials and methods” section, line 540-543:

      “The Sankey diagram was created using SankeyMATIC (https://sankeymatic.com/) (Zhang et al., 2023), which was used to characterize the interactions between differential transcription factors and neural differentiation and development.”

      Revision in the “Figure and Figure Legend” section, line 832, 842-844:

      Author response image 2.

      Sankey diagram showing the correspondence between differential transcription factors and neural differentiation and development.

      (7) Are there other experiments demonstrating that TCF3 is a key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum

      Thank you for your valuable comment. In the previous experiment, we attempted to select a subNPC1B subcluster by flow sorting to verify the relevant molecular mechanism. Due to the small proportion of subNPC1B subcluster in the whole organoids, the selected cells were in a poor state and could not reach the number of cells required for the experiment. However, we used scRNA-seq data to further identify TCF3 as a key transcription factor that inhibits subNPC1B - hindbrain neuron differentiation induced by T. pallidum. The relevant results and descriptions of the analysis are detailed in the revised manuscript, please see our response to point (6) above.