195 Matching Annotations
  1. Last 7 days
    1. Reviewer #3 (Public Review):

      It is established that Kinase suppressor of Ras 1 (KSR1) contributes to the oncogenic actions of Ras by promoting ERK activation. However, the downstream actions of this pathway are poorly understood. Here Rao et al. demonstrate that this KSR1-dependent pathway increases translation of Epithelial-Stromal Interaction-1 (EPSTI1) mRNA and expression of EPSTI1 protein. This is significant because EPSTI1 drives aspects of EMT, including expression of ZEB1, SLUG, and N-Cadherin. The analysis is thorough and includes both loss-of-function and gain-of-function studies. Overall, the conclusions of this study are convincing and advance our understanding of cancer development.

    2. Reviewer #2 (Public Review):

      KSR1 functions as a critical rheostat to fine-tune MAPK signalling, and identifying modes by which its over-expression promotes tumor progression is clinically important and potentially druggable. Ras is highly mutated in CRC and unfortunately inhibitors of Ras have been challenging to develop. However, small molecules which stabilize an inactive form of the KSR are actively being developed in an attempt to repress RAS signaling. Thus, this study, which seeks to identify how KSR1 promotes oncogenic mRNA translation, is potentially highly clinically relevant, as it may identify novel druggable targets.

      In this manuscript the authors performed polysome profiling in colorectal cancer (CRC) cells and proposed that KSR1 and ERK regulate the translation of EPSTI1 mRNA. They go on to characterize the phenotypes associated with knock-down or knock-out of KSR1 in CRC, and show that their defects in invasion, anchorage-independent growth and switch to a less EMT-like phenotype are all EPSTI1-dependent.

      The authors succeeded in providing ample in vitro data that KSR1 and EPSTI1 are potential therapeutic targets in CRC. However, the data demonstrating that KSR1 and ERK regulate EPSTI1 mRNA translation is tenuous. Although the authors state that "EPSTI1 is necessary and sufficient for EMT in CRC cells", the data presented are consistent with a more restrained conclusion of a partial-EMT and not EMT per se. Finally, without an in vivo model it is difficult to glean novel insight into the mechanism by which KSR1 and/or EPSTI1 control the invasive and metastatic behaviour of cells.

    3. Reviewer #1 (Public Review):

      In this manuscript Rao et al. describe an interesting relationship between KSR1 and the translation regulation of EPSTI1 (a regulator of EMT). They identified this relationship by polysome RNAseq of CRC cells in the context of KSR1 knockdown (KD) which they confirm by polysome QPCR. They then go on to show that KSR KD and add back influences EPSTI1 expression at the protein but not mRNA level and impacts cell viability, anchorage-independent growth, and possibly cell migration. They focus on the cell migration phenotype and show that it is associated with changes in EMT-related genes including E-cad and N-cad. Interestingly, add back of EPSTI1 can reverse the phenotype elicited by KSR1 deletion. Overall, this story is interesting and translation regulation by KSR1 has not been described previously. However, Rao et al. do not provide a mechanism for how KSR1 regulates the translation of EPSTI1, and it is unclear whether this occurs through eIF4E, as the authors suggest.

    4. Evaluation Summary:

      This paper demonstrates the involvement of Kinase Suppressor of Ras 1, a protein that acts as a scaffold in the mitogen activated protein kinase (MAPK) signaling cascade, in translational control of epithelial-to-mesenchymal transition. The analysis is thorough and includes both loss-of-function and gain-of-function studies. This study advances our understanding of cancer development.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      The authors have studied preclinical models of human small cell lung cancer (SCLC) using characterized SCLC cell lines that have been manipulated to conditionally express mutant EGFR (L858R) or KRAS (G12V) alleles and then assessing their morphology in cell culture, expression of neuroendocrine differentiation markers and transcription factors, and main signaling pathways such as the MAPK pathway. They focus on this because activation of ERK and the MAPK pathways are seen in nearly all non-small cell lung cancers (NSCLCs) including those with EGFR or KRAS mutations but mutations in these driver oncogenes or active ERK and MAPK pathway are essentially never found in SCLCs. In addition, chromatin modifications are assessed after manipulations and functional genomics targeting and pharmacologic inhibition of various components of the MAPK pathway are tested to see their effect on NE expression. Because of the known clinical phenomenon of transformation to SCLC like tumors by lung adenocarcinomas with EGFR mutations that become resistant to EGFR tyrosine kinase inhibitors, findings from the SCLC studies were applied to try to experimentally generate such LUAD to SCLC transformation. Overall, they found that activation of ERK/MAPK pathway by oncogenic mutations led to loss of NE differentiation and that the "ERK-CBP/p300-ETS axis promotes a lineage shift between neuroendocrine and non-neuroendocrine lung cancer phenotypes". They conclude: "In summary, we provide the first reported biological rationale for why alterations in MAPK pathway are rarely found in SCLC and describe the molecular underpinnings of how the central node in this pathway, ERK2, suppresses the NE differentiation program. " The authors conclusions and claims are justified by the experiments and data they present and they provide a mechanistic basis of what happens with MAPK/ERK activation in SCLC, why one does not find MAPK/ERK activation in SCLC, or the presence of related oncogenic driver mutations such as mutant KRAS or EGFR.

    2. Reviewer #2 (Public Review):

      Cell fate transitions (such as adenocarcinoma converting to small cell neuroendocrine fate) are an increasing phenomenon observed during therapeutic resistance in lung cancer, prostate cancer, and possibly other cancer types. It is important to understand these mechanisms if we ultimately seek to tailor treatment to a patient's disease and/or to control the pathways that lead to treatment resistance. However, the mechanisms that underly these cell fate changes are not well understood. It has been previously observed (Calbo et al, Cancer Cell, 2011) that activated mutant Kras (commonly associated with adenocarcinoma fate) can promote a non-neuroendocrine fate in SCLC, but the mechanisms are unknown.

      Predominantly using three human small cell lung cancer (SCLC) cell lines, Inoue and colleagues use genetic and pharmacological approaches to focus on potential mechanisms by which Egfr/Kras/Mapk signaling can repress neuroendocrine fate. They make a number of interesting observations that extend our understanding of neuroendocrine cell fate regulation including:

      1) Kras-induced NE suppression appears to depend mostly on ERK2, and not ERK1 or PI3K signaling.

      2) Kras activation induces chromatin changes including increased H3K27Ac in 2/3 cell lines; increased H3K27Ac in response to HDAC inhibition is associated with NE suppression. Pharmacological inhibition of CBP/p300 (a HAT that promotes H3K27Ac) reduces H3K27Ac and restores NE suppression. Altogether, these findings are consistent with the notion that SCLC cannot tolerate high levels of H3K27Ac.

      3) Kras induces the MSK/RSK pathway consistently in cell lines but appears to be functionally-relevant to NE fate only in H82 cells.

      4) Kras activation induces chromatin occupancy at ERG and ETS family transcription factor (Etv1, 4, 5) binding sites in 2/3 cell lines, and induces ETV4 (2/3 lines) and ETV5 protein levels (3/3 lines). ETV1 and ETV5 overexpression are sufficient to inhibit NE fate markers in context-dependent manner. Ets family induction appears to occur in a CIC-independent manner.

      In addition, some interesting negative data is presented, for example, SOX9 is induced upon Kras activation in 3/3 cell lines but it was not functionally relevant for NE suppression; Notch1, Notch2, and HES1 (known NE fate suppressors) are induced by Kras activation in a cell context-specific manner, but they did not appear functionally-relevant to NE suppression based on HES1 knockout and a pharmacological inhibitor of Notch signaling; Rb1 loss was not sufficient to promote NE fate in EGFR/p53 mutant cell lines, despite its known association with adeno-to-SCLC conversion. Overall, the conclusions in the manuscript are well justified. These findings will be of interest to those especially in lung and prostate cancer studying cell fate conversions in the context of EGFR and AR inhibitor resistance, respectively. These observations will be built upon by these fields.


      1) One recurring issue in the manuscript is that the observations are often not consistent across the three cell lines and are context-specific effects, and the potential reasons could be explained better. The cell lines chosen unfortunately (but interestingly) represent some of the major cell states of SCLC. H2107 represents the ASCL1+ NE-high subset of SCLC (and has some MYCL). H82 and H524 represent the C-Myc (MYC)-high subset of SCLC, with H82 having a MYC amplification, and both representing the NEUROD1 subtype (which tend to be associated with more MYC). Assessment of NE score using a common approach in the field (Zhang et al, TLCR) shows that H82 cells are already considerably NE-low, with H524 as NE-intermediate/high, and H2017 as NE-high. So, this may be related to why H82 seemed to be the most permissive cell line to change NE fate in multiple assays.

      In addition, H2107 and H524 appear to have EP300 mutations, which may contribute to their NE-high nature and contribute to the refractory response to A485 treatment based on the author's model. It's known that MYCL and MYC-driven cell lines differ in numerous aspects from transcriptional signatures, super enhancer usage, metabolic regulation, therapeutic response, etc. This information could be mentioned in the results and discussed when mentioned as a factor near line 540.

      2) Related to Figure 4, the authors show that p300 pharmacological inhibition can restore NE fate in presence of Kras. Given that drugs can have off-target effects, it would be helpful to know if genetic knockdown/knockout of p300 phenocopies these effects. Given that CREBBP (CBP) or EP300 (p300) mutations are common in SCLC, it is also relevant whether any of these cell lines have CREBBP (CBP) or EP300 (p300) mutations. It appears H2107 and H524 may have EP300 mutations, and it would be good to know whether the authors have tried to restore EP300 function.

    3. Reviewer #1 (Public Review):

      The paper is investigating the mechanism of lineage switch in lung cancer. In about 10-15% of lung cancers treated with inhibitors of oncogenic receptors such as EGFR or KRAS, cancer cells emerge over time with newly acquired features of neuroendocrine differentiation. The authors proposed that it is a direct result of inhibition of MAPK pathway signaling so that reduced MAPK activity activates previously silent genes regulating neural crest differentiation. While this theory is of interest, the experiments presented herein are construed on the opposite sequence by way of introducing activated MAPK via oncogenic KRAS or EGFR to 3 neuroendocrine cell lines resulting in lower expression of neuronal transcription factors. The authors propose MAPK-activated ETS family TFs are responsible for the repression of NE lineage.

      Several principal issues presented by the authors raise some concerns:

      1) Despite presenting some evidence to the effect of suppression of NE transcription factors by overactivating MAPK signaling, the conversion of adenocarcinoma to NE (the opposite transition) is not being addressed in the paper. Therefore, it is rather illogical to investigate the process of transition that is not taking place in the real world.

      2) The authors do not consider a possibility of multi-clonality of human cancers and clonal competition as a mechanism leading to acquired resistance and the emergence of NE clones that are not suppressible by the inhibitors of MAPK pathway (e.g. EGFR inhibitors, or KRAS/RAF/MEK inhibitors). Starting the experiments with clonal populations of long-term cultured cell lines may be an insurmountable difficulty to switch these cells between the epithelial and NE phenotypes which proved to be frustratingly non-productive in the hands of the authors. Taken out of context of tumor microenvironment, these phenotypic transitions may be co-regulated by a combination of cell-intrinsic and extrinsic factors.

      3) Despite zeroing in on ETVs downstream of ERK1/2, the paper does not go as far as showing the direct effect of these TFs as repressors of NE differentiation (ASCL1, BRN2, NEUROD1 etc.).

      4) The line of evidence that Dox-activated MAPK signaling via massive over expression of KRAS or EGFR induces dramatic increase in marks of transcriptionally active chromatin (such H3K27ac and others) is to be expected in this entirely artificial system. Indeed, the addition of doxycycline results in massive burst of proliferation and overexpression of ETV1 and ETV4, the canonical MAPK targets. Again, this switch appears unrelated with the opposite of epithelial-to-NE de-differentiation.

    4. Evaluation Summary:

      This manuscript will be of interest to cancer biologists studying cell fate transitions, particularly adenocarcinoma-to-small cell transitions that occur in prostate and lung cancer, which is a timely topic. While there is not a single linear mechanism identified that fully explains Kras-induced neuroendocrine cell fate suppression in all contexts, multiple new findings will likely be built upon by the field. Overall, the data are properly controlled and the key claims are supported.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. All reviewers agreed to share their names with the authors.)

    1. Reviewer #3 (Public Review):

      Advances in understanding the biochemical and cellular mechanism of neuronal damage are investigated here and are to be appreciated. The strength of this work on SARM1 is its success in establishing that a concentration-dependent phase change activates the enzyme to degrade NAD, an essential component of neuronal integrity. Cellular significance is demonstrated in C. elegans neuronal damage triggered by citrate. Weaknesses are that high citrate is required for SARM1 effects but low citrate is used in the C. elegans model without establishing concentration dependence in the C. elegans system. The progression on neuronal damage from enzyme activation to neuronal damage in C. elegans is missing the quantitation of NAD change. A strength of the work is to provide a solid stepping-stone to permit the next steps in cementing the biochemical pathways of initiating cellular damage to neurons.

    2. Reviewer #2 (Public Review):

      The latest manuscript of Loring and coworkers solves a number of important problems of SARM1 structure and function at once, namely why the purified enzyme has little activity, what size is the active multimer, whether it produces cADPR on the way to ADPR, and how this enzyme may overcome autoinhibition by NAD+ in vivo. In work that is technically sound, the authors describe a phase transition that can be induced by macroviscogens and by citrate in which we are able to see cryoEM images of activated multimers and the induction of SARM1 activity in worms by citrate. Working with concentrated enzyme, the authors are further able to characterize SARM1 activity in detail and clearly show which cations are most inhibitory and that ADPR and not cADPR is the primary product of the reaction.

      There is clearly a lot of regulation in the system with NAD+ inhibiting and NMN activating this enzyme and NMNAT, which controls conversion of NMN to NAD+ being localized to the outer Golgi membrane. Golgi and mitochondria are both moved along axons in processes that are totally dependent on cellular energetics. Given the broad contributions that are made by this work, I would not mind if the authors considered whether citrate, either from stressed mitochondria or from inhibition of the cytosolic enzyme ATP-citrate lyase, might be produced at high enough concentration to push SARM1 into the phase transition described herein.

    3. Reviewer #1 (Public Review):

      SARM1 is an enzyme that is present in neurons and degrade NAD+. Previous studies have shown that disrupting SARM1 inhibits axon degeneration and thus it could be a target for treating neurodegenerative diseases. NAD+ is also an important metabolite that is required for many biological pathways. Thus, SARM1 activity must be carefully regulated. Recent studies have provided structural and biochemical insights about how SARM1 activity is auto-inhibited in basal states. The manuscript by Dr. Thompson and coworkers provide a nice new model regarding how SARM1 could be potentially activated. They provide strong in vitro data to support that phase transition, promoted by PEG molecules and citrate, could dramatically increase the activity of SARM1 TIR domain (which is the catalytic domain) in vitro. The authors also showed that in the worm, C. elegans, citrate promotes SARM1 puncta formation and axon degeneration, which is consistent with the in vitro data. They also generated multiple mutants of SARM1 TIR domain and showed many of the mutants have decreased phase transition and decreased activity in vitro. One of mutant, G601P, also showed decreased puncta formation when expressed in HEK 293T cells as SARM1 SAM-TIR domains E462A mutant (a catalytic mutant so that expression will not cause toxicity) fused with GFP.

      The manuscript has many strengths, including the strong and very careful in vitro characterization of the purified SARM1 TIR domain, which provide a lot of useful information regarding the kinetic parameters, substrate specificity, and inhibition profiles. The worm data with citrate is consistent with the in vitro data, which is also a strength.

      The impact of the finding lies in two aspects. First, it provides a new understanding about how SARM1 activity might be regulated in vivo by phase transition. This is especially true given most studies so far focuses on how it is inhibited at basal conditions. It also adds another example to the list of enzymes that are regulated by phase separation. Second, the finding that PEG and citrate strongly activate SARM1 in vitro also provides a much improved assay for the development of small molecule modulators of SARM1 for potential therapeutic applications.

      There are two minor weaknesses associated with the studies of the manuscript. One is that all the in vitro studies used just the TIR domain of SARM1, not the full length SARM1. Another minor weakness is associated with the data in Figure 5. Most of the mutants have dramatically lower catalytic activities (>100-fold), but the precipitate formation is only modestly affected (2-fold). Although this does not affect the overall conclusion of the manuscript, it prevents the mutants from being more useful for mechanistic dissection.

    4. Evaluation Summary:

      This manuscript describes an interesting regulatory mechanism that activates SARM1, an enzyme that degrade NAD+ and promote axon degeneration. Previous structural and biochemical studies mostly focus on how SARM1 is auto-inhibited at basal conditions and this manuscript provides evidences supporting that phase transition could promote its activity, thus providing new understanding about its regulatory mechanism. The finding also enables in vitro assays to be carried out more easily and thus could facilitate the development of small molecule modulators of SARM1 for therapeutics purposes.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and reviewer #2 agreed to share their names with the authors.)

    1. Reviewer #3 (Public Review):

      The authors set out to determine the role of interleukin (IL)-33 in the host immune response to the parasite Toxoplasma gondii. They achieve this using a mouse model of infection and a range of genetically modified mice to systematically prove the pathway involved.

      A major strength of the study is the use of strategic immune cell/factor-deficient mice in combination with recombinant proteins in vivo. This may be further strengthened by future studies that test the impact off inhibitory antibodies against the primary factor of interest, IL-33. This would allow for a loss and gain of function approach, supporting the exisiting in vivo data with recombinant mouse IL-33.

      Overall, the approach taken achieves the goal of the study. The manuscript is well written and systematically addresses the steps in the pathway that are required to mount an effective IL-33-mediate immune response to T. gondii.

      The likely impact of this work are new knowledge of the function of IL-33 in response to infection and the interaction between different components of the immune system to achieve a balanced, context dependent response. The study does not highlight new methods or technical advances, but does provide important new information on immune responses to infection.

    2. Reviewer #2 (Public Review):

      The authors eloquently showed that IL-33 was produced from stromal cells following Toxoplasma infection and that the absence of IL-33 signaling resulted in increased parasitemia. In agreement with this observation, they found that exogenous IL-33 reduced parasite load and increased the recruitment of inflammatory monocytes that are critical for resistance. The manuscript is well written and data presented here supports the major findings of this work.

    3. Reviewer #1 (Public Review):

      In initial experiments, low levels of IL-33 were detected in Toxaplasma-infected mice. How do these levels compare with normal physiological levels? It would help the reader to understand the relative levels to expect.

      The authors identify that most IL-33 is produced by stromal cells rather than hematopoietic cells. The frequency of tdTomato parasites appear to be much less than for the distribution of IL-33 producing cells. Does the parasite expression reflect 100% of parasites or are the number of IL-33-producing stromal cells stimulated in the infection much larger than the identifiable parasite number? That is, is the activation of the stromal cells a direct effect of the Toxaplasma infection or does it depend on intermediates to amplify the effect?

      Although the data presented are interesting and the authors identify that both stromal cells and hematopoietic cells contribute to the protective effect of IL-33, it is somewhat confusing amongst the hematopoietic cells, which cells are really driving the response amongst those categorized as 'innate'. Within the hematopoietic compartment, a number of associations are delineated but the causal connections are less clear. The provision of exogenous cytokines indeed have the effect they show in their results, but it remains unclear to this reviewer, whether these effects directly act on the hematopoietic cells, or stromal cells which alone are not sufficient to contain the infection and thus develop a higher pathogen load confounding their contributions.

      This work would be strengthened significantly by delineating more clearly the contributions of each compartment. Currently, the correlations are modelled on the responses in the omentum and it would be useful to understand if this reflects the broader response.

      This work would benefit from a schematic to indicate how the authors believe the different cells are connected and which are the real drivers/where connections have been demonstrated in driving the immune response.

    4. Evaluation Summary:

      This study sheds new light on the function of an immune system protein termed interleukin (IL)-33 in response to parasite infection. The study provides information on alternative functions of this immune protein and details the path taken to achieve a beneficial immune response. This study is of interest to immunologists who deal with the host response to infection, particularly to parasites. Immunotherapies that enhance or inhibit IL-33 are in development. Understanding the role of this immune factor in a broad range of infections is important when considering future treatments that target this pathway.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

    1. Reviewer #2 (Public Review):

      Overall this is a solid and technically sound manuscript, and I have only two relatively minor suggestions for improvement:

      1) Tetramer versus dimer

      The particles that were analyzed by cryoEM were composed of four THO-Sub2 protomers, yet the authors argue that a dimer is the functional unit. Why? The tetramer versus dimer organization needs to be better discussed, also in light of the observation that the human complex can also form a tetramer.

      2) Sub2 activation mechanism

      The authors should more carefully discuss how THO 'activates' Sub2 (and how the 'semi-open state' leads to activation) and indicate the RNA binding surface of Sub2 in their model.

    2. Reviewer #1 (Public Review):

      1) I found the initial description of the overall structure confusing. At first the authors say the complex is a tetramer, which is not what was seen by the Conti lab and then follow that with a confusing discussion leading to the conclusion that the dimer with a rigid subunit and a flexible one is the functional unit. It rather feels like they arrive at this conclusion because that's what Conti's lab saw, rather than any other reason. Since the human complex is a tetramer, perhaps the tetrameric complex observed here is one possible form and that possibility should be considered more carefully. Please state whether there is any similarity in the arrangements between the human tetramer and the tetramer observed here. I found the figure 2 supp 1C was not easy to follow. Coloring each of the four protomers differently would make things clearer.

      2) The authors previously determined the structure of yra1C domain bound to sub2 and several labs have shown this interaction activates Sub2 atpase activity. Are those interaction observed previously between Yra1 and Sub2 compatible with this new structure? If so, perhaps the authors could provide a model showing how Yra1 fits into this larger complex. Also, could Yra1 C domain and Gbp2 bind simultaneously to a single THO-Sub2 protomer or would one protomer bind Yra1 and perhaps another bind Gbp2? This is worth considering because this would strengthen the concept that TREX acts as a general platform engaging with multiple export factors to drive recruitment of multiple Mex67 molecules and eventual export of the Mex67:mRNP complex. In the human system, the SR proteins and Alyref have an overlapping binding site on Nxf1, suggesting they may not act together to recruit a single Nxf1, but rather they recruit different Nxf1 molecules perhaps to the same mRNP via a single multimeric THO platform.

    3. Evaluation Summary:

      This is an interesting paper describing the structure of the yeast THO:Sub2 complex and how it interacts with the SR like protein Gbp2. The paper extends what we have learned from two recently published Tho:Sub2 complex structures by the Conti and Plaschka groups in two ways. Firstly, it shows how Gbp2 interacts with the THO complex. Secondly, it reveals a substantially different orientation between THO:Sub2 protomers compared with the earlier structure, so provides more information on the flexibility and range of movements that the two protomers might make with respect to each other. The structural inferences are supported by some biochemical experiments but mechanistically the work has limitations, similar to other recent cryo-EM structures of this complex. However, this is an important structure of wide interest to people working on gene expression in eukaryotes and it undoubtedly advances the field.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers opted to remain anonymous to the authors.)

    1. Reviewer #3 (Public Review):

      This paper examines the role of neutrophils, inflammatory immune cells, in disease caused by genital herpes virus infection. The experiments describe a role for type I interferon stimulation of neutrophils later in the infection that drives inflammation. Blockade of interferon, and to a lesser degree, IL-18 ameliorated disease. This study should be of interest to immunologists and virologists.

      This study sought to examine the role of neutrophils in pathology during mucosal HSV-2 infection in a mouse model. The data presented in this manuscript suggest that late or sustained IFN-I signals act on neutrophils to drive inflammation and pathology in genital herpes infection. The authors show that while depletion of neutrophils from mice does not impact viral clearance or recruitment of other immune cells to the infected tissue, it did reduce inflammation in the mucosa and genital skin. Single cell sequencing of immune cells from the infected mucosa revealed increased expression of interferon stimulated genes (ISGs) in neutrophils and myeloid cells in HSV-2 infected mice. Treatment of anti-IFNAR antibodies or neutrophil-specific IFNAR1 conditional knockout mice decreased disease and IL-18 levels. Blocking IL-18 also reduced disease, although these data show that other signals are likely to also be involved. It is interesting that viral titers and anti-viral immune responses were unaffected by IFNAR or IL-18 blockade when this treatment was started 3-4 days after infection, because data shown here (for IFN-I) and by others in published studies (for IFN-I or IL-18) have shown that loss of IFN-I or IL-18 prior to infection is detrimental.

      These data are interesting and show pathways (namely IFN-I and IL-18) that could be blocked to limit disease. While this suggests that IL-18 blockade might be an effective treatment for genital inflammation caused by HSV-2 infection, the utility of IL-18 blockade is still unclear, because the magnitude of the effect in this mouse model was less than IFNAR blockade. Additionally, further experiments, such as conditional loss of IL-18 in neutrophils, would be required to better define the role and source(s) of IL-18 that drive disease in this model.

    2. Reviewer #2 (Public Review):

      This manuscript will be of interest to a broad audience of immunologists especially those studying host-pathogen interactions, mucosal immunology, innate immunity and interferons. The study reveals a novel role for neutrophils in the regulation of pathological inflammation during viral infection of the genital mucosa. The main conclusions are well supported by a combination of precise technical approaches including neutrophil-specific gene targeting and antibody-mediated inhibition of selected pathways.

      In this study by Lebratti, et al the authors examined the impact of neutrophil depletion on disease progression, inflammation and viral control during a genital infection with HSV-2. They find that removal of neutrophils prior to HSV-2 infection resulted in ameliorated disease as assessed by inflammatory score measurements. Importantly, they show that neutrophil depletion had no significant impact on viral burden nor did it affect the recruitment of other immune cells thus suggesting that the observed improvement on inflammation was a direct effect of neutrophils. The role of neutrophils in promoting inflammation appears to be specific to HSV-2 since the authors show that HSV-1 infection resulted in comparable numbers of neutrophils being recruited to the vagina yet HSV-1 infection was less inflammatory. This observation thus suggests that there might be functional differences in neutrophils in the context of HSV-2 versus HSV-1 infection that could underlie the distinct inflammatory outcomes observed in each infection. In ordered to uncover potential mechanisms by which neutrophils affect inflammation the authors examined the contributions of classical neutrophil effector functions such as NETosis (by studying neutrophil-specific PAD4 deficient mice), reactive oxygen species (using mice global defect in NADH oxidase function) and cytokine/phagocytosis (by studying neutrophil-specific STIM-1/STIM-2 deficient mice). The data shown convincingly ruled out a contribution by the neutrophil factors examined. The authors thus performed an unbiased single cell transcriptomic analysis of vaginal tissue during HSV-1 and HSV-2 infection in search for potentially novel factors that differentially regulate inflammation in these two infections. tSNE analysis of the data revealed the presence of three distinct clusters of neutrophils in vaginal tissue in mock infected mice, the same three clusters remained after HSV-1 infection but in response to HSV-2 only two of the clusters remained and showed a sustained interferon signature primarily driven by type I interferons (IFNs). In order to directly interrogate the impact of type I IFN on the regulation of inflammation the authors blocked type I IFN signaling (using anti IFNAR antibodies) at early or late times after infection and showed that late (day 4) IFN signaling was promoting inflammation while early (before infection) IFN was required for antiviral defense as expected. Importantly, the authors examined the impact of neutrophil-intrinsic IFN signaling on HSV-2 infection using neutrophil-specific IFNAR1 knockout mice (IFNAR1 CKO). The genetic ablation of IFNAR1 on neutrophils resulted in reduced inflammation in response to HSV-2 infection but no impact on viral titers; findings that are consistent with observations shown for neutrophil-depleted mice. The use of IFNAR1 CKO mice strongly support the importance of type I IFN signaling on neutrophils as direct regulators of neutrophil inflammatory activity in this model. Since type I IFNs induce the expression of multiple genes that could affect neutrophils and inflammation in various ways the authors set out to identify specific downstream effectors responsible for the observed inflammatory phenotype. This search lead them to IL-18 as possible mediator. They showed that IL-18 levels in the vagina during HSV-2 infection were reduced in neutrophil-depleted mice, in mice with "late" IFNAR blockade and in IFNAR1 CKO mice. Furthermore, they showed that antibody-mediated neutralization of IL-18 ameliorated the inflammatory response of HSV-2 infected mice albeit to a lesser extent that what was seen in IFNAR1 CKO. Altogether, the study presents intriguing data to support a new role for neutrophils as regulators of inflammation during viral infection via an IFN-IL-18 axis.

      In aggregate, the data shown support the author's main conclusions, but some of the technical approaches need clarification and in some cases further validation that they are working as intended.

      1) The use of anti-Ly6G antibodies (clone 1A8) to target neutrophil depletion in mice has been shown to be more specific than anti-Gr1 antibodies (which targets both monocytes and neutrophils) thus anti-Ly6G antibodies are a good technical choice for the study. Neutrophils are notoriously difficult to deplete efficiently in vivo due at least in part to their rapid regeneration in the bone marrow. In order to sustain depletion, previous reports indicate the need for daily injection of antibodies. In the current study the authors report the use of only one, intra-peritoneal injection (500 mg) of 1A8 antibodies and that this single treatment resulted in diminished neutrophil numbers in the vagina at day 5 after viral infection (Fig 1A). Data shown in figure 2B suggests that there are neutrophils present in the vagina of uninfected mice, that there is a significant increase in their numbers at day 2 and that their numbers remain fairly steady from days 2 to 5 after infection. In order to better understand the impact antibody-mediated depletion in this model the authors should have examined the kinetics of depletion from day 0 through 5 in the vaginal tissue after 1A8 injection as compared to the effect of antibodies in the periphery. These additional data sets would allow for a deeper understanding of neutrophil responses in the vagina as compared to what has been published in other models of infection at other mucosal sites.

      2) The authors used antibody-mediated blockade as a means to interrogate the impact of type I IFNs and IL-18 in their model. The kinetics of IFNAR blockade were nicely explained and supported by data shown in supplementary figure 4. IFNAR blockade was done by intra-peritoneal delivery of antibodies at one day before infection or at day 4 after infection. When testing the role of IL-18 the authors delivered the blocking antibody intra-vaginally at 3 days post infection. The authors do not provide a rationale for changing delivery method and timing of antibody administration to target IL-18 relative to IFNAR signaling. Since the model presented argues for an upstream role for IFNAR as inducer of IL-18 it is unclear why the time point used to target IL-18 is before the time used for IFNAR.

      3) An open question that remains is the potential mechanism by which IL-18 is acting as effector cytokine of epithelial damage. As acknowledged by the authors the rescue seen in IFNAR1 CKO mice (Fig 5C) is more dramatic that targeting IL-18 (Fig 6D). It is thus very likely that IFNAR signaling on neutrophils is affecting other pathways. It would have been greatly insightful to perform a single cell RNA seq experiment with IFNAR CKO mice as done for WT mice in Fig 3. Such an analysis might would have provided a more thorough understanding of neutrophil-mediated inflammatory pathways that operate outside of classical neutrophil functions.

      4) The inflammatory score scale used is nicely described in the methods and it took into consideration external signs of vaginal inflammation by visual observation. It would have been helpful to mention whether the inflammation scoring was done by individuals blinded to the experimental groups.

      5) The presence of distinct clusters of neutrophils in the scRNA-seq data analysis is a fascinating observation that might suggest more diversity in neutrophils than what is currently appreciated. In this study, the authors do not provide a list of the genes expressed in each cluster within the data shown in the paper. Although the entire data set is deposited and publicly available, having the gene lists within the paper would have been helpful to provide a deeper understanding of the current study.

    3. Reviewer #1 (Public Review):

      Overall this is a well-done study, but some additional controls and experiments are required, as discussed below. The authors have done a considerable amount of work, resulting in quite a lot of negative data, and so should be commended for persistence to eventually identify the link between neutrophils with IL-18, though type I IFN signaling.

      Major Comments:

      • A major conclusion of this manuscript is prolonged type I IFN production following vaginal HSV-2 infection, but the data presented herein did not actually demonstrate this. At 2 days post infection, IFN beta was higher (although not significantly) in HSV-2 infection, but much higher in HSV-1 infection compared to uninfected controls. At 5 days post infection the authors show mRNA data, but not protein data. If the authors are relying on prolonged type I IFN production, then they should demonstrate increased IFN beta during HSV-2 infection at multiple days after infection including 5dpi and 7dpi.

      • Does the CNS viral load or kinetics of viral entry into the CNS differ in mice depleted of neutrophils, IFNAR cKO mice, or mice treated with anti- IL-18? Do neutrophils and/or IL-18 participate at all in neuronal protection from infection?

      • In Figure 3 the authors show that neutrophil "infection" clusters 2 and 5 express high levels of ISGs. Only 4 of these ISGs are shown in the accompanying figures. Please list which ISGs were increased in neutrophils after both HSV-2 and HSV-1 infection, perhaps in a table. Were there any ISGs specifically higher after HSV-2 infection alone, any after HSV-1 infection alone?

      • The authors claim that HSV-1 infection recruits non-pathogenic neutrophils compared to the pathogenic neutrophils recruited during HSV-2 infection. Can the authors please discuss if these differences in inflammation or transcriptional differences between the neutrophils in these two different infections could be due to differences in host response to these two viruses rather than differences in inflammation? Please elaborate on why HSV-1 used as opposed to a less inflammatory strain of HSV-2. Furthermore, does HSV-1 infection induce vaginal IL-18 production in a neutrophil-dependent fashion as well?

    4. Evaluation Summary:

      This manuscript will be of interest to a broad audience of immunologists especially those studying host-pathogen interactions, mucosal immunology, innate immunity and interferons. The study reveals a novel role for neutrophils in the regulation of pathological inflammation during viral infection of the genital mucosa. The main conclusions are well supported by a combination of precise technical approaches including neutrophil-specific gene targeting and antibody-mediated inhibition of selected pathways.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

  2. Feb 2021
    1. Reviewer #3 (Public Review):

      In this manuscript, Filipowicz and Aballay present a nice story that characterizes a new learned behavioral phenotype prompted by intestinal distention during infection with the bacterial pathogen E. faecalis. The authors show that distention of the anterior portion of the intestine by E. faecalis induces an aversive behavioral response. Importantly, the authors show that this aversive learning response is controlled by multiple sets of neurons, including some that express the GPCR NPR-1 and others that express the ion channels TAX-2/4. The authors nicely showed that TAX-2 expression in ASE neurons was sufficient for pathogen avoidance, but not other chemosensory neurons. Next the authors examined the mechanism of aversive learning following ingestion of E. faecalis, showing that AWB and AWC neurons are required. Finally, the authors show that two proteins that could be mechanoreceptors in the intestine (GON2 and GTL-2) are required for pathogen avoidance. Together these data characterize important mechanisms of pathogen avoidance and an aversive learning response.

      I have one issue for the authors to consider. The title of the manuscript emphasizes the role of TRPM channels in mediating the learned pathogen avoidance response. Demonstrating that the site of action of the TRPM channels is the intestine could further strengthen this exciting finding.

    2. Reviewer #2 (Public Review):

      Work in the nematode C. elegans has shown that these worms learn to avoid pathogens like Pseudomonas aeruginosa after consumption and infection over a period of 12 or more hours. Here, the authors confirm and expand upon earlier observations that - in contrast to P. aeruginosa - avoidance of Gram-positive pathogens such as E. faecalis, E faecium and S. aureus occurs rapidly on a timescale as short as even several minutes. Consistent with this more rapid response, they present evidence that behavioral avoidance occurs via distinct molecular, neuronal and phenotypic mechanisms from those of P. aeruginosa.

      The first major finding that the authors describe is that behavioral avoidance of E. faecalis occurs as a consequence of rapid intestinal distension and not through immune responses or other pathways. They show that anterior intestinal distension occurs rapidly - as early as 1 hr, which is a striking finding and is consistent with rapid behavioral effects. They show that neither E faecalis bacterial RNA, nor bacterial virulence are necessary for behavioral avoidance and that immune response genes are induced only after distension. These data are consistent with a model in which intestinal distension underlies behavioral avoidance, but this assertion could be strengthened by showing that bloating is necessary for behavioral avoidance, that it occurs prior to observable behavioral avoidance, and by more definitively ruling out a role for immune responses.

      Next, the authors show that behavioral avoidance in laboratory conditions requires intact neuropeptide signaling via the npr-1 receptor and this is because worms tend to avoid high oxygen conditions outside of bacterial lawns that typically exists in the lab. At lower oxygen concentrations, npr-1 is dispensable for avoidance. This is consistent with previous work implicating this neuropeptide pathway in lawn avoidance and is convincingly demonstrated.

      The second major finding presented in this manuscript is that rapid behavioral avoidance of Gram-positive bacteria occurs via a learning process involving both gustatory and olfactory neurons. This suggests that worms may rapidly learn to avoid the taste and smell of these bacteria. They show that lawn avoidance of E. faecalis occurs in minutes and coincides with changes in lawn leaving and re-entry rates. They identify sensory neurons involved in lawn avoidance through genetic ablation and cell-specific rescue of signal transduction in the ASE, AWC and AWB neurons. A role for ASE in avoidance is specific to E. faecalis and is a new finding. The authors also show that after a 4hr training exposure to E. faecalis, worms switch from their naïve preference for E. faecalis odors to preferring E. coli odors. This switch in olfactory preference appears to require the AWC and AWB neurons, but not the ASE neurons. While the authors show a clear change in olfactory preference with these data, it is currently unclear whether this reflects associative learning as opposed to non-associative olfactory plasticity resulting from, for example, intestinal distension. Previous work from this group showed that longer-term bloating from bacteria could induce avoidance of different bacteria, arguing against a strictly associative learning role for previously described bloating phenotypes. It is also not currently clear from the authors' data whether ASE plays a role in training-dependent changes in food preference, how this training process relates to the timecourse of intestinal distension, and what role nutrient status might play here.

      Lastly, the authors present the intriguing hypothesis that TRPM family channels may sense bloating either directly or indirectly to mediate this colonization-dependent aversive behavior. Mutations in TRPM channels gon-2 and gtl-2 block lawn aversion that occurs after intestinal distension elicited by E. faecalis colonization or through interference with the defecation motor program. The authors convincingly show that these channels, which are expressed in the intestine but also play known roles in the germline, do not act via the germline in this context. The hypothesis that these channels act in the intestine to sense bloating is an exciting and particularly important one; however, both of these channels are known to be expressed in multiple tissues, and there is no data demonstrating a sensory function for these receptors in the intestine as opposed to other roles.

    3. Reviewer #1 (Public Review):

      In this work, the authors set out to better understand the mechanisms by which the nematode C. elegans responds to bacterial pathogens.

      Using behavioral assays and genetic manipulations, the authors find that C. elegans can rapidly learn to avoid the pathogen E. faecalis (E.f.). While recent studies from other groups have shown that small RNAs (sRNAs) produced by some pathogenic bacteria can trigger aversive learning, the authors find that this seems not to be the case for E. faecalis. Instead, they provide evidence that E. faecalis causes abdominal distention, and that this may provide the trigger for learning. Because the evidence for this is largely correlative, alternative explanations may still be possible. Further, the authors identify two TRPM-class ion channels whose function appears to be necessary for learned avoidance of E.f. The authors propose that one or both of these may mediate detection of abdominal distention, an interesting idea that merits further study. While the paper's title indicates that these channels "mediate" this function, this remains speculative.

      The authors also find that wild-type C. elegans prefer olfactory stimuli from E.f. to those of their regular diet, E. coli, but that this pattern is reversed after exposure to E.f. This plasticity involves the function of the chemosensory neurons ASE, AWC, and AWB, as well as the cyclic-nucleotide-gated channel TAX-2/TAX-4. This finding provides important insight into the nature of the changes in neural circuit function that are triggered by pathogen exposure, leading to pathogen avoidance.

      The paper also examines a role for the neuropeptide receptor npr-1 in learned E.f. avoidance. Animals lacking npr-1 function are known to strongly avoid high (ambient) oxygen concentrations, and instead prefer the lower-oxygen environment of a bacterial lawn. The authors find that this oxygen avoidance overcomes any avoidance of E.f.; thus, npr-1 mutants do not avoid E.f. when tested with ambient oxygen, but they do avoid it in a low-oxygen environment. This indicates that npr-1 is not required for pathogen avoidance per se. Although the authors suggest that npr-1 may be a target of the learning process, this is not well justified by the data and it may be more likely that oxygen avoidance and pathogen avoidance are separate processes.

      Together, these findings demonstrate that the mechanisms underlying learned pathogen avoidance in C. elegans differ substantially depending on the nature of the pathogen, and that worms likely use a combination of strategies to deal with these threats in the wild.

    4. Evaluation Summary:

      In this manuscript, the authors aim to address an important and interesting question: when an animal's intestine is colonized by pathogenic bacteria, how can it sense these bacteria and learn to avoid consuming them? Here the authors suggest that in C. elegans, sensing of intestinal distension or bloating caused by Gram-positive bacteria via intestinal ion channels may drive rapid behavioral responses through a process involving associative learning. While these findings are of broad interest to both the microbiology and neurobiology community, some of their conclusions are not currently fully supported by their data, and reasonable alternative interpretations exist.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

    1. Reviewer #2 (Public Review):

      In this manuscript, Galbraith et al add to our understanding of COVID19 pathobiology by undertaking a cross-sectional survey of 73 hospitalized COVID19 patients with non-severe disease. They perform very broad multi-omics analysis, including plasma proteomics, cytokine profiling and mass cytometry. The authors propose that disease course can be classified by the titer of anti-CoV2 antibodies, which in turn is associated with distinct changes in circulating proteins, cytokines and immune subsets. Interesting correlations with complement and coagulation factors are noted. These findings suggest an alternative way to map disease progression in COVID19 and have implications for broader studies of COVID19 pathobiology. In particular, it will in interesting to extend this framework to analyze a broader spectrum of COVID19 patients, particularly those with poor outcome.

    2. Reviewer #1 (Public Review):

      Galbraith et al., using systems immunology approach document in a very detailed manner, provide the textbook example of innate and adaptive immune responses over time following an infection. Here, their clinical assessment is linked to SARS-CoV2 infection. While novelty aspects are not immense, this study is nonetheless well executed, detailed and thorough.

      The authors perform association studies and propose that simple seroconversion test should be considered in determining the clinical treatment. While some would argue that is already practiced and perhaps expected, the authors have done an excellent job at detailed immune analyses which they coupled with statistically sound associations. Thus these findings are important to document, and should be considered as experimental ex vivo evidence of what clinical practice may have implicitly already considered.

    3. Evaluation Summary:

      In this study, the authors use a systems immunology approach to document innate and adaptive immune responses during clincal SARS-CoV-2 infection. This general impact of this work is a better understanding of COVID19 pathobiology and more specifically, the identification of serum antibodies as a novel classification framework to understand COVID-19 disease course and associated changes.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

    1. Joint Public Review:

      The manuscript by Tachinawa et al. presents a new method (named RhIP), to study incorporation of recombinant epitope-tagged histone dimers into permeabilized cell nuclei. Using RhIP, the authors demonstrate that both H3-H4 and H2A-H2B and their variants are incorporated in this setup. They proceed with investigating context-specific features of these events, providing evidence that ongoing replication and overall chromatin structure may influence histone dimer incorporation in RhIP. This argues for RhIP having the potential to reveal the mechanisms of chromatin assembly and disassembly genome-wide, and determine how cell cycle and chromatin structure influence these dynamics.

      The system is capable of recapitulating major known chromatin assembly pathways and supports existing knowledge of histone dimer dynamics on chromatin. RhIP is also valuable in directly testing histone mutants or variants, as proven by authors.

      H3.1 incorporation is shown to be exquisitely dependent on replication, demonstrating that replication itself, as well as replication-dependent chromatin assembly are successfully reconstituted with isolated nuclei, cytosolic extracts and recombinant histones.

      The focus of the study is on the incorporation H2A variants, in particular H2A.Z. These data supports known notions about H2A.Z dynamics in chromatin, showing a preference for transcription start sites, and the dependence on the M6 region.

      However, the major limitation of the current manuscript is that it remains unclear what properties are driving the observed RhIP effects. This is not fully elucidated and thus limits the ability of RhIP to enable the discovery of new mechanisms.

      While replication-dependent mechanisms are well captured by RhIP, it is less clear if transcription and chromatin remodeling is functional in this system and thus if transcription-dependent nucleosome exchange processes are faithfully recapitulated. It is important to improve the comparison of RhIP with 'in vivo' (i.e. existing ChIP-seq datasets) localisation and explicitly develop hypotheses why in some cases the data matches the 'in vivo' situation and in others not. It would be helpful to improve the interpretation of the data to include all existing caveats to the assay setup.

    2. Evaluation Summary:

      The method presented in this article is of interest for all fields that interface with chromatin dynamics. It could provide a powerful tool to dissect the mechanisms of chromatin assembly and disassembly genome-wide, and determine how cell cycle and chromatin structure influence these dynamics. However, in the current form, the article falls short of its potential. Further validation of the data, and clarification of its implications is requested.

      (This preprint has been reviewed by eLife. We include the public review from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

    1. Reviewer #3:

      In this paper Werkhoven et al. ask a fundamental question in behavioral neuroscience - what is the structure of co-varying behaviors among individuals within populations. While questions in the context of inter-individual behavioral differences have been studied across organisms, this work represents a highly novel and comprehensive analysis of the behavioral structure of inter-individual variation in the fly, and the underlying biological mechanism that may shape this structure of covariation. In particular, for their experiments they combined a set of behavioral tests (some of them were explored in previous studies) to a 13-day long behavioral paradigm that tested single individuals in a highly controlled and precise way. Through clever analysis the authors interestingly showed strong correlations only between a small set of behaviors, indicating that most of the behaviors that they tested do not co-vary, exhibiting many dimensions of inter-individual variation in the data. They further used perturbations of neuronal circuits and showed that temperature and circuit perturbations can change dependencies among sets of behaviors. In a different set of experiments where they integrated gene-expression data (from the brains of single individuals), they showed that some of the genes are correlated with individual-specific parameters of behaviors. Interestingly, through comparison of inbred and outbred population they demonstrated that also outbred populations are showing relatively low covariance of behaviors across individuals.

      Overall, the data in the paper indicate that surprisingly, even for a 'simple' organism, there are many dimensions of inter-individual variation, e.g. many specific characters that can change among individuals in a non-dependent way. The ability of the authors to precisely measure such dependencies in such a highly robust and precise way allowed their investigation of the underlying processes that may generate this variation. The results in this study are highly interesting and novel. They uncover a general picture of the structure of behavioral variation among individuals and open many avenues for further analyses of the underlying neuronal and molecular mechanisms that control variation in sets of behaviors. Furthermore, the methods that were developed in this paper can be of great use by other researches in the field.

      However, while the key claims of the manuscript are well supported by the data and analyses methods, some aspects of data analysis need to be clarified or extended:

      • It is not clear what the motivation is for using the 'Effective dimensionality spectrum' analysis presented in the paper and how it significantly adds to existing methods of clustering that are relying directly on the correlation/distance matrix (some of them were used in this study).

      • While it is clear that the distilled behavioral covariation matrix has many independent dimensions (as the authors indicated, most of the a-priori PCs are not strongly correlated), the number of 'significant' Pcs was not calculated directly for the distilled matrix, and t-SNE analysis is presented only for the original covariation matrix (1L).

      • It is possible that some of the behaviors that covary across individuals in the high temporal resolution assay and also tend to be associated over time within an individual, may indicate sequences of behavior on longer time-scales (than the timescales in which parameters are quantified).

      • Further analyses are needed for extending the detection of correlations between variation in gene-expression data and the independent behavioral measures in the covariance matrix.

    2. Reviewer #2:

      In this paper, Werkhoven and colleagues describe a large-scale effort, using Drosophila, to study variation in behavior among individuals with identical genotypes, and raised in very similar environmental conditions. This addresses the important and basic question of how much behavioral variability exists under such conditions, e.g. due to stochastic processes during development. By looking across many different behaviors, the authors are able also to investigate the nature of this variability. The key conclusion of the paper is that this intragenotypic variability is high dimensional, and cannot be explained by a small set of behavioral syndromes. They find that this observation is robust to the method they use to quantify behavior, and also holds to different degrees in data sets acquired from outbred flies, or files subjected to genetic perturbations of neural activity. Furthermore, they have generated a data set that allows correlation of behavioral biases in individual animals with transcriptomic data. Altogether, this is an impressive study that, beyond its important conclusions, opens up the possibilities for many further explorations in this area, and should be interesting to a broad audience. The experiments are well designed and overall the paper is very nicely written and clear to understand.

    3. Reviewer #1:

      The definition of individuality and its neurogenetic basis is a fundamental problem in ethology and neuroscience. Individuals might fall into discrete groups of personality types; alternatively, individuals might be better described by a broader spectrum of independent traits. An unbiased and quantitative analysis of behavioural traits that make up an individual's personality is a prerequisite of investigating the neuronal and genetic basis of individuality. Given the technical challenges in systematically measuring many behavioural traits across sufficiently large and genetically defined populations and over long time-scales, these questions remain unanswered. This manuscript represents a tour-de-force trying to shed more light in these directions. Werkhoven and colleagues aim at characterizing structure in correlations among a large set of quantitative behavioural measures obtained from the model organism Drosophila melanogaster. The authors performed a large number of high throughput behavioural experiments that cover behavioural paradigms ranging from locomotion to perceptual decision-making. Data were acquired from an inbred, hence isogenic fly line, an outbred line, and various neuronal circuit manipulations. In addition, gene expression data were obtained from individuals. In this way, the authors were able to capture hundreds of behavioural metrics from hundreds of flies, while keeping their individual identities over the course of 13 days. They developed a computational analysis pipeline that quantifies the correlation matrix computed from these metrics. In a 2-step procedure, they condense this matrix into a "distilled" matrix, the entries of which contain all remaining behavioural covariates that were not a priori expected by the authors.

      A central claim in this paper is that any structure in this distilled matrix should reveal the principal axes along which individuality should be described. Based on these measurements and analyses flies could not be categorized into discrete types. Moreover, behavioral covariates appear rather sparse and derive from a high-dimensional behavioral space. This would mean that each individual fly is better described by a large combinatorial set of parameters. The same qualitative finding was made between inbred and outbred flies, leading the authors to a conclusion that larger genetic diversity does not change the principal organization of behaviour. The authors perform a set of neuronal-circuit manipulations and claim in conclusion that specific neuronal activity patterns underlie structure in behavioural correlations. Some correlations between gene expression and behavioral metrics were discovered, for example gene expression of metabolic pathways can predict some variability found in the behaviour of flies. The behavioural pipeline is sophisticated and presents a great leap forward in enabling researchers to capture a large set of behavioural measures from a large fly population, keeping the identity of individuals. The work is also presenting an innovative and interesting analysis pipeline.

      Although we applaud these ambitious experimental paradigms and computational techniques used, we have several major reservations about this work. Reading through the manuscript multiple times, one is left confused whether the major finding is that no structure whatsoever can be found in these data and to what extent the remaining sparse correlations are of biological / ethological relevance. Another major concern arises from the high level of trial-trial variability that is found in the data, which seems to preclude identification of persistent idiosyncrasies in the behavioural traits of individuals and impedes the reproducibility of the data matrices in two repetitions of the main experiment. We feel that most of the authors' conclusions and claims are confounded by these caveats.

      1) Distinguishing persistent idiosyncrasies from trial-to-trial variability and reproducibility of decathlon data

      A major challenge in measuring personality traits or individuality is to distinguish between persistent idiosyncrasies and trial-to-trial variation; the latter could result from inherent stochastic properties of behaviors, environmental or measurement noise. To identify an idiosyncratic behavioral trait in an animal one needs to show that individuals exhibit a distinct distribution in a behavioral metric that cannot be explained by trial-to-trial variability. Such a distinction cannot be made if a behavioral metric is measured just once or during a short period, but requires repeated measures over longer time-scales from a sufficiently large population of animals. Unfortunately, in this study many measures have been taken during just one 1-2hs episode per individual of a decathlon. For other measures that were taken repeatedly (circadian assays, unsupervised video acquisition) no efforts have been undertaken by the authors to make the above distinction. Hence, the authors' conclusion that there are no "types" of flies seems premature. In Figure S1 we are surprised to see how low most behavioral measures auto-correlate when recorded on two subsequent days; most auto-correlations further drop to meaningless values when compared over time-periods that correspond to the different epochs of a decathlon. This indicates that trial-to-trial variability dominates the data. In our view it makes little sense to ask whether two behavioral metrics are correlated or not, if their autocorrelations measured over the same time-scale are already extremely low. Moreover, Fig S5B shows that the two decathlons generate largely different data matrices (correlation ~0.25), raising concerns that the results are not reproducible. We wonder whether any structure in behavioral correlations was masked by various sources of noise in this study.

      Related to above, there should be error bars and number of flies for the plots in Fig S1. This figure undermines the starting point of the paper claiming persistent idiosyncratic behaviors.

      2) Given the concerns above, it is not surprising that the outbred fly line delivers another set of covariates which lack otherwise any further structure. If experiments with >100 inbred flies cannot deliver reproducible results, it cannot be expected that a similarly sized population of outbred flies would. Perhaps the needed population size must be orders of magnitudes larger in this case.

      3) Figure 3. It is intriguing to observe how the relationship between switchiness and clumpiness is perturbed upon temperature shifts. But, it seems rather uncorrelated at the restrictive temperature in the Iso line, with a slightly positive value. However, the switchiness-clumpiness correlation is not reproducible in both perturbation types at permissive temperatures. Note, that at both temperatures the Shi and Trp datasets show no - or very low correlations: the Trp lines produce correlations from approx. -0.2 (permissive T) to 0.1 (restrictive T); the Shi lines 0, 0.1 respectively. Fig 3D is very misleading in showing the best fits to the combined datasets. We are not convinced that there is a robust sign-inversion in any of these correlation. The authors' major conclusion that " thermogenetic manipulation and specific neuronal activity patterns underlie the structure of behavioral variation" is not supported by these data. The effect of temperature in the control line, although interesting, is a major caveat for interpreting the results from the Shi and Trp results.

      4) The authors measure a large set of low- and high-level behavioral metrics, e.g. walking speed and choices in Y-mazes respectively. A fundamental problem is that many of these metrics potentially have common underlying but trivial causes, e.g. covariation between speeds measured in various conditions is expected. Therefore, the authors condense their original correlation matrix (Fig 1E) into a distilled matrix (1G) by making such judgements. In the present form, it is impossible to evaluate how systematic or arbitrarily these choices were. In many cases, where the same measure was recorded repeatedly (e.g. circadian bout length) or across different conditions (e.g. mean speed) it is obvious, but for other cases it is not obvious at all for the non-expert: for example, why are circadian-bout-length and LED-Y-maze-choice-number lumped into one block of expected behavioral covariates? The current manuscript lacks detailed explanations how the authors systematically created the distilled matrix. Can the sparseness of the distilled matrix be a consequence of too generous pre-allocations? See also point (6). The bulk of the analysis in this paper is done on the "distilled matrices" which are produced by removing correlations within previously defined groups of behavioral metrics. This is said to cleanly reveal unexpected correlations, leading to a main result of the paper, the correlations between "Switchiness" and "Clumpiness". However, if the a priori categories were defined differently, then in the extreme case this correlation would have been completely removed. How sensitive is this correlation to the choice of categories, especially given that many of the Switchiness and Clumpiness metrics are from similar assays (Fig. S8)?

      5) For the second pipeline that uses t-SNE and watershed (Fig. 2 and S3C), a previous publication from some of the authors [1] appears to show low repeatability of this analysis.Thus, the repeatability and noise levels of the pipeline must be investigated further. These were 3x 1h recordings per decathlon. Related to comments (1-2), the authors need to show that the differences across flies (Fig 2C,D) are not expected from the level of trial-to-trial variability. Perhaps more data from individual flies need to be recorded?

      6) 1G: To our understanding, within-block entries to the distilled matrix should indicate zero correlations, because these are correlations between PCA-projections. But we see many nonzero entries. Given the information provided in the methods it is unclear why this is the case; this requires further explanation.

      In any case, within-block correlations are expected to be at least very low. Hence, we expect the distilled matrix to be relatively sparse given how it was calculated. Of interest are then the across-block correlations, the authors should make this point more clear to the readers.

      7) Some of the author's claims are related to the spectral dimensionality reduction technique described in Fig. S9. However, none of the real data shown in the main paper figures look qualitatively similar to the toy data. Indeed, the histograms from the main figures are on a log scale, and are thus not comparable to the toy data results. Although the technique might be well suited for certain classes of data, one interpretation of the main paper figures seems to be that no structure is revealed whatsoever. More work should be done to exclude this as a possible interpretation, at least by generating toy data that look like the real Datasets; also with respect to point (6) above.

      8) Throughout the paper, the authors use the term "independence" for orthogonal / uncorrelated datasets. Correlation/uncorrelation - dependence/independence are not interchangeable terms. To my understanding PCA decomposes into independent variables only under certain circumstances (multivariate normal distributed data). Have the authors tested for independence?

      [1] Todd, J.G., Kain, J.S. and de Bivort, B.L., 2017. Systematic exploration of unsupervised methods for mapping behavior. Physical biology, 14(1), p.015002.

    4. Summary: This manuscript is interesting to circuit-neurobiologists, behavioural biologists and psychologists. The reviewers agree that this manuscript addresses an important unanswered question: what is the covariation-structure in the vast space of behavioural variables that individuals can explore, and what defines their individuality in this space? The reviewers also praise the great efforts made in the experimental approach and analyses methods, which potentially will set new benchmarks in the field. However, the work can be improved, by accounting for the trial-to-trial variability in behavioural data and clearly distinguishing these from persistent idiosyncrasies observed in individuals.

    1. Reviewer #4:

      This manuscript by Huss, P., et al, is a major technological step forward for high throughput phage research and is a deep dive into the deep mutational landscape of a portion of the T7 Phage receptor binding protein (RBP). The author’s develop a new phage genome engineering method, ORACLE, that can generate a library of any region of the phage genome. They apply ORACLE to do a deep mutational scan of the tip domain of T7 RBP and screen for enrichment in several bacteria. The authors find that different hosts give rise to distinct mutational profiles. Exterior loops involved in specialization towards a host appear to have the highest differential mutational sensitivity. The authors follow up these general scans in the background of phage resistant hosts. They find mutations that rescue phage infection. To demonstrate the utility of the approach on a clinically relevant task, the authors apply the library to a urinary tract associated clinical isolate and produce a phage with much higher specificity, creating a potentially powerful narrow scope antibiotic.

      Overall, the ORACLE method will be of tremendous use for the phage field solving a technical challenge associated with phage engineering and will illuminate new aspects of the bacterial host-phage interactions. It was also quite nice to see host-specialization validated and further explored with the screens done in the background of phage resistance mutations. The authors do a tremendous job digging into potential mechanisms when possible by which mutations could be altering fitness. We especially appreciate how well the identity of amino acids tracks host specialization within exterior loops.

      We have no major concerns about the manuscript but have some minor comments to aid interpretation. There are also some minor technical issues. We think this manuscript will be of broad interest, especially for those in the genotype-phenotype, phage biology, and host-pathogen fields.

      Minor comments:

      P5L20: In the introduction to the ORACLE section the authors mention homologous recombination then they mention using 'optimized recombination' that is done with recombinases. This contrast should be mentioned somewhere perhaps to highlight the benefit of having specific recombinases.

      P6L16: Using Cas9 to cut unrecombined variants is clever... Cool! This is a real 21st Century Dpn1 idea.

      P6L27 The authors state that there is a mild skew towards more abundant members after ORACLE. Why might this be? In iterations more abundant members simply become even more abundant? To be clear this isn't a substantial limitation and it's common to see these sorts of changes during library generation. Just curious. Overall looks like a fantastic method.

      P7L6: Authors mention ORACLE increases the throughput of screens by 3-4 orders of magnitude. How many variants can one screen? Is this screen of a little over 1k variants at about the threshold of the assay?

      P8L7: The authors assign functional scores based on enrichment and normalize to wild type. Is a FN=1 equivalent to wild type?

      P9L5: Awesome!

      P10L7: Authors mention R542 forms a hook with a receptor. There should be a citation here.

      P10L21: For N501, R542, G479, D540 there are wonderful mechanistic explanations. However, for D520 there is not. Any hypothesis for why this is distinct from the others? Are there other residues that behave similarly? I feel it would be really helpful to have a color scale that discriminates between FN 1 (assuming wild type) and enriched/depleted w/in figure 3A.

      P12L4: Authors note residues that are surface exposed yet intolerant to mutations in the previous paragraph. Authors also calculate free energy changes with Rosetta and state free energy maps pretty well with tolerance. What is the 93% based on? Perhaps a truth/contingency table would be useful here to discriminate/ compare groupings. What residues are in the 7% others. Can the energy scores help understand the mechanisms behind the mutations better?

      P12L7: Authors state substitutions predicted to stable and classified intolerant could indicate residues necessary for all hosts. What about those that fall outside of the groupings? Unstable residues can also be necessary.

      P14L22L Authors mention comparing systematic truncations, however they do not present any figure. This should be in a figure to aid in looking at the data and would surely be helpful to people in the phage field. A figure should be included here especially because this is one of the main discussion topics at the end of the manuscript.

      P16L2: The authors did the selection in the background of a clinically isolated strained and discussed 3 variants that were clonal characterized. Was this library sequenced similar to before?


      Barplots need significance tests.

      Figure 2C-E ; Fig 3A. All figures are colored white to red. With this color scale it's hard to appreciate which variants are neutral vs those that are enriched. A two or more color scale would be more appropriate. Log-scaling might be wise to get a better sense of the dynamic range that is clearly present in fig2F.

      FIg 4F: Needs a statistical test between bar plots.

      Fig6A-C: These figures have tiny symbols that represent the architecture at an insertion position. It's probably easier to look at if the same annotations from Fig 4B or C for architecture were used.

      Fig6D: needs tests for significance

      Supp fig 4E: This figure is the first evidence that the physics chemistry of amino acids w/in surface exposed loops determine host specificity. This is followed up by Figure 4D and E. I would consider moving this to one of the main figures.

      Supp fig 5: A truth table could be useful here to test for ability to classify based on rosetta compared to FD. It looks like here that the tolerant residues have a distinct pattern

      Why are these colored white to red?

    2. Reviewer #3:

      Huss et al. describe a phage genome engineering technology that they call ORACLE. This technique uses recombineering of a phage target gene with a variant library to identify both gain and loss of function mutations. The beauty of this method and what makes it superior to other techniques is that it dramatically limits loss of mutants that are less fit during the initial round of library generation. Thus, the pool of variants is vast and is reduced in bias toward more fit species based on the host used for initial library amplification. They use the model coliphage T7 as a proof of principle and show that several previously unidentified residues in the T7 tail fiber play critical roles in both loss and gain of function for phage infectivity and they also identify residues that are major drivers of altered host tropism. Lastly, they apply this library to a pathogenic UTI associated strain of E. coli which is normally resistant to wild type T7 infection and identify tail variants of T7 that can now infect this strain, highlighting the applicability of this method toward the discovery of engineered phages that could be used therapeutically. Altogether this is an important advancement in phage engineering that shows potential promise for future phage therapies.

    3. Reviewer #2:

      The authors are reporting a new approach termed ORACLE to develop locus-specific phage variants, which includes a recombination step, whose efficacy is improved by the overexpression of a dedicated recombinase, followed by an enrichment performed using CRISPR/Cas9. They applied this method to create a mutant library containing 1660 variants of the tip domain of the T7 tail fiber. Performance of each variant was determined by quantifying their abundance before and after selection on three E. coli strains compared to the WT phage. Their findings show that single amino acid changes in the tip of gp17 can have major consequences on phage performance on different hosts. Then they tested whether these variants would be less prone to select phage-resistant using an UTI strain. Finally, they searched for variants that would be more prone to infect one host than another and successfully tested their predictions.

      The ORACLE approach is overall novel and has some advantages over existing methods, mainly for generation of mutation libraries of genes. Authors did a nice (even if very lengthy) job of showing how mutants have consequences to structure and function of the tail fiber gene and how that influences performance on different hosts, including combating host resistance.

      The authors state that ORACLE overcomes three major hurdles that make it better than existing methods, one of which is "generalizability for virtually any phage", while denouncing other systems for being applicable for highly transformable hosts only. This is highly exaggerated since ORACLE requires transformation of two plasmids (helper and donor) including one with tunable gene expression, which is clearly not possible in many bacteria. Furthermore, the enrichment step requires a strain with a functional CRISPR/Cas9 system, which again is not so obvious in the bacterial world.

      The authors disregard bias that can be generated at the "O" step if a variant reproduces better than the wt. They should also mention bias arising from non-viable or severely infection hampered variants, which is briefly mentioned later in the manuscript but should be mentioned earlier, would not pass the accumulation step.

      The weakest paragraph is the one dealing with the UTI strain. I have the feeling that this paragraph could simply be deleted without changing the overall story. Approaching resistance, selection, and evolution would require more experiments than the very simplistic lysis curves. The authors did not even show adequately that cells growing after 5-10 hours are either genotypically or phenotypically resistant cells. A more appropriate qualification would be "insensitive" instead of resistant.

    4. Reviewer #1:

      Huss et al. have developed a novel tool (ORACLE) for generating libraries of phage variants. They go on to apply this tool to study the residues important for T7 host specificity, providing a rich dataset for in-depth functional studies. They validate a subset of hits and use this information to engineer T7 variants that may be able to overcome bacterial resistance against a urinary tract infection associated strain, consistent with their in vitro results. Their approach provides both a valuable new tool and intriguing biological insights prompting future studies.

      Major suggestions for improvement:

      1) The writing could be much more concise.

      2) Claims about generalizability should either be removed or supported by additional data. This study focused on a single phage gene and a single host bacterial species. As such, it is not clear if ORACLE will work well in other contexts.

    1. Reviewer #3:

      In this manuscript, the authors investigated roles of PSD95 in the hippocampus for contextual fear extinction. The authors showed that PSD95 levels in the spine and density of PSD-95-positive spines in the dorsal CA1 (dCA1) are changed following contextual fear conditioning and extinction learning. Interestingly, overexpression of PSD95-S73A mutant or chemogenetic inhibition of dCA1 impairs only the second extinction learning at 24 hrs following the first extinction learning. Importantly, these manipulations also blocked the changes of PSD95-positive spines following the first extinction learning. These observations suggest that phosphorylation of PSD95 at S73 in the dCA1 of hippocampus contributes to contextual fear extinction. This manuscript suggests the importance of PSD95 phosphorylation in the hippocampus in some aspects of mechanisms of contextual fear extinction at the molecular and spine levels. However, the title, abstract and conclusions do not well reflect observations and experimental designs in this manuscript. I have several concerns as follows.

      Major concerns:

      1) The authors used viral overexpression of PSD-95 S73A mutant that may function as a dominant negative mutant, but not knock in mutation. Therefore, the function of phosphorylation of PSD 95 at S73 on spine morphology and contextual fear extinction have been not yet investigated well. The experimental design in this manuscript made limitations to understand behavioral results. It is better to use knock-in mutation strategy than overexpression of the mutant. Alternatively, the authors can examine the phosphorylation levels of PSD95 following contextual fear conditioning and extinction learning and/or function of this mutant at the molecular and cellular levels using biochemistry/molecular biology/cell culture.

      2) Overexpression of S73A or chemogenetic inhibition of CA1 impaired additional extinction learning. These observations are interesting. However, the authors have not well characterized these findings at the behavioral levels. In other words, the authors should clarify the effects of these manipulations on contextual fear extinction at the behavioral levels. According to abundant knowledge of fear memory extinction, the behavioral results in this manuscript raised a lot of questions to understand the impact of those genetic manipulations on "contextual fear extinction". How about effects on extended extinction learning (60 min), additional 30 min extinction learning at the same day after first extinction training, spontaneous recovery, renewal, and reinstatement? Some answers of these questions will help to understand behavioral observations in this study and enable us to identify roles of PSD95 and its phosphorylation in extinction of contextual fear memory. It is also important to examine PSD95-positive spines just after the additional extinction learning to understand behavioral observations.

    2. Reviewer #2:

      Ziółkowska et al. investigate synaptic processes in the dorsal hippocampal CA1(dCA1) region with the goal of testing the role of postsynaptic density protein 95 (PSD-95) dynamics in contextual fear extinction. They conclude that 1) extinction increases synaptic dCA1 PSD-95 levels and induces remodeling of dendritic spines, 2) extinction-related PSD-95 changes are mediated by phosphorylation of PSD-95 at serine 73, and 3) phosphorylation of PSD-95 at serine 73 as well as dCA1 activity are required to "update a partially extinguished fear memory". The experiments provide new insight and address a timely and important issue. The major strengths of the paper lie in the use of a wide range of complementary technical approaches, and the significance of addressing specific molecular mediators of fear attenuation. However, some of the analysis is based on inadequately justified or inappropriate measures (e.g. that do not directly assay the phenomenon under investigation), and there are concerns about independent effects of viral overexpression in this system as well as the relevance of the behavioral analysis. The conclusions from the paper, if true, would appear to support a very intricate model involving PSD95 phosphorylation and synaptic accumulation after extinction, but because of weaknesses in the underlying evidence, these mechanisms and their relationship to extinction memory were not persuasively demonstrated. Following are some specific concerns:

      1) The mean intensity of PSD95 labeling per spine appears to be affected in some hippocampal layers (Fig. 1), but this might be attributable in some cases to elimination of spines that have relatively lower PSD-95, rather than a change in PSD-95 levels, per se.

      2) The quantification of overexpressed PSD-95 in Fig. 2 makes unclear what specifically has been measured. The methods suggest that % area is defined as the total area of mCherry labeling divided by the total image area. This is not a direct measure of PSD-95 levels, rather than morphological or protein localization changes. Furthermore, the localization of overexpressed PSD-95 (Fig. 2) is clearly very different from that of endogenous PSD-95 (Fig. 1) in that it accumulates throughout the dendrites. This makes it unclear what a "puncta" represents, or whether the analysis implies anything about synaptic function.

      3) The authors argue that S73 phosphorylation is required for synapse elimination during extinction, but Fig. S2 (which is not referenced or discussed in the manuscript) and Fig. 3 indicate that the effect of S73A overexpression is to dramatically reduce spine density in both behavioral groups. It is therefore not clear whether the manipulation interacted with extinction to prevent spine removal, or simply occluded such an effect because spine density was already at an artificial floor prior to any behavioral training. Overexpression of the wildtype construct also reduced spine density to a similar degree. Furthermore, the S73A mutant protein dramatically increased PSD area (Fig. 3d), which apparently contradicts the notion that phosphorylation of this site is required for synaptic accumulation, when applying the same logic used elsewhere in the paper. These are serious confounding issues because the central claim of the paper is that S73 phosphorylation mediates PSD95 synaptic accumulation and synaptic strengthening.

      4) The authors suggest that successive days of extinction represent a distinct process called updating of a partly extinguished memory, which they seem to imply has different molecular requirements. There appears to be no basis in the literature for this idea.

      5) The analysis of extinction relies on measurement of within-session decreases in freezing. However, within-session extinction has been shown to be neither sufficient nor essential for between-session extinction. It is not even clear that within-session extinction is really even extinction at all, rather than, for example, habituation. It is essential to examine the retention of decreased freezing across days in order to establish that the formation of long-term memory is involved.

      6) Finally, numerous comparisons are made between animals that received FC, with no further manipulation, and extinguished animals. This design leaves open the possibility that any differences are attributable not to an extinction process but instead to context exposure independent of fear regulation. A behavioral control in which animals receive context exposures, but no shocks, would be very useful.

    3. Reviewer #1:

      Patients with posttraumatic stress disorder show impaired fear extinction that leads to persistent fear memories. The CA1 subregion of the hippocampus has been implicated in the acquisition and extinction of contextual fear memories, and both mechanisms depend on glutamatergic synaptic plasticity in this region. Postsynaptic density protein 95 (PSD-95) is known to regulate structural and functional changes in glutamatergic synapses, but whether PSD-95 participates in the acquisition and extinction of contextual fear memories remains unclear. To address this question, here Ziółkowska and coworkers used nanoscale-resolution analyses of PSD-95 protein in the CA1 combined with genetic and chemogenetic manipulations in mice exposed to a classical Pavlovian contextual fear conditioning paradigm. The study revealed that PSD-95-dependent synaptic plasticity in the dorsal CA1 area is not necessary for fear acquisition or the initial phase of fear extinction, but is critical for updating a partially extinguished fear memory. In addition, phosphorylation of PSD-95 at serine 73 is necessary for contextual fear extinction-induced PSD-95 expression and remodeling of dendritic spines in this region, suggesting a potential mechanism for fear memory persistence.

      This timely study provides important and novel findings with regard to the role of PSD-95 protein in fear extinction formation and helps to advance our understanding of how dendritic changes in the hippocampus regulates fear maintenance. The present findings should be of general interest to the scientific community because extinction-based therapies are the gold-standard treatment for many fear-related disorders. The manuscript is clear, and the experiments were well-designed and executed. While the study is elegant, there are several important points including data interpretation that need to be clarified.

      Major points:

      1) The authors identified changes in PSD-95 expression levels and spine density after both fear acquisition and fear extinction. Similarly, S73-dependent phosphorylation of PSD-95 and changes in spine density were also reported following both phases. How do the authors explain the lack of effects on fear acquisition and extinction after the infusion of S73-deficient PSD-95 expressing virus? Does this suggest that the observed dynamics of PSD-95 are not important for the fear memory expression? The interpretation of these findings should be clarified in the discussion.

      Previous studies have demonstrated a key role of dorsal hippocampus CA1 area on fear retrieval and extinction acquisition using either lesion (e.g., Ji and Maren 2008, PMID: 18391185), or optogenetic tools (e.g., Sakagushi et al, 2015, PMID: 26075894). However, in the present study, chemogenetic inhibition of this same region had no effect on fear retrieval or extinction acquisition (Figures 5 and 6). How do the authors reconcile the lack of effects on fear retrieval and extinction acquisition with the previous literature? Similarly, previous studies on the role of hippocampal PSD-95 protein in extinction memory should be described and the main differences in the experimental design and findings should be discussed (e.g.; Nagura et al, 2012, PMID: 23268962; Cai et al, 2018; PMID: 30143658; Li et al 2017, PMID: 28888982)

      2) The authors have used scanning electron microscopy to analyze the ultrastructure of dendritic spines and determine whether PSD-95 regulates extinction-induced synaptic growth. In addition, the authors complemented these studies by investigating the effect of PSD-95-overexpression and fear extinction training on synaptic transmission in the dorsal CA1 ex vivo. However, it is hard to understand what does the observed changes in dendritic spines and amplitude of EPSCs mean if the behavior of the animals was the same. This point should be discussed in the article.

      3) In Figure 5, the authors showed that chemogenetic inactivation of CA1 changed PSD-95 expression in all the 3 subregions of CA1 (stOri, stRad and stLM). However, the extinction training behavior in Figure 1 demonstrated an effect only in 2 subregions (stOri and stLM). The authors should clarify this discrepancy. In addition, in the same series of experiments (Fig. 5Ciii), it is unclear whether the reduction in PSD-95 expression induced by chemogenetic inactivation is sufficient to bring the PSD-95 expression to the same post-conditioning levels.

      4) The authors showed an interesting behavioral effect in the second part of the extinction phase (Figure 6C), similar to the results in Figure 4C. However, to confirm that phosphorylated PSD-95 is crucial for the maintenance of extinction memory, the authors may want to consider a direct comparison between the levels of phosphorylated PSD-95 right after extinction 1 and extinction 2. Differences in the expression would clarify whether the phosphorylated PSD-95 expression is further increased after additional extinction training, which would help to link the effect of chemogenetic inactivation on behavior. At least some discussion is needed for this part.

      5) The authors used immunostaining and confocal tools to analyze 3 domains of dendritic tree of dorsal CA1 area in Thy1-GFP(M) mice (stOri, stRad and stLM) on different fear phases (conditioning and extinction). They found a significant decrease of PSD-95 expression, spine density and spine area in stOri and stRad during conditioning and a rescue of such decrease during extinction. However, the authors’ interpretation is that extinction resulted in an upregulation of PSD-95, which doesn't seem to be the case if you compare the numbers with the naïve group. Please clarify this point.

    4. Summary: This timely study provides important and novel findings with regard to the role of PSD-95 protein in fear extinction formation and helps to advance our understanding of how dendritic changes in the hippocampus regulates fear maintenance. The findings should appeal to those interested in hippocampal function, fear and fear-related conditions, and extinction-based therapies. The major strengths of the paper lie in the use of a wide range of complementary technical approaches, and the significance of addressing specific molecular mediators of fear attenuation. Reasonable alternative explanations were identified for some of the key findings and the conclusions may not perfectly reflect the observations and experimental designs.

      Reviewer #1 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      This paper is primarily about modeling the ERK pathway during the induction of synaptic plasticity. This pathway has been previously modeled, and this is cited in the paper. The main addition here is the addition of the effect of SynGap which is necessary in some form of LTP. This is a very detailed study, and what it seems to primarily show is that the ERK pathway favored spaced vs. massed stimulation protocols. This is a very detailed paper, but no conceptually new ideas are presented here. The paper adds to an existing foundation, but fails to make the case that this is a very significant addition. What is the significant consequence at a higher level of these added details?

      The ERK pathway is just one component of a much larger set of pathways that control synaptic plasticity, how much do we learn from studying this pathway in isolation? Also, the paper cites the importance of this pathway to L-LTP, is it the induction phase of L-LTP? It seems so because ppRRK decays in less than an hour. How then does this pathway contribute to the maintenance of L-LTP? These processes, such as a possible upregulation of protein synthesis, are not part of this model either.

      This paper studies in detail different pathways that influence ERK activation in synapses. This is a very detailed study, but how many details do we actually know? For a detailed paper though it seems that many of the details are missing. Is there a detailed diagram of reactions, or set of equations for all these reactions? Some coefficients are named in figure 1, and this might be sufficient for a schematic description of the model in the paper, however there must be somewhere a detailed description of all reactions. How many species are there here, how many coefficients? How are coefficient values known? How many coefficients are directly estimated? The paper does carry out an extensive robustness analysis, though it is not well explained.

      What are the major takeaways from this paper, and what experiments could test this model?

      To summarize, the paper is very detailed, carefully constructed and executed, but it fails to convince that the problem it addresses is very significant, and it makes no conceptual breakthroughs.

    2. Reviewer #2:

      Miningou and Blackwell in their manuscript titled "Temporal pattern and synergy influence activity of ERK signaling pathways during L-LTP induction" explored the contributions of upstream pathways to ERK activation during LTP. The authors expanded on their previously published LTP model to assess the influence on ERK activation of each of the upstream pathways originating from cAMP or Ca+2 activated with differing temporal patterns. This manuscript's aim is quite germane, since 1) ERK plays such a central role in learning and memory and its cellular proxy LTP; 2) the Ca+2/cAMP/PKA system is highly complex and nonlinear, with multiple feedback loops. The resulting manuscript has the potential to be impactful. The approach of using a stochastic reaction-diffusion model is state of the art and appropriate for the modeling of these subcellular events in spines. And the modeling insights are very intriguing as the authors predict that ERK activation by cAMP/PKA or Ca+2 pathways differ in their linearity, these pathways can synergize during LTP and this may involve a novel feedforward loop containing synGAP. The authors do a marvelous job placing their findings within the huge body of LTP literature.

      There are, however, a couple of points that I feel should be addressed:

      1) There needs to be additional technical detail on how the original models were expanded. The model presented here was developed by merging Jȩdrzejewska-Szmek et al., 2017 and Jain and Bhalla, 2014 models. These models were developed based on experimental data and validated with independent experimental datasets in a rigorous manner. It is not clear how the combining these two models, and the additional molecules and reactions added have affected the dynamics of ERK activation, and how comparable they are to the original experimental data used for model development in the previous modeling efforts. It is not clear if the model was reparameterized.

      2) Beyond the ERK activation traces, it would be useful for clarity sake to also include the simulated traces for the activation of the upstream molecules (PKA, RAS, RAP, etc). Given how additional changes have been made additional information should be provided to ensure that the contribution of each pathway is accurately represented.

    3. Reviewer #1:

      This study takes on the question of the roles of the many pathways leading to ERK activation in long-term potentiation. This is an advance: few models consider more than a couple of input pathways. The authors consider two aspects: how pathways sum to give strong responses, and distinct temporal pattern selectivity. They show that both summation linearity, and pattern selectivity, are strongly governed by which pathways are engaged in driving the response.

      The model and analysis is potentially interesting, but the paper would be much strengthened if there were more convincing validation of the properties of the model by way of simulations to compare with experiments. Further, the pathways chosen are already one step into the synapse. Thus the actual combination of pathway activations would not be quite as cleanly separated if they were driven by synaptic input.

    1. Summary: This research makes important, incremental contributions to the fundamental understanding of propofol interactions with bacterial voltage-gated sodium channels.

      Public Review:

      The reviewers agree that this research adds to the fundamental understanding of propofol interactions with bacterial voltage-gated sodium channels. Here an objective avenue to binding site mapping is taken involving a photoactivated azide propofol derivative. The strategy identifies two adjacent sites at the intracellular face of Nav channels. These sites are provocative as they settle into a mechanistically rich channel region where the voltage-sensor is coupled to the pore. The manuscript is well-written and referenced, and the conclusions are aligned with the data presented. The methods are appropriate, the data appear to be of high quality. The manuscript is internally consistent and well written. The findings are quite interesting.

      The primary concern is that these results were deemed to add incrementally to recently published studies (Yang et al., JGP, 2018) which came to similar conclusions, without the support of the photoaffinity ligand results. Additionally, there were questions about whether voltage-gated sodium channels are involved in the anesthetic actions of propofol, technical questions about molecular simulations, and suggestions for control experiments.

  3. Jan 2021
    1. Reviewer #3:

      Behaviours that are instrumental for producing reward can be either goal-directed or, after repeated practice, habitual. Tasks that dissociate these types of learning, notably outcome devaluation, are tricky to implement for studying intravenous drug delivery although there is great interest to understand the role of habits in controlling drug use and addiction and so this paper is important in that regard. This article takes a new approach analyzing response latencies to infer the types of decision-making process that underlies a reward-seeking behaviour. Goal-directed behaviours are argued to involve evaluation of the outcome of responding and/or deliberation between choices both of which should take time, and slow responding relative to an efficient but inflexible habit. So I think this approach is quite interesting. The paper is well written and the predictions are clear.

      My main issue in evaluating the current article is that while different predictions are made about when response latency should be relatively fast or slow, since the article is framed in terms of dissociating goal-directed and habitual processes, I feel there should be some independent evaluation of whether the target behaviour is in fact goal-directed or habitual. The authors rely on the amount of training as extended training has been shown to promote habitual control. However, exactly how much training is needed and how other parameters (type of reward, schedules of reinforcement, choice or single outcome) affect when habitual control may emerge varies widely in the literature and I don't think we can take for granted that after a certain amount of training responding will be habitual without testing that.

      It is also important to consider alternative explanations for differences in response latency. A behaviour that is well-practiced might well be expected to become more efficient and faster. This need not be due to habit formation. The authors acknowledge the possibility that responding could be at floor but don't really discuss it or whether it might apply more to the saccharin response.

    2. Reviewer #2:

      When animals are given a choice between drug and nondrug reinforcers, they will most often choose the nondrug alternative even when presented with highly reinforcing drugs of abuse. This is difficult to reconcile with known behavior in humans and for modeling aspects of addiction that are critical to the disorder, such as choosing to use drugs above all other reinforcers. Recent work by this same group has reported that responding for nondrug reinforcer is, surprisingly, insensitive to devaluation. This suggests that the choice for the nondrug reinforcer is under habitual, rather than the presumed goal-directed, control and may explain why animals most often choose the nondrug reinforcer over drug reinforcers. Moreover, because there is no devaluation procedure for determining whether drug choice is habitual or goal directed, it's not known if choice for drug is also habitual or remains goal-directed.

      The manuscript by Vandaele et al., therefore, sought to develop a procedure for determining whether behavior of rats making choices between saccharin and cocaine reinforcers was habitual or goal-directed based on reaction times (RT). Based on previous theories, the authors argue that goal-directed behavior should have slower RTs on choice trials versus sampling trials (e.g., because animals are deliberating between the alternatives) whereas habitual behavior should have similar RTs across both sampling and choice trials. The authors also present a third possibility in which options are evaluated sequentially, rather than simultaneously, resulting in RTs being longer in the sampling versus choice trials. The authors report that rats with minimal training and who are presumed to be goal-directed have slower RTs in choice trials compared to sample trials whereas rats that have had extensive training have similar RTs in the choice and sampling phases. These findings are consistent with their hypotheses. Moreover, they demonstrate that in the small subset of rats that prefer cocaine over saccharin, RTs in the sampling trials are longer than that in the choice trial suggesting that cocaine preferring rats are not evaluating each of the options. These data are the first to evaluate habitual responding for a drug reinforcer and suggest that comparing latencies across different task phases could be used to measure habitual and goal-directed behaviors.

    3. Reviewer #1:

      Vandaele et al. probe the mechanisms of decision making in rats when making a forced choice between drug and non-drug reward. The authors have led the field in this domain. In this manuscript, a retrospective analysis of choice response times from many rats in their past work is used to tease out potential decision-making mechanisms. We know already from decades of work that choice response times are almost always log-normally distributed (humans, non-human primates, rodents). The question here is whether differences in the mean and dispersion of these distributions can be used to derive insights into nature of the decision-making mechanism - a deliberative comparison versus a race model - and how this may differ for rats that prefer cocaine over saccharin and how this might be altered by more extended training. These questions are framed in terms of the differences between goal-directed and habitual behavior which, to be frank, I found less compelling (these response time data are of significant interest in their own right). I enjoyed reading this manuscript. It was thoughtful and well presented. I have only two comments.

      First, much, if not all, of the absolute differences between latencies in sample and choice phases appear to be carried by the sample rather than the choice phase. Choice latencies for cocaine preferring rats, saccharin preferring rats, and the indifferent rats are all very similar. In contrast, the sampling latencies for cocaine preferring rats and the indifferent rats are longer. I am not sure why this should be. My reading was that the authors were more concerned with the choice side of the experiment being different, not the sample phase. Is this predicted by the models being tested? I struggled to understand why an SCM-like model would predict the difference being in the sample phase. Either way, the authors could be clearer about where the difference is expected to lie and why the sample phase is so obviously different in some conditions and the choice phase so similar.

      Second, the main and real issue for me is whether the differences between response latencies in the sample versus choice phases plausibly reflect operation of different decision making mechanisms (race model versus deliberative processing) or different operation of the same decision-making mechanism. I don't know the answer, but I could not really derive the answer from the data and modelling provided. The authors frame the differences in response time as being uniquely predicted or explained by different forms of choice. The models that the authors are using are closely linked to, and intellectually derived from, models of human choice reaction time. The most successful of these models are the diffusion model (DDM) (Ratcliff, R., Smith, P.L., Brown, S.D., and McKoon, G. (2016). Diffusion Decision Model: Current Issues and History. Trends in Cognitive Sciences 20, 260-281) and the linear ballistic accumulator (LBA) (Brown, S.D., and Heathcote, A. (2008). The simplest complete model of choice response time: linear ballistic accumulation. Cognitive Psychology 57, 153-178.2008).

      Even though the DDM and LBA adopt different architectures to each other (but the same architectures as those in Supp Fig 1A), they are intended to explain the same data. Of relevance, the same model (a DDM or an LBA) can explain differences in both the response distribution and the mean response time via changes in the starting point of evidence accumulation, rate of evidence accumulation, and/or the boundary or threshold at which evidence is translated into choice behavior. So, for either a difference accumulator model (DDM) or a race model (LBA), the difference between sampling and choice performance could reflect changes in how the model is operating between these two phases, including a change in the starting point of the decision [bias], a change in rate of accumulation [evidence], a change in threshold [caution] or collapsing boundary scenario, rather than reflecting operation of a completely different decision-making mechanism.

      In thinking of a way forward I readily concede I could be wrong and the authors may effectively rebut this point. Another option could be to acknowledge this possibility and discuss it. E.g., does it really matter if it is a qualitatively different decision-making process or different operation of the same decision-making mechanism? I don't really think the action-habit distinction lives or dies by reaction/response time data, this distinction is almost certainly far less absolute than often portrayed in the addiction literature, and it is generally intended as an account of what is learned rather than an account of how that learning is translated into behaviour (even if an S-R mechanism provides an account of both). Response time data tell me, at least, something different about how what has been learned is translated into behaviour. The third, marginally more difficult but more interesting option, would be to explore these issues formally and to move beyond simple descriptive or LDA analyses of response time distributions. The LBA has a full analytical solution and there are reasonable approximations for the DDM. Formal modelling of choice response times (e.g., Bayesian parameter estimation for a race model or DDM) could indicate whether a single decision-making mechanism (LBA or DDM or something else) can explain response times under both sample and choice conditions or not. This is a standard approach in cognitive modelling. This would be compelling if it showed the dissociation the authors argue - i.e. one model cannot be fit to both sample and choice datasets for all animals. However, if one model can be fit to both, then formal modelling would show which decision making parameters change between the sample and choice conditions for cocaine v saccharin v individual animals to putatively cause the differences in response times observed. Either way, more formal modelling would provide a platform towards identification of those specific features of the decision-making mechanisms that are being affected.

    4. Summary:

      In this manuscript the authors perform a retrospective analysis in attempt to delineate the role of goal-directed versus habitual mechanisms underlying choice between drug and non-drug rewards. Specifically, the authors utilized data generated in their laboratory to assess cocaine-versus-saccharin choice following limited and extended training paradigms. A sequential choice model was used to assess the prediction that increased latencies during choice reflect goal-directed control; whereas no change in latencies reflects habitual control. Based on this model, the authors report that rats engage in goal-directed control after limited training, and adopt more habitual responding after extended training. The authors conclude that the sequential choice model is specific to habitual choice.

      While the Reviewers appreciate the approach and conceptual framework described in this manuscript, they are all in agreement that additional data and analyses are needed to better support the claims surrounding goal-directed versus habitual control of reward-seeking behavior. For example, an independent evaluation of whether the target behavior is in fact goal-directed or habitual seems necessary to support such claims. Reviewers’ comments and suggestions for improvement are included below.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Author Response:

      Reviewer #3:

      However, a lot of the data presented in the manuscript are not novel and were previously published. A recent Molecular Cancer Research paper by Llabata and collaborators published in April 2020 (referred to in the text) has already identified the same MGA interactors by Mass Spectrometry and the same binding sites by ChIP-Seq using human lung adenocarcinoma cell lines. Llabata et al. found that MGA interacts with the non-canonical PCGF6-PRC1 complex (named PRC1.6) that includes L3MBTL2 and that the complex also contains MAX and E2F6 but not MYC. They clearly show that MAG binds to and represses genes that are bound and activated by MYC convincingly showing that MYC and MGA have opposite functions. This unfortunately tempers the enthusiasm of the reviewer.

      This reviewer states that "... a lot of the data presented in the manuscript are not novel and were previously published". The reviewer goes on to write that the Llabata et al. 2020 paper (referring to doi: 10.1158/1541-7786.MCR-19-0657 [https://mcr.aacrjournals.org/content/18/4/574]) "has already identified the same MGA interactors by Mass Spectrometry and the same binding sites by ChIP-Seq using human lung adenocarcinoma cell lines. Llabata et al. found that MGA interacts with the non-canonical PCGF6-PRC1 complex (named PRC1.6)..." ​

      We strongly disagree with the reviewer's statements.

      1) A major focus of our paper is that it provides and validates a mouse model in which we delete MGA and demonstrate its tumor suppressive activity. The experiments in Llabata et al., including the biological assays and the ChIP_Seq, were done by overexpressing MGA in cells which already express endogenous MGA. Therefore, all their data monitor the consequences of overexpression of MGA, a situation without clear biological relevance. In the experiments reported in our paper, we delete MGA. Therefore our molecular data refer to a comparison between MGA null and the same cells expressing endogenous MGA. This is important since MGA is a tumor suppressor and its loss of function is what is crucial biologically, as we show here or the first time in our lung adenocarcinoma model. Furthermore, by deleting MGA we were able to show that its loss corresponds to an increase in a core set of target genes previously associated with PRC1.6. Furthermore, we show that members of this core group are relevant to the proliferation of tumors that lack MGA.

      2) The PRC1.6 complex has been known to be associated with MGA since at least 2012 as indicated in our references cited. Llabata et al confirmed that result. Our paper reports PRC1.6 subunits are associated with MGA through the DUF4801 domain of MGA. This is the first identification of the interface between PRC1.6 and MGA. It is important and relevant because multiple frame shift mutants in MGA have the consequence of deleting this region in a wide range of tumor types.

    2. Reviewer #3:

      Mathsyaraja and collaborators analyzed the role of the MAX-Gene associated protein, referred to as MAG, in mouse models and human cell lines and organoids of Non-Small Cell Lung Cancer. MAG is a repressor, a MYC antagonist that opposes its transcriptional activity. It has TBX and bHLH domains. They found that MGA loss by shRNA or CRIPSR accelerated tumor development in vivo in the KP mouse models. Using RNA-Seq, the authors showed that MGA loss leads to the de-repression of the atypical/non-canonical PRC1.6 polycomb complex, E2F and MYC targets as well as increased invasion. ChiP-Seq/cut and run as well as proteomics, revealed that MGA, E2F6 and L3MBTL2 co-occupy thousands of promoters and that MGA interacts with E2F6, and many core members of PRC1.6. Finally, they mapped the DUF domain as required to bind the PRC1.6 complex and bring it to promoters.

      Overall, the experiments are well executed, the paper clearly written and the conclusions justified by the data.

      The new data in the present report are the in vivo data in the mouse models, the role of MGA in repressing invasion, in increasing IFN signaling and the anti-tumor response, and the identification of the DUF domain required for binding to the PRC1.6 complex.

      However, a lot of the data presented in the manuscript are not novel and were previously published. A recent Molecular Cancer Research paper by Llabata and collaborators published in April 2020 (referred to in the text) has already identified the same MGA interactors by Mass Spectrometry and the same binding sites by ChIP-Seq using human lung adenocarcinoma cell lines. Llabata et al. found that MGA interacts with the non-canonical PCGF6-PRC1 complex (named PRC1.6) that includes L3MBTL2 and that the complex also contains MAX and E2F6 but not MYC. They clearly show that MAG binds to and represses genes that are bound and activated by MYC convincingly showing that MYC and MGA have opposite functions. This unfortunately tempers the enthusiasm of the reviewer.

    3. Reviewer #2:

      This manuscript by Mathsyaraja et al. studies the oncogenic loss of the Max-gene-associated (MGA) protein due to deletion or mutation in cell-lines, in mice and in human cancers (cell-lines and tumors). The authors knocked out MGA by aerosol-delivered, CRISPR-CAS expressing lentiviruses that simultaneously Cre-activated a Lox-stop Kras oncogene. The loss of MGA accelerated proliferation and oncogenesis, and shortened survival. Oncogenesis was further enhanced by enforced TP53 deletion in these lung tumors. RNA-seq and ChIP-seq of MGA+ or - cell-lines demonstrated the up and downregulation of various gene classes (thousands of genes) according to function and regulation including of PRC1.6 targets, meiosis regulators, TGF-beta signaling pathway components, EMT regulators, anti-tumor immunity, as well as of MYC, E2F, etc. Different cell lines exhibited both overlapping and distinct target sets. MGA knockout cells were more migratory and invasive and displayed actin-protrusions in accord with this behavior. They show that a Domain of Unknown Function in the mid-region of MGA engages PRC1.6 and is required to depress proliferation. The DUF is also required to limit actin-protrusions. Human colon organoids were studied since MGA mutations and deletions are also apparent in colon cancer. Again, shared and distinct targets of MGA action were inferred.

      The authors make a strong case that MGA is an important tumor suppressor that operates through PRC1.6 for some of its actions.

    4. Reviewer #1:

      The authors report the analysis of a Mga deletion and provide convincing evidence that Mga functions as a tumor suppressor during lung carcinogenesis. The data shown are clear, the message is important and the discussion is very careful. There is a certain overlap with a recent study by Llabata et al., but there is sufficient novelty in the current study.


      It seems that the investigation of publicly available datasets is essentially identical to the Schaub et al . analysis and not new data. If the authors want to maintain this, they would need to better explain what is new. One important piece of information that seems to be missing is whether the mutations are homozygous or heterozygous. So data on MGA and MYC protein expression in human tumors would greatly strengthen this part.

      Conceptually, one would to know whether tumor development in an MGA-delete situation depends on MYC. One would also like to know whether the polycomb complex that is assembled by MGA is tumor-suppressive. Therefore,the authors should perform a similar analysis as they did for MGA (introduce sgRNAs into the lung models) and score the phenotypes they get. Both experiments could be done in cell lines established from this model and either in vitro (that would allow a mechanistic analysis, e.g. RNA seq) or upon re-transplantation. This would also prevent simply reporting negative results.

      The interpretation of the VENN diagram and the heatmaps in Figure 5A,B is somewhat uncertain. If one plots these for MYC, occupancy often simply parallels occupancy by RNAPII, so essentially being bound by MYC simply says the promoter is open/active. Is this the case for MGA and its complex partners? Or is there a specificity in binding? The authors should do RNAPII ChipSeqs in these cells, preferentially +/- MGA, and then show these alongside (and plot a correlation between MYC, RNAPII and MGA occupancy).

      Along these lines, it is hard to understand how one obtains the extreme p-values shown in figure 5E and 5H, I would challenge this. If the authors want to maintain this, they should not use ENCODe data, but simply determine what genes are active in the cells (e.g. what promoters are bound by RNAPII) and then use those as background list and calculate P-values for overlap between MYC, MAX and E2F6.

      Based on the description, the ChIPSeq analyses are not spike-normalized and I could not find information about the number of repeats. If it is n=1, the authors need to find a way to exclude that the differences are due to experimental variation.

      I think the Llabata reference is missing in the list.

    5. Summary:

      The reviewers agreed that the paper provides strong in vivo data for a tumor-suppressive role for Mga in lung carcinogenesis. The authors convincingly show that MGA is important in oncogenesis. We note here that MGA is highly understudied (~200 publications) in and of itself despite its involvement with the MYC network for oncogenesis (~41,000 publications at the current time). Given a protein of 3000 amino acids, the number of potential protein partners and PTMs that might modify its tumor suppressor functions are staggering. However, the reviewers also noted that a previous paper has addressed the same topic and the novelty of the data presented here needs to be better explained and additional experiments are needed to strengthen and expand the new aspects.

      Reviewer #1 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      This manuscript presents its two main results in Figure 3:

      In response to a non-hydrolysable glucose analogue, E. coli cells show...

      (1) Increase in fluorescence intensity of motors with labelled stator proteins, (2) Increase in speed of motor rotation and swimming

      Sufficient controls are described to rule out possible indirect explanations of this effect, via buffer refreshment, metabolism of glucose, proton motive force (Fig 3D) and rotation direction (Fig 4F), and by contrast the effect is demonstrated to depend upon the chemotaxis receptor for glucose (Fig 4B) and the phosphotransferase system (Fig 4D), which is supports the chemotaxis system. These results are interpreted as evidence for a direct effect of the chemotaxis system upon the number of independent stator units, and thereby upon motor and swimming speeds.

      This is a novel finding, and with better statistics (more repeats of fluorescence experiments) and better presentation of the findings (see below), the paper would be an important contribution to the field of bacterial chemotaxis. However, especially without presenting nor postulating a mechanism for the proposed direct effect, the paper might be more suitable for a more specialist journal.

    2. Reviewer #2:

      1) The authors hint towards the involvement of c-di-GMP signaling via the YcgR protein. This hypothesis can be tested by knocking down the ycgr gene and repeating the assay, but this has not been done or reported. Addition of these data to the manuscript would make the paper significantly stronger.

      2) Do other chemoreceptors (Tar, Tsr, Tap) also act in the same way with their respective ligands? It would be useful to know if this effect is specific to Trg or if it is also found in the other chemoreceptors.

      3) In figure 3C, what is the reason that the GFP intensity and the speed do not have the same range? In other words, why is the slope not equal to 1? Since there is 1:1 correspondence between the number of MotB and the number of GFP, shouldn't the slope be 1?

      4) The authors do not cite or discuss the recent literature on load-dependent stator remodeling (e.g. PMIDs: 29183968, 31142644). It would be helpful to have a more in-depth discussion on how the observed stator unit recruitment relates to stator remodeling in response to load.

    3. Reviewer #1:

      Bacterial chemotaxis is a well-studied process at many levels, from the chemical networks that control the rotation of the flagella to the fluid dynamics of the motility itself. In the present paper the authors address the widely held view that ligand sensing is responsible only for changing the rotational bias of the motor driving flagellar motion, and not its speed. Using a well-established method of quantifying motor activity by monitoring the rotation of the cell body when the flagella are stuck to a surface, a fluorescent labelling technique to determine the membrane potential, a mutant with fluorescently labelled stator units, and direct measurements of swimming speed, the authors show that the sensing of a non-metabolizable analogue of glucose leads to a momentary increase in motor speed and stator unit numbers. At the same time, control experiments make it clear that this is purely as a consequence of ligand sensing. This behaviour is indeed contrary to the accepted view, and although the fundamental mechanism is as yet unclear, this is an important result.

      On the whole I am very supportive of this work, which has been done with great care and clear logic. My only suggestion for improvement would be to make quantitative the changes in chemotactic behaviour that would be expected as a consequence of the motor speed changes revealed in this research. That is, can the authors put some numbers into a standard analysis of run-and-tumble dynamics to quantify any improvement in chemotactic efficiency or speed under such changes?

    4. Summary: This is an interesting study reporting an increase in the rotation speed of the E. coli flagellar motor upon the sensing of a non-metabolizable glucose analog (2Dg) by the cell. The authors conclude that this increase is due to an increase in the number of torque-generating stator complexes that drive the motor. Knockout of the trg gene abolished this effect, suggesting that sensing of 2Dg by the Trg chemosensor is responsible. Involvement of membrane potential, the PTS pathway, and the chemotaxis response regulator CheY is ruled out. The manuscript is well-written, and the data are convincing. But the mechanism remains unclear.

      Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      This study by Pipitone et al. combines SBF-SEM microscopy with quantitative proteomics and lipidomics to explore chloroplast differentiation. Authors describe that chloroplast biogenesis occurs in a first phase of structure establishment with thylakoid biogenesis, followed by a second phase of chloroplast division. The images and 3D reconstructions are beautiful, the quantitative data are novel, and their integration offers a new perspective into the seedling de-etiolation process, a model system for physiological and molecular studies. However, in my opinion some aspects need to be better explained and significantly improved.

      • In lines 276-282, the authors write: "After 8h of illumination (T8), we observed decreased abundance of only one protein (the photoreceptor cryptochrome 2, consistent with its photolabile property) and increased levels of only three proteins, which belonged to the chlorophyll a/b binding proteins category involved in photoprotection (AT1G44575 = PsbS; AT4G10340= Lhcb5; AT1G15820= Lhcb6". This is striking, as many well studied proteins change in abundance during the first hours of de-etiolation. Actually, looking into the data set with the quantification data for the ~5,000 proteins, it appears that many proteins do show significant changes between T0 and T8. For example PORA and ELIP, changes that are also reflected in figure 6A.

      • Related to the above, well known proteins for example phyA and HY5, that undergo drastic changes in abundance when etiolated seedlings are first exposed to light, do not show changes in T4,T8 and T12 relative to T0 in the proteomics data set. This raises questions about the proteomic approach (sensitivity of the method?) or the experimental setup. Could authors please comment on this? I feel that validation of the proteomics approach is critical, especially taking into account the central conclusion that "the first 12h of illumination saw very few significant changes in protein abundance".

      • Lines 570-572: A reference is needed. Also, it is mentioned that PSII appears later than PSI, which does not seem to match the observation that PSII proteins appear earlier than PSI, or that the surface area occupied at early time points by PSII is greater than the one occupied by PSI. Please check.

      • Are the calculations of thylakoid surface expansion over time consistent with previous available data using tomography? Please include.

      • In the introduction, authors could include mention of the massive transcriptional reprogramming that takes place during de-etiolation. In addition, I think that comparison of the proteomics data with the transcriptomic changes during de-etiolation (well described in the literature) would allow further understanding of the distinct phases proposed. For the chloroplast proteins already present in the dark, how does this correlate with expression of the corresponding genes?

    2. Reviewer #2:

      This impressive manuscript describes a comprehensive, multifaceted analysis of the morphological and molecular changes that accompany photosynthetic establishment during seedling de-etiolation. Morphological data, focusing in particular on the photosynthetic thylakoid membranes, are derived using transmission electron microscopy (TEM), serial block face scanning electron microscopy (SBF-SEM), and confocal microscopy, while quantitative molecular data on the abundances of proteins and lipids are derived using mass spectrometry and western blotting. The various data are acquired over a time course between 0 h and 96 h post illumination, and with a high level of temporal resolution. The data allow the authors to develop a mathematical model for the expansion of the surface area of thylakoids (reaching 500-times the surface area of the cotyledon leaf), which matches well with experimental observations from the SBF-SEM analysis for earlier, but not later, stages of de-etiolation. Moreover, the data point to a two-phase organization of the de-etiolation process, with the first phase ("Structure Establishment") characterized by thylakoid assembly and photosynthetic establishment, and the second phase ("Chloroplast Proliferation") characterized by chloroplast division and cell expansion.

      The data are of a high standard, and the depth and breadth of analysis in a single, unified study is unprecedented. While it is arguable that there are few major, completely novel insights reported here (indeed, in the Discussion, the authors very helpfully point out how many of the parameters they have measured are consistent with data reported elsewhere by others), this should not detract from the overall value of the study; a major and unique strength here is that all of the data have been acquired together and so are directly comparable. I have no doubt that this dataset will be extremely interesting to many researchers, and prove to be an invaluable resource for the plant science community. Consequently, I am sure that it will attract many citations.

      I have a few specific comments that I would like the authors to consider carefully, as follows.

      1) Figure 3. The 3D reconstructions are undoubtedly useful for deriving quantitative data, as they enable the derivation of thylakoid surface area data to verify the mathematical model. However, it is very difficult to see anything clearly in the images shown in the Figure. I wonder if the authors can make the images clearer, and then also point to and describe some of the key features. The videos do help a bit, but even these are not that clear.

      2) Page 9, second paragraph. It is here that the "two phases" model is first proposed. I really could not see a clear basis for proposing this model here, using the data that had been presented thus far. As I see it (and based on the way the two phases are described in the Discussion), one can't really propose this model until after the chloroplast number and cell size data have been presented.

      Moreover, the description of the second phase here ("and a second phase...") seems a bit inconsistent with the statement in the paragraph above that thylakoid surface area increases dramatically between T4 and T24, and much less between T24 and T96.

      3) Figure 6, and the related supplementary figure. Loading controls are missing here, and should be added. Also, it is stated that a number of proteins (PsbA, PsbD, PsbO, Lhcb2) are "detectable" at T0 (line 348, page 11). To me, they look UNdetectable.

      4) Dividing chloroplasts. On page 13, line 412-413, it is stated that the volume of dividing chloroplasts was measured, and we are referred to Figures 8E and 4B in support of this statement. However, it is not explained how this was done. More clear and specific explanation is needed. Was it the case that the authors sought out and measured dumbbell-shaped organelles, and quantified those? If so, images are needed to illustrate this point. And, I don't see anything relevant in Fig. 4B - this callout apparently belongs in the following sentence. The statement that the average size of dividing chloroplasts was higher than that of all chloroplasts (lines 413-414) is not really surprising if the authors were measuring organelles just on the point of becoming two organelles.

      5) Page 13, beginning of modelling section. The motivation for this section needs to be better introduced. When I first read it, I could not understand why the authors wished to again "determine the thylakoid membrane surface area", as this had already been discussed earlier in the manuscript.

      Also related to the modelling: Did the authors take into account the existence of appressed membranes when calculating the surface area exposed to the stroma (lines 431-432). And, assuming it is clearly established that there is a 1:1 relationship between these proteins and the relevant complexes (lines 441-443), perhaps this should be stated and the relevant literature cited.

    1. Reviewer #3:

      Overall the manuscript is a valuable contribution and represents an important advance using the model that the authors have recently established in Doro et al. 2019.

      I have however a few suggestions for improvement, that I present below.

      Suggestions to strengthen the manuscript:

      1) Fig. 1 diagram is very useful. However, it would be very informative if the diagram could be followed by a representative quantification. For example, when injecting 200 T. carassii, what % of larvae is classified in the two infection categories? Could the authors also further discuss the % of T. low larvae where no parasites were observed during the clinical scoring? Have these larvae (or some of them) cleared the infection completely? Shouldn't they be classified/followed on their own?

      2) Fig. 2: Is the clinical scoring predictive of early death onset (or likelihood of death)? To show this, the authors could, for example, divide the T. car 200 survival curve into 2 separate curves, based on the clinical scoring at day 4-5.

      3) In Fig. 5 and Fig. 6 and related text, the authors describe their results as "macrophage proliferation" and "neutrophil proliferation". I would encourage them to avoid these terms and rephrase these sections. Normally "macrophage proliferation" is used to refer to resident tissue macrophages that occasionally are seen to divide/proliferate. To my knowledge, neutrophil proliferation in a similar manner has not been described. Most likely what the authors describe is myelopoiesis (in agreement, the authors also indicate that Edu staining most commonly is seen in hematopoietic tissues) and the EdU staining in mature macrophages/neutrophils is the result of a (recent) cell division of a hematopoietic progenitor cell. The authors do not have evidence that the terminally-differentiated cells (macrophages and neutrophils) are actually "proliferating". In lack of a more specific mechanistic insight, I would encourage the use of much broader terms, such as "increased production/number of macrophages/neutrophils" rather than "macrophage/neutrophil proliferation", throughout.

      4) The authors observe several very interesting phenotypes that they report in Fig. 7, 8, 9 & 10. The frequency of these phenotypes (association with infection and with each other) however is not quantified and tested statistically. In particular:

      • The authors report that macrophages, but not neutrophils, infiltrate in the cardinal vein, although both cell populations are accumulating on the outer side of the vasculature during infection. Can the authors quantify and test statistically these phenomena, i.e. by counting cells inside the vessel and associated (externally) with the vessel in the PVP, T. car-low and T. car-high groups? Also, do neutrophils ever interact with trypanosomes in other sections of the vasculature, if not in the cardinal vein? Do trypanosomes ever escape from the circulation and interact with neutrophils elsewhere?

      • The authors report that foamy macrophages occur inside the vasculature and are exclusive to high-infected larvae. Can the authors show some quantifications of these associations and perform statistical tests (i.e. count foamy/non-foamy mpeg+ cells inside/outside the vessels in the PVP, T. car-low and T. car-high groups)? Also, macrophages do not phagocytose T. carassii, but foamy macrophages are seen in the context of other (intracellular) Trypanosoma infection. Are macrophages here perhaps scavenging dead Trypanosoma from the circulation, and is this leading to the foamy macrophage phenotype? Trypanosomes are also leading to hemolysis and this could lead to increased phagocytosis of red blood cell debris by macrophages. Could this be linked to the foamy appearance? How specific is BODIPY, to distinguish cholesterol (typical of foamy macrophages), vs lipids derived by phagocytosis of cell debris (i.e. high in membrane phospholipids?)

      • The authors report that foamy macrophages occurring in T. car-infected larvae are characterised by a strong proinflammatory profile and are all il1beta and all tnfa positive. Significant differences are observed in the inflammatory response of macrophages in high- and low-infected individuals and in their susceptibility to infection. Can the authors quantify and test statistically these observations? For example, can the authors show that foamy macrophages are indeed more frequently il1b positive/tnfa positive than neighbouring non-foamy mpeg+ cells?

      • The authors report that a strong inflammatory profile is associated with the occurrence of foamy macrophages. However, it is not clear how widely spread the inflammation is and only images of macrophages and endothelial cells in the cardinal vein are shown. Moreover, only tnfa and il1b are assessed (using transgenic reporters). The authors also mention that they observe a mild inflammatory response in low-infected individuals and that this is strongly associated with control of parasitaemia and survival to the infection. Can they confirm strong vs mild inflammatory profiles and different association with survival in the 2 infection categories and PVP control with a panel of qRT-PCR for several inflammatory markers (i.e. il1beta, tnfa and other relevant cytokines and chemokines)?

    2. Reviewer #2:

      Using this new Trypanosoma carassii infectious model in larval zebrafish, Jacobs et al. have developed a new clinical scoring system to reliably separate high-and low-infected larvae in order to investigate their individual innate immune responses, with a special emphasis on macrophages and neutrophils.

      In summary the separation system used in this allows us i) to identify a strong macrophage and neutrophil proliferation response by high-and low-infected larvae, although happening a bit earlier, 5 dpi, for macrophages in low-infected larvae, and ii) to observe a differential distribution and morphology of macrophages, associated to the unique presence of more rounded foamy macrophages with a high pro-inflammatory profile into the vessels of high-infected zebrafish larvae. Together, this study constitutes the first report of the occurrence of foamy macrophages during an extracellular trypanosome infection.

      Although the paper is well-written and the findings are interesting as they bring new insights into the development of foamy macrophages in response to an extracellular pathogen, i.e. Trypanosoma carassii, using a zebrafish larvae model, I have a few concerns regarding the following:

      • The experimental infectious model in zebrafish: figure 2 summarizes that only 15% of the infected larvae, named low-infected larvae, are able to survive the infection. As an explanation the authors refer to the trypanosuceptible vs. trypanotolerant background of the host observed in non-zebrafish models. However, in this particular setting, all the larvae possess an identical genetic background. Therefore, why would the larvae behave differently in response to a similar pathogen? In addition, there is no clear differences in neither parasitic load at 2 dpi (figure 3F) nor myeloid cells accumulation at 3 dpi (figure 4AB), which could lead to a drastic difference in parasitic load based on mRNA expression at 4 dpi (figure 3F). The authors should discuss this shortly.

      • Figure 4: the representative pictures from Fig4B do not seem to clearly match the histograms depicted in Fig4C. For example, from the pictures in Fig4B, it seems that there is a decrease in red fluorescence in the representative pictures from 7 dpi to 9 dpi low-infected larvae, which is not reflected in the histogram. Also, a representative picture of 7 hi-infected larvae seems to show at least equal or even more red fluorescence compared to 9 dpi low-infected larvae.

      • Lines 494-496 states "No significant difference was observed between high-and low-infected fish, confirming that macrophages react to the presence and not to the number of trypanosomes.", reflecting that there is no differences in total macrophages nor in their proliferation between low- and high-infected zebrafish larvae (Figure 5B&C). Therefore it is not sufficiently clear on which basis the authors states a few lines later as a conclusion that "Altogether, these data confirm that T. carassii infection triggers macrophage proliferation and that proliferation is higher in low-infected compared to high-infected individuals, possibly due to a higher haematopoietic activity." Therefore the authors should revise this conclusion or bring stronger data to reinforce their results. Also, similar conclusions need to be adjusted in the discussion section and bring new elements to explain the higher number of macrophages observed in figure 4.

    3. Reviewer #1:

      The authors devised clinical criteria for identifying Zebrafish larvae with high or low T. cassari infections in order to track. Using transgenic fish line marking macrophages and neutrophils, the authors showed that both groups of larvae increase macrophage (and to lesser extent neutrophil) levels in response to infection. However, the macrophages in high parasitaemia animals migrated into the capillaries and had elevated levels of inflammatory markers (TNF, IL-1) and lipids, indicative of a foamy phenotype. The authors conclude that a measured inflammatory response allows animals to control the initial infection, while an exaggerated inflammatory response leads to an environment in which the bloodstream trypanosomes can proliferate. The findings support and extend data from murine models of infection, by allowing direct visualization of host immune response.

    4. Summary: This study investigates the role of the innate immune response in controlling bloodstream trypanosome infection in the zebrafish infection model recently developed by the authors. The study found that an innate immune response, characterized by controlled inflammatory response was sufficient to control infection in some individuals, while failure to control infection was associated with a strong inflammatory response characterized by expansion of foamy macrophages. The findings highlight the importance of a balanced immune response in controlling bloodstream trypanosome infections that are likely relevant to mammalian infections.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #2:

      This manuscript, "Lactobacilli in a clade ameliorate age-dependent decline of thermotaxis behavior in Caenorhabditis elegans," is focused on the impact of diet on age-dependent behavioral decline. The authors utilize a thermotaxis screen using different lactic acid bacteria (LAB) and identify strains of LAB with the ability to ameliorate age dependent decline in thermotaxis behavior. The study introduces some interesting results, including the finding that many LAB strains of the same clade can improve thermotaxis in older nematodes, despite disparate results on longevity. However, there were some questions remaining about methodology, and more importantly, there is very little evidence provided on what the molecular mechanism might be behind this phenomenon. Overall, this study contains interesting findings that are not developed thoroughly enough.

      Major Comments/Questions:

      1) How is LAB different from Ecoli? Does metabolic composition of LAB dictate its impact on thermotaxis behavior of worms? In the manuscript the authors argue that LAB are a "better" food source than E. coli. How does one define better for something as broad as a food source? There is a difference here but it is very unclear what aspects of LAB physiology may play a role.

      2) Does this phenomenon require eating LAB, or just perceiving it? The assays did not test whether perception of LAB diet is sufficient for its effect on thermotaxis, rather whether more time on LAB leads to better thermotaxis.

      3) Showing a potential daf-16 interaction is plausible, given that daf-16 interacts with many key pathways in the worm, but it is unclear whether this interaction is direct or indirect, or whether daf-16 is a major player in this pathway or just necessary for maintenance of health. What sensory pathways are activated when worms are fed on LAB diet, and how it finally interacts with daf-16?

      4) Similarly, the pha-4 and eat-2 data are interesting, but are not developed in any way. This is another avenue that could in principle lead toward a better mechanistic understanding.

    2. Reviewer #1:

      These investigators examine how lactic acid producing E. coli impact age-related decline in neurological function through the use of temperature-food associative learning or thermotaxis. In particular, they screen a panel of different lactate producing E. coli and identify a particular clade of bacteria, Lactobacilli, that are able to suppress age-dependent decline in thermotaxis in a daf-16 dependent manner. Moreover, they uncouple improvement in neurological function from lifespan determination and locomotion. Overall, this group presents an interesting phenomenon regarding the effects of the lactic acid producing bacteria. However, it is not clear what is happening in the worm to elicit this neurological response and much work remains to determine this mechanism of action.

      While I can appreciate the careful nature of these worm behavioral assays including a host of different controls, these studies lack cellular and molecular details, which reduce my overall excitement for the story. It is interesting that a clade of lactic acid bacteria (LAB) can improve associative learning in C. elegans. However, I was very underwhelmed when I got to the final figure, which very briefly touched on molecular mechanism (only to give DAF-16 dependence). Since it has previously been shown that daf-16 mutant animals impact taste avoidance learning (Nagashima et al. PLOS Genetics, 2019), the dependence of DAF-16 and its role in associative learning seemed predictable. For future submissions, this previous study on DAF-16 should be referenced in the manuscript. Moreover, data regarding dietary restriction and the eat-2 mutation appear to be misinterpreted. Thus, more attention and analysis should be dedicated to the effects of dietary restriction on their paradigm. I thought that it was interesting that a clade of LAB consistently reduced expression of PHA-4 transcription factor and the authors might benefit for expanding upon this observation.

      In addition to molecular characterization, the manuscript provides little explanation at the cellular level. It is unclear what neurons or neuronal circuit are responsible for this phenomenon. Although mentioned in the discussion, this manuscript would benefit by close examination of the thermosensory circuit including the AFD and AIY neurons. How are these lactic acid producing E. coli ultimately signaling to the neurons? Do the LAB slow the rate of degeneration of either neuron? Is this phenomenon the result of lactic acid production or something else in the bacteria? Would it be possible to supplement lactic acid to worm media and produce the same result?

      This is an interesting phenomenon and requires more in-depth cellular and molecular characterization.

    1. Reviewer #2:

      In this manuscript, Knight et al examine the genetic diversity in >12,000 publicly available C. difficile genomes in order to characterize genomic evidence of taxonomic incoherence among this genomically diverse pathogen. Their primary analysis employs average nucleotide identity thresholds to identify species boundaries, with secondary analyses examining core genome size changes, gene content, and estimated emergence dates. The authors' main conclusion is that the previously identified C. difficile cryptic clades CI-III are genomically divergent enough from the main clades C1-5 to warrant classification as different genomospecies. This paper is a useful contribution in benchmarking our understanding of the genetic diversity of C. difficile using all currently publicly available genomes, but the results are largely unsurprising given previous phylogenetic analyses involving clades 1-5 and CI-III, and is therefore probably best suited for a specialty journal. Additionally, in some instances, the methods lack details, reducing their interpretability and reproducibility.

      Major Comments:

      1) There are some claims that are too strong and not supported by the data or literature, including the claim that the rise of community-associated CDI is likely due to presence of C. difficile in livestock (Lines 53-54 - far too little evidence to make such a sweeping claim), the statement of apparent rapid population expansion into clades C1-4 (Lines 278-279 - only shown for certain sequence types and greatly impacted by observation bias), the statement that these findings "impacts the diagnosis of CDI worldwide" (Lines 37-38 -too grandiose given limited evidence of the clinical importance of the cryptic clades).

      2) Generally, it is hard to discern which sets of genomes and variants were used for each of the bioinformatic analyses that are described. If there are a limited number of genome sets it might be useful to define them in the results to allow the reader to more easily follow along and understand the scope of different analyses.

      3) The dated phylogenomic analyses methods would benefit from a more thorough assessment of model assumptions along with more description of the sources of bias and uncertainty at play. Specific questions are:

      • Was the temporal signal in the data evaluated?

      • What are the potential impacts of using a single clock model and demographic prior for such a diverse set of taxa?

      • Was the clock rate restricted to the cited 2.5x10-9 - 1.5 x 10-8 range? What clock prior distribution was applied?

      • Were relaxed clock priors explored?

      • What went into the selection of the demographic model prior in BEAST? Were alternative models evaluated?

      • The significant uncertainty in the divergence estimates should be emphasized/listed as a limitation.

      4) Similarly, the pangenome analyses could be more thoroughly described, and the relevance of the core-genome size changes more robustly explored. Specifically:

      • How did the core genome change when excluding any of C1-5? Were these changes much different than when excluding CI-III?

      • The differences between Roary and Panaroo are notable, and potentially important for the microbial genomics community. More details should be provided on these results and how sensitive they are to the input parameters of the respective programs (e.g. collapsing paralogs in Roary and percent identity for orthologs). In addition, it is important to know if any filtering was done with respect to the quality of assemblies, which could have a significant impact on Roary's behavior.

    2. Reviewer #1:

      General Assessment:

      The work presented by Knight et al. in "Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy" is of excellent quality and spans several of the themes of eLife. The manuscript provides a thorough and robust examination of publicly available C. difficile genomes, to deliver a much-needed update of C. difficile phylogeny, in particular the cryptic clades of C. difficile. However, there are some further clarifications could be included to confirm if the cryptic clades of C. difficile, and the 26 unclassified STs (which seemingly form 4 distinct clusters) should indeed be assigned to the Clostridioides genus, distinct from both C. mangenotii and C. difficile.

      Specific comments:

      Lines 96-97 and Figure 2: Figure 2 suggests the 26 unclassified STs form at least 4 distinct clusters, yet these STs are classified as outliers. Could you please comment on why these are considered outliers? Or do these STs represent new cryptic clades? C-IV, C-V etc.? And do these unclassified STs also fit into the criteria for the novel independent Clostridioides genomospecies?

      Lines 161-162; Table 1: C. mangenotii is referred to as Clostridioides mangenotii on lines 161-162, but has been listed as Clostridium mangenotii in table 1. Was this intentional? Or should this be Clostridioides mangenotii as C. difficile is also listed as Clostridioides difficile?

      Figure 6: Many of the numbers and symbols on the figure are difficult to see e.g. Figure 6A the values listed above each data point are extremely small. Can these values/symbols be increased?

      Lines 224-225: Given that C. difficile strains lacking tcdA and tcdB can still cause infections, consider rephrasing "indicating their ability to cause CDI".

      Figure 7: As with Figure 6, many of the numbers and symbols on the figure are difficult to see. Can these values/symbols be increased?

      General comments:

      Were the unclassified STs included in the species wide ANI analyses in Figure 3? If similar analyses were performed for these STs and given the clusters that are presented in Figure 2 would this support the idea that they may also fit into the criteria for the novel independent Clostridioides genomospecies?

      Similarly, were these same unclassified STs included in the BactDating and BEAST analyses? Or the pairwise ANI and 16S rRNA value comparisons in Figure 5? Or the pangenome and toxin gene analysis also presented in Figures 6 and 7? And would this add further strength to the idea that these "outliers" could be the first typed representatives of additional genomospecies?

      Lastly, your conclusions are a little too on the fence. You have presented sufficient evidence to suggest that the cryptic clades of C. difficile likely represent novel independent Clostridioides genomospecies, but dilute out the importance of this throughout the discussion and conclusions. Although controversial, the evidence provided gives credence to these claims, and the text should be changed to reflect this.

    3. Summary: We appreciate this study and find that the conclusions that reclassify Clostridiodes are largely justified by the data/analysis. The major concern is that the work represents the application of standard approaches to refine species classification, as opposed to either proposing a novel approach to classify species or defining a split that might be more surprising and/or clinically significant (e.g. Kumar et al. Nature Genetics, 2019). Consequently, despite being a useful contribution to the literature we believe it is more suitable for a specialized audience.

      Reviewer #1 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #2:

      Recombinant antibodies are the most common and powerful reagents in life science research to identify and study proteins. Yet, every single antibody should always be validated and carefully tested for its relevant application, to ensure constructive and reproductive scientific endeavor. I was thus extremely pleased to review the manuscript of Terkild Buus et al, as it provides a careful assessment of oligo-conjugated antibody signal in CITE-seq. The authors tested four variables (antibody concentration, staining volume, cell numbers and tissue origin) and clearly showed that antibody titration is a crucial step to optimize CITE-seq panel. The authors found that, as a general rule, concentration in the 0.625 and 2.5 µg/mL range provides the best results while recommended concentrations by vendors, 5 to 10 µg/mL range, increase background signal.

      In my opinion, the study is well-performed and may serve as a guideline to accurately validate antibodies for CITE-seq, as a consequence I have only minor comments.

      • As stated by the authors, the starting concentration used for each antibody was based on historical experience and assumptions about the abundance of the epitopes. This approach may not be ideal, and the optimal concentration may have been missed. Do the authors think that a proper titration would be an advantage? Maybe this could be discussed in the text.

      • The authors showed by testing four variables (see above) that they could define the optimal conditions to reduce background signal and increase sensitivity of antibodies and thus this way improves CITE-seq outcome. Nevertheless, the authors rely on the fact that all antibodies used in their panel are specific for their targeted antigens. I am not asking here to test the specificity of every single antibody used in the study as this would be a colossal amount of work. But I feel that this aspect should be discussed in the manuscript, especially when an "uncommon" antibody is intended to be used in the CITE-seq panel; the specificity of this antibody should be indeed tested prior to its use.

    2. Reviewer #1:

      In the study by Buus et al., the authors set out to address an important need to understand how oligo-conjugated antibodies should be optimally utilized in droplet-based scRNA-seq studies. These techniques, often referred to as CITE-seq, complement techniques such as flow cytometry and mass cytometry yet also further extend them by the ability to jointly measure intra-cellular RNA-based cell states together with antibody-based measurements. As is the case with flow cytometry, manufacturers provide staining recommendations, yet encourage users to titrate antibodies on their specific samples in order to derive a final staining panel. Based on the ability to stain with hundreds of antibodies jointly, few studies to date have assessed how the antibodies present in these pre-made staining panels respond to a standard titration curve. In order to address this point, this study tests two dilution factors, staining volume, cell count, and tissue of origin to understand the relationships between signal and background for a commercially available antibody panel. They arrive at the general recommendation that these panels could be improved, grouping various antibodies into distinct categories.

      This study is of general interest to the scRNA-seq and CITE-seq communities as it draws attention to this important aspect of CITE-seq panel design. However, it would stand to be substantially improved by not only providing suggestions but also testing at least one, if not more, of their suggestions from Supplementary Table 2, and preferably performing experiments using more technical replicates or biological replicates. As it stands now, the study is largely based on one PBMC and one lung sample, that were stained once with each manipulation as far as can be gathered from the Methods.

      Major comments:

      1) Given the title is improving oligo-conjugated antibody... it would be important to functionally test one of the suggestions. We would suggest a full titration curve of selected antibodies, perhaps one from each of the categories, but if cost is a concern at least two or three antibodies, to identify how titration impacts antibodies, and especially those in categories labeled as in need of improvement. Relatedly, if the idea is that if antibodies (such as gD-TCR) do not have a cognate receptor leading to general background spread, does spiking in a cell that is a known positive in increasing ratios remedy this issue by acting as a target for the antibodies? Does adding extra washes help to remedy these issues of background?

      2) Another way of improving these panels is through reducing the costs spent on both staining but perhaps more importantly the sequencing-based readouts. Several times in the manuscript (at line 77 for example or line 277) it is alluded to that the background signal of antibodies can make up a substantial cost of sequencing these libraries. However, no formal data on cost is presented, which would be important to formalize the author's points. It would be important to provide cost calculations and recommendations on sequencing depth of ADT libraries based on variation of staining concentration. Relatedly, in the methods, sequencing platform and read depth for ADT libraries was not discussed, nor is the RNA-seq quality control metrics provided other than a mention of ~5,000 reads/cell targeted. This is important to report in all transcriptomic studies, and especially a methods development study.

      3) One of the powerful elements of joint multi-modal profiling, as mentioned in the title, is to be able to measure protein and RNA from a single cell. This study does not formally look at correlation of protein and RNA levels, and whether a decrease in concentration of antibody either improves or diminishes this correlation. This would be important to test within this study to ensure that decreasing antibody levels does not then adversely affect the power of correlating protein with RNA, and whether it may even improve it.

      4) How was the lack of antibody binding determined for Category E? CD56 is frequently detected on NK cells in peripheral blood, CD117 should be detected on mast cells in the lung, and CD127 should be found on T cells, particularly CD8+ T cells. From inspecting Figure 1E, it appears as if all three of these markers are detected on small but consistent cell subsets. As the clusters are only numbered and no supplementary table is provided to help the reader in their interpretation, it is difficult to determine if these represent rare but specific binding, or have not bound with any specificity.

      5) References: At 14 references, the paper overall could benefit from a more comprehensive citation of related literature including flow cytometry and/or CyTOF best practices for antibody staining and dealing with background, and joint RNA and protein measurement from single cells.

    1. Reviewer #3:

      The authors present a simple model that explains important outstanding controversies in the field of long-range gene regulation. These controversies include the fact that insulation boundaries tend to be weak; that acute inactivation of CTCF or cohesin (that leads to inactivation of insulation boundaries) leads to only minimal gene expression and that in live cells enhancer-promoter contacts appear not correlated with transcriptional bursting. The model involves a futile cycle of tag addition and removal from promoters, stimulation of more tag addition when tag is already present, and stimulation of tag addition by contacts with distal enhancers. The authors show that such a model explains all the above controversies, and indicate that the controversies are not inconsistent with mechanisms where long-range gene activation is driven by physical contacts with distal regulatory elements.

      The authors have explained and explored the properties of the model well. I have only minor comments.

      1) An alternative explanation for TAD-specific enhancer action is that an E-P interaction within a TAD (between two convergent CTCF sites), one that is brought about by extruding cohesin, is not equivalent to an interaction that occurs between two loci on either side of a CTCF site and that can be a random collision that is not mediated by extruding cohesin. In other words, two interactions can be of the same frequency but can be of a very different molecular nature. I agree that this model would not explain the results of the experiment where cohesin is acutely removed.

      2) In the beginning of the introduction the authors introduce TADS. I recommend that the authors present this in a more nuanced way: compartment domains also appear as boxes along the diagonal, an issue that has led some in the chromosome folding field to be confused. This reviewer believes TADS are those domains that strictly depend on cohesin mediated loop extrusion, whereas compartment domains are not. If the authors agree, perhaps they can rewrite this section?

      3) If I understand the model correctly, the nonlinearity arises because of the increased rate of tag addition when tag is already present. The authors then speculate histone modifications can be one such tag. However, there are only so many sites of modification at a promoter. Can the authors analyze how the possible range of tag densities affects performance of the model? Is the range required biologically plausible?

      4) Can the authors do more analysis to explore how rapid changes in gene expression may occur (e.g. upon signaling a gene may go up within minutes)? How much more frequent does the E-P interaction need to be for rapid switch to the active promoter state? Can the authors do an analysis where they change the rates of the futile cycle upon some signal: at what time scale does transcription then change (keeping E-P frequency the same)?

    2. Reviewer #2:

      The main analyses of the study compare previously published experimental observations from Hi-C and ORCA to predictions of the author's "futile cycle" model. The predictions are derived from simulations and differential equations analysis of the model as a dynamical system. Given its centrality to the manuscript, we recommend describing this overall strategy in more detail in Results. For example, at line 124 (Pg. 4) the authors could talk about how the simulations are done, including where the variability comes from (e.g., random starting conditions vs. probabilistic events vs. different parameters).

      Xiao et al. make several key assumptions to dramatically simplify their model. Namely, it is assumed that promoter modification and transcription are equivalent and that enhancer-promoter contact influences transcription instead of transcription influencing structure. Steady-state equilibrium must also be assumed. It would be helpful if the authors explicitly stated these assumptions and provided references to support their being reasonable.

      It is not totally clear why the authors decide to call their proposed approach the futile cycle model. There are similarities to other well-known models in biochemistry and biophysics that should be noted. It might make sense to simply call this a mechanistic model of cooperative promoter activation. If the authors stick with "futile cycle", the relationship between promoter activation through tags and metabolic signaling should be described in more detail.

      There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point. There are parallels to both sigmoidal dose-response curves, where drug concentration is plotted against response, and transcription factor binding curves, where free ligand concentration is plotted against the fraction bound. We recommend providing background context on these types of models or the Hill equation to illustrate why non-linear behavior is or is not surprising given the proposed model.

      For clarity, it would be helpful to discuss model parameters in greater detail. First, we suggest noting which parameters shift the location of the curve and which increase the steepness of the curve. Second, we recommend including a phase diagram exploring when sigmoidal behavior and any other key model predictions arise across parameter space. In what circumstances does hypersensitivity or time lag emerge? The authors demonstrate that a narrow set of parameters is sufficient to produce a super-linear relationship between enhancer-promoter contact and transcription in Figure 6. One potential dilemma is this model's ability to explain many experimental observations by indicating that minimal changes all occur in the sub-linear regime while observable changes occur in the super-linear regime. Given that one needs specific parameters to replicate an example of the hyper-linear regime (including at least three degrees of stimulation and increasing stimulation of the successive states), it could be valuable to demonstrate how large the plausible parameter space is. Without an exhaustive search across the space of minimal parameters, it is not clear when this property emerges or how common it is within the full parameter space. The authors could vary model parameters and plot a grid visualizing behavior (e.g., steepness of the curve or Hill coefficient).

      Images throughout the manuscript are low resolution, making the figures difficult to read. Increase the resolution of figures throughout, especially those containing text (Fig 6A).

    3. Reviewer #1:

      Xiao et al describes a kinetic model of enhance-promoter interactions, which the authors use to explain the changes in transcription levels upon disruption of genomic contacts within topologically associated domains (TADs). The model uses the law of mass action to describe activity of promoters and enhancers, which are proposed to be able to accommodate multiple transcription activation tags. The authors use the model to explain the nonlinear relationship between the genomic contact frequencies within TADs and their corresponding transcription rates. They recapitulate the superlinear relationship between the changes in genomic contact probabilities and transcription rates within TADs observed in their recent experiments (Mateo et al, 2019). Inspired by the futile cycle of cell signaling, their model incorporates multiple tagging of promoters allowing for transient amplification of transcription rates.

      Conceptually, this work is interesting and the model suggests possible reconciliation of seemingly contradictory experimental observations reported earlier.

      However, the manuscript in its current form fails to substantiate many of its claims.

      Here are my major concerns:

      1) The presentation of the model is unclear. It is currently present in the text, lines 110-122, in pure qualitative description. Authors define only rates in the text; definitions of other model parameters are not present. For example, E and a are not specifically defined in the text or Methods section. Since both terms "enzyme" and "enhancer" are being used and in fact "enzyme tagging" and "enhancer tagging" occur simultaneously in the model, it is not possible to say for sure when do authors call which one in the model and thus the methods section can be interpreted in different ways. Moreover, the cartoon is missing a legend confirming, which molecular player is which. The figure caption mentions only green triangles being the tags, but no other parts of the cartoon are being explained. Taken together, this makes it very difficult to verify the mechanics of the model.

      • The authors should provide a detailed technical description of their model directly in the text, including description of their parameters, list their constitutive equations and identify all parameters in their cartoon Fig. 1C.
      • Axes labels in all figures should be expressed in the parameters/variables of the model (as in Fig. 6C-D) directly connecting to inputs/outputs of the model.

      2) Due to the lack of description, in many sections it is not clear what are the specific inputs and outputs of the model (e.g. Fig. 2).

      3) The Methods section describes the chemical kinetics of the suggested reactions and the insulation score calculations. But it is not clear how do these inform each other, how are contact-frequency maps chosen/computed and cross-referenced with the local E-P kinetics?

      4) In the Methods section, it appears that in lines 577-580 of the model description, the mass is not conserved.

      5) In 587-588, the index of k is 2(n+1), which equals to 2n+2, but then in the next line the following assumption is made 2n+1 → n+1

      6) The authors make assumptions that their kinetic considerations hold for n>2. What is the evidence?

      7) The authors observe hysteresis in median transcription rate as a function of enhancer contact frequency. However, the presented violin plots suggest a presence of two states, one with low and one with high transcription rates. In the intermediate regime of enhancer contact frequency, where authors report hysteresis, the violin plots show bimodal distributions suggesting coexistence of these two states. This would suggest that the system exists in and switches between two distinct states with a discontinuous transition, instead of a continuous hysteretic behavior as suggested by the median behavior.

      8) The language of the paper is often not technically precise with qualifiers missing, which could lead to ambiguities and misinterpretations. Here are some examples:

      • *p. 1, line 10, "difference in contact across TAD borders is usually less than twofold"
      • *p. 1, line 17, "results from recent cohesion disruption"
      • *p. 2, line 71, "A simple model of hypersensitivity to changes in contact frequency"

      9) On p. 13, line 483, authors define Ostwald ripening as given by weak multivalent interactions; however, Ostwald ripening is a thermodynamic process. In addition, they propose that liquid condensates become larger due to Ostwald ripening, but there are also other processes that may occur, such as coalescence of condensates, which would also lead to larger condensates.

      10) At the beginning of the Discussion section authors state they will propose future experiments in each section. However, in some of the sections it is not clear what specifically authors are proposing. These suggestions should be made clearer.

    4. Summary: The work describes a simple theoretical model for enhancer action that explains several major controversies in the field of long-range gene regulation and the role of topologically associating domains and insulating boundaries in modulating enhancer-promoter interactions. Further, the model makes predictions that can be experimentally tested. This is valuable for the field of gene regulation.

      Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #2:

      This manuscript by Diamanti et al. describes their study on how visual neurons responded to identical visual stimuli at two different locations along a virtual linear track. Extending their previous result that spatial location modulates the neuronal activities in the primary visual cortex (V1), they now demonstrate that similar spatial modulation also occurred in the higher visual areas (HVAs), but not so much in a lower visual area, the lateral geniculate nucleus (LGN). In addition, they show that the modulation, measured by a spatial modulation index (SMI), was stronger when animals had more experience in the track and when the animals were actively performing a task rather than passively viewing the same virtual track. The authors have been responsive to comments by previous reviewers at a different journal. Data are appropriately analyzed and clearly presented.

      Since the finding that visual neurons are spatially modulated similarly as hippocampal place cells in spatial navigation tasks (Ji and Wilson, 2007; Haggerty and Ji, 2015; Fiser at al, 2016; Saleem at al, 2018), there has been increasing interest in identifying the source(s) of this modulation. This study adds new evidence to this puzzle, suggesting that it is more likely either generated within the visual cortex or top-down propagated from higher brain areas, rather than bottom-up propagated from the thalamus. This is an important contribution. However, there are concerns, mainly on the data interpretation and the clarification of the main conclusion, as elaborated below.

      1) Because experience and task engagement enhanced spatial modulation, the authors concluded in the abstract that "Active navigation in a familiar environment, therefore, determines spatial modulation...". This conclusion is too strong and not well-supported by the data. First, spatial modulation on Day 1, when the task was novel, was lower than on later days, but it was already much higher than 0 (Fig. 1h). Also the individual neuron data (Fig. 1e) display clear spatial modulation on Day 1. Therefore, "familiar environment" is not a requirement. Second, spatial modulation during passive viewing was much higher than 0 and was correlated with that during active navigation, as shown in Fig. 4e - Fig. 4l. Therefore, "active navigation" is not a requirement either. It is true that both active navigation and familiar environment enhanced spatial modulation. They did not "determine" spatial modulation.

      2) Related to the point above, the presence of spatial modulation in passive viewing reminds us that these cells in the visual system were still mainly driven by visual stimuli. The data in Fig. 4e,f are especially telling: the modulation in V1 was similar and highly correlated between active navigation and running replay. In addition, it is clear from all the raw traces in Fig. 1 and Fig. 2 that these cells did respond to the two segments with identical stimuli reliably with two peaks. The spatial modulation was just a change in one of the peaks. So the nature of the modulation is a "rate remapping" of the expected, classical visual responses. I believe, in order to maintain the big picture of what drives the activities of these neurons, it is beneficial to clarify that the "spatial modulation" is a modulation on top of the expected visual responses. This message is not explicitly conveyed in the current manuscript.

      3) The authors stated that spatial modulation is "largely absent in the main thalamic pathway into V1". This was based on the significantly weaker SMIs in LGN than those in V1 and HVAs. However, it is unclear whether the SMIs in LGN were still significant. The SMI values for both LGN buttons (Line #100) and LGN units (Line# 130) might be statistically significant from zero. The statistical comparison p-values should be given in both cases. Second, Figure 3 - figure supplement 1 b,f show that the SMI values in LGN could be predicted by spatial modulation, but not by visual stimuli alone or behavioral variations, just like those in V1 and HVAs. This seems to me good evidence for the presence of spatial modulation in LGN. Therefore, it is my opinion that the data do not support the complete lack of spatial modulation in LGN, but do clearly demonstrate weaker spatial modulation in LGN than in V1 and HVAs.

    2. Reviewer #1:

      This paper investigates the modulation of spatial signals in higher order visual areas. A number of the findings are novel and interesting, including that signals in higher visual areas are not more influenced by spatial position that signals in V1, that this modulation is not a general feature of the entire visual circuit (i.e. LGN boutons in L4 of V1, as well as LGN units, show very little spatial modulation, and that spatial modulation decreases when mice are watching a replay of tunnel traversals. Overall, I think this paper provides new insight regarding position coding in visual systems. However, there are some points that should be addressed.

      1) The imaging data is from mice with different genetic backgrounds, as well as a mixture of gcamp6f and 6s. In addition, different reward protocols were used for different mice. Although the authors state in the methods that none of these factors impact their results, it would be good to include some quantifications to this effect (e.g. they could show the distribution of SMI for 6f data vs 6s data). While I don't expect the major observations to change if it turns out that some of these factors have as systematic effect, it could affect portions of the results where the dataset is split up - for example in the comparison between different higher visual areas, and the observation that spatial modulation appears to vary with receptive field location.

      2) The authors state that it is to be expected that LGN neurons respond more strongly in the first half of the corridor due to contrast adaption mechanisms. However, I did not see any quantification that could support this statement?

      3) When looking at the spatial modulation index, the authors switch between using median (e.g. Fig 1 and 2) and mean (Fig 4), t-test and rank-sum - and sometimes there is missing information regarding which (mean or median) they are reporting. The authors need to include more detail regarding these statistics.

      4) It was not clear to me if the authors are only imaging from layer 2/3 or if they also attempted to image deeper layers.

      5) Throughout the paper, the authors use 'firing rate' to refer to deconvolved calcium signal. Although this is stated in the methods, this wording can be misleading, especially since the paper also contains extracellular recordings of spiking activity.

      6) It was not clear to me how the dotted lines (e.g. Fig 1 b) were calculated.

    3. Summary: This paper investigates the modulation of spatial signals in higher order visual areas in mice navigating virtual reality environments. Previous work demonstrated that the spatial position of an animal modulates neural activity in the primary visual cortex (V1). Here, the authors demonstrate that this spatial modulation however, is not a general feature of the visual circuit. Similar spatial modulation occurs in higher visual areas but not in lower visual areas, such as the lateral geniculate nucleus. Moreover, this work finds that spatial modulation was stronger when animals had more experience on the track and when the animals were actively performing a task, rather than when the animal was passively viewing the same virtual track. Since the first reports that visual neurons show modulation by spatial position during spatial navigation tasks, similar to that observed in hippocampal place cells, the source of this modulation has been an open question. This work adds new insight regarding this question, suggesting that it is likely either generated within the visual cortex itself or propagated in a top-down manner from higher brain areas, rather than in a bottom-up manner from the thalamus.

    1. Reviewer #3:

      In this interesting paper authors compare MEG recordings of svPPA patients and 44 healthy controls during living vs. non-living categorization tasks. Both patients and the control group performed this task with similar accuracy. In addition, svPPA patients showed greater activation over bilateral occipital cortices and superior temporal gyrus, and inconsistent engagement of frontal regions. The authors conclude that patients with svPPA compensate for their semantic deficit by recruiting regions involved in perceptual processing.

      This is a well written study and the results are presented clearly. The findings are novel and interesting.

      1) One question for clarification is whether the recruitment of the occipital areas in semantic PPA is truly "compensatory" - does it indicate a shift of resources due to the anterior temporal atrophy? Is the recruitment of the parieto-occipital regions associated with more accurate performance?

      2) The main results concentrate on the differences between patients and controls in the low gamma range. There are also significant effects in the other frequency bands (e.g., high gamma, beta and alpha). Could the authors discuss the functional significance of these effects?

    2. Reviewer #2:

      Borghesani and colleagues aimed to understand how dysfunction in the ATL alters the dynamic activity during semantic categorization. To achieve this, they contrast MEG responses between patients with svPPA and age-matched healthy controls. Both groups show similar profiles of behavioural performance on the task, and broad similarities in MEG responses. Critically, svPPA patients show enhanced gamma synchronization in the occipital lobe compared to controls, while gamma synchronization was correlated to task RTs.

      In general, I found the manuscript interesting, and the major strength being the application of MEG analyses to a clinical population during a cognitive task. In terms of improvements, I think the results could be more fully characterized, which would allow for more expansive interpretations and inferences.

      Major comments:

      1) As the paper is about 'Neural dynamics', I felt this aspect could be developed, with the timing of the effects characterized further, and considered more in relation to the conclusions. For example, the main finding is the increased occipital gamma response in svPPA compared to controls. Looking at Figure 3, there is a peak in the svPPA group near 200 ms, and very little synchronized activity in the control group. This is interesting as there are many ways we could have seen svPPA > controls, but this suggests that the gamma synchronization response associated with compensation is specific to the svPPA group (and largely absent from controls - also from Supp fig 1), and is distinguished from an initial visual evoked response (peaking ~100 ms). I would recommend discussing and characterizing the dynamics of this effect more, such as what a later occipital effect could tell us about dynamics given ATL dysfunction? Is this increase a result of a lack of top-down effects from ATL? I think these kinds of issues could be explored and discussed more.

      2) The occipital gamma effect looks like the primary visual cortex, which might suggest the effects are not related to higher-level perceptual features (such as has eyes, teeth) as the authors suggest, but rather low-level visual effects. Do the authors perhaps think the effects could relate to enhanced processing of visual details (as related to the ideas of Hochstein and Asher's reverse hierarchy), or whether the effects relate to additional visual input following a visual saccade?

      3) The VBM results for the svPPA patients were surprising given that all the atrophy appeared in the left hemisphere. There can be hemispheric differences in svPPA, but is this a true lateral pattern (meaning the right ATL is intact) or a product of VBM being run so that the most atrophied hemisphere is shifted to the left side? If the VBM maps are correct, and the svPPA patients are only showing left hemisphere atrophy, then what does this suggest about the role of the right ATL, and the bilateral nature of occipital increased in svPPA?

      4) Both svPPA patients and healthy controls achieved around 80% accuracy in the categorization task. This seems surprisingly low given, (1) the task (living vs. nonliving after seeing the image for 2 seconds), (2) that all the images were pretested and had high name agreement, and (3) that items were repeated on average 2.5 times. Is there something that explains this low performance for all individuals?

    3. Reviewer #1:

      This study examines MEG activity in a picture categorization task (decide living or non-living) in a sample of 18 patients with semantic variant PPA, compared to 18 controls. As svPPA is a rare (but scientifically informative) disorder, the sample size is impressive, and given that relatively few MEG studies exist in PPA at all, this is an interesting dataset. The authors show differences in engagement of oscillatory activity, specifically increased low-gamma ERS in occipital cortex and increased beta ERD in the superior temporal gyrus. The authors interpret this as reflecting increased engagement of / reliance on early perceptual mechanisms for completing the task, as opposed to semantic identification of the picture.

      Major concerns:

      1) My biggest methodological issue with this paper relates to a very old debate in neuroimaging that still comes up all the time: the choice of statistical threshold. Using a high threshold prevents false positives, but may also lead to false negatives, and I fear that is the case here, with the high threshold contributing to an unrealistic impression of spatial specificity in MEG. It is obvious from the average responses in both groups that these oscillatory responses are widespread through the brain. Indeed the alpha and beta responses are significant in the majority of cortical voxels. This basic property of the responses should be presented clearly and prominently in the paper - I don't think it's appropriate to put it in supplementary information where only a minority of readers will even see it. The authors then use what I think is an extremely high and conservative statistical threshold to contrast differences between the two groups. P<.005 uncorrected is a highly conservative threshold already, even before cluster-thresholding is added (although with data as smooth as MEG beamforming solutions, cluster-thresholding is unlikely to change anything). Basically this makes the only the strongest part of the activation survive, and it is valid to conclude that a significant group difference exists there (protected from Type 1 error), but this can give a false impression of the difference is specific to that region. I think a more realistic characterization of the results would involve measuring differences in the strength of the responses between groups on a broader level, possibly the sensors or in large ROIs - and not ROIs pre-selected to show a dramatic difference by first searching the whole brain for the most significant effects - that is the classic "double-dipping" fallacy in neuroimaging.

      2) Similarly, the ERD/ERS in each frequency band is treated as a separate entity, ignoring the fact that these bands are arbitrary and frequency is a continuous quantity. This matters because much is made of the fact that PPA participants exhibited greater ERS in the low-gamma range, and that this was correlated with reaction time. Supplementary figure 1 shows that both groups had strong occipital ERS in the high-gamma range, but only PPA showed it in the low gamma range as well. This suggests that the ERS in the PPA group may simply have been shifted to a lower frequency range. A more fulsome characterization of these group differences via time-frequency analysis and/or power spectral analysis would help clarify what is going on here.

      3) It is surprising that PPA participants only exhibited increased MEG responses compared to controls - assuming that both gamma ERS and beta ERD can be interpreted as increased neural activation, which is a reasonable assumption based on the literature. No decreases in the PPA group are found, and thus the observed increases can be plausibly attributed to compensatory processes as framed by the authors. However, I am concerned about the role of certain analysis choices in producing this data pattern. In particular, the authors state (line 611): "To remove potential artifacts due to neurodegeneration or eye movement (lacking electrooculograms), we masked statistical maps using patients' ATL atrophy maps (see section MRI protocol and analyses), as well as a ventromedial frontal mask."

      It is not clear whether this masking was done in group space from average atrophy maps, or on an individual level. In either case, I don't think this is well justified. I don't know any physical mechanism by which tissue undergoing neurodegeneration can be said to generate an artifactual signal. Atrophied tissue still contains living neurons with ionic currents; these are real signals not artifacts, and furthermore, atrophy is a continuous process with tissue further from the epicenter also undergoing similar neurodegenerative mechanisms. Atrophied tissue may well generate electromagnetic signals that are different from healthy tissue, and such differences should be included in this paper. I think that there may be regions of hypoactivation as well as hyperactivation in this PPA group. If the hypoactivation localizes to atrophied tissue and the hyperactivation to other regions, that will bolster the case that we are seeing compensatory processes, but it isn't certain with half the story masked. I also don't really see statistical masking of the frontal region as a valid solution to eye movement artifacts. The authors would have to present evidence that the region that they masked corresponds to the region potentially affected by eye movements. However, many studies have found that beamforming already does a pretty good job of removing ocular artifacts from estimated brain signals, except for very close to the eyes.

      4) The correlation with reaction time in the occipital cortex is consistent with the idea that the ERS there may reflect compensatory overreliance on perceptual information, but it isn't conclusive. The authors suggest that PPA patients are able to categorize the stimuli correctly based on visual features, but are unable to name them. What about testing for correlations with the out-of-scanner behavioural measures that established that the patients have a naming deficit? It would strengthen the case if atrophy or hypoactivation (see comment above) correlated with the naming deficit.

    4. Summary: Borghesani and colleagues aimed to understand how dysfunction in the anterior temporal lobe (ATL) alters dynamic activity during semantic categorization. They contrast MEG responses between 18 patients with semantic variant Primary Progressive Aphasia (PPA) and 18 age-matched healthy controls. Both groups show similar profiles of behavioural performance on the task, and broad similarities in MEG responses. Critically, however, svPPA patients show enhanced gamma synchronization in the occipital lobe compared to controls. The authors interpret this as reflecting increased engagement of / reliance on early perceptual mechanisms for completing the task, as opposed to semantic identification of the picture.

      Overall, the reviewers found the manuscript interesting. As svPPA is a rare (but scientifically informative) disorder, the sample size is impressive, and given that relatively few MEG studies exist in PPA at all, this is an interesting dataset. However, the general opinion is that the results could be more fully characterized, which would allow for more expansive interpretations and inferences.

      This manuscript is in revision at eLife.

      Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      Neuronal ensembles have been shown by this lab and others to constitute one basic functional unit for the representation of information in cortical circuits. It is therefore important to determine how stable these blocks of representation might be. If these ensembles were preserved across time and sensory stimuli, this would indicate a significant degree of structure underlying cortical representations. In a first attempt to address these important issues, this manuscript analyzes the long-term stability of ensembles of coactive neurons in the layer 2/3 of mouse visual cortex across several days. Ensembles were recorded during periods of spontaneous activity as well as during visual stimulation (evoked). For this, the authors record spontaneous and evoked activity using two-photon calcium imaging one, ten and 40 days after the first recording session. In order to maximize overlap between successive imaging sessions, the authors record three planes separated by 5 microns almost simultaneously (9ms interval) using an electrically-tunable lens. They show that ensembles extracted during visual stimulation periods are more stable on days 2 and 10 than those computed during spontaneous activity. Stable ensembles display a higher "robustness" (a parameter that quantifies how many times a given ensemble is repeated and how similar these repeats are) . Neurons displaying stable membership are more functionally connected than unstable ones. It is concluded that such observed stability of spontaneous and evoked ensembles across weeks could provide a mechanism for memories. Long-term calcium imaging within the same population of neurons is a real challenge that the authors seem to overcome in the study. The conclusions are important, my main concern relates to the number of experiments and analyses supporting these findings as detailed below.

      Number of experiments and statistics: According to Table 1, two mice with GCamP6f have been through the complete imaging protocol (days 1,2, 10 and 43) but none with the 6s, since 3 missed the intermediate measure (day 10) and one the last point (day 40+). Therefore five mice have been recorded over weeks with two different indicators, but only two were sampled on day 10. One mouse was only recorded until day 10. Altogether, this is quite a low sampling, but the experiments are certainly difficult. However, the total number of experiments analyzed is higher, due to the repeat of 3 sessions on the same mouse per day. This certainly contributes to reaching significance. However, the three samples from the same mouse are not independent points. Are the FOVs different for each session in the same mouse? If they are the same, then the statistics should be repeated but treating all experiments from the same mouse as single experiments. I would suggest repeating the analysis but using only one data point per mouse per day. Also, given that two different indicators were used (6s and 6f), one would need to see whether the statistics are the same in the two conditions.

      Robustness: the authors compute this metric, as the product of ensemble duration and average of the Jaccard similarity and find that stable ensembles display higher robustness: isn't it expected that robustness is higher in stable ensembles given that stable ensembles should be observed more often?

      Evoked ensembles: It seems to me that evoked ensembles are ensembles extracted during continuous imaging periods that include stimulation. However, one would expect evoked ensembles to be the cells activated time-locked to the visual stimulation. This notion only appears at the end of the paper with "tuned" neurons in Fig. 4. In the discussion, authors conclude lines 205-207 that "sensory stimulus reactivate existing ensembles" . I do not think this is supported by the analysis performed here. For this, I believe that one would need to compare, within the same mouse the amount of overlap between spontaneous ensembles and "tuned neurons".

      How representative are the illustrated examples in Figs. 2&3? The authors report that about 20 neurons remain active from day 1 to 46 but their main figures display example rasterplots with more than 60 neurons, which is three times more than the average. Is this example representative? Which indicator was used? Is there a difference in stability between 6f and 6s?

      Rasterplot filtering: The authors chose to restrict their ensemble analysis to frames with "significant coactivation". Why not use a statistical threshold to determine the number of cells above which a coactivation is significant instead of arbitrarily setting this number to three coactive neurons? In cases of high activity this number may be below significance.

      Demixing neuronal identity: The authors assign a neuron to an ensemble if it displays at least a functional connection with another neuron. They use reshuffling to test significance of functional links but still it seems that highly active neurons are more likely to display a high functional connectivity degree and therefore to be stable members of a given ensemble with that definition of ensemble membership. What is the justification to define membership based on pairwise functional connectivity? The finding that core ensemble members display a high functional degree may be just a property reflecting a property of highly active neurons (as previously described by Mizuseki et al. 2013).

      Type of neurons imaged: The authors use Vglut1-Cre mice, therefore they are excluding GABAergic cells from their study, this should be clearly mentioned and even discussed.

      Volumetric imaging: I am not sure one can say that "volumetric imaging" was performed here, rather this is multi-plane imaging.

      Mouse behavior: there is little detail concerning mouse behavior, are mice allowed to run? What is the correlation between ensemble activation and running?

      Abstract: the authors should say that 46 days is the longest period they have been recording, otherwise it gives the wrong impression that after 46 days ensembles are no longer stable. Also "most visually evoked ensembles" should be replaced by "ensembles observed during periods of visual stimulation" (see above). "In stable ensembles most neurons still belonged to the same ensemble after weeks": how could ensembles be stable otherwise?

      Discussion: I found the discussion quite succinct. It lacks discussion of the circuit mechanisms for assembly stability and plasticity (role of interneurons for example?), the limitations and possible biases in the analysis and the placing of the results in the perspective of other studies analyzing the long-term stability of neuronal dynamics.

    2. Reviewer #2:

      Overall I think the authors collected an interesting dataset. Analyses should be adjusted to include all cells rather than sub-selecting for stability. Additionally, the language needs to be adjusted to better reflect the data. I wish there was any behavioral data included, but if the authors compare their data to publicly available data in V1 for a single recording session during a visually guided task, these concerns could be quelled a bit.

      1) In general the language of this paper and title seem to mismatch the results. The fraction of cells that were 'stable' as the authors say on line 112 was very small, however the authors focus extensively on this small subset for the majority of analyses in the paper. Why ignore the bulk of data (line 119)? What happens if you repeat the same analysis and keep all cells in the dataset? The general language around stability of neural ensembles should be adjusted to better reflect the data (ex: lines 157, 225).

      2) There are claims in this paper about how ensembles 'implement long-term memories' in the introduction and conclusion and yet the authors never link the activity of ensembles to any behavioral or stimulus dependent feature. This language reaches far beyond the evidence provided in this paper. The introduction could provide some better framing for expectations of stability vs. drift in neural activity rather than focus on the link between ensembles and memory given that there isn't much focus on the ensembles' contribution to memory throughout. For example, the last sentence of the paper is not supported by data in the paper. Where is the link between ensembles and memory in the data? What is the evidence that transient ensembles are related to new or degraded memories? This reads as though it was the authors' hypothesis before doing the experiments and was not adjusted in light of the results.

      3) There is no discussion around the alternative to stability of neuronal ensembles. What are the current theories about representational drift? For example, in Line 34 the authors present an expectation for stability without any reasoning for why there need not be stability. This lack of framing makes their job of explaining results in line 217 more difficult. There is a possibility that the most stable cells aren't more important - what is the evidence that they are? Does an ensemble need a core? Would be interesting to include some discussion on the possibility of a drifting readout (Line 223). [https://doi.org/10.1016/j.conb.2019.08.005]

      4) How do activations in V1 in this dataset compare to other data collected from V1 while the animal is performing a task (where for example the angle of the gradings is relevant to how the mouse should respond)? I would be interested to know if the authors compared statistics of their ensembles to publicly available data recorded in V1 during a visually guided behavior. Are the ensembles tuned to anything in particular? Could they be related to movement? [http://repository.cshl.edu/id/eprint/38599/]

      5) The authors provide some hypotheses as to why fewer cells are active in the later imaging sessions (dead/dying cells?). This is worrisome in regards to how much it might have affected the imaged area's biology. One alternative hypothesis is that the animal is more familiar with the environment/ not running as much etc. Have the authors collected any behavioral data to compare over time?

      6) How much do the results change when you vary the 50% threshold of preserved neurons within an ensemble (Line 146)? Does it make sense to call an ensemble stable when 50% of the cells change? Especially given that the cells analyzed as contributing to an ensemble are already sub-selected to be within the small population of stable cells (Line 119)?

      7) Cells are referred to as 'stable' when they're active on 3 different sessions that are separated in time. However, the authors find a smaller number of cells are stable over extended time (43-46 days later). If we extrapolate this over more time, would we expect these cells to continue to be stable? Given these concerns, it might make more sense to qualify the language around stability by the timespan over which these cells were studied.

      8) Filtering frames to only coactive neurons for ensemble identification seems strange to me. Authors may be overestimating the extent of coactivation. What happens when you don't do this? How much do the results change when you don't subselect for Jaccard similarity? I would be interested to see how the results vary as you vary this threshold (Line 136).

      9) The term 'evoked activity' is misleading because the authors don't link these activations to the visual stimulus. There's no task, so the mice could be paying little attention to the stimulus. Should we really consider this activity to be visually driven? Could the authors provide any evidence of this?

      10) A method like seqNMF could reveal ensembles that are offset in time. This looser temporal constraint could potentially reveal more structure. This should be run on the entire dataset (without stability sub-selection). I suggest this as a potential alternative or supplement to the method described by the authors. [https://elifesciences.org/articles/38471]

    3. Reviewer #1:

      Perez-Ortega and colleagues performed rigorous experiments to determine if the activity of neurons in the visual cortex is similar across days, in particular comparing spontaneous activity in the absence of visual stimuli across days, which was previously not examined to my knowledge. The paper claims that evoked ensembles are more stable than spontaneous ensembles, but more convincing quantitative analyses are required to support these claims.

      Major Comments:

      1) There is only one mention of prior work with multi-day imaging in the visual cortex (Ranson 2017). Another related study to cite and compare your results to would be Jeon, ..., Kuhlman 2018 (and I think a comment about how similar/different your results are from this study + Ranson would be useful for the reader). I would also recommend mentioning that there are studies that have observed differences in evoked activity across learning in V1 (e.g. Poort, Khan et al 2015; Henschke, Dylda et al 2020). Do you think there was adaptation across days to the stimulus that you repeated?

      2) Some GCaMP6f mice have aberrant cortical activity (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5604087/). In the raw data (Fig 1F) it doesn't look present, but it would be useful to show more time and sort the neurons by their first PC weights perhaps to see the activity structure.

      3) The approach of 3 plane imaging taking the maximum projection seems useful for tracking cells across days. There is a claim that some cells are no longer found / no longer active. Based on Fig 1G it appears there may have been some Z-movement from day 10 to day 46. This Z movement may explain some of the lost active cells. As a sanity check I would recommend plotting the Z-plane on which the cells were maximally active on day 1 vs the Z-plane on which the cells were maximally active on day n.

      4) There is an emphasis on analyzing the data as ensembles but I think this may be missing other slow, gradual changes. The definition of stable is at least 50% of neurons were preserved across days. However, the fitting procedure of finding ensembles may produce different ensembles even if those neurons are still correlated to each other. I would recommend two possible additional analyses: 1) compare the correlation matrices for common neurons across days (unless there are too few neurons for this); 2) look at changes in single neuron statistics across days. For 2) this may include reliability of neural responses to the visual stimuli, the weights of the neuron onto the first principal component of spontaneous activity, or the correlation of a neuron with running speed. I think these results may solidify your ensemble result (evoked-related statistics change less across time).

    4. Summary: This work examines whether coincident firing of neurons in the visual cortex is preserved over a long timescale (one month) which is important because it provides insight into the stability and plasticity of neural circuits and visual representations. The authors find that subsets of identified neurons maintain coordinated firing despite some degree of flux in the firing activity across the population.

      All reviewers agreed that the question is important but found the analysis lacked depth and there were some technical issues in the experiments that should be addressed with a fuller discussion and potentially additional analysis to eliminate confounds/artefacts. In general, and in light of earlier work (some of which is not cited) the conclusions need to be more circumspect. Specifically:

      • There were concerns about movement/loss of cells/calcium indicator artefacts over this long imaging period that should be accounted for more rigorously.
      • The analysis applies a somewhat arbitrary criterion for stability (50% of cells remain responsive in an assembly). This threshold should be systematically explored and justified more carefully.
      • The wider literature on this topic should be more thoroughly cited, limitations of the study should be transparently laid out, claims about the overall stability found in this population response and its relevance to memories and behaviour should be moderated in line with the comments below.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      From the technical perspective this manuscript provides clear results that are consistent with, but do not prove, what this reviewer believes is the main objective of the work; to establish the relevance of the open structure of the eukaryotic cysteine desulfurase complex. This reviewer has no good basis to either accept or reject the open structure as having physiological relevance. This could well be the case but it is not clear from my (limited) knowledge of the published literature that the relevance of the open structure is generally accepted. From this perspective I believe the manuscript is sound from the technical approach and experimental implementation but suffers from a lack of clarity about the case for and against the relevance of the open structure. If this is a point of controversy in the field the topic should be discussed in depth and the position of the authors more clearly articulated.

    2. Reviewer #2:

      In this manuscript, Barondeau and co-workers test a hypothesis for the role of the protein frataxin in iron-sulfur cluster assembly, seeking, inter alia, to explain the observation that mutations in the gene encoding this protein are associated with the incurable neurodegenerative disease, Friederich's ataxia. Their notion is that, whereas the bacterial versions of the sulfur-providing cysteine desulfurase are stable homodimers - in which the interactions between the monomers help to organize the mobile loop harboring the key cysteine residue that serves as general acid and nucleophile in the C-S-cleavage reaction that mobilizes the sulfur for incorporation into the cluster - the human enzyme (i) has a dimer interface that has been weakened through evolution, (ii) can be monomeric or form non-optimal dimeric forms, and (iii) can be driven to adopt the optimally active dimer form by intervention of accessory proteins (e.g., frataxin). Their approach was to perturb a bacterial (E. coli) cysteine desulfurase (IscS) by structure-guided mutagenesis in an attempt to introduce into it the behavior of the human enzyme, specifically its activation by accessory proteins (here CyaA and FXN). The experiments were successful in this goal. I like this paper and believe that it is interesting and important. I would point out two aspects that perhaps leave room for improvement.

      1) In principle, it would have been a more powerful test of their hypothesis had they been able to perturb the human enzyme to get a constitutively active form, no longer dependent on the binding of the accessory proteins, either instead of, or in addition to, the converse perturbation of the bacterial system. Perhaps this approach was precluded by difficulties associated with the human enzyme?

      2) The second criticism is that the effects on quinonoid form decay and activity are rather modest. However, I believe that important biological effects can arise from even such modest regulation of enzyme activity levels.

    3. Reviewer #1:

      This study presents a detailed and focused study of the structural basis for a regulation strategy used by a human iron-sulfur cluster biosynthesis system, elucidated by artificial installation of new amino acids into a bacterial system that lacks the allosteric elements of the human enzyme. The work includes quaternary structure analysis and activity assays of variant bacterial proteins. It is performed competently and supports the conclusions. But the focus may be too narrow for a general audience. To bring the work over the bar, the authors could test whether installing the bacterial residues into human NFS1 restores activity without frataxin (inactivated in the human genetic disorder Friedrich's Ataxia). Furthermore, some elements of the study could be presented more clearly/rigorously to communicate the significance of the work to a general audience. These suggestions are listed below.

      1) It would be useful for an unfamiliar reader to include a diagram of the bacterial and human iron-sulfur cluster biogenesis pathway. It would also be helpful to depict the mechanism of the IscS/NFS1 cysteine desulfurase reaction - essentially a picture to go along with the description of the PLP-dependent transformations described in paragraph 2.

      2) In the first paragraph of the results section - I would be interested to see more details about the selection of the three residues targeted for mutagenesis. For example, did the authors inspect the interfaces of existing crystal structures of these complexes? Did they create sequence alignments for multiple eukaryotic/prokaryotic cysteine desulfurases and select sites conserved in bacterial proteins but not eukaryotic ones? More description of the experimental or bioinformatics basis for selecting these three sites would be important for convincing the reader that the basis for this work is sound.

      3) The structural basis for the dimer interaction and the enhanced activity isn't completely clear - how do the changed interactions enhance the enzyme activity? A good description of the different quaternary forms and why they are more/less active is given on page 4-5 - but perhaps another link could be made between the exact residues targeted for substitution and the features of the system important for catalysis.

      4) On page 10, the authors describe changes in IscS quaternary structure as a function of concentration. What is the estimated copy number or concentration inside the cell? Which concentration ranges would be most physiologically relevant?

      5) Addition of any helper protein appears to increase the proportion of variant IscS dimer and activity. Is there any reason to believe that this phenomenon is simply a crowding effect? If the same amount of an unrelated protein is added - does the activity/dimer fraction change compared to variant IscS alone?

      6) I found the color scheme in Figure 1 hard to follow - could the authors keep the subunit colors consistent and use text labels directly on the figure panels for the subunits and forms (open, ready, etc). I also don't think the "Clash!!" labels are necessary. A more effective approach might be to use zoomed-in insets for each clash.

      7) In Figures 4-6 - could the authors include a more complete description of the error bars? What kind of error is shown? Are the replicates different experiments done on different days? These presentations might also benefit from showing the actual data points on top of the bars/error bars.

    4. Summary: This study provides support for a proposed allosteric regulatory mechanism in a human iron-sulfur cluster biosynthesis protein that is linked to the human genetic disorder, Friedrich's Ataxia. In an approach guided by inspection of a structure of the human enzyme, the authors successfully converted a bacterial homolog lacking allosteric regulation into a system that behaves similarly to the human one. The work provides validation of the roles of accessory proteins in activating iron-sulfur cluster biosynthesis machinery. It also could open novel routes for therapeutic intervention in genetic disorders of this process in humans.

      The major concerns about the study center on the significance of the form of the human enzyme structure used as the basis for designing the mutagenesis/activity experiments in the bacterial enzyme. To bolster the underlying framework for the experiment design, the description of the existing human enzyme structures and how exactly they were used to select sites for mutagenesis in the bacterial counterparts should be improved to include more detail and balanced perspective. Experiments are suggested to show that activity enhancement upon addition of accessory proteins is specific to those factors, along with a more comprehensive discussion of the errors and reproducibility in activity measurements. Finally, the significance of the work would be elevated if the authors could use a similar approach to install activating mutations in the human enzyme - particularly if these could overcome the requirement for frataxin.

    1. Reviewer #3:

      This study combines two cutting-edge approaches for the study of polyclonal antibody responses to understand the molecular profiles of antibodies elicited by HIV envelope trimer immunization in a rabbit model. In one arm of the study, the authors performed mutational profiling of serum antibody neutralization escape variants, and in the second arm they used electron microscopy polyclonal epitope mapping (EMPEM) to track antibody binding sites. These authors performed large-scale data collection and present high-quality validation data and explorations of the resulting datasets that compare antibody binding and virus neutralization profiles. These approaches provide a comprehensive window into the molecular specificity and performance of HIV immunization and are expected to inform advanced HIV-1 vaccine designs.

      Summary of any substantive concerns:

      The authors have done a nice job validating the integrity of the NGS data, and the strong data in Figs 4/5/2B show the power of the NGS-based neutralization mapping assays. This adds a solid confirmation of the study findings and demonstrates the quality of the techniques. Overall this is a solid study and the findings are informative. I see just a few methods updates and analyses that would help finalize the presentation of methods and data.

      1) Additional information on the bioinformatic methods for data analysis is needed. How did the authors handle discrepancies in data across replicates or libraries, for example if a mutation that was enriched in one library or replicate, but deleted in another? Were there any quality filters or metrics used to estimate true signal vs. noise?

      2) Differential selection statistics are mentioned briefly, along with citations to prior publications. Prior citations are definitely helpful. I think it is still important to state the key steps used in processing NGS data and the statistical techniques and quality metrics that were used. The authors should also state any criteria for acceptance or rejection or binning of individual data points, or acceptance/rejection of datasets or replicates, if quantitative criteria or metrics were used.

      3) Several replicates showed a low percentage infectivity (Fig S1, e.g. animals 5724 and 2124), but the text indicates averages between 0.3% and 2.7% infectivity. Were some groups omitted from analysis, or were all groups included?

      4) How well did the mutational profiles correlate between different libraries or replicates of the same samples?

    2. Reviewer #2:

      This manuscript by Dingens et al. develops a novel application of mutational antigenic scanning to identify dominant neutralizing antibody epitopes in polyclonal sera from vaccinated animals, and compares the findings of such techniques with those from cryo-EM based unbiased mapping of binding antibodies and from conventional mutational mapping of neutralizing epitopes. Overall, I find the experiments and analyses to be of high quality, thorough and of sound reasoning, and the manuscript to be well written. I also commend the authors for the development of a facile and easy-to-use interactive viewer for exploring the mutational scanning data. I think the dual approach of mutational scanning and cryo-EM based mapping has the potential to be a powerful approach for dissecting antibody content of polyclonal sera post-vaccination or in infected hosts.

      The only major concern I could identify is the following. One of the main advantages of the mutational scanning approach is that it can identify novel epitopes targeted by antibody responses in a high-throughput manner. It is a little disappointing that this advantage was not leveraged in the current manuscript, perhaps due to the choice of the vaccine (BG505 SOSIP trimers where the epitopes have been thoroughly mapped in the literature) and the selection of vaccinated animals. Looking at Fig. 2, animal 5727 was the only animal whose serum showed some selection signatures outside of the regions considered in depth (at sites 507 and 509) - have the authors analyzed these escape mutations? If not, and only if possible within reasonable workload, I urge the authors to pursue this example or any other example where a potential novel epitope discovery could be possible.

    3. Reviewer #1:

      Dingens et al. report a timely complementary study to map neutralizing and binding responses in polyclonal rabbit sera induced by immunization with the BG505 SOSIP Env trimer. Neutralizing responses are mapped using libraries of replication-competent HIV expressing all mutants of the BG505 Env, an approach developed in the Bloom laboratory. Binding responses were mapped using an EM-based method, EMPEM, developed in the Ward laboratory. The Env mutations that affect neutralization of the autologous BG505 strain in the BG505-SOSIP-immunized animals were largely known from other studies, as were the binding (not necessarily neutralizing) responses - the strength of this study is the combination of the two approaches. It is especially useful that the complex datasets have been deposited on-line where they can be interactively explored, including mapping onto Env trimer and monomer structures. Although results were anticipated, it is very nice to directly compare the neutralization epitopes to the binding epitopes determined by EMPEM. This is a well-written and beautifully illustrated paper.

    1. Reviewer #3:

      The authors probe mechanosensory processing in Hydra by measuring calcium activity in neurons and muscles in response to precise mechanosensory stimulation in whole and resected animals. The authors' claims are well supported by the evidence. The development of a mechanosensory delivery system for Hydra is also a significant methodological advancement. Taken together, the work advances our understanding of the Hydra nervous system and is a needed step towards developing Hydra as a powerful model for systems neuroscience.

      Substantive concerns:

      1) One weakness is that different measures of "mechanosensory response" are used at different places in the manuscript. In some contexts, a response is defined as calcium activity in neurons (Fig 2), and elsewhere as calcium activity in muscles (Fig 3 and 4). And in Fig2 SuppFig2 muscle contractions are also measured using MeKs. The relation between neural activity, muscle activity and body movement is of course of high interest, and the paper explores this. But, if technically possible, it would be helpful to report a single metric of behavior that could be used in all experiments. For example, it might be possible to use video of the animal's pose or body length to measure contractions in all experiments. At a minimum the reasoning behind choice of measurement of response for each experiment could be discussed explicitly.

      2) Related: Without a consistent measure of behavior, it will be important to further clarify figures so that a reader can tell at-a-glance how contraction probability is being measured.

    2. Reviewer #2:

      The Hydra, in the phylum cnidaria, is a near microscopic freshwater animal that has recently resurfaced as an attractive model organism in neuroscience due to its optically accessible transparent body, sparsely distributed neural network, and simple behaviors. In this manuscript, Badhiwala and colleagues use calcium imaging of the Hydra neural network, combined with surgical resection and microfluidics pressure stimulation to identify body regions indispensable for mechanosensory activity. They report that while resection of the aboral region did not abolish the mechanical response, resection of the oral region attenuated this response, while combined resection of oral and aboral regions showed the greatest effect. They also find a correlation between reduced stimulated activity and spontaneous activity, suggesting a common mechanism that gives rise to both activities. While this study takes on an innovative approach by using a microfluidics device to mechanically stimulate the hydra under optical recording there are a number of conceptual and technical limitations. Perhaps my biggest reservation is that despite real potential, the data are rather low resolution (body transections and bulk calcium responses) and as such the conclusions that can be reasonably drawn do not extend what is known in a significant way.

      Major comments:

      1) The authors have designed a microfluidic device that allows them to simultaneously mechanically stimulate, monitor movement and functionally image a hydra. The highly quantifiable nature of the microfluidic device is a great asset, although this potential is not deeply explored. While I can see how the microfluidic stimulation could offer benefits over fluid jet or blunt probe, more in-depth characterization is needed.

      2) What is the spatial distribution of the pressure pulse stimulus on the Hydra body? How far does the mechanical force spread from the region directly touching the pressure valve?

      3) The use of the microfluidic device was limited. Have the authors attempted to map mechanical sensitivity across the Hydra body by stimulating different sites?

      4) The authors have not attempted to record calcium responses from single neurons, but rather spatially average a population response from a large region of interest. This should be specifically stated in the results section. More importantly, to provide insight into network function much smaller ROIs over multiple sites are needed instead of the bulk activity of the entire peduncle. This seems like a real lost opportunity as the lure of the optically clear and small hyda is that neural representation and coding can be tracked over large portions of the network at cellular resolution.

      5) It is unclear where the recorded signals are coming from and if movement is creating artifacts. Have authors made any attempts to correct for movement? The supplemental movies show a stationary region of interest and moving animal, in some cases parts of animal moving in and out. Furthermore, is background subtracted and how? There is a large fluorescent signal coming from the entire body/ middle columnar part of the body and spontaneous firing that makes interpretation of the data difficult.

      6) Contraction is a behavioral response of the animal; however, the authors use 'contraction' do describe calcium imaging responses throughout the figures and text. This should be avoided.

      7) I am unsure if the title of the paper is accurate. I do not think this work has demonstrated "multiple nerve rings" are important for coordinating mechanosensory behavior.

      8) Furthermore, the claim that the observed "linear relationship" between the spontaneous contraction probability and resection type is evidence for shared neural pathways is a stretch. These data are fairly coarse resolution and include only 3 animals in each group with highly variable responses (Figure 4C). Additionally, they do not provide evidence to distinguish the motor circuits they hypothesized these neural nets converge upon.

    3. Reviewer #1:

      The manuscript by Badhiwala et al. is an interesting study using the emerging model system Hydra, which has many advantages for studying the entire nervous system of an animal during simple behavior. Some of the foundational neuroscience papers in this field have only come out in the past few years, and new studies such as the one here, might have the potential to contribute to an important early literature. Despite clear reasons for enthusiasm, the many shortcomings in this work greatly diminished my enthusiasm and support for this study. Although I appreciate building the microfluidic devise with simultaneous pan-neuronal imaging, the nature of the new biological insights provided here seems quite limited and easily predicted based on prior studies in hydra and other model systems. Moreover, the crude nature of some experiments inhibits my ability to make fair judgement of potential findings.

      Major concerns:

      1) The pressurized stimulation of the hydra appeared to be specific to the center of the body. The authors don't mention why this region was chosen, which seems critical to this study. Relatedly, why didn't they test multiple areas across the hydra with this system? Might we expect to see different sensorimotor behaviors, and thus different neural outputs?

      2) The authors reference a recent single cell study characterizing multiple neuronal cell types in hydra. This work would greatly benefit by using some cell-type resolution studies to determine the functional nature of the neurons being activated as opposed to solely using pan-neuronal GCAMP imaging. If they can put GCAMP in all neurons, why not put it in specific subsets of neurons based on cellular identity? This point becomes more salient because a major take-home from this paper is that the spontaneous behavior and firing patterns is nearly identical to the stimulus evoked patterns, except for an apparent increase in firing rate. The true nature of the mechanosensory response might be revealed with cell-type specific experiments.

      3) Although the authors reference whole animal imaging, they focus imaging analysis on peduncle and hypostomal nerve rings, despite the videos showing calcium activity in other areas throughout the body. Moreover, are the authors certain their pan-neuronal genetic strategy equally samples neurons throughout the body? In other words, is the apparent increase in activity in the nerve ring over other areas driven by a technical artifact of these neurons being labeled better?

      4) While I appreciate the resection studies to get at "loss-of-function" experiments, this approach seems rather crude, and potentially confounding to clear interpretation. Exactly which neurons are killed and to what extent, and how many, if any began to regenerate throughout this process? My alarm here is raised especially in light of the author's surprising finding that "footless" animals show that the aboral nerve ring is not required for spontaneous or mechanosensory responses. What if residual activity from neurons not ablated is driving this response?

    4. Summary: Specifically, all of the reviewers agreed that the emerging Hydra system holds great promise for neuroscience discoveries. Moreover, some of the findings presented here have the potential to be of use to other scientists who work in this system. However, we felt that the findings here were too preliminary and underdeveloped. In particular reviewers felt that 1) multiple locations across the Hydra's body should be stimulated coupled with mapping the behavioral and neuronal correlates of such stimulation, 2) the pan-neuronal nature of the bulk calcium measurements made it challenging to fully appreciate which neuronal circuits might be driving the sensorimotor responses, 3) uniform proxies for measuring/plotting the behavior would be useful, 4) the ablation studies lacked cellular resolution, similar to the calcium imaging experiments.

    1. Reviewer #3:

      Lee and Daunizeau formulate a model of the effects of mental effort on the precision and mode of value representations during value-based decision-making. The model describes how optimal levels of effort can be determined from initial estimates of precision and relative value difference between competing alternatives, accounting for the subjective cost of incremental effort investment, as well as its impact on precision and value differences. This relatively simple model is impressive in its apparent ability to reproduce qualitative patterns across diverse data including choices, RTs, choice confidence ratings, subjective effort, and choice-induced changes in relative preferences successfully. The model also appears well-motivated, well-reasoned, and well-formulated.

      I have two sets of concerns, my first set relates to model fitting and validation. The model appears to do fairly well in predicting aggregate, group-level data, but does it predict subject-level data? Or, does it sometimes make unrealistic predictions when fitting to individual subjects? The Authors should provide evidence of whether it can or cannot describe subject level choices, confidence ratings, subjective effort, etc.

      Also, I think the Authors should do more to demonstrate that their model is an advance on simpler variants. The closest thing to model comparison is an exercise where the authors show that, relative to when their model is fit to random data, their model explains more variance in dependent variables when fit to real data. This exercise uses a straw man as a baseline because almost any model which systematically relates independent variables to dependent variables would explain more variance when fit to real data than to data for which, by definition, independent and dependent variables do not share variance. It would be more useful to know whether (and if so, how much) their model explains data better, than, e.g. a model with where effort only affects precision (beta efficacy), or a model in which effort only impacts value mode (gamma efficacy). Since the Authors pit their model against evidence accumulation models, it would be yet more useful to ask whether their data predicts these diverse data better than a standard evidence accumulation model variants.

      My second set of concerns are regarding the assumed effect of mental effort on the mode of subjective values. First, is it reasonable to assume that variance would increase as a linear function of resource allocation? It seems to me that variance might increase initially, but then each increment of resources would add diminishing variance to the mode since, e.g., new mnesic evidence should tend to follow old mnesic evidence. How sensitive are model predictions to this assumption? What about if each increment of resources added to variance in an exponentially decreasing fashion? Also, what about anchoring biases? Because anchoring biases suggest that we estimate things with reference to other value cues, should we always expect that additional resources increase the expected value difference, or might additional effort actually yield smaller value differences over time? If we relax this assumption, how does this impact model predictions?

    2. Reviewer #2:

      The manuscript introduces a computational account of meta-control in value-based decision making. According to this account, meta-control can be described as a cost-benefit analysis that weighs the benefits of allocating mental effort against associated costs. The benefits of mental effort pertain to the integration of value-relevant information to form posterior beliefs about option values. Given a small set of parameters, as well as pre-choice value ratings and pre-choice uncertainty ratings as inputs to the model, it can predict relevant decision variables as outputs, such as choice accuracy, choice confidence, choice induced preference changes, response time and subjective effort ratings. The study fits the model to data from a behavioral experiment involving value-based decisions between food items. The resulting behavioral fits reproduce a number of predictions derived from the model. Finally, the article describes how the model relates to well-established accumulator models, such as the drift diffusion model or the race model.

      Before I get into more detailed comments, I would like to highlight that this work addresses a timely and heavily debated subject, namely the role of cognitive control (or mental effort) in value-based decision making (see Shenhav et al., 2020). While there are plenty of models explaining value-based choice, and there is a growing number of computational accounts concerning effort-allocation, little theoretical work has been done to relate the two literatures (but see Major Comment 1). This work contributes a novel and interesting step in this direction. Moreover, I had the impression that the presented model can account for a broad range of behavioral phenomena and that the authors did a commendable amount of work to validate the model (but see Major Comments 2 and 3). The manuscript is also well written in that it seems accessible to a broad audience, including non-technical readers. However, while I remain curious about what the other reviewers have to say, the manuscript misses to address a few issues that I elaborate below.

      Major Comments:

      1) Model Comparison(s): While the manuscript compares the presented computational approach to existing accumulator models, it could situate itself better in the existing literature, ideally in the form of formal model comparisons. For instance, as someone less familiar with choice-induced preference changes in value-based decision making, I wonder how the model compares to existing computational work on this matter, e.g. the models described in Izuma & Murayama (2013) or the efficient coding account of Polanía, Woodford, & Ruff (2019). I do understand that the presented model can account for some phenomena that the other models cannot account for, at least without auxiliary assumptions (e.g. subjective effort ratings), but the interested reader might want to know how well the presented model can explain established decision-related variables, such as decision confidence, choice accuracy or choice-induced preference changes compared to existing models, by having them contrasted in a formal manner. Finally, it would seem fair to compare the presented account to emerging, more mechanistically explicit accounts of meta-control in value-based decision making (e.g. Callaway, Rangel & Griffiths, 2020; Jang, Sharma, & Drugowitsch, 2020). As these approaches are still in preprint, it may not be necessary to relate them in a formal model comparison. However, the manuscript might benefit from discussing how these approaches differ from the presented model in the text.

      2) Fitting Procedure: This comment concerns the validation of the described model based on its fits to behavioral data. If I understand correctly, the authors first fit the model to each participant while "[a]ll five MCD dependent variables were [...] fitted concurrently with a single set of subject-specific parameters" and then evaluate whether model fits match the predicted qualitative relationship between experimental variables (e.g. pre-choice value ratings and pre-choice confidence ratings) and dependent variables (e.g. choice accuracy). I'm happy to be convinced otherwise, but it appears that the model's predictions could be tested in a more stringent manner. That is, it doesn't appear compelling to me that the model, once fitted, matches the behavior of participants -- please note that this is not to diminish the value of the results; I still think that these results are valuable to include in the manuscript. Instead, rather than fitting the model to all dependent variables at once, it would be more compelling to fit the model to a subset of established decision-related variables (e.g. accuracy, choice confidence, choice induced preference changes) and then evaluate how the fitted model can predict out-of-sample variables related to effort allocation (e.g. response time and subjective effort ratings). Again, I am happy to be convinced otherwise but the latter would seem like a much more stringent test of the model, and may serve to highlight its value for linking variables related to value-based decision making to variables related to meta-control.

      3) Parameter Recoverability: Given that many of the results rely on model fits to human participants, it would seem appropriate to include an analysis of parameter recoverability. That is how well can the fitting procedure recover model parameters from data generated by the model? I apologize if I missed this, but the manuscript doesn't appear to report this kind of analysis.


      Callaway, F., Rangel, A., & Griffiths, T. L. (2020). Fixation patterns in simple choice are consistent with optimal use of cognitive resources. PsyArXiv: https://doi.org/10.31234/osf.io/57v6k

      Izuma, K., & Murayama, K. (2013). Choice-induced preference change in the free-choice paradigm: a critical methodological review. Frontiers in psychology, 4, 41.

      Jang, A. I., Sharma, R., & Drugowitsch, J. (2020). Optimal policy for attention-modulated decisions explains human fixation behavior. bioRxiv: 2020.2008.2004.237057.

      Polania, R., Woodford, M., & Ruff, C. C. (2019). Efficient coding of subjective value. Nature neuroscience, 22(1), 134-142.

      Shenhav, A., Musslick, S., Botvinick, M. M., & Cohen, J. D. (2020, June 16). Misdirected vigor: Differentiating the control of value from the value of control. PsyArXiv: https://doi.org/10.31234/osf.io/5bhwe

    3. Reviewer #1:

      The authors report a model about the confidence-effort tradeoff; explaining how subjects invest effort depending on how confident they want to be in their decision (and how costly this is). They fit their model to behavioural data and report qualitative similarities between model and data.

      I find this an interesting model, with interesting links between timely topics of interest, such as confidence, effort, and cost optimisation. But I have several requests for clarification.

      Major Comments:

      Line 274: Without loss of generality: what does it mean here? I guess that with a different cost function, not all conclusions remain the same?

      The model assumes that it is "rewarding" to choose the correct (highest-value) option (B = R*P). But is this realistic? If the two options have approx the same value, then R should be small (it doesn't matter which one you choose); if the options have different values, it is important to choose the correct one. Of course, the probability P_c continuously differentiates between the two options, but that is not the same as the reward. Can the predictions generalise toward a more general R that depends on value difference?

      In Figure 2, I guess that the important quantity to decide is a standardised delta-mu (similar to d' in signal detection theory). It might be useful to also plot that (essentially combining the current two plots). Or alternatively, plot P_c(z), which relates more directly to the theory.

      The section Probabilistic model fit is unclear. Are the MCD variables y the 5 variables mentioned above? Do different y's share the same alpha, beta, gamma? Are different transformation parameters a and b fitted for each y? Is estimation done per subject? It is mentioned that VBA is used, but what distribution is approximated exactly using VBA? Is it a mean-field approximation, optimised with gradient descent? Is the goal function a posterior across the 5 parameters? It would also be good then to have an intuition on the estimated model parameters (e.g., their standard error or Bayesian equivalent). Is there an estimate of model fit (in addition to checking qualitative predictions)? Figure S3 is a good start (and I think it is worth putting in the main MS), but it would be nice, for example, to see model comparisons where one or more parameters are restricted.

      Figure 4, 5, 6 should be better annotated. I have a hard time trying to fill in what is plotted exactly (eg, scale of the color bar). Why are the data grouped in percentiles? Also in Figure 4 legend, I guess that "beta" is not used as the MCD model parameter? Please avoid overloading definitions.

      Figure 7: It seems that "spreading" of alternatives occurs in the model only for alternatives that are initially close together? Is this consistent with their discussion around equation (14)? (I may be overlooking something; if so, consider making this more explicit.)

      I find it a really interesting feature of the model that it can explain spreading of alternatives from a statistical perspective. So I think it's worth commenting on it in the Discussion. For example, does the current model capture trends in the literature? To what extent is the effect (also in empirical data) dependent on initial value differences?

    4. Summary: This manuscript addresses a timely subject: the role of cognitive control (or mental effort) in value-based decision making. While there are plenty of models explaining value-based choice, and there is a growing number of computational accounts concerning effort-allocation, little theoretical work has been done to relate the two literatures. This manuscript contributes a novel and interesting step in this direction, by introducing a computational account of meta-control in value-based decision making. According to this account, meta-control can be described as a cost-benefit analysis that weighs the benefits of allocating mental effort against associated costs. The benefits of mental effort pertain to the integration of value-relevant information to form posterior beliefs about option values. Given a small set of parameters, as well as pre-choice value ratings and pre-choice uncertainty ratings as inputs to the model, it can predict relevant decision variables as outputs, such as choice accuracy, choice confidence, choice induced preference changes, response time and subjective effort ratings. The study fits the model to data from a behavioral experiment involving value-based decisions between food items. The resulting behavioral fits reproduce a number of predictions derived from the model. Finally, the article describes how the model relates to established accumulator models of decision-making.

      The (relatively simple) model is impressive in its apparent ability to reproduce qualitative patterns across diverse data including choices, RTs, choice confidence ratings, subjective effort, and choice-induced changes in relative preferences successfully. The model also appears well-motivated, well-reasoned, and well-formulated. While all reviewers agreed that the manuscript is of potential interest, they also all felt that a stronger case needs to be made for the explanatory power of the model, and that the model should be embedded more thoroughly in the existing literature on this topic.

    1. Reviewer #2:

      This is a nice study that is clearly written and makes use of several datasets. The authors show that a gene signature associated with increased myelopoiesis in utero is associated with increased risk of pediatric asthma. Furthermore they show that cord blood serum PGLYRP -1 is associated with reduced risk of pediatric asthma and increased FEV1/FVC. Interestingly sIL6ra which is derived from neutrophils but not associated with neutrophil granules did not show any association with pulmonary outcomes. This suggests that it is the neutrophil granules rather than the neutrophils per se that are the problem association. The following should be addressed:

      1) While the manuscript is clearly written, the message regarding PGLYRP-1 is at times confusing. The manuscript is clear that PGLYRP -1 is inversely associated with mid childhood asthma risk. The discussion however refers to animal models where PGLYRP -1 is proinflammatory and is associated with increased airway resistance and allergen sensitization. The apparent disparity should be clarified.

      2) What is the proposed role of neutrophil degranulation in the pathogenesis or long term susceptibility to asthma?

      3) While it was not the focus of the current study and maybe beyond the scope of the data it would be interesting to know if there is any association with the subsequent development of adult asthma.

    2. Reviewer #1:

      This paper attempts to explain perinatal risk factors and the associated risk of developing pediatric asthma in the mid-childhood and early teenage years. The authors found that some maternal characteristics such as atopy, BMI, race/ethnicity and demographics such as newborn sex, and birth characteristics such as birthweight, gestational age, and mode of delivery were associated with risks of subsequent asthma development in the pediatric population. The paper then goes on to demonstrate the differences in immune response during the different time frames of pregnancy. Throughout the majority of the pregnancy, fetal hematopoiesis generates mostly lymphoid and erythroid lineages. Towards term, the immune cells are predominantly neutrophils and monocytes. Pre-term is characterized primarily by lymphocytes. It was seen during term deliveries that the myeloid response produces several cytokines that shift CD4+ T- cells away from the Th2 response. Enhanced production of IFN gamma by leukocytes stimulation early in life is associated with reduced susceptibility to infections. However, the author states that these findings do not extend to asthma diagnosis in childhood.

      Major comments:

      I would have liked the paper to readjust the introduction; a lot of emphasis is placed on IFN/infection/asthma, but after this fact, it seems neglected going forward and the paper explores another topic. Instead, the paper's focus was on determining the biological nature, serologically, with a granulocytic luminal marker (PGLYRP-1) and a membrane-bound marker (sIL6Ra) and its association to pediatric asthma.

      The take-home message for the paper - that there appears to be an inverse relationship between serum levels of PGLYRP-1 and overall risk for pediatric asthma - should be explored in relation to the whether a therapeutic role for such proteins is possible since they can accurately predict risk factors for disease and assess pulmonary function. Other proteins, like the sIL6Ra, have no association with disease predictability and have no association with predicting pulmonary outcomes. This should be explored/explained in greater detail.

      Minor comments:

      As part of the validation efforts of the study - the rationale for using three different cohorts to assess pediatric asthma risk was not clearly explained.

      One of the main findings of the analysis was the conclusion that patients with higher levels of myeloid cells in their CBMCs are at lower risk of developing pediatric asthma, and vice versa. Furthermore, CBMC neutrophil abundance was negatively associated with the number of risk factors. (patients with more risk factors, as mentioned above, were found to have lower levels of neutrophils in their CBMC, and more at risk of pediatric asthma). This was further elucidated with measuring CBMC plasma levels of PGLYRP-1 with levels of mRNA and correlating it with risk of developing pediatric asthma. Increased levels of mRNA for the PGLYRP-1 protein was associated with an increased serum concentration of the protein. However, this was inversely correlated with risk factors. Patients with reduced risk factors for development of pediatric asthma were found to have increased levels of the protein and its mRNA.

      The measurement and correlation of PGLYRP-1 (present in neutrophil specific granules) and sIL6Ra (derived from neutrophils, but not present in granules) to pediatric asthma at mid-childhood and early-teen years was determined. There were two follow-up points where asthma outcomes and pulmonary function by way of the FEV1/FVC ratio was determined. It was found that increased levels of PGLYRP-1 were significantly associated with current asthma at mid-childhood. However, there was no association between levels at the early-teen follow-up.

      In terms of correlations between each protein level and pulmonary function - the sIL6Ra protein was NOT associated with the FEV1/FVC ratio or a bronchodilator response at either age group. However, it was found that increased levels of PGLYRP-1 were associated with an INCREASED FEV1/FVC ratio (not indicative of asthma) and reduced odds of developing pediatric asthma at each age group.

      This analysis makes sense as increased production of neutrophil granules, PGLYRP-1, serves a protective effect against infection, reducing incidence of disease states. The paper, however, should explore the rationale behind the no-response to the sIL6Ra protein. In terms of understanding, since this protein is NOT associated with neutrophilic granules, it can be inferred, that is it may not have a role in protecting against infection. However, this could have been explored in more detail in the paper.

    3. Summary: This is a nice study that is clearly written and makes use of several datasets. It attempts to explain perinatal risk factors and the associated risk of developing pediatric asthma in the mid-childhood and early teenage years. Identified among maternal characteristics that were associated with risks of subsequent asthma development included atopy, BMI, race/ethnicity and demographics, birth characteristics, and mode of delivery. The paper then goes on to demonstrate the differences in immune response during the different time frames of pregnancy. Most notably, a gene signature associated with increased myelopoiesis in utero is associated with increased risk of pediatric asthma. Furthermore they show that cord blood serum PGLYRP -1 is associated with reduced risk of pediatric asthma and increased FEV1/FVC. Interestingly sIL6ra which is derived from neutrophils but not associated with neutrophil granules did not show any association with pulmonary outcomes. This suggests that it is the neutrophil granules rather than the neutrophils per se that are the problem association.

    1. Reviewer #3:

      The results of this study suggest that maternal loss alters the HPA stress axis in wild chimpanzees, but these effects are transient and are not evident later in life.

      Overall the study is the result of much careful fieldwork. The number of cortisol samples is impressive and these are robustly analysed. The conclusions are carefully and thoroughly discussed.

      I have very few comments, in part because I am not a specialist in stress hormones and so cannot fully assess the laboratory analysis or interpretation, but in part because my view is that this is a high-quality thorough study and a well-written manuscript.

      My only major point is that I am aware that measurement of cortisol is difficult in the wild. It is possible to inadvertently measure metabolites other than cortisol, and the most robust way to measure cortisol is using a challenge and subsequent measurements. While I cannot adequately assess this aspect of the manuscript, I think it is important that the other reviewers/editor ensure the hormone measurements are appropriate.

    2. Reviewer #2:

      The paper submitted by Girard-Buttoz and colleagues asks whether and how early maternal loss affects cortisol levels and diurnal slopes among wild chimpanzees at Tai Forest, Côte d'Ivoire. The major claim of the paper is that, like humans, chimpanzees experience altered HPA functioning after maternal loss, including alterations to both diurnal slope and overall cortisol levels. However, their chimpanzee orphans exhibited patterns in diurnal slope that were opposite to their predictions (predicted blunted slopes, observed steeper slopes). The authors should be commended for their efforts in collecting a large number of samples for this analysis. However, I am not convinced that it is sufficient for investigating the hypotheses put forth here and, therefore I am also not convinced that their results are solid. I also have concerns about the theoretical grounding for the paper.

      1) My principal concerns with this paper, as written, revolve around the methods/results. First and foremost, I am not convinced that the authors have the sufficient sample size to evaluate the predictions/hypotheses outlined in the introduction. While 849 urine samples is a large number, and again, their efforts here should be commended, the sample spread is actually quite thin once it is spliced up into appropriate categories, especially considering how many samples were collected per individual year, on average. As the authors indicate throughout and especially when describing their modeling approach, cortisol is inherently a very noisy hormone impacted by myriad factors- including age in at least one other densely-sampled chimpanzee community. I'm also surprised that time of day was modeled quadratically. It is my understanding that humans, other populations of chimpanzees, and other mammals follow a sigmoidal curve which should be modeled with a third-order term as well. For these reasons, it's difficult to tell whether model 1A is not significant because of insufficient sample or a true lack of predictive power. Additionally, I'm concerned that the paper seems to focus so much on the results from a single model term in a model that did not reach significance.

      2) Despite acknowledging that the "significance of these predictors should be interpreted with caution" because model 1a did not reach significance, the authors make very strong claims about the results in the discussion- and also feature the finding of that model in the title of the paper. That seems problematic to me- especially because the insignificant model results (more intense diurnal slopes among immature orphans) diverge from the expectations set forth by other works in humans and non-humans. The finding that this is to do with higher-than-expected morning cortisol is puzzling given that evening levels are generally considered more responsive or plastic. However, this could also be an artefact of fitting the models without the third-order term for time.

      3) The introduction needs refinement to help clarify and specify the authors' arguments.

      (a) Does the biological embedding model always lead to negative fitness outcomes? Or is it possible that phenotypic adjustments might be adaptive, or even just making the best of a bad job (e.g. earlier death, but not death today)?

      (b) Throughout the introduction it is unclear whether and where the authors refer to the human clinical literature as opposed to animal literature. It is also unclear how human patterns are similar versus different from those observed in animals. Further, I would recommend that the authors include a deeper review of the animal literature (e.g. early experimental work with macaques, cortisol at other chimpanzee field sites/captivity). It's also unclear whether and where the authors refer more broadly to early life adversity (and what this means for humans vs. animals) versus more specifically to maternal loss. Additionally, there should be further discussion specifically related early maternal loss (rather than "early life adversity" which can include a lot of different factors) focused on the nutritional and social obstacles associated with early maternal loss, how these related to HPA functioning, and how these effects are expected to change during development (Plasticity? Flexibility? The role of HPA in responding to changing environmental conditions?). What about the adaptive calibration model which posits that the HPA can readjust during particular periods of developmental reorganization?

      4) It is difficult to assess the discussion without first dealing with the problems in the introduction/methods. However, despite their claims in the results section, it does not seem that the authors interpreted the results of model 1a with caution.

    3. Reviewer #1:

      A very interesting paper testing the biological embedding model in a wild long-lived mammal using an impressive dataset. However, the results for immature orphans are not entirely straight forward. The effect on the HPA axis is in the opposite direction to humans and there seems to be no significant increase in cortisol compared to non-orphans overall - it depends on time since maternal loss. The paper would be improved by communicating this more clearly and discussing exactly why this pattern may be different to that in humans. Some of the evolutionary ideas discussed in the paper also need to be more clearly conveyed or thought through.

      Substantive concerns:

      1) There are important sections in the introduction (L125-128 particularly) and discussion (L403-409) about the evolution of the HPA response and differences between humans and other mammals that are unclear. Greater detail on the evolutionary logic being used, the precise hypotheses being suggested and references to back the ideas up are required (further details in minor comments).

      2) Table2/Model 1a doesn't directly test whether orphans have higher cortisol than non-orphans (or no p-value reported in table 2) and CIs in table 1 suggest that there is not a significant difference. Therefore, categorical statements that orphans have higher cortisol levels don't seem to be entirely justified. However, model 1B demonstrates that cortisol declines with years since maternal loss and figure 3 supports the idea that orphans do have higher cortisol than non-orphans in the first 2 years following maternal loss but that this declines to levels similar to those of non-orphans after 2 years. Could a statistical test be run to back this up? Perhaps instead of using a binary variable for orphan status (yes/no) it could be analysed as categories (orphaned within 2 years, orphaned more than 2 years ago, not orphaned as an immature) which could be used to directly test this and back up statements e.g. recently orphaned immatures had higher cortisol levels than non-orphans. A broader concern is why likelihood ratio tests have been used to calculate p values (and for only some of the predictors) rather than reporting the output from the models themselves. Could you explain what the benefit of this is over reporting values from the actual models and/or also provide the model outputs?

      3) The effect on cortisol slopes found in this study is in the opposite direction to that in humans. This is discussed in some detail but is lacking clarity in places and I think it would help to make this difference more obvious - it is really a key finding of the paper not a secondary point. The expected pattern is very nicely set out in the introduction so it would be good to format the discussion so there is a paragraph that outlines exactly how the results differ from hypothesized:

      (a) that the effect on cortisol slopes is in the opposite direction

      (b) that only the cortisol levels of recently orphaned immatures are significantly different to non-orphan immatures and then brings in the ideas discussed about why these differences may be present. I think this would really help communicate the findings more clearly, bringing the discussion more inline with what is set out in the introduction.

    4. Summary: This paper tests the biological embedding model by asking whether and how early maternal loss affects cortisol levels and diurnal cortisol slopes among wild chimpanzees at the Tai Forest, Côte d'Ivoire. The results suggest that maternal loss alters the HPA stress axis in wild chimpanzees, but these effects are not visible later in life. Authors suggest that the lack of a later life association between maternal loss and cortisol levels may be due to selective early mortality of individuals with high cortisol levels but did not provide any survival or behavioural data to show that orphans and non-orphans differ in any fitness-related traits other than cortisol. Furthermore, the association between cortisol and the HPA axis is in the opposite direction to that observed in humans and there seems to be no significant increase in cortisol in orphans compared to non-orphans. Overall, the study is the result of extensive fieldwork, the number of samples collected is impressive and the subject is very interesting.

      The analyses will benefit greatly if the authors use effect sizes and confidence intervals for inferences instead of p-values. This may solve the significance threshold issues. Moreover, the reliance on p-values seem to limit the value of the data. For example, authors suggest that results from model 1 should be treated with caution because the full model is not significantly different from the null model, but by relying on it as the key finding of the study without exploring effect sizes, it does not seem that they did exercise sufficient caution.

    1. Reviewer #2:

      The manuscript addresses an interesting question: whether genetic effects of common variants on educational attainment (EA) differ between individuals with and without psychiatric diagnoses. The dataset they use is ideally suited for such an analysis. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.

      My main concern with the paper is the statistical analyses used to support the authors' conclusions. The authors draw conclusions from dividing individuals into subgroups and comparing the R^2 of the EA PGS between those subgroups. This analysis is liable to bias due to range restriction: if the subgroups have been selected based on low/high education, then the R^2 of a predictor will tend to be lower in the subgroups than in the overall sample. Furthermore, here the selection into the subgroup (here diagnosis with ASD or ADHD) itself is related to both education and the EA PGS, which could be contributing to the differences in R^2 the authors observe between subgroups.

      A more powerful and robust analysis would be to fit an interaction model in the full sample. The authors could regress individual's EA jointly onto their EA PGS, their diagnoses coded as binary variables, and the interactions between the EA PGS and the diagnoses codings. The authors could do this jointly for all diagnoses in the full sample, which would account for comorbidities between psychiatric disorders. If the influence of the EA PGS is truly weaker in ASD and ADHD cases, there should be a negative interaction effect between the EA PGS and ASD and ADHD diagnoses, which can be tested with a simple statistical test for a non-zero interaction effect.

      It could also be worth first regressing the EA PGS onto the psychiatric diagnoses, and taking the residuals before assessing whether there are interactions between the EA PGS and ADHD/ASD diagnosis. It is possible that correlation between the EA PGS and ADHD/ASD diagnosis could generate a spurious interaction effect in the above analysis.

      It is interesting that controlling for SES appears to mediate the (potential) interaction between EA PGS and ADHD diagnosis. However, I worry again that this could be a function of SES influencing ADHD diagnosis. SES and its interaction with both EA PGS and ADHD diagnosis could also be included in a full interaction model that could help interpret this finding.

      The authors construct the PGS by using a pruning and thresholding approach. This is known to be suboptimal, which may explain why their R^2 is lower than in other studies. The authors could use LD-pred or other methods that account for linkage disequilibrium and non-infinitesimal genetic architectures. In the EA GWAS from which the score was constructed, the best R^2 was found by applying LD-pred to all variants without p-value thresholding.

      The hypothesis that indirect genetic effects differ between psychiatric cases and controls is interesting. Do the authors have sufficient sibling data within their samples to test this?

      Line 581: Closely related individuals were removed from the analysis. Why? How many were removed? Could inclusion of these help assess the hypothesis about indirect genetic effects and improve power? The authors could use a mixed model regression to control for relatedness without having to throw individuals out of their sample.

      The grammar in the writing of the paper is a little odd at times. Often, definite or indefinite articles are omitted preceding nouns, such as in 'association of EA-PGS' in the abstract, which should be 'association of the EA-PGS'.

      line 54: 'strongly influences', I think this is a little overconfident in its assignment of causality to highest level of education, perhaps 'strongly associated' would be better

      Paragraph 3 of the introduction: the authors should mention population stratification and assortative mating as possible mediators of the association between EA PGS and EA, especially when referencing the drop in association strength in within-family designs

      I found the decile based analyses a bit pointless. By arbitrarily dividing a continuous outcome into discrete subgroups, the authors are losing power and not gaining much compared to simply performing linear regression, which they already do. I would relegate these to supplementary figures.

      Line 452: I think that the stated equivalence between low EA PGS and learning difficulties goes a bit too far here. I understand the point the authors are trying to make, but I think it should be phrased more carefully.

      The authors used an MAF threshold of 5% for construction of the score. Typically, a threshold of 1% is used for construction of PGS from summary statistics by software such as LD-pred.

      Line 580: the authors state that an EA PGS based on summary statistics from European samples cannot be used to predict EA in non-European samples. This is not true. It is true that the prediction accuracy is attenuated, but it is not zero.

    2. Reviewer #1:

      This is overall a well written and methodologically sound study researching how educational achievement can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. I think that it is a very important study area, the study is well powered using a fantastic representative sample and offers some insights into aetiology of associations between psychiatric traits and educational achievement.

      I suggest some minor adjustments for the authors to consider, mainly addressing the conclusions and implications of the findings. I also recommend some clarifications in the methods and the results sections; these suggestions might require some very modest additional analyses and rethinking/rewording some of the conclusions.

      • The major issue I have is that you discuss family SES as a purely environmental factor throughout the manuscript. However, we know that this is not the case and that there is substantial heritability for SES. It follows from what SES composite is made out of, in your case parental education and occupation, both of which are highly heritable (as you rightly note in the manuscript yourself). This needs to be addressed and discussed throughout the manuscript.

      • The major conclusion in the manuscript, even if you acknowledge that this is speculation, is that the attenuation of the association between EA-PGS and school grades after correcting for SES can be explained by genetic nurture. I agree, this can be one of the explanations, however, here you also control (partially) for transmitted genes, that is educationally related genetic variants present in both generations (so without genotyped trios here you cannot distinguish between direct and indirect genetic effects). In addition, this attenuation can also be explained by gene and environment correlation (not only passive which is addressed by genetic nurture hypothesis) but also active and evocative rGE. In addition, in your design, you need to consider assortative mating. I suggest directly addressing this in the manuscript.

      • I also think that you should address that you are dealing with diagnosed disorders only. It is a great strength of the paper, and you are using a fantastic resource, but we know that these disorders are quantitative traits and your study does not allow to take that into account, so there are possibly individuals with high ADHD symptoms are included in the control group; similarly, you cannot take into account the symptom severity. In terms of symptom level data, I see you have referenced Selzam et al., 2019 paper that, among other things, related EA-PGS to ADHD symptoms and vice versa, and also controlled for SES.

      • In the introduction, you rightly state that individual differences are explained by genetic and environmental factors and the interplay between them, however, I suggest rephrasing it, because "much of the variance can be explained" is incorrect, all of the individual differences can be explained by the combination of these factors.

      • You report low rG between schizophrenia and E1, can you specify how this was calculated

      • You state that your prediction in the control sample is lower than the other studies and offer a possible solution of the inclusion or exclusion of 23andMe data in the summary statistics, please note that other studies have not used 23ndme statistics either (for example TEDS publications). You also discuss genetic heterogeneity; I think that the difference can be explained by both genetic and environmental heterogeneity. What is the rG between EA in your sample and GWAS sample?

      • I think that the conclusion that the impact of low EA-PGS is comparable to the impact of ADHD is too strong, your data does not support this strong conclusion. I suggest rephrasing it, especially as we're not aware of the associated mechanisms. Note that people with ADHD in your sample also have lower EA-PGS compared to control conditions. In addition, symptom severity of ADHD varies greatly.

      • I also do not agree with the statement that having wealthy parents does not boost the performance as much for children with ADHD as compared to children without for the reasons mentioned above.

      • I think that you have fantastic data, and you have data available about how many of your participants have multiple diagnoses. I suggest adding a stratified group with multiple diagnoses to the analyses, that is adding groups with 2, 3 or 4 and more psychiatric diagnoses and checking their polygenic score prediction to EA.

      • I suggest making it clearer what covariates were used in every analysis (you say first that you added psychiatric diagnoses as covariate among the usual covariates, but later only that covariates were included 'as before', I assume you did not include diagnoses in later analyses, but this is not clear). In addition, it is not clear to me why you control for psychiatric diagnoses in the first set of analyses, I would have wanted to see full results without this covariate.

      Overall, this is a beautiful study and it was a pleasure to read/review it.

    3. Summary: This is an interesting study researching how educational achievement (EA) can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. The study is well powered using an impressive and representative sample and offers insights into the etiology of associations between psychiatric traits and educational achievement. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.

  4. Dec 2020
    1. Reviewer #3:

      In this paper the authors have developed a system to simultaneously generate two-, three- and four-photon fluorescence excitation from a single laser line and then proceed to apply this system to a number of turbid biological imaging applications to highlight its capabilities. Using a customised commercial La Vision BioTec Trimscope, they have incorporated a high powered fiber laser source with an Optical parametric amplifier and dispersion compensation to generate a either 1330nm or 1650nm laser lines with high peak pulse energies at low pulse repetition rates. They then compare the relative capabilities of each laser line in terms of number of fluorescence emission channels measured (skin tumour xenografts), fluorescence bleaching analysis and functional toxicity thresholds and fluorescence signal attenuation (excised murine bone).

      Whilst the paper is well written, the concept of utilising high laser peak powers and at low repetition rates to generate 3PE and 4PE at spectral excitations at 1300nm and ~1650nm is not new and has been presented previously (Cheng et al. 2014), as referenced by the authors. The authors have however gone into more detail and presented a number of comparative excitation approaches to compare and contrast low-duty-cycle high pulse-energy infrared with the more common high-duty-cycle low pulse energy near-infrared alternative. The benefits of higher order multiphoton microscopy when combined with higher wavelength excitation allows deeper imaging and more localised fluorescence excitation with reduced phototoxic and photobleaching effects per excitation pulse. One of the major issues associated with generating 4PE is that since higher pulse energy is required, this further reduces the repetition rate of the laser source, in order to reduce the average laser power in order to avoid sample heating effects. This in turn leads to much longer acquisitions and is limited by the fluorophore saturation particularly since they are using single beam excitation.

      Major comments:

      1) It seems as though when you take into consideration duty cycles, fluorescence saturation, water absorption effects and longer acquisition times, which lead to greater phototoxicity, 4-PE at 1700nm excitation is not appropriate for most dynamic biological applications where acquisition speed and/or continued image acquisitions are the key factors. Could the authors comment on this?

      2) How long does it take to acquire a single frame with four-photon excitation at 1700nm? In none of the data sets was frame time mentioned in particular when acquired 3D data sets. Can the authors ensure that these times are mentioned both in the main text and the figures containing images.

      3) In line 131 and figure 3d the authors present data showing relative axial resolution measurements. Are these features measured diffraction limited and how do they know? They are clearly not measuring like for like structures (different fluorescent species) so do not think this can be used as a measure of resolution. Can the author provide other resolution measurements?

      4) In line 140 - 142 the authors present data showing the advantages of THG at 1650nm over other excitation lines. Aside from the excitation wavelength could this data be explained by the greater absorption and scattering at the emission wavelengths generated at these laser lines?

      5) In figure 3A and 3C the SNR for 1650nm increases whilst for 1300nm and 1180 excitation this decreases. Is this simply due to more of the exciting fluorophore species residing deeper into the tissue?

    2. Reviewer #2:

      Nonlinear microscopy is in the unique position that high-resolution images of cells and other tissue components can be obtained in live tissue. However, scattering and absorption limit the penetration depth. The impact of nonlinear microscopy in biomedicine and biology would be much improved if higher imaging depths can be achieved. Lately a few key studies have appeared achieving this. This manuscript contains a well-motivated extension of this research, in particular on the benefits of high-pulse-energy low-duty-cycle infrared excitation near 1300 and 1700 nm over 2-photon excitation, in heterogenous and dense tissue. The authors compare three types of excitation, at 1650 and 1300 nm at 1 MHz and at 1100 (or 1270) nm at 80MHz. They characterize photodamage in the tissue and determine the limits for power densities to stay below that. They study the achieved resolution at high depth for each of the processes and show a deeper imaging depth is resolved in bone and tumor core with 3P and 4P than with 2P. The article is a very solid and extensive study.

      Though I have no major concerns with article, I do have some minor points:

      l.57: Are the resolutions reported for 2-, 3-, or 4- photon processes. Do you not expect these to differ for the different processes? l.60 It is not explained that power is increased from X to Y, instead the peak power of 87 nJ in L 67 is not found back in fig. S2.

      L. 103 Given is the power at the sample surface, after which the readout for cell stress via Ca imaging is done (very elegant). Is not the imaging depth of the readout relevant too, as it is probably the power density at the focus which matters. What imaging depths can be reached with this low power? This comes back later, but would be good to mention here.

      L.110 The phrase 'Furthermore' confuses me. I guess the authors mean to say that with their 2.8-8.7 nJ of power they were well below the 100 mW level? Which is kind of obvious at 1 MHz?

      L. 126 Some words are missing, 'but 1180'.

      Why do some signals show a peak in intensity in fig. 3C and G rather than a slope?

    3. Reviewer #1:

      In this manuscript, the authors show they can accomplish imaging in complex specimens using 3- and 4-photon excitation, deeper in the specimen than comparable optics can accomplish with 2-photon excitation laser scanning microscopy. This is a clear advantage for imaging optically hostile specimens such as cultured organoids or spheroids, or in challenging in vivo settings. I am excited about these findings, but I am not at all supportive of the current version of the manuscript being used to present these lovely findings.

      There are two strong reasons for my opinion:

      i. The manuscript presents the findings in a manner that will only be understandable by the readers who are familiar with the topic, and who are likely to already have heard of the capabilities of 3- and 4-photon excitation to image deeper into specimens.

      ii. The results are not presented in a way that the large body of potential readers can understand. They will be unable to grasp the way that the experiments were performed, or understand what the figures are showing, or critically evaluate the results that are presented.

      Thus, there is a disconnect between the quality of the work and the quality of the presentation. There are many areas of quantitative imaging and intravital imaging that are well known to those that know about them (or use them), and that are a complete mystery to the vast majority of those that don't know about the tools or use them. The authors must take this as an opportunity to reach the many workers that could benefit from this powerful approach, rather than writing for the group that already knows (and even uses) the approaches presented.

      1) Provide needed background and present important things first. The authors should give the reader a clear view into the issues in imaging biological tissues with the longer wavelengths that are used for confocal laser scanning microscopy (CLSM) and for two-photon laser scanning microscopy (TPLSM). There are several factoids presented, all seemingly true, but not presented in an accessible manner. Rather than starting with a mention of the expected temperature rise due to the dramatically higher absorbance by water of 1300nm and 1700nm light, the paper first presents the major absorbance of the light (~2/3 loss) and that this isn't a problem because there is sufficient laser power. For most readers, the need for a larger laser won't be their first question; instead it will be the viability after/during the imaging session. The expected temperature rise, and an indirect mention of burn marks (!), comes at the end of the section.

      2) Explain and perform cell viability tests. Calcium imaging for assessing tissue viability is not the technique of choice for most readers, and is presented in a way that assumes general knowledge that simply does not exist. Membrane patency assays using membrane-impermeant DNA dyes, or other live-dead assays are far more common, but not presented in this study. I am not insistent that the authors use any particular assay, but I am insistent that the authors present the need for viability assay(s), teach the reader the principles of the assay(s) used, and present the results in an understandable manner.

      3) Present the finding and the figures in an accessible manner. The figures are simply not digestible by the readers who do not perform this sort of work, and the legends do not help sufficiently. For those of us who do perform work of this sort, the figures are not as convincing as they should be, or presented in a way that they can be critically evaluated.

      Consider the legend for Figure 1: "Microscopy with simultaneous 2-, 3- and 4 photon processes excited in fluorescent skin tumor xenografts in vivo. Representative images were selected from median-filtered (1 pixel) z-stacks, which were taken in the center of fluorescent tumors through a dermis imaging window. a) Excitation at 1300nm (OPA) in day-10 tumor at 145 μm imaging depth with a calculated 3.3 nJ pulse energy at the sample surface, 24 μs pixel integration time and 0.36 μm pixel size. For calculation of pulse energy at the sample surface see Figure S3. b) Excitation at 1650 nm (OPA) in day-13 tumor at 30 μm depth with a calculated 6.3 nJ pulse energy at the sample surface, 12 μs pixel integration time and 0.46 μm pixel size. c) Excitation at 1650 nm (OPA) in day-14 tumor at 85 μm depth, with a calculated 5.4 nJ pulse energy at the sample surface, 12 μs pixel integration time and 0.46 μm pixel size. Cell nuclei containing a mixture of mCherry and Hoechst appear as green."

      If I gave any of the figures and legends to the people in my lab, the half that don't do multiphoton imaging (but that have sat through many lab meetings) would just hand them back to me with quizzical expressions on their faces.

      The figures are not as compelling as the results, and defer to the body of the paper to explain what was done or what was shown, and assumes that the average reader remembers the differences between OPO and OPA , for example (which they won't). The power plots showing nJ and mW in Figure 3 are inaccessible to most readers, and not well described.

      I should mention that the figures, legends and text are not satisfying for the readers who are familiar with 2-, 3- and 4-photon imaging either. These are fantastic findings, and deserve figures that are as lovely as the results, and are compelling. Some of these issues are due to typos: "Consistently, multiparameter recordings were achieved inside the tumor at 350 μm depth using excitation at 1650 nm and 1300 nm, but 1180 nm (Figure 3b). "

      However, the greater problem is that the text doesn't present the findings in a straightforward, convincing fashion and then interpret them. Instead, the conclusion often leads the evidence: "In line with an improved depth range, the signal-to-noise ratio (SNR) of 3PE TagRFP outperformed the SNR of 2PE TagRFP at depths beyond 150 μm (Figure 3c). Because H2B-eGFP expression in HT1080 tumors was very high, 3PE eGFP emission reached the highest SNR."

      The legend and figure that it describes should be able to stand on their own, and convince a skeptical reader with the help of the text in the body of the manuscript.

      In summary, these are lovely and important results that I am excited about. They are presented in a fashion that will make it difficult for most to appreciate because the body of the paper is not fashioned to teach the reader, and the figures themselves are challenging, and the legends inadequately present what is shown in the figures. Careful expansion and editing should resolve all of these issues and make the manuscript into the presentation these excellent findings deserve.

    4. Summary: Nonlinear microscopy is in the unique position that high-resolution images of cells and other tissue components can be obtained in live tissue. However, scattering and absorption limit the penetration depth. The impact of nonlinear microscopy in biomedicine and biology would be much improved if higher imaging depths can be achieved. In this manuscript, the authors show they can accomplish imaging in complex specimens using 3- and 4-photon excitation, deeper in the specimen than comparable optics can accomplish with 2-photon excitation laser scanning microscopy. Using a customised commercial system, the authors have incorporated a high-powered laser source with an OPA and dispersion compensation to generate either 1330nm or 1650nm laser lines with high peak pulse energies at low pulse repetition rates. They then compare the relative capabilities of each laser line in terms of number of fluorescence emission channels measured (skin tumour xenografts), fluorescence bleaching analysis and functional toxicity thresholds and fluorescence signal attenuation (excised murine bone).

      This is a very interesting study with some potentially important findings from a technical perspective. However, there is a disconnect at present between the quality of the work and the quality of the presentation. There are many areas of quantitative imaging and intravital imaging that are well known to those in the direct field, and that are a complete mystery to the vast majority of those that are not. It would therefore be highly beneficial to restructure the manuscript in such a way that the findings can reach the many researchers that could benefit from this powerful approach rather than the few who already use it.

    1. Reviewer #2:

      The authors quantify virulence factors in Cryptococcus neoformans and C. gattii in a large number of clinical isolates and correlate these virulence factors to survival in a g. mellonella infection model and to the clinical outcome. The authors found a correlation between secreted laccases and disease outcome in patients. In addition, the authors show that a faster melanization rate in C. neoformans correlated with phagocytosis evasion, virulence in the g. mellonella model and worse prognosis in humans.

      The manuscript is well structured with an appropriate abstract summing the main findings, a clear introduction, well described methods section and appropriate number of figures and tables. The results are clearly described.

      1) The authors identify and acknowledge the most important limitation of the study: line 365-366 the patients were treated with different regimens in distinct health services. This reviewers agrees this is a limitation. However, to get a feeling about the impact of these differences the authors should indicate how the patients were treated and whether there were differences in patients that died and survived. Without this information clearly presented, I cannot interpret the correlations between virulence factors and outcome found in this study. Perhaps the authors can show how many patients that were included in the phenotype-survival analysis, that died and survived were treated according to Brazilian guidelines.

      2) The melanin production evaluation assay is an important tool that the authors use in this study and the measurements from these assays were correlated with G. mellonella and patients survival and thus are essential to the conclusions of the study. The method is well standardized, and the authors show elegantly that the outcomes are highly reproducible. Can the authors describe when melanization occurs: does it occur in mature colonies and may growth rate itself may influence the measurements? Do isolates with a high growth rate/colony maturation have a low T-HMM or high melanization Top. Have the final colonies of different species have a different final cell number after 7 days incubation and how does this correlate to melanization? And how does the growth rate/ budding rate/ colony maturation/ correlate to G. mellonella survival?

      3) The figures 1-5 give a clear picture of the wide distribution and variation of virulence parameters e.g. the distribution of melanization kinetics parameters, the distribution of capsule sizes, GXM secretion and LC3 phagocytosis. But what does this distribution mean, it only shows that the isolates are not the same but does not contribute majorly to the final conclusion. Can the authors think of a way to give more meaning to these figures: e.g. indicate with colors which isolates were retrieved from patients that eventually died and which survived (although this may be inappropriate as not all clinical information is available. Figure 6 really gives meaning to the numbers displayed in figure 1-5. Perhaps move some figures to the supplementary file.

    2. Reviewer #1:

      The manuscript describes the characterization of in total 85 Cryptococcus spp. clinical isolates with regard to virulence phenotypes including a Galleria mellonella infection model for cryptococcosis. The authors determined the melanization kinetics of all strains, measured the whole-cell and extracellular laccase activity, the capsule thickness, and the concentration of the cell wall polysaccharide glucuronoxylomannan. In addition, during macrophage interaction the proportion of Cryptococcus-containing LC3-positive phagosomes for each strain was determined as well as the survival of G. mellonella after infection with selected Cryptococcus strains. Finally, regression analyses were performed to estimate the relationship between the risk of death in crytptococcosis patients and the phenotypes of the isolated Cryptococcus strains. A major finding was that the risk of death in patients with disseminated cryptococcosis increased with the level of extracellular laccase activity and the time for half-maximum melanization in the Cryptococcus isolates. This suggests that the melanization rate, more than the total amount of melanin, impacts the outcome of a Cryptococcus infection.

      General assessment:

      The study is based on carefully performed experiments. However, the scientific significance of this work is moderate. Melanin and the laccases that are involved in its synthesis are known virulence factors of Cryptococcus spp. for many years and similar studies have already been published elsewhere (e.g. Samarasinghe et al. 2018). The major new finding of the presented work is that the speed of melanization has an impact on the virulence of Cryptococcus spp. rather than total amount of melanin. The shortcoming of the manuscript is that the author's hypothesis is mainly based on regression analyses, but the final proof based on a genetically well-defined background is missing. Therefore, the study only provides little new insight into fundamental mechanisms of Cryptococcus virulence but includes associations with patients and therefore might be more suited for a journal specialized in pathogenic fungi.

      Following points should be considered:

      1) The authors show the association between faster Cryptococcus melanization and more effective evasion from host immunity. However, the author cannot totally exclude other factors that are associated with host evasion. It would be more appropriate to either create a mutant (e.g. overexpression of LAC1), which showed faster melanization in comparison to a wildtype strain or to perform multilocus sequence typing (including the LAC1 locus) to capture the genetic variation of the clinical isolates and to find come correlations with the speed of melanization. The interesting question is which genetic factors contribute to the difference in the melanization rate.

      2) The authors should critically discuss the suitability of their Galleria mallonella infection model. It is a known fact that temperature has an influence on the melanization in Cryptococcus spp.. Laccasse activity is significantly inhibited at temperatures of 37°C and higher. The Galleria model can only be used at lower temperatures.

    3. Summary: The description of how faster melanization is associated with LC3-mediated phagocytosis evasion, virulence and outcomes in humans is interesting and does provide some new information. In general, the study has been executed well, with clear articulation of the results and appending conclusions. However, the work falls short of investigating any substantive mechanistic basis for the observations and how they relate to the broader metabolism in Cryptococcus. .

    1. Reviewer #3:

      The manuscript by Shi and colleagues delineates an approach for labeling newly synthesized lipids thereby providing a method to examine how lipids move throughout the cell. The premise of this technical approach is that fluorescently labeled fatty acids are fed to a cell in the presence of another lipid which will incorporate the fluorescent acyl tail using the endogenous cellular acyltransferases. Cellular imaging is paired with this approach to show the subcellular accumulation of the lipid. As presented, the data are intriguing, but there are some concerns and questions about the technique that limits the interpretation of the data and could impact the overall utility of this approach. The authors should provide the additional requested data, and resolve the issues raised below to increase confidence that this labeling approach allows for the monitoring of physiologic lipid trafficking pathways.

      Specific concerns and questions are delineated below.

      1) The authors initially exploit the remodeling of PLs as described in figure 1a. This involves the addition of lyso-PL and NBD-labeled palmitoyl-CoA. The authors imply from their schematic in Fig 1a that they are using lyso-PLs that are being remodeled at the sn1 position by NBD-labeled palmitoyl-CoA. Unless I am missing something, lyso-PA and other related lyso-PLs are generally remodeled at the sn2 position. Additionally, there is specificity for PUFAs acylation to the lyso-PL. So I am a bit confused about the enzymes that are working in this system. I tried to determine which lyso-PLs that the authors are using, but the methods did not specify if they are using 1- or 2-lyso PLs. This should be clarified so that we can understand the enzymes the authors think are underlying the labeling reaction. On a minor, but related note, the lyso-PL in Figure 1a is missing an -OH group at the sn1 position.

      2) The authors use a cell system where the cells are starved of lipids and other metabolites for 1 hour and then fed a large bolus of lipids as substrates. It appears that the cells can remodel and label some PLs under these conditions, but it is not clear to me that this represents physiologic labeling that can be used to track the de novo labeling and trafficking into subcellular compartments. Nor can it be used to draw strong conclusions about required trafficking or enzymatic pathways under normal conditions. What happens if labeling occurs in complete media or defined media? This might help to resolve this.

      3) The labeling looks non-uniform in mitochondria as evidenced by the images in figure 2a. Why is the labeling only at the outer edge of the mito in these cells in this figure? What happens if labeling goes longer? Similarly, the authors quantify "30 cell images" or the like in the figures for Pearson correlations. How were the 30 cells selected, and since labeling across the mitochondria is not uniform, how were images selected? A much larger number of images scanned in an unbiased manner would increase confidence.

      4) Likewise, what happens if the labeling is allowed to proceed beyond 15 min. Can the authors provide a 30 min and 1 hr image?

      5) There are a number of conclusions drawn about specific pathways required for the trafficking of accumulation of labeled lipids. I realize that some of these studies are used as a specific proof-of-concept for the approach. However, there are many studies that go beyond proof-of-concept and draw conclusions about biology. Many of the studies are somewhat superficial and the conclusions reached by the authors should be tempered given that they have not deeply investigated this new biology.

    2. Reviewer #2:

      In this study, Zhang et al. use a semi-novel method to track acyltransferase activity using fluorescently labeled palmitic acid (NBC-16:0) to track where specific lipids are incorporated with subcellular specificity. They show that NBD-16:0 can be incorporated into different lipid classes that segregate based previously known on subcellular specificity. While this is an interesting technique, it is difficult to determine how much fidelity this method has in recapitulating biological function without additional experimentation, orthogonal measurements, and a more descriptive methods section.


      1) The authors do not specify which lysophospholipids were used in their study. In the method section they specify that they came from Avanti, but there are >100 different lPLs in their catalog. Also, the authors give a range of lPL concentrations in the methods, but do not specify which concentration was used for which experiment. Without this information and other unspecified aspect of their studies, interpreting subsequent experiments is difficult.

      2) One potential advantage of this method is that it is a method to track endogenous lipids in live cells, however the authors show the NBD-16:0 transporting to lipid species where palmitate is almost never measured. For example, the use of the transport of NBD-16:0 to CL as evidence this is working. However natural cardiolipins are almost completely devoid of 16:0. In mammalian cells >80% of the fatty acids in CL is 18:2 with most of the remaining being 18:1 and 18:3. Similarly (assuming you are using sn-1 lPL-16:0), phospholipids with two 16:0 are extremely rare in mammalian cells with the exception of lung surfactant. Further, 16:0 composes <5% of cholesteryl esters in typical cells. The authors should be clearer about how this discrepancy between natural sorting of palmitate and the sorting of NBD-16:0 supports this as an accurate model of acyltransferase activity and intracellular transport.

      3) The authors state that PA is primarily remodeled in the ER and transported to the mitochondria as a precursor to CLs (lines 108-111). This statement needs a source. In most studies I am aware of, the vast majority of both PA and 16:0 are primarily converted to TGs or PC/PE with only a small fraction going towards the CDP-DAG pathway required for CL biosynthesis. Are C2C12 cells unique in this regard? Does lPA stimulation specificity induce CL production? Does any of the NBD get into the TG or phospholipid fractions?

      4) This study would be much stronger if another fluorescently labeled fatty acid was added. A comparison of the sorting of 20:4 and 16:0 would be very informative. This is especially true if the studies were done in the context of a known acyltransferase, for example LPCAT3.

      5) This study would also be strengthened by an orthogonal technique showing similar sorting. For example, separation of the organelles and measurement of labeled fatty acids by MS or nano-SIM analysis would greatly strengthen these studies.

      6) In figure 1A, the authors draw a schematic with an sn-2 lyso-PL in the figure. Sn-2 lyso-PLs as labile and will acyl migrate to the sn-1 position without careful handling of the PL in a basic solution. The authors make no mention of this type of handling in the method section. This figure should either be corrected or more details of how they handled their lysophospholipids should be provided.

    3. Reviewer #1:

      My general assessment of this work is that it is full of good ideas and presents a novel and general approach to examine lipid remodeling in cells and perhaps subsequent transport of lipids, mainly to mitochondria, but it lacks the scientific rigor necessary to be fully confident that their conclusions firmly support their claims. Often, insufficient information about the methods are provided and the manuscript is hard to follow critically.

      More specific comments:

      1) I am surprised that acyl-CoAs are transported into cells. I don't know of any precedent for this. Usually fatty acids are imported into cells and then converted to acyl-CoAs as part of the mechanism of import. Could it be that the acyl--CoAs are hydrolysed before uptake only to be reformed inside the cells? I would suggest feeding the NBD-palmitate plus the lysolipids to the cells as a control to see whether this is the case.

      2) In fig 1 as an example they choose a region to blow up. As one can see there is a large variation, even in the blowups of mitochondrial labeling and if one looks at the originals the variation is confirmed. How have they chosen these areas? Furthermore, in figure 1 there is quite a bit of label with MLCL outside of the mitochondria, in particular in regions that they did not choose to blow up. What are these structures? Remodeling of MLCL is thought to take place in mitochondria.

      3) They speak of transport of lipids from ER to mitochondria, but in fact the demonstration of this is very weak from what they show in the time course in supp fig 1. I am also disturbed by the difference in patterns of the NBD-PA patterns in a and b. They should be the same, but there are problems, maybe focus? I would say anyway that there is no clear evidence that the NBD PA first appears in the ER then goes to mitos. It could be synthesized in both compartments from their data.

      4) The product characterization by TLC is insufficient. There are no standards, no characterization. Would they have seen the free NBD-palm by their methods?

      5) When they use mutants and find less "transport" the mitochondrial signal as seen by mitotracker is always more diffuse. This indicates to me that there is another problem.

      6) In fig 3 the fluorescent pictures do not correspond to what is seen in the quantification. There is more yellow in e than in h.

      7) How did they add cholesterol at 50 or 100 micromolar? It is soluble at less than 1 micromolar in aqueous solution. The cholesterol experiments are puzzling. From what we know about StAR protein it recognizes cholesterol not esters. There is no precedent for cholesterol ester transport into mitochondria. Can they rule out that the esters are transported to the surface of the mitochondria and the NBD-Palm cleaved off and transported into the mitochondria?

      8) The MAG and DAG experiments are overinterpreted. It could just be a kinetic problem since the MAG gets converted to DAG before TAG

      9) They compare to externally added NBD lipids, but we don't know which ones they used. Are they using short chain NBD phospholipids. I could not find this in their manuscript. If they do not have the same NBD-palm in the sn-2 position then the comparison is meaningless.

      10) The excitation and emission spectra of their probes are sometimes overlapping. How did they deal with this? Are they sure that they are not seeing FRET?

    4. Summary: Zhang et al. describe an interesting method to label newly synthesized lipids with fluorescent fatty acids and track their movement in cells. All reviewers agreed that this could potentially be a useful tool. However, they all raised concerns regarding the rigor of the characterization of this methodology.

    1. Reviewer #3:

      In this manuscript, Icke and colleagues show that the secreted protein CexE/Aap from entergotoxigenic E. coli is acylated at an N-terminal glycine and suggest that acylation is required for secretion via a Type I Aat secretion system to the cell's surface or into the environment. The key findings is the identification of an N-acyltransferase (AatD) encoded nearby cexE/aap and demonstration that this enzyme is required for acylation.

      There is a concern about the novelty of the findings. The publication by Belmont-Monroy et al. (PLoS Pathogens, August 2020) cited by the authors is very similar to the current manuscript. That publication demonstrated that N-acylation of Aap (a CexE homolog) occurs at its N-terminal glycine (made available after signal peptide cleavage), that acylation is dependent on the acyltransferase AatD, that acylation is required for Aap secretion, and that N-terminal residues are sufficient for acylation of a heterologous protein (though this was poorly analyzed in that paper). Almost all of those findings are shown in this current manuscript by Icke et al., independently confirming the acylation reaction.

      This Icke et al. study is well done and convincing on the AatD-dependent acylation of CexE/Aap. Overall, the same conclusions are drawn as Belmont-Monroy et al., 2020. The major new advance (not previously described) is the observation that the N-terminal glycine is required for N-acylation by AatD.

      As described in my comments (below), the manuscript could be improved in a few instances by including key controls to support the conclusions. In other instances, broad conclusions are made from narrowly focused data and the text should be revised.

      Major comments:

      1) "To our knowledge this is the first report of enzyme mediated N-palmitoylation in nature". This statement is not correct. The lipoprotein N-acyltransferase Lnt (used as a reference for AatD analysis in this manuscript) performs N-palmitoylation (C16:0) in E. coli and distantly related bacteria such as mycobacteria/corynebacteria. See Jackowski & Rock 1986 (JBC 261,11328-11333), Hillman et al. 2011 (JBC 86, 27936-27946), Brulle et al. 2013 (BMC Microbiology 13, 223).

      2) The conclusion that "we reveal a new function for acylation - protein secretion" is not fully supported. The authors do not directly show that the secreted CexE is acylated (Fig 2A) or that acylation is required for secretion. The use of 17 ODYA is innovative and could be used to show that secreted supernatant CexE is acylated. The CexE N-terminal substitution mutants that are not acylated (Fig 7C) could be used to test if acylation is required for secretion.

      3) If the secreted CexE is acylated, some discussion is needed. How is the acylated form sometimes secreted into the aqueous environment but sometimes embedded in the outer membrane as shown in the model?

      4) Can the authors show/detect CexE acylation in the native system that doesn't rely on overproduction of the CfaD transcription factor? Is the observed acylation physiological or a consequence of strong overexpression?

      5) Claims of novelty in text should be altered following Belmont-Monroy et al., 2020.

    2. Reviewer #2:

      I think this is a superb manuscript - it is written in a clear way, such that the story starts at the historical understanding of lipoprotein trafficking and builds up convincingly using various experimental methods to show that a new class of lipoproteins is trafficked via acylation of glycine, through the Aat secretion system.

      It is highly exciting that a protein that does the acylation AND the secretion from the periplasm to the cell surface has been identified! Next step is to get a structure.

      The data are convincing and the paper is extremely well-written. My comment is that I am not convinced by the argument that CexE would not be recognised by the lol system, when it is acylated it likely would be as the hydrophobic pocket of LolA and LolB are fairly indiscriminate - see e.g. the binding of small hydrophobic molecules to these proteins. The authors should comment on this aspect.

      It is intriguing how glycine in particular is recognised for acylation.

      Overall, a great paper - authors should be commended.

    3. Reviewer #1:

      This study from the Henderson laboratory describes the identification of a hybrid secretion system involved in the acylation and trafficking of a conserved class of bacterial lipoproteins. Spurred by the serendipitous observation of posttranslational modification, Icke et al. identify the AatD protein as the factor responsible for CexE acylation. Combining alignment of conserved sequences and structural data the authors isolate the site of acylation on the CexE polypeptide and identify AatD residues responsible for catalysis Overall this is a strong manuscript, densely packed with supporting data and extremely well written.

      My only significant concern is the issue of novelty. Although the authors seem to imply they are the first to report this type of system, they cite a 2020 PLOS Pathogens paper by Belmont-Monroy detailing nearly identical results in Enteroaggregative E. coli. Given the significant amount of overlap between these two manuscripts, it would seem prudent for the authors to spend some time in the introduction and discussion highlighting open questions that this paper addresses.

    4. Summary: All three reviewers were enthusiastic about the identification and characterization of a hybrid secretion system involved in lipoprotein acylation and trafficking. We were impressed by the strength and extent of the data and the clever use of genetic, biochemical and bioinformatic approaches. At the same time, there was agreement that the conclusion that acylation is involved in CexE secretion is not fully supported. There was also consensus that overlap between this study and the 2020 PLOS Pathogen paper from Belmont-Monroy, necessitates more direct acknowledgement.

    1. Reviewer #2:

      The manuscript prepared by Kim and Colleagues provides a solid attempt at understanding the neural correlates associated with self-reassurance and self-criticism in relation to what they term neural pain. While it is well written and there is a clear story presented here, there appears to be insufficient details in the introduction and discussion. The methodology appears sound for the most part, but I have some concerns relating to stimuli and gender effects that I believe would make the findings more compelling if addressed.


      1) The example items of the neutral statements appear to involve an external agent (i.e., a reference to a friend), while the neural pain is purely about the self. Are there also references to other people in the neural pain condition? If not, how have the authors ensured that the neutral condition is actually neutral. It seems likely that the inclusion of an external agent for many of the neutral statements could pose problems with interpretation, especially when talking about self-criticism and self-reassurance. The presence of an external agent in the neutral statements changes the meaning from a purely self-oriented experience to a shared experience.

      2) I am curious as to why the inverse contrasts (i.e., reassurance - criticism) were not run? Knowing whether there was a unique network associated with self-reassurance would provide a more comprehensive understanding of the authors' findings.

      3) I am wondering why the authors did not accommodate for gender differences in their study? Given recent evidence (See citation) it seems likely that this may play a part in self-compassion. The authors report an almost equal distribution of males and females, so it should be possible. If the authors did explore this and found no difference with gender as a regressor then this should be noted in the manuscript. Mercadillo, R. E., Díaz, J. L., Pasaye, E. H., & Barrios, F. A. (2011). Perception of suffering and compassion experience: brain gender disparities. Brain and cognition, 76(1), 5-14.

      4) It seems as though a whole body of literature is being very lightly touched on here but would benefit from inclusion. I think it would be useful to have some information in the introduction regarding moral emotions (i.e., compassion) and the link with empathy and emotion regulation (see work by Jean Decety). This would also be beneficial for the discussion as the authors are in essence describing empathy.

    2. Reviewer #1:

      This is a potentially interesting analysis, but there is a lack of framing, details, and specificity that dampens my enthusiasm for the work.

      1) As far as I can tell, the authors do not really demonstrate that "markers of negative emotion and pain" can be down-regulated during self-reassurance". They simply show that regions surviving multiple comparisons change depending on condition, but they don't show data supporting their hypothesis. How much do regions activated during criticism actually change during reassurance? What is the time course of these differences?

      2) Behaviorally, the neutral statements from the two "conditions" appeared to have distinct intensity levels. Specifically the "intensity" for neutral trials during criticism blocks appears significantly lower than neutral trials for reassuring blocks. Because of this behavioral effect, within their design it is difficult to identify the cause of the brain changes.

      3) How were subjects trained in self-criticism vs. reassurance? Is there any way to confirm that they were in fact doing the "task"? Further, at what point in the 2-week compassion training paradigm were FMRI data collected?

      4) Figure 2 is quite confusing to me: (1) the authors refer to brain maps as "neural pain"? I would strongly advise against this as it is very reverse-inferency. I would recommend against using this phrase throughout the paper. (2) How would one interpret the phrase "neural pain during self-reassurance"? Is this emotional > neutral during reassurance?

      5) Figure 3 refers to "trial by trial ratings of intensity" but if I am understanding the figure, this is not an accurate description. The authors are reporting the mean across subjects for each condition. It is unclear in fact how much variability there is on a trial-by-trial level within persons for the intensity of each condition. One idea is to use an amplitude modulation analysis to scale FMRI parameter estimates by the intensity rating on a per-trial basis. That would be an interesting analysis, IMO.

      6) It is unclear from this paper what was done previously. It appears that the authors examined physiological data (e.g., HRV) in their previous report but don't talk about other measures that were collected here. It would be useful to know the extent to which they buttress the authors findings (or if they do not).

    3. Summary: Kim and colleagues present a secondary analysis of an already published imaging dataset in 40 participants going through a two-week compassion training paradigm. They show participants standardized statements that are emotional or neutral and further have participants either engage in "self-criticism" or "self-reassurance" while considering the statements. The authors report on differences in brain regions (what they refer to as "neural pain") depending on criticism or reassurance condition. Concerns with the conceptual framework, approach, and interpretation substantially dampened our enthusiasm.

    1. Reviewer #3:

      Three different anti-asprosin mAbs were produced and tested in different metabolic syndrome animal models. Beneficial effects were noted on body weight, food intake and blood glucose and insulin levels. The effects were modest, but seemed to be relevant to elevated aprosin levels, as the AB blocked the effects of adenoviral overexpression of the hormone. Some issues require attention:

      1) Additional characterization of the aprosin neutralizing effect of the AB is required.. It will be helpful to show the endogenous free asprosin levels at different time points after a single or repeated mAb injection. This result is also important to tell whether this mAb will cause other immune responses and side effects that might confound interpretation of the results.

      2) In Figure 3 (a, e, j) and Figure 4 (a, e, I, m). please show body weight to rule out the stress or side effects caused by virus injection. For DIO mice, 14 days IgG injection also caused weight loss; for db/db mice, IgG injection increased body weight. Please discuss.

      3) Although adenovirus and AAV are widely used for in vivo protein overexpression, it is important to show here that endogenous asprosin levels were increased after virus injection and decreased after antibody neutralization.

      4) In Figure 5, more data on liver weight, histology, etc. is required to support their conclusion on liver health. The current data from three different mice models are very contradictory, this can be caused by the side effect or off-target effect of this mAb.

      5) In Figure 6, it is important to demonstrate the neutralizing effect of the mAbs.

    2. Reviewer #2:

      Asprosin, as identified by the authors' group, is reported to stimulate glucose release from the liver and also centrally act as an orexigenic hormone. The present study developed monoclonal antibodies for asprosin and demonstrated that antibody-based asprosin depletion lowered food intake, prevented diet-induced body weight gain, and lowered blood glucose levels in mice. Overall the data are supportive of the conclusion; however, several concerns were identified as follows:

      1) One of the central issues is the specificity of the antibody action. The authors should demonstrate if the effect of the asprosin antibodies is blunted in mice that lack either asporosin or its receptor OR4M1.

      2) Previous studies from the authors' group show that asprosin acts on hepatocytes and triggers cAMP signaling. The authors should examine if the neutral antibodies blunt the cAMP signaling in DIO mice.

      3) Similarly, asprosin was shown to stimulate AgRP+ neurons. The authors need to demonstrate the effect of asprosin antibodies on AgRP+ neuronal activity.

      4) A recent paper (von Herrath et al. Cell Metabolism 2019) challenged the author's observation of the metabolic action of asprosin. The authors claim that this is due to "due to use of poor quality recombinant asprosin". However, no scientific evidence was presented. This study needs a more rigorous assessment of data reproducibility.

      5) Most of the bodyweight data are presented as "body-weight change". However, the authors should present them as whole-body weight.

      6) Some of the data points and stat analyses require further clarification. e.g.) lack of SE in Fig.1c, statistical analysis of Fig.3, Sup Fig.1

    3. Reviewer #1:

      The study is interesting and does have potential translational relevance. There are some concerns however: (1) in Figure 1 the blood glucose drops independent of food intake is this all related to decreased hepatic glucose output or are there any effects on urine. Was urinary glucose measured? Is there increased glycosuria?; (2) In previous papers you discuss the increased lean body mass when aprosin is not present. There is no body composition data in this study. Was there any body composition differences with the antibody among the different mouse models (e.g DIO vs Nash diet)?; (3) were any changes in lean body mass with the antibodies associated with increases in strength?; (4) several mouse ages are discussed in the Methods section: 12 weeks, 16 weeks, 12 week of high fat diet or 24 week of NASH diet. Not clear from description if mice were matched for age. Please clarify; (5) In Figure 5 there are a number of inflammatory markers which can vary according to the model. What about anti inflammatory markers (cortisol, IL-10 etc) would be helpful to get a better picture of physiologic changes.

    4. Summary: Mishra et al. present data characterizing the effect of asprosin neutralizing antibodies on the parameters of metabolic syndrome (weight, glucose, lipids, etc). This group were the initial discoverers and characterized asprosin as a hormone that increases blood sugar and stimulates appetite. In their Nature Medicine 2017 article they also present data on a neutralizing antibody. In this follow-up manuscript the group characterizes the impact of neutralizing monoclonal antibodies on metabolic parameters of three mouse models of obesity (DIO, NASH diet and Leptin receptor knockout). The translational focus of the manuscript is potential use of monoclonal antibodies against aprosin as a treatment of metabolic syndrome.

    1. Reviewer #3:

      This paper shows that during a second-order conditioning (SOC) task, the representation of a conditioned outcome is represented in the lateral orbitofrontal cortex (lOFC). The BOLD signal in this region shows increased functional coupling with the amygdala for second-order conditioned stimuli that indirectly predict a negative outcome. The authors suggest these findings reflect a mechanism by which value is conferred to stimuli that were never paired with reinforcement.

      The paper tackles an interesting question concerning the neural mechanisms that support second order conditioning. The task design includes relevant controls and, on the whole, the findings support the claims made by the authors. I have a few questions about interpretation of the data, but my main suggestion would be to revise the framing of the article. There are many previous studies that have investigated the mechanisms that support second order conditioning which are not always given due credit. I believe this paper would benefit from placing the hypotheses and findings more firmly within the context of previous literature.


      1) The authors test the hypothesis that CS2 is directly paired with a neural representation of the US. They state that this hypothesis 'has never been tested to date'. However, a number of studies have shown evidence for and against this hypothesis (for example: Wimmer and Shohamy 2012; Wang et al., 2020; Barron et al., 2020). Can the authors clarify how the hypothesis tested here differs from those investigated previously? In addition, it is not clear to me how the four potential mechanisms they propose are really distinct from each other?

      2) Relatedly, given the authors use an SOC paradigm that differs from sensory preconditioning studies used by many previous authors, does the difference in task paradigm provide new insight? Do the authors expect the neural mechanism to be the same or different between their version of SOC and sensory preconditioning?

      3) Why is the behavioural data in Figure 1F bimodal for CS1 and CS2? i.e. what does choice probability of 0 for CS2+ vs CS2- mean for a given subject?

      4) To test the author's hypothesis, is it not necessary to assess evidence for US in response to CS2? They instead report reactivation of US in response to CS1 and for the PPI it is not clear to me how the authors distinguish between CS1 and CS2 given the temporal proximity in their presentation (Figure 1D).

      5) For the PPI, is there a main effect of CS- and CS+ versus CSn in lOFC? If not, how does this affect interpretation of the PPI? On a separate note, is the effect reported in Figure 3 really in the hippocampus? Does it survive small volume correction using a hippocampal mask?

      6) The following is stated as a premise: "To form an associative link between CS2 and US, the reinstated US patterns need to be projected from their cortical storage site to regions like amygdala and hippocampus, allowing for convergence of US and CS2 information." This potentially seems fair for the hippocampus, with added reference to relevant literature (e.g. publications from Shohamy and Preston labs), but in my opinion the jury is still out on this one. It is not clear to me why we necessarily expect amygdala here.

      7) There are various strong statements that in my opinion need to be toned down in light of existing literature. For example, the paper claims this study is the first to show evidence for implicit inference. However, as far as I'm aware, Wimmer & Shohamy 2012 also found no evidence for explicit memory of stimulus-stimulus associations with no relationship between measures of explicit memory and decision bias. Similarly, the authors claim this paper is 'the only report so far of behavioral evidence for associative transfer of motivational value during human second-order conditioning', overlooking a large number of other studies that have shown similar behavioural effects.

    2. Reviewer #2:

      The authors investigate the neural correlates of second order conditioning in carefully designed behavioural experiments coupling multivariate fMRI and functional connectivity. They found that the lateral OFC in connection with the amygdala, plays an important role. I think the paper represents a valuable addition to the human cognitive literature, where second order conditioning is surprisingly under-investigated. I have only a few suggestions to make.

      I encourage the authors to complement the multivariate analyses with a standard univariate analysis. To be clear, I am not without seeing the added value of the multivariate approach, however, given the extensive literature on the neural bases of conditioning using univariate analyses and the strong prediction about directionally of the effects in the OFC (which should positively encoded expected values and rewards), I think the paper would definitely benefit from including the univariate results for the main contrasts / variables.

      I am also curious to see the reaction times in the attentional control task analyzed to check if they were affected by the underlying conditioning procedure. Following the Pavlovian-to-Instrumental transfer theory, we should observe that the reaction times are slower for negative (aversive) stimuli and faster for positive (appetitive) stimuli.

    3. Reviewer #1:

      This manuscript by Luettgau et al. describes a study of second-order conditioning in humans. The behavioral task involved visual first- (CS1) and second-order cues (CS2) and gustatory outcomes (US). Behavioral results show that subjects preferred both the CS1+ and CS2+ over the CS1- and CS2-, respectively. MVPA shows that the CS1 evokes US representations in the lateral OFC, and that US representations in the amygdala increase over second-order conditioning. This study addresses an important and novel question. However, I have several major concerns regarding the study design and data analysis:

      1) I do not see how it would be possible to disentangle responses to the CS1 and CS2 in this task. The delay between the CS2 and CS1 is only 500 ms, which is not long enough to disentangle fMRI responses to the two CS.

      2) For the main "reinstatement" analysis, activity was averaged across both CS2 and CS1, and so it is unclear whether reinstatement is driven by the CS1 or CS2. The authors argue that "US reinstatement during SOC could only be faithfully attributed to the respective CS1, but not to CS2, since only CS1 had been directly paired with the US, and CS2 had not previously been experienced." However, this is only strictly true for the very first trial during which the CS2 could have gained full access to the US representation.

      3) In this regard, it is unclear why the authors did not use data from the first-order conditioning phase to test for US reinstatement. Although the 4-second delay between CS1 and US is still quite short, TR-wise MVPA could provide evidence that signals are related to the CS1 and not the US itself.

      4) Relatedly, the authors perform analyses suggesting that, from early to late phases of second-order conditioning, representations of CS2 in the amygdala became more similar to US representations. Although here they attempt to model fMRI responses to the CS1 and CS2 separately, there is no evidence that this was indeed successful. As I see it, the delay between the two CS is just not long enough to dissociate these responses.

      5) Is there evidence for a CS1 evoked reinstatement of the US in the amygdala, and a CS2 evoked reinstatement of the US in the lateral OFC? In theory these signals should exist, but independently testing for activity related to the two CS requires a task design where the two CS are presented in isolation or with long enough delay between them.

    4. Summary: All reviewers agreed that the neural mechanisms by which value is conferred to stimuli that were never directly paired with reinforcement is an important topic. However, individual reviewers raised questions regarding the study design and data analysis. In particular, reviewers agreed it was not clear how you could distinguish BOLD responses to CS1 and CS2 given the temporal proximity of their presentation. They also wondered whether the current results would provide enough advance beyond previous work.

    1. Reviewer #3:

      This manuscript examines data from the Young Adult Human Connectome Project's 900-subject release to compare both structural and functional connections between iso-eccentricity bands in striate cortex and the fronto-parietal, cingulo-opercular, and default mode networks. The authors find that central vision is most strongly connected to the fronto-parietal network, which is associated with attention, while the far periphery is more strongly connected to the default mode network. The questions asked in this manuscript are of considerable interest to the field, and this study has the potential to be impactful. However, substantial work is needed to make the methods and results sufficiently clear and reproducible to the reader.

      Major Comments:

      A major problem throughout this paper is that the authors have not been very careful in documenting their methods, what they are plotting, or how they are supporting their assertions. This is a major shortcoming of the work. I do not believe there is sufficient detail in this paper as is to reproduce the methods, nor was I able to understand what precisely was calculated in the statistical tests reported.

      The amount of work that has been put into this project's quality control (at minimum, visual inspection and filtering of 900 MR images) is very impressive! This information should really be shared with the broader research community in order to make this manuscript more reproducible and in order to ensure that other researchers can simply use and cite the authors' efforts rather than repeating them. This could be as simple as a supplemental table or text-file that includes the subject IDs of those HCP subjects that were included in all analyses.

      It should be crystal-clear from the Methods section whether the manuscript's data were collected or reanalyzed by the authors. My understanding is that all of this manuscript's analyzed data are from the HCP database. However, had I read only the "Data Acquisition" section I would have been left with the strong impression that the authors collected the data themselves using the same kind of scanner and the same analysis pipelines as the HCP. Unless this is the case, the opening sentence of this section should probably be something like "All data were acquired and preprocessed by the Human Connectome Project (Van Essen et al., 2013)" [10.1016/j.neuroimage.2012.02.018]. It may also be wise to reference the HCP in the Acknowledgements section. Further information: https://www.humanconnectome.org/study/hcp-young-adult/document/hcp-citations. This should apply equally to the data and the preprocessing methods-i.e., if the quality control mentioned in the above comment was performed by the HCP and not the authors, that should have been explicit.

      P3, ❡6. This paragraph is critical to the methods but is not at all clear. In particular, the paragraph eventually describes seven eccentricity segments per subject, yet it does not explain what the eccentricity boundaries of these segments are, nor does Figure 2 show these segments. It isn't clear from the manuscript if these are ever used (rather than the 3 central/mid-peripheral/far-peripheral segments) or exclusively used.

      In looking at Figure 4, my first and strongest impression is that the central connectivity is very similar to the far-peripheral connectivity, and the z-score differences are incredibly small. Additionally, the legend does not make the quantities plotted very clear (these are based on the averaged z-scores across subjects?) so I'm left wondering how to assess any sort of significance. I have a similar reaction to Figure 5. More help is needed to understand these results.

      Given that this paper consists of a large analysis of a large existing dataset, it would be especially nice if the authors would make their source code and intermediate analysis files publicly available. Having access to the source code directly is virtually a requirement of making this kind of study reproducible and would mediate many of my concerns about the ambiguities of the methods.

    2. Reviewer #2:

      In this work, Sims and colleagues use resting-state functional connectivity and diffusion tractography in human connectome project data to examine the connectivity of the central and peripheral aspects of the primary visual cortex. They find that central V1 connects more strongly to regions of the prefrontal cortex interpreted as the Fronto-parietal network than does peripheral V1.

      The idea that central V1 may be directly connected to control-related networks is an interesting one, and has fascinating implications for the study of top-down modulation of visual cortex function. However, I must say I am somewhat skeptical of these findings, for several reasons.

      First, I find the a priori anatomical basis for these proposed connections to be dubious. The authors themselves describe how Markov et al. explicitly conducted tract tracing with central V1 and found connections with posterior frontal and parietal cortex, but nothing with areas classically associated with the fronto-parietal cortex. The authors propose that the inferior fronto-occipital fasciculus may connect V1 with lateral prefrontal regions only in humans. However, they provide no evidence for this suggestion. Indeed, my understanding of the iFOF is that it connects to inferior and lateral occipital cortex (see e.g. figures from the Takemura study cited in this work). Can the authors better support the idea that the iFOF might be the route of connection between V1 and frontal cortex?

      Second, I am concerned that both 1) the Central V1 ROI employed in this work and 2) the inferior frontal cortex region showing strong FC with that Central V1 ROI overlap very closely with regions where we have seen poor BOLD signal in our own fMRI data (I would like to attach a figure if possible).

      We are not confident what the source of the poor signal might be in posterior occipital or inferior frontal cortex; we suspect the presence of large veins (possibly the transverse sinus in V1; see Winawer et al., 2010, Journal of Vision). In any case, the data quality is low enough that we believe our data should not be considered to represent actual neural function in those regions. Can the authors demonstrate convincingly that this is not the case in their HCP data?

      Third, I have an issue with the localization of effects in this paper. The paper describes effects in the fronto-parietal network throughout the manuscript, including the title. How surprising, then, that the strongest effects are not in the FP network at all! Figure 4A makes it very clear that the largest effects are in the IFG, which is outside the green outlines describing the extent of the fronto-parietal network, but inside the Default network. Figure 3A also supports this Default-centric localization, with Central V1 effects in posterior lateral parietal, medial parietal, and superior frontal cortex, all outside FP but inside Default. Since the FC effects are not actually primarily in FP, I see no reason why FP should be used as a mask in Figure 5. Indeed, the authors should show the localization of SC effects throughout the cortex, not just in FP. I also see no reason why these V1-Default connections should be characterized in any way as "attention" or "control".

      Fourth, I feel that these FC and SC differences are wildly over-interpreted. From the scale, the actual strength of FC and SC between central V1 and lateral parietal cortex is extremely weak (around Z(r) = .1 for FC and p-track = .1 for SC). Under no circumstances would I believe that either of those values represents any sort of real connection. Cortical regions with direct structural connections have much stronger FC values, as do regions that influence each other indirectly via multi-step connections. Further, very large portions of the brain probably have both stronger FC and SC to central V1 than these FP regions (the authors show this for FC but exclude this info for SC). Most glaringly, I certainly don't believe there is a "direct structural connection" as is claimed in the discussion--a claim based, strangely, on the spatial correspondence between the structural and functional maps, which really has nothing to do with any evidence for a direct connection.

      Finally, the authors must note that p values may not be used for spatial correlations between brain maps. This is because these maps are always highly autocorrelated, which violates the independence assumption of the correlation procedure.

    3. Reviewer #1:

      This manuscript extends on prior work by the authors (Griffis et al, 2017), which originally reported eccentricity-dependent differences in resting state connectivity between V1 and regions brain wide. This study builds on that work by expanding the pool of participants, using the HCP dataset, as well as also investigating any eccentricity-dependent effects that may emerge with tractography. Interestingly, both measures find that foveal areas in V1 are more strongly connected to frontoparietal networks. The study is interesting, but I have a few remaining points.

      1) While during the resting state scans, there was, in theory, no 'task', participants were asked to maintain fixation on the cross in the center of the screen throughout the scan. I think it would be important for the authors to note that there is a possibility that the resting state correlations observed wherein foveal areas were more correlated with frontoparietal regions (and far periphery with DMN areas) could be due to attention directed towards the fixation cross, and away from the periphery. While I acknowledge the authors have no way to test this with this data set, it is possible that if participants had been asked to covertly attend to a ring in their far periphery the entire time instead, the correlations might have been flipped, with frontoparietal connectivity highest in the periphery towards the attended eccentricity. The authors should either explain why this is not a concern, or acknowledge it in the manuscript.

      2) Related to the last point, what was the size of the screen used during the connectivity data acquisition? I ask because the far eccentricity bands determined using Benson et al's technique are very eccentric. And if participants had eyes opened and were fixating, was that eccentricity outside the outer edge of the screen? Because then it would be encouraged to be 'unattended', thereby potentially influencing connectivity results.

      3) Was there any attempt at replicating these results in extra striate cortex? Are these patterns still there, both in structural and functional connectivity, for V2 or V3?

    4. Summary: The manuscript is a replication of findings from Griffis et al., 2017, and it seeks to validate those findings using a different modality (diffusion-weighted imaging; DWI). While the questions asked in this manuscript are of considerable interest to the field, the findings' focus and implications are relatively narrow. Further, the study does not reveal new conclusions about brain function or organization. Authors may be cautious about interpreting the findings as representing direct structural connections between the occipital and frontal cortex -- as the reported structural and functional connectivity values may not be strong enough to support such a strong interpretation. The reviewers also agree that the methods are not presented clearly, in a manner that is straightforward to follow and critique.

    1. Author Response:

      This response corresponds to the essential revisions sent to the authors after review.

      1) Further characterization and clarification are needed regarding the sensor properties. This is crucial for the potential users in the field to judge and use the sensor, and for interpretation of the biology results using the sensor.

      We are grateful to the reviewers and editors to raise such important questions regarding the characterization of sensor properties. The feedback surely contributes to clarify important aspects of the sensor.

      i) Clear statement in prominent places about the improvement of the sensor and new potential for its biologic applications separating from the authors' 2015 paper.

      Previous enzyme-based biosensor designs, including the ChOx biosensor described in our publication on 2015 (Santos et al, 2015), were based on the differential coating of electrode sites with matrices containing or lacking ChOx. This modifications render the sites Ch- sensitive or insensitive, respectively. The latter have been termed “sentinel” sites, as they are designed to respond to any perturbation except to the analyte of interest (Ch in this case). By subtracting the sentinel from the Ch-measuring site, this approach has been useful to decrease the contribution of interferent signals, namely caused by electrochemical oxidation of electroactive compounds or by voltage fluctuations associated with LFP. However, cross- talk caused by H2O2 diffusion from enzyme-coated to sentinel sites poses important constraints on this design. The inter-site spacing required to avoid diffusional cross-talk leads, for example, to uncontrolled differences in the amplitude and phase of LFP across sites, compromising common-mode rejection.

      In the current study, we have circumvented diffusional cross-talk-related limitations by implementing a novel sensing approach. Rather than changing the coating composition across recording sites, we have differentially modified their electrocatalytic properties towards H2O2, resulting in Ch-sensitive and pseudo-sentinel sites. As Ch responses depended solely on the intrinsic properties of the metal surface, we could dramatically reduce the size and increase the spatial density of recording sites by using tetrode configuration. Tetrodes, a bundle of four twisted wires glued together, are conventionally used for separating single neuron action potentials based on the spatial structure of their action potentials across wires. Here, the spatial structure of the electrochemical signal is created by electrochemical modification of wires. Importantly this design allows the unbiased measurement of ChOx activity and O2 in the same brain spot by using a tetrode site to directly measure the latter. This has not been possible to achieve with conventional enzyme-based biosensor designs, including our own previous stereotrode design.

      We acknowledge that the improvements of the TACO sensor over our previous stereotrode design, published in 2015 (as well as other conventional enzyme-based biosensors in general), were not clearly emphasized in the manuscript. We added new paragraphs/sentences in the introduction and results of the revised manuscript (page 4 lines 10-16, page 5 lines 6-15 and page 6 line 8) highlighting the main difference between the two sensors and advantages of the new design for the unbiased measurement of the signals derived from ChOx activity (COA) and O2.

      ii) Regarding the choline responses: characterizing the linearity of choline response is important for users to understand the sensor properties.

      Responses to choline were highly linear within the concentration range tested (up to 30 μM). This information was added to Table 1 and mentioned in the text (page 7, line 18) of the revised manuscript.

      Related, demonstration how to calibrate moving artificial signals in freely-moving rodents will be useful for the future applications.

      Movement can cause electromagnetic or mechanical perturbations (movement artifacts) that are expected to scale with the impedance of individual recording sites. As the same applies for LFP-related currents, it is not trivial to discriminate both confounds. Nevertheless, our common-mode rejection approach, which is optimized by a frequency-domain correction of electrode impedances (please check Methods section, page 40, for detailed explanation), is designed to optimally remove both LFP- and movement-related artifacts.

      In our freely-moving recordings we did not have prominent movement-related perturbations, probably due to the proximity of the head-stage to the sensor and the shielding effect of the grounded copper mesh that covers the implant. Nevertheless, candidate events likely caused by movement consisted in current deflections aligned to locomotion bouts, which were completely removed by common-mode rejection. In the revised manuscript we added the average raw traces triggered on locomotion bouts in Figure 2D, highlighting the usefulness of our method to remove putative movement-related artifacts in addition to LFP and other interferents. We have also added a brief mention to this issue in page 10, lines 32-35 and page 11, lines 1-2.

      Further, since the COA signal is confounded by phasic O2 fluctuations, the authentic changes in COA are potentially interfered by O2-evoked enzymatic responses. The interpretation of the signal interference needs to be clearly discussed, including O2-evoked changes, and other related signaling changes, like DA.

      The main focus of our study was to investigate the effect of physiological O2 fluctuations on the ChOx biosensor signal, which is given by the activity of immobilized ChOx, which we abbreviate as COA across the manuscript. In order to address this issue in an unbiased manner it is essential to clean artifacts that directly generate currents on the electrode surface (please see response to point 1vi for details). Our TACO sensor was designed to optimize the removal of such confounds, resulting in a clean COA signal. As this signal reflects the activity of immobilized enzyme, it is sensitive to changes in O2, not only Choline. Thus, the COA signal is not confounded, but rather modulated by changes in O2. Our main finding was that phasic O2 modulation of COA is a major confound of phasic Ch dynamics measurements using ChOx sensors in vivo in the brain. In this sense, the central tenet of the paper is that COA is not reflecting an authentic choline concentration dynamics, but rather a nonlinear function of Ch and O2 dynamics, with no feasible analytical approach to separate the two.

      We recognize that, in the Methods section, the description of how the COA signal was computed could lead to confusion between authentic COA and authentic Ch measurement. In the revised manuscript we have changed the terms used in the signal cleaning procedure (page 40-41).

      Regarding neurochemical confounds (e. g. ascorbate or dopamine and other monoamines), we acknowledge that the description of multichannel sensor properties in Table 1 could be confusing to readers. The table was also not conveying the important information on how sensitive is our COA measurement to these artifacts. In the revised manuscript we have removed the information about selectivity ratios for individual sites. Instead, the table section now called “Analytical properties for COA measurement” was expanded and now shows DA and AA sensitivities and selectivity ratios for the COA signal, computed from the difference between Au/Pt/m-PD and Au/m-PD sites.

      Additionally, we added a column in the color plot in Figure 1E describing the relative responses of the COA measurement to the different factors. This addition highlights the high selectivity of the COA signal for Ch, as compared with individual sites.

      Finally, we have detailed the interpretation of the freely-moving signals triggered on SWRs and locomotion bouts. In the Methods section of the revise manuscript (page 41, lines 4-11), we clarify how the differential signals COAnon-mPD and NCC (neurochemical confounds) presented in Figure 2 (revised version) were computed. In the description of these results, we also explain how the response patterns of raw and cleaned signals can be used to infer the contribution of different sorts of artifacts, including movement- and LFP-related and those caused by neurochemicals (page 10 lines 26-35, page 11 lines 1-5).

      iii) The dimensions of the sensor head need to be specified and spelled out clearly. It seems to be around 50 um, but the text seems to suggest 150 um. The individual sensing elements are 17 um in diameter. If this is true, it is very exciting because it exhibits hemispherical diffusion yielding higher response and enhanced sensitivity. This may improve spatial and temporal resolution if this is in indeed a much smaller sensor as a disk-shaped one.

      We thank the reviewers for referring to this point. It is an important detail that was not clearly stated in the manuscript. In the Methods section (page 34 of original manuscript), the description of the insertion of the tetrode inside a silica tube might have been misleading. In fact, the tetrode actually protrudes 1-2 cm out of the silica tube. This distance assures that the latter is not in contact with the brain in in vivo recordings. The cutting of the twisted ending of the tetrode results in four disc-shaped sensing elements with 17 μm diameter. The total diameter of the tetrode is approximately 60 μm. In the revised manuscript we have clarified and emphasized these details in the Methods section (page 36 lines 10, 15-16), in the results (page 6, lines 3-5) and with an additional cartoon in Figure 1A.

      iv) The role of the sentinels with differential plating is very interesting, but the function of the sentinels is not clear (p. 4 "canceling LFP-related currents"). They consume oxygen. Why does this not result in overlap of the diffusion layer for the choline sensor and therefore affect choline response? Please explain why differential electroplating was employed.

      We further clarified the role of the pseudo-sentinel sites on the removal of LFP-related currents and neurochemical artifacts and expanded the reasoning behind this approach. Please check the Introduction of the revised manuscript (page 4 lines 4-18, page 5 lines 6- 15).

      When polarized at +0.6 V vs. Ag/AgCl, the pseudo-sentinel channels display a residual activity towards electrochemical oxidation of H2O2. This electrochemical reaction generates O2, but the effect on the local O2 concentration is negligible due to the poor sensitivity and very small electrode surface area (17 μm diameter disc). We measured O2 (head-fixed mice and in vitro) by electrochemical reduction at -0.2 V vs. Ag/AgCl at a pseudo-sentinel site (gold-plated without m-PD). In this case O2 is consumed, but at a very limited extent that does not affect the local O2 level in the sensor. In accordance with the expected lack of effect on O2 levels, we have confirmed that switching the applied potential on a gold-plated site between +0.6 V and -0.2 V vs. Ag/AgCl has no effect on the COA signal. In the revised manuscript we added a supplementary figure (Figure S4) describing this observation. Accordingly, we extended the discussion of this topic in the results section (page 13, lines 17-18).

      v). Please explain how time-dependent behavior of the sensor was measured. This process typically leads to the formation of a film on this electrode surface which can affect sensitivity. According the authors' 2015 paper, the method for measuring the response time seems rather crude, and may overestimate the response time which is related to the mixing of the solution. This needs to be discussed.

      The sensor response times were estimated from the rise of the current in response to analyte additions in a stirred buffer solution, as described in the Methods section (page 40, lines 9-10 of revised manuscript). In the revised manuscript, we added a sentence to further clarify the use of this setup to estimate response times (page 37, line 29). Indeed, this setup is not the most appropriate to precisely determine response times due to the bias introduced by the analyte mixing time after its addition to the buffer. Our previous study (Santos et al, 2015) suggests however that the biggest contribution to the estimated response time is due to diffusion of Ch in the sensor coating. Besides the fact that we cannot precisely determine response times, it is noteworthy that real response times are faster than the values we report. This further highlights the high temporal resolution of the TACO sensor. We added a paragraph discussing this topic in the revised manuscript (page 7, lines 19-21).

      vi). The effect of LFP and other perturbations of sensor responses need to be more clearly explained.

      Two main types of artifacts affect the response of enzyme-based electrochemical biosensors: electromagnetic or electrochemical sources that directly generate currents at the electrode surface and biochemical factors that affect the activity of the immobilized enzyme. The first group can be sub-divided into: a) artifacts that generate faradaic currents, arising from oxidation/reduction of electrochemically active molecules, such as ascorbate or dopamine; b) artifacts that change the charge distributions at the electrode surface, generating capacitive currents, which in the brain are mainly caused by local fluctuations in field potentials (LFP) generated by the transmemberane current sources of the surrounding neural tissue. Effectively, LFP causes potential changes at the electrode surface who’s voltage is clamped by the potentiostat circuit, giving rise to apparent current, similar to voltage clamp measurement of the intracellular current. The second group, consisting in biochemical artifacts, comprises mainly the effect of oxygen on enzymatic activity (although other factors such as temperature and pH might have a minor effect, as discussed in the manuscript, page 34, lines 16-20).

      Importantly, the strategies devised to reduce artifacts that directly generate electrochemical currents (chemical surface modifications or common-mode rejection) do not control for factors influencing immobilized ChOx activity.

      Since O2 interference was the main focus of the paper and is thoroughly described throughout the manuscript, in the Introduction of revised manuscript we extended the description of the factors directly generating currents on the electrode surface (page 4, lines 4-18).

      2) Re-organization of the manuscript to improve the readability. This manuscript contains the characterization of the TACO sensor and application of this sensor to monitor real-time behavior in freely moving rodents. The design and characterization of the sensor is intermingled with the application of studying the choline biology with the sensor, making the logic flow hard to follow. The arrangement and presentation of the figures need to be improved so readers can appreciate both characterization and applications aspects and how they are tightly linked. This might also involves properly arrange main figures and associated supplementary figures.

      We believe this suggestion stems from the expectation that we may have conveyed to the readers rega